Showing posts with label database. Show all posts
Showing posts with label database. Show all posts

Tuesday, November 30, 2010

Database isolation

When multiple threads are accessing data from the same table or tables in a relational database, care needs to be taken that updates from one thread or transaction do not interfere with others. Isolation is a property of database transactions that determine when changes made by one transaction are visible to other concurrent transactions.

Relational databases support different isolation levels. Isolation is typically achieved either by locking data or by serializing access to data, both of which lead to loss in concurrency ( and thus performance). It is thus important to pick the correct isolation level so you have optimal performance without loss in correctness.

To understand isolation levels, it is useful to first talk about the type of queries than can happen.

1. Dirty reads
A dirty read is one that reads uncommitted data.This is dangerous for obvious reason. The data you just read may get rolled back and never exist in the database in the future.

2. Non repeatable read
A non repeatable read reads committed data. But if you do the read again, you will see the effect of any changes to the data that were committed by other transaction.

At time a1, say a transaction t1 executes query:

select quantity from Order
where orderid = 1

It returns 2.

At time a2, transaction t2 updates the quantity to 3.

At time a3, t1 executes the same query which returns 3.

3. Phantom reads
A phantom read reads committed data as well. But a subsequent read in the same
transaction may see new data added and committed by another transaction.

At time a1, a transaction t1 does

Select OrderId from Order
where itemid = '23'

The querry returns

1
2

At time a2 , another transaction t2 inserts a new order for the same item id.

At time a3, transaction t1 executes the same query and it might return

1
2
3

Now that we understand the types of reads, let us look at the isolation levels. There are 4 isolation levels defined by ANSI

1. READ UNCOMMITTED
This is the least stringent isolation level. At this level, dirty reads are allowed. You are reading data that may or may not be committed and it is unreliable.

2. READ COMMITTED
This isolation level ensures that only committed data is read. Dirty reads are thus not allowed. But Non repeatable reads or phantom reads are possible. This isolation level is sufficient if you just want to get a snapshot of the data at a particular time and do not care that might be updated.

3. REPEATABLE READ
This isolation level ensures that rows read within a transaction will not be updated by another transaction. New rows might added, but ones already read will not change. The reads are thus repeatable. Non repeatable read is not possible, but phantom reads are possible.

4. SERIALIZABLE
This is the most stringent isolation level. Access to data is serialized. This is very expensive but none of the problem reads are possible.

The default isolation level in SQL Server, Oracle and DB2 is Read Committed. Most the time this is sufficient. Read Uncommitted is too dangerous and serializable can lead to unacceptable performance.

Repeatable Read is necessary when you say read rows from a database and based on the value, do an update or delete within the same transaction. You need repeatable read because within the transaction, the conditions that lead to update following the read ,should not change.

The default isolation level is sufficient for most routine database applications. However problems due to isolation generally show up in large scale environments where thousands of transaction are hitting the same database tables at the same time. By choosing the right isolation levels, you can ensure the performance stays acceptable while avoiding difficult to troubleshoot problems.

Sunday, September 19, 2010

Database access made simple with Spring

In "The promise of spring" I talked about some of the benefits of Spring, one of which is the simplification of the usage of JDBC.

Typical JDBC programming requires the programmer to write repeatedly the same code for basic things like loading the JDBC driver, getting a connection, closing a connection. After getting a connection, one has to create a PreparedStatement to execute the SQL. The PreparedStatement may return a ResultSet, over which the programmer needs to iterate to extract the data.

Spring addresses the above the issues as follows:

1. The cornerstone of the Spring framework is dependency injection. With Spring, the connection information such as driver, connection URL can be defined as metadata and a datasource object injected into the application. This frees the programmer from the burden of writting code to manage connections.

2. Spring provides a class JdbcTemplate that abstracts away the repetetive code involving Statements and ResultSets.

3. Lastly Spring maps JDBC checked exceptions to a runtime exception hierarchy. This helps to unclutter application code because the code no longer needs to be cluttered with database specific try catch logic.

Let us see how this works with a sample. Let us write a DAO (data access object) to insert and retrieve information from a database. Before you proceed any further, you can download the complete source code from springjdbc.zip.  In the blog below, for brevity, I show only code snippets. To run the sample, you will also need Spring 3.0, which you can download from Spring 3.0.x.

Step 1. For a database, I am going to use Apache Derby. We are going to store and retrieve user information as is typically required in any application. The schema is
create table puma_members ( 

  firstname  varchar(20), 
  lastname   varchar(30) not null, 
  street     varchar(40), 
  city       varchar(15), 
  zip        varchar(6), 
  email      varchar(30) not null primary key, 
  password   varchar(8) 

) ;
To download derby and for help on creating a database and table , see Derby documentation

Step 2. The DAO interface to access this table is
public interface MemberDAO {

    public int insertMember(Member m) ;
    public int deleteMember(String email) ;
    public Member getMember(String email) ;
}
Member is a class that has get and set methods for every column in puma_members. For brevity, the code is not shown here.

Step 3. The class MemberSpringJDBCDAO shall provide an implementation for MemberDAO using Spring JDBC.

The heavylifting in this class is done by an instance of org.springframework.jdbc.core.JdbcTemplate. However we don'nt need to instantiate it explicitly.We will let Spring create an instance and inject it into this class.

To help Spring, we however need to provide getter/setter method for JdbcTemplate.

So the class needs a private variable to hold the jdbcTemplate and setter/getter methods that Spring can call to set its value.
private JdbcTemplate jdbcTemplate ;
public JdbcTemplate getJdbcTemplate() {
    return jdbcTemplate ;
}
public void setJdbcTemplate(JdbcTemplate template) {
    jdbcTemplate = template ;
}
This is a form of dependency injection called setter injection. Spring calls the setJdbcTemplate method to provide our class with an instance of JdbcTemplate.

Step 4. How do we tell Spring we need a JdbcTemplate ? And how does Spring know what database driver to load and what database to connect to ?
All the bean definitions are in springjdbcdao.xml.The main bean memberdao has a property that references a jdbcTemplate.
<bean class="com.mj.spring.jdbc.MemberSpringJDBCDAO" id="memberdao">
    <property name="jdbcTemplate">
       <ref bean="jdbcTemplate"></ref>
    </property>
</bean>

<bean class="org.springframework.jdbc.core.JdbcTemplate" id="jdbcTemplate">
    <constructor-arg>
        <ref bean="dataSource"></ref>
    </constructor-arg>
</bean>
The jdbcTemplate bean has as it implementation org.springframework.jdbc.core.JdbcTemplate which is the class we are interested in. It references another bean dataSource , which has all the necessary jdbc configuration.
<bean id="dataSource"
  class="org.springframework.jdbc.datasource.DriverManagerDataSource">
    <property name="driverClassName" value="org.apache.derby.jdbc.EmbeddedDriver"/>
    <property name="url" 
         value="jdbc:derby:/home/mdk/mjprojects/database/pumausers"/>
</bean>
This configuration has all the information necessary for Spring to create a JdbcTemplate with a dataSource and inject it into memberDAO.

Step 5. JdbcTemplate has a number of helper methods to execute SQL commands that make implementing MemberDAO methods easy.
private static final String insert_sql = "INSERT into puma_members VALUES(?,?,?,?,?,?,?)" ;
private static final String select_sql = "Select * from puma_members where email = ?" ;
public int insertMember(Member member) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {member.getFirstname(),member.getLastname(),
                           member.getStreet(),member.getCity(),member.getZip(),
                           member.getEmail(),member.getPassword()} ;
    int ret = jt.update(insert_sql, params) ;
    return ret ;
}
public Member getMember(String email) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {email} ;
    List result = jt.query(select_sql,params, new MemberRowMapper()) ;
    Member member = (Member)result.get(0) ;
    return member;
}
private class MemberRowMapper implements RowMapper {
    public Object mapRow(ResultSet rs, int arg1) throws SQLException {
        Member member = new Member(rs.getString("firstname"), rs.getString("lastname"), 
                                   rs.getString("street"), rs.getString("city"), rs.getString("zip"),
                                   rs.getString("email"), rs.getString("password"));
         
        return member ;
    }
}
In insertMember, the update method of JdbcTemplate takes 2 parameters, the SQL insert statement and an array that contains the data to be inserted. In getMember, the query method takes an additional parameter, a class that implements the RowMapper interface, and this maps the JDBC resultSet to the object we want, which is an instance of Member. The Spring javadocs very clearly state that JdbcTemplate is star of the Spring JDBC package. It has several variations of query, update, execute methods. Too many,one might think.

Step 6. The class MemberSpringJDBCDAOTest has the junit tests that test MemberSPringJDBCDAO. A Snippet is below
public void insertMember() {
    ApplicationContext context = new ClassPathXmlApplicationContext(
                                                "springjdbcdao.xml");
    BeanFactory factory = (BeanFactory) context;
    MemberSpringJDBCDAO mDAO=(MemberSpringJDBCDAO) factory.getBean("memberdao");
    Member newMember = new Member("John","Doe","2121 FirstStreet","Doecity",
                                   "42345","jdoe@gmail.com","jondoe") ;
    int ret = mDAO.insertMember(newMember) ;
}
This is typical Spring client code. First we create a BeanFactory and load the metadata in springjdbcdao.xml. Then we request the factory to create a memberDAO. Insert a record into the database by calling the insertMember method.

Clearly the code is a lot simpler than if you implemented MemberDAO in plain JDBC. If you are new to Spring and are intimated with buzz words like inversion of control (IOC), then using Spring for database access is a good way to start benefiting from Spring while learning to use it. Note the loose coupling between the interface MemberDAO and its implementation. The loose coupling is good design and a reason for the popularity of frameworks like Spring. In future blogs, I will implement MemberDAO using other persistence APIs like JPA and may be Hibernate and show how the implementation can be switched without having to change client code.