Sunday, September 19, 2010

Database access made simple with Spring

In "The promise of spring" I talked about some of the benefits of Spring, one of which is the simplification of the usage of JDBC.

Typical JDBC programming requires the programmer to write repeatedly the same code for basic things like loading the JDBC driver, getting a connection, closing a connection. After getting a connection, one has to create a PreparedStatement to execute the SQL. The PreparedStatement may return a ResultSet, over which the programmer needs to iterate to extract the data.

Spring addresses the above the issues as follows:

1. The cornerstone of the Spring framework is dependency injection. With Spring, the connection information such as driver, connection URL can be defined as metadata and a datasource object injected into the application. This frees the programmer from the burden of writting code to manage connections.

2. Spring provides a class JdbcTemplate that abstracts away the repetetive code involving Statements and ResultSets.

3. Lastly Spring maps JDBC checked exceptions to a runtime exception hierarchy. This helps to unclutter application code because the code no longer needs to be cluttered with database specific try catch logic.

Let us see how this works with a sample. Let us write a DAO (data access object) to insert and retrieve information from a database. Before you proceed any further, you can download the complete source code from springjdbc.zip.  In the blog below, for brevity, I show only code snippets. To run the sample, you will also need Spring 3.0, which you can download from Spring 3.0.x.

Step 1. For a database, I am going to use Apache Derby. We are going to store and retrieve user information as is typically required in any application. The schema is
create table puma_members ( 

  firstname  varchar(20), 
  lastname   varchar(30) not null, 
  street     varchar(40), 
  city       varchar(15), 
  zip        varchar(6), 
  email      varchar(30) not null primary key, 
  password   varchar(8) 

) ;
To download derby and for help on creating a database and table , see Derby documentation

Step 2. The DAO interface to access this table is
public interface MemberDAO {

    public int insertMember(Member m) ;
    public int deleteMember(String email) ;
    public Member getMember(String email) ;
}
Member is a class that has get and set methods for every column in puma_members. For brevity, the code is not shown here.

Step 3. The class MemberSpringJDBCDAO shall provide an implementation for MemberDAO using Spring JDBC.

The heavylifting in this class is done by an instance of org.springframework.jdbc.core.JdbcTemplate. However we don'nt need to instantiate it explicitly.We will let Spring create an instance and inject it into this class.

To help Spring, we however need to provide getter/setter method for JdbcTemplate.

So the class needs a private variable to hold the jdbcTemplate and setter/getter methods that Spring can call to set its value.
private JdbcTemplate jdbcTemplate ;
public JdbcTemplate getJdbcTemplate() {
    return jdbcTemplate ;
}
public void setJdbcTemplate(JdbcTemplate template) {
    jdbcTemplate = template ;
}
This is a form of dependency injection called setter injection. Spring calls the setJdbcTemplate method to provide our class with an instance of JdbcTemplate.

Step 4. How do we tell Spring we need a JdbcTemplate ? And how does Spring know what database driver to load and what database to connect to ?
All the bean definitions are in springjdbcdao.xml.The main bean memberdao has a property that references a jdbcTemplate.
<bean class="com.mj.spring.jdbc.MemberSpringJDBCDAO" id="memberdao">
    <property name="jdbcTemplate">
       <ref bean="jdbcTemplate"></ref>
    </property>
</bean>

<bean class="org.springframework.jdbc.core.JdbcTemplate" id="jdbcTemplate">
    <constructor-arg>
        <ref bean="dataSource"></ref>
    </constructor-arg>
</bean>
The jdbcTemplate bean has as it implementation org.springframework.jdbc.core.JdbcTemplate which is the class we are interested in. It references another bean dataSource , which has all the necessary jdbc configuration.
<bean id="dataSource"
  class="org.springframework.jdbc.datasource.DriverManagerDataSource">
    <property name="driverClassName" value="org.apache.derby.jdbc.EmbeddedDriver"/>
    <property name="url" 
         value="jdbc:derby:/home/mdk/mjprojects/database/pumausers"/>
</bean>
This configuration has all the information necessary for Spring to create a JdbcTemplate with a dataSource and inject it into memberDAO.

Step 5. JdbcTemplate has a number of helper methods to execute SQL commands that make implementing MemberDAO methods easy.
private static final String insert_sql = "INSERT into puma_members VALUES(?,?,?,?,?,?,?)" ;
private static final String select_sql = "Select * from puma_members where email = ?" ;
public int insertMember(Member member) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {member.getFirstname(),member.getLastname(),
                           member.getStreet(),member.getCity(),member.getZip(),
                           member.getEmail(),member.getPassword()} ;
    int ret = jt.update(insert_sql, params) ;
    return ret ;
}
public Member getMember(String email) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {email} ;
    List result = jt.query(select_sql,params, new MemberRowMapper()) ;
    Member member = (Member)result.get(0) ;
    return member;
}
private class MemberRowMapper implements RowMapper {
    public Object mapRow(ResultSet rs, int arg1) throws SQLException {
        Member member = new Member(rs.getString("firstname"), rs.getString("lastname"), 
                                   rs.getString("street"), rs.getString("city"), rs.getString("zip"),
                                   rs.getString("email"), rs.getString("password"));
         
        return member ;
    }
}
In insertMember, the update method of JdbcTemplate takes 2 parameters, the SQL insert statement and an array that contains the data to be inserted. In getMember, the query method takes an additional parameter, a class that implements the RowMapper interface, and this maps the JDBC resultSet to the object we want, which is an instance of Member. The Spring javadocs very clearly state that JdbcTemplate is star of the Spring JDBC package. It has several variations of query, update, execute methods. Too many,one might think.

Step 6. The class MemberSpringJDBCDAOTest has the junit tests that test MemberSPringJDBCDAO. A Snippet is below
public void insertMember() {
    ApplicationContext context = new ClassPathXmlApplicationContext(
                                                "springjdbcdao.xml");
    BeanFactory factory = (BeanFactory) context;
    MemberSpringJDBCDAO mDAO=(MemberSpringJDBCDAO) factory.getBean("memberdao");
    Member newMember = new Member("John","Doe","2121 FirstStreet","Doecity",
                                   "42345","jdoe@gmail.com","jondoe") ;
    int ret = mDAO.insertMember(newMember) ;
}
This is typical Spring client code. First we create a BeanFactory and load the metadata in springjdbcdao.xml. Then we request the factory to create a memberDAO. Insert a record into the database by calling the insertMember method.

Clearly the code is a lot simpler than if you implemented MemberDAO in plain JDBC. If you are new to Spring and are intimated with buzz words like inversion of control (IOC), then using Spring for database access is a good way to start benefiting from Spring while learning to use it. Note the loose coupling between the interface MemberDAO and its implementation. The loose coupling is good design and a reason for the popularity of frameworks like Spring. In future blogs, I will implement MemberDAO using other persistence APIs like JPA and may be Hibernate and show how the implementation can be switched without having to change client code.

Tuesday, August 31, 2010

Security concerns with Cloud Computing

A few years ago, there was a news story about an online tax service mixing up peoples tax returns. That scared the hell out of me. While I continued to e-file my returns, I use old fashioned desktop tax preparation software to prepare my return and then e-file it. I am not yet comfortable with someone else hosting all the personal information in a tax return.

Concerns regarding security are a primary barrier in businesses adopting cloud based services for critical stuff. As a cloud consumer, below are some issues to think about and ask the provider. As a cloud service provider, these are issue to think about and address.

1. Physical security

Are the premises that host the servers, databases etc physically secure. The buildings that host the systems needs state of art security technology to restrict entry and monitor who goes in and who goes out, as well as record who is doing what. Location is important as well. Who would want their business data hosted in a high crime area or in a country with a track record of wars.

2. Isolation

A cloud service provider is hosting data from multiple customers. That is something users should never have to care about. Any mixup, like the one described in the first paragraph is completely unacceptable.

3. Authentication and Access control

When sensitive data is hosted outside your enterprise, are the people who manage or access the data properly authenticated ? Is the access limited to those that absolutely need it ? Is the access control policy available for review and reviewed periodically.

4. Data security

Is sensitive data encrypted ? Operations staff such as system administrators manage files and databases. They need to move , backup, copy stuff etc but they do not necessarily need to be able to read credit card numbers from a customer table.

5. Audit trail

As is the case in any business, things can go wrong. There will be the bad apple who happens to come across some sensitive information and decides to misuse it. To be able to investigate such issues, a detailed audit trail is required. Who entered or left the premises ? Who logged on to the system ? what actions did he perform ?

As the saying goes "forewarned is forearmed". If you know the security practices of your provider, you can weigh risk versus benefit and decide what is appropriate to be hosted by the provider.

Sunday, August 15, 2010

The promise of Spring

The Spring framework  is a popular alternative to J2EE for enterprise application development.  To download Spring or read about it, visit http://www.springsource.org/

The top 5 documented benefits of Spring are:

1. Spring saves the developer the time and effort required to write boiler plate code like JNDI lookups, creating JDBC connections, API specific exception handling etc. Every one has their horror story on not finding a JNDI reference and a framework that abstracts such plumbing code away from the application developer is certainly useful.

2. Spring is a lightweight framework. A lightweight framework should be small in size, conceptually simple and easy to use.

3. Spring simplifies database access. Most applications need to store and retrieve data from a relational database. There are many alternatives for database access such as JDBC, JPA, Hibernate, with varying levels of complexity. Spring provides a simpler abstraction over these APIs. Similarly it simplifies web development with its MVC framework.

4. Spring can used in various environments. It can be used in standalone J2SE applications. It can be used with a webserver such as tomcat. It can be used with a full blown application server like JBOSS or Websphere.

5. Spring lets you develop applications by assembling loosely coupled components. Loose coupling is a better design practice because it allows you to swap out moving parts without having to do major changes to the application.  Spring achieves this by what it calls "Inversion of Control" and "Dependency Injection".  Actually, it is mostly dependency injection.

Inversion of Control and Dependency Injection are a topic for a separate blog. But very briefly, in Spring, when you author a "bean", you specify the dependencies in an XML file and the Spring container makes them available. You don'nt have to explicitly create each dependency.

In subsequent blogs, using some code example, let us see if Spring delivers on its promise. I will dig deeper with examples and point out where Spring really simplifies development and where it just adds another layer of complexity. So stay tuned.

Sunday, July 25, 2010

Cloud computing: I can see the "cloud" clearly now the rain ....

There are many blogs and articles on the internet on cloud computing and perhaps there are too many. Yet the question keeps popping up. What is cloud computing ? Let us try to clear the fog around this topic.

Almost everyone uses online services like Gmail, Google docs,  yahoo mail etc. To use these services, there is no investment to the consumer. There are no software licensing costs, there is no time spent in installing, managing, upgrading and software - not even client software. The infrastructure is provided and managed behind the scenes by providers like Google. Behind the scenes, Google adds the necessary hardware, upgrades the software when necessary, to ensure that you and I as end users get the same quality of service when we use Gmail. This is an example of  what is called software as a service and is a form of cloud computing. Some companies do the same for enterprise software, the best example of which is SalesForce.com. Enterprise software is extremely hard to install and manage. With cloud computing, you can pay a fee and start using the software and let the provider take care of  installation, management, maintenance and customization.

Companies like Amazon offer storage and servers that an IT department can use on demand. If I am running an IT department and all of sudden the enterprise needs several gigabytes of disk space or additional servers to run some additional jobs, I want to be able to rent the disk space or servers and use it temporarily - without an 18 wheeler bringing in boxes of hardware that I need to install, configure and manage. This is an example of the cloud offering computing services just the way utility companies offer electricity or water.

There are many more variations of this concept of providing some computing infrastructure whether hardware or software, over the internet for use by a customer.

Why is this stuff called the cloud ? The services we are talking about are generally offered over the internet. In computing literature, the cloud drawing is a popular way to show the internet as a computer network. The idea is that as a user you do not care about what is in the cloud, but you can reliably get some computing service from it. Cloud is an abstraction for the underlying computing technology which makes it easy to offer services in a highly dynamic and scalable manner. What distinguishes cloud computing from other forms of distributed computing is that one can view the cloud as vast supply of computing power, some of which you can buy for a fee.

If you are providing a cloud based service, a key requirement is that the architecture needs to dynamic. Depending on user demand, the provider should be able to scale seamlessly by adding hardware or software as necessary. This could be be anything from adding more storage or starting more server processes to handle user requests.

The consumer of the service should not have to do any infrastructure setup related to the service. Ideally the consumer should be able to use the service with minimal help from the provider. When you move into a new apartment or house, you call up the utility company to turn on the electricity service. After that you have uninterrupted service as long as you are paying your monthly bill. Using a cloud service should be as easy as that.

A good cloud platform is self managed and self healing. Services and usage are monitored and resources allocated optimally. Problems should be detected automatically and fixed without interruption of service.

Security for applications as well as data is critical for obvious reasons. Customers would suffer severe financial damages if their private data falls into the wrong hands.

So just putting up a web site that is hosted from your garage does not necessarily make you a cloud service provider.

The biggest advantage of cloud based services for the customer is that it provides a low cost entry point to a variety of computing services. Whether it is storage, web hosting, application services for email, documents to even things like payroll, to be able to get them with no infrastructure investment is a huge plus. Obviously, as with any out sourcing, you give up some control and become dependent on the service provider. However when you are starting out as a business, the lack of control might be tolerable when compared to the cost of setting up a data center or huge IT department. As the business grows and you have solid revenues, you can of course decide to move some function in house.

There are advantages for the service provider as well, especially for software.

If you have developed commercial software, you know how expensive  it is to support several different versions. If your software is available as a cloud service, every user is on the same version.

Since users are not installing the software on their hardware, you do not have to support different platforms.

Since you control hardware and software, making updates/upgrades is easier. They can happen behind the scenes.

The application can to be tuned to the hardware/OS platform of your choice and hence can deliver the best performance.

To conclude, it is clear that the cloud computing paradigm does provide value to both customers and providers and it is not just a rehash of old technology - cloud service providers have to solve some real technical problems to ensure the quality of service for their customers.

Sunday, June 27, 2010

What Java Map class should I use ?

The interface java.util.Map is an interface for mapping keys to values. Implementations of Map provide in memory maps of keys to values. The interface provides methods to insert (key,value) pairs into a map and methods to look up values based on the key.

The JDK provides a few implementations of the map interface such as HashMap, LinkedHashMap, TreeMap and the IdentityMap. Which implementation should you use ? The answers depends on what your requirement is. 

The HashMap is implemented using a hash table. Hash tables are implemented using a hash function to calculate the index into an array where the value is placed.  Hence HashMap provides the fastest search and insertion times O(1). HashMaps however take up more memory. Also they do not guarantee any ordering for the keys. If you iterate over the keys, you will find them in no particular order.

The LinkedHashMap is implemented using both hash table and linked list. In addition to using a hash table for fast retrieval, it maintains a doubly linked list through the items. The linked list helps preserve the order in which keys were inserted into the map. Search or get operation is still as fast as an HashMap, but inserts or deletes are a little slower because of the additional work involved in maintaining the linked list. LinkedHashMap can be used if you are going to  need to iterate over the keys in say the order in which the keys were inserted into the Map. LinkedHashMap also has a constructor parameter accessorder, which when set to true, changes the ordering from insertion order to access order - from least recently accessed to most recently accessed. This is particularly useful if you are using the Map to build a cache.

TreeMap is implemented using the red black tree which is a balanced tree. When items are inserted into a tree, they are placed at a position based on their natural ordering or based on a provided comparator. The ordering is thus a sorted order. Search and insertion for TreeMap are of the order O(log(n)). This is slower than HashMap and LinkedHashMap. You might need to use a TreeMap if you are going to need to iterate over the keys in the map in a sorted order.

IdentityHashMap is implemented using hash table and is very fast - O(1) for both get and insert. But unlike other Map classes which use the equals method when comparing keys, it just uses object reference equality. For 2 keys key1 and key2, it will compare using key1 == key2 rather than key1.equals(key2). The JDK is clear that this is not be to used as general purpose Map where you need lookups based on values. The specification of the containsKey(object key) in the Map interface states that it returns true if key.equals(k). This is not true for the IdentityHashMap. IdentityHashMap is rarely used. But you might use it if you are doing something that needs keeping track of object references.

In summary,
  • HashMap is the fastest map with O(1) search and insertion times.
  • LinkedHashMap is a little slower for inserts, but maintains the insertion order.
  • TreeMap is the slowest map, but lets you iterate over the keys in a sorted order.

Sunday, April 18, 2010

The Java Memory Model

The Java memory model describes the rules that define how variables written to memory are seen, when such variables are written and read by multiple threads.

When a thread reads a variable, it is not necessarily getting the latest value from memory. The processor might return a cached value. Additionally, even though the programmer authored code where a variable is first written and later read, the compiler might reorder the statements as long as it does not change the program semantics. It is quite common for processors and compilers to do this for performance optimization. As a result, a thread might not see the values it expects to see. This can result in hard to fix bugs in concurrent programs.

The Java programming language provides synchronized, volatile, final to help write safe multithreaded code. However earlier versions of Java had several issues because the memory model was underspecified. JSR 133 fixed some of the flaws in the earlier memory model.

Most programmers are familiar that entering a synchronized block means obtaining a lock on a monitor that ensures that no other thread can enter the synchronized block. Less familiar but equally important are the facts that
(1) accquring a lock and entering a synchronized block forces the thread to refresh data from memory.
(2) On exiting the synchronized block, data written is flushed to memory.

This ensures that values written by a thread in a synchronized block are visible to other threads in synchronized blocks.

Ever heard of  "happens before" in the context of Java ? JSR 133 introduced the term "happens before" and provided some guarantees about the ordering of actions within a program.These guarantees are

(1) Every action in a thread happens before every other action that comes after it in the thread.
(2) An unlock on a monitor happens before a subsequent lock on the same monitor
(3) A volatile write on a variable happens before a subsequent volatile read on the same variable
(4) A call to  Thread.start() happens before any other statement in that thread
(5) All actions in thread happen before any other thread returns from a join() on that thread

Where action is defined in section 17.4.2 of the Java language specification as statements that can be detected or influenced by other threads. Normal read/write, volatile read/write, lock/unlock are some actions.

Rules 1 ,4 and 5 guarantee that within a single thread all actions will execute in the order in which they appear in the authored program. Rules 2 and 4 guarantee that between multiple threads working on shared data, the relative ordering of synchronized blocks and the order of read/writes on volatile variables is preserved.

Rules 2 and 4 makes volatile very similar synchronized block. Prior to JSR 133, volatile still meant that a write to volatile variable is written directly to memory and a read is read from memory. But a compiler could reorder volatile read/writes with non volatle read/writes causing incorrect results. Not possible after JSR 133.

One additional notable point. This is related to final members that are initalized in the constructor of a class. As long as the constructor completes execution properly, the final members are visible to other threads without synchronization. If you however share the reference to the object from within the constructor, then all bets are off.

Saturday, March 27, 2010

REST in a nutshell

REST or Representational State Transfer is the architectural style of the world wide web. REST was defined by Roy Fielding in his PhD thesis sometime around 2000. But it has come to the foreground only recently with the realization and acceptance of REST as the preferred architecture for providing web services. In recent years, REST has replaced SOAP/WSDL as the preferred way to build web services.

Representational State Transfer is a term that sounds very academic. It is best explained using the web where it is put to use every time you use the web. A typical web interaction involves a user in front of a web browser ( the client ) making a request to say get some resource from a server by typing in a URL. The representation of the resource is a document. The location of the document is described by the URL. Getting the resource from the server and displaying in the browser causes the state transition in the browser.

The protocol used in the web browser-server communication is HTTP described in RFC2616.

When you type the URL http://www.xyz.com/path/mydoc.html, the browser requests the document from the host by sending the GET command.

Everyone has at some point filled out a form on a web page to may be open an account or to buy something online. When you click the submit button, the browser sends the data to the server using the http POST command.

Other common http commands are PUT which is used to update a document on the server and DELETE which is used to delete a document from the server.

This sort of interaction does not necessarily have to take place only between a browser and a server. It could very well be two servers or applications exchanging documents or data. The term web services is generally used to describe this kind of interaction. The two applications could have been written in different programming languages. But that does not matter since they are talking HTTP.  Ironically, web services were popularized by SOAP/WSDL, the technology that REST is displacing.

SOAP based web services use XML to describe the operation or method to be invoked on the server, the  data that is passed as input and the data that is output. In SOAP, describing the interface, operation and data has to done for each and every application. For example, for an application that manages users, you define methods like getUser, createUser, deleteUser and so on. For an application that manages accounts, you might define methods like getAccount, createAccount, unaccountably and etc.

Contrast this with REST , where the operations stay the same no matter what the application is - GET, PUT, POST, DELETE. The resources are identified by URLs such as http://www.xyz.com/user/userid or
http//www.xyx.com/account/accountid. REST is thus a lot simpler. Data is transferred as the body of the HTTP message. Popular formats are XML and JSON.

An example of a GET request is

GET /account/A12345 HTTP/1.1
Host: www.xyz.com/


and the server response could be

HTTP/1.1 200 OK
Content-Type: text/json; charset=utf-8
Content-Length: length

{"account":
  {"id":"A12345",
   "type":"checking",
    "balance":"150"
   }
}

Similarly to create a new account, the request would be

POST /account HTTP/1.1
Host: http://www.xyz.com/
Content-Length: length
Content-Type: text/json
{"account":
   {"id":"A12346",
     "type":"checking",
     "balance":"200"
    }
}

An important characteristic of RESTful applications is that they are stateless. Each request has all the information necessary for the server to respond to the request. This is a critical requirement for such web services applications to be highly scalable. By being stateless, each subsequent request can be serviced by any other peer server, which means you can use a cluster of servers to service requests and you can have high availability and failover.

In summary, REST is a simple and powerful architectural style for building web services. REST requires identifying the data or resources in your application using URLs. The HTTP commands such as GET,PUT,POST,DELETE are the verbs or operations exposed by the application. The client and server communicate using the HTTP protocol with the data carried in the body of the message.