Tuesday, November 30, 2010

Database isolation

When multiple threads are accessing data from the same table or tables in a relational database, care needs to be taken that updates from one thread or transaction do not interfere with others. Isolation is a property of database transactions that determine when changes made by one transaction are visible to other concurrent transactions.

Relational databases support different isolation levels. Isolation is typically achieved either by locking data or by serializing access to data, both of which lead to loss in concurrency ( and thus performance). It is thus important to pick the correct isolation level so you have optimal performance without loss in correctness.

To understand isolation levels, it is useful to first talk about the type of queries than can happen.

1. Dirty reads
A dirty read is one that reads uncommitted data.This is dangerous for obvious reason. The data you just read may get rolled back and never exist in the database in the future.

2. Non repeatable read
A non repeatable read reads committed data. But if you do the read again, you will see the effect of any changes to the data that were committed by other transaction.

At time a1, say a transaction t1 executes query:

select quantity from Order
where orderid = 1

It returns 2.

At time a2, transaction t2 updates the quantity to 3.

At time a3, t1 executes the same query which returns 3.

3. Phantom reads
A phantom read reads committed data as well. But a subsequent read in the same
transaction may see new data added and committed by another transaction.

At time a1, a transaction t1 does

Select OrderId from Order
where itemid = '23'

The querry returns

1
2

At time a2 , another transaction t2 inserts a new order for the same item id.

At time a3, transaction t1 executes the same query and it might return

1
2
3

Now that we understand the types of reads, let us look at the isolation levels. There are 4 isolation levels defined by ANSI

1. READ UNCOMMITTED
This is the least stringent isolation level. At this level, dirty reads are allowed. You are reading data that may or may not be committed and it is unreliable.

2. READ COMMITTED
This isolation level ensures that only committed data is read. Dirty reads are thus not allowed. But Non repeatable reads or phantom reads are possible. This isolation level is sufficient if you just want to get a snapshot of the data at a particular time and do not care that might be updated.

3. REPEATABLE READ
This isolation level ensures that rows read within a transaction will not be updated by another transaction. New rows might added, but ones already read will not change. The reads are thus repeatable. Non repeatable read is not possible, but phantom reads are possible.

4. SERIALIZABLE
This is the most stringent isolation level. Access to data is serialized. This is very expensive but none of the problem reads are possible.

The default isolation level in SQL Server, Oracle and DB2 is Read Committed. Most the time this is sufficient. Read Uncommitted is too dangerous and serializable can lead to unacceptable performance.

Repeatable Read is necessary when you say read rows from a database and based on the value, do an update or delete within the same transaction. You need repeatable read because within the transaction, the conditions that lead to update following the read ,should not change.

The default isolation level is sufficient for most routine database applications. However problems due to isolation generally show up in large scale environments where thousands of transaction are hitting the same database tables at the same time. By choosing the right isolation levels, you can ensure the performance stays acceptable while avoiding difficult to troubleshoot problems.

Sunday, October 24, 2010

Developing web applications with Spring

Spring MVC enables easy web application development with a framework based on the Model View Controller architecture (MVC) pattern. The MVC architectural pattern requires the separation of the user interface (View), the data being processed (Model) and the Controller which manages the interactions between the view and the model.

At the core of Spring MVC is a servlet, the DispatcherServlet, that handles every request. The DispatcherServlet routes the HTTP request to a Controller class authored by the application developer. The controller class handles the request and decides which view should be displayed to the user as part of the response.

Let us develop a simple web application that takes a request and sends some data back to the user. Before you proceed any further, I recommend you download the source code at springmvc.zip

For this tutorial you will also need

(1) A webserver like Tomcat
(2) Spring 3.0
(3) Eclipse is optional. I use eclipse as my IDE. Eclipse lets you export the war that can be deployed to Tomcat. But you can use other IDEs or command line tools as well.
(4) Some familiarity with JSPs and Servlets is required.

Step 1: If you were to develop a web application in J2EE, typically you do it by developing servlets or JSPs, that are packaged in .war file. Also necessary is a deployment descriptor web.xml that contains configuration metadata. The war is deployed to a web server like tomcat.

With Spring, the first thing to do is to wire Spring to this J2EE web infrastructure by defining org.springframework.web.servlet.DispatcherServlet as the servlet class for this application. You also need to define org.springframework.web.context.ContextLoaderListener as a listener. ContextLoaderListener is responsible for loading the spring specific application context which has Spring metadata.

The web.xml setup ensures that every request to the application is routed by the servlet engine to DipatcherServlet. The updated to web.xml is shown below:
<listener>
    <listener-class>
        org.springframework.web.context.ContextLoaderListener
    </listener-class>
</listener>
<servlet>
    <servlet-name>springmvc</servlet-name>
    <servlet-class>
        org.springframework.web.servlet.DispatcherServlet
    </servlet-class>
    <load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
    <servlet-name>springmvc</servlet-name>
    <url-pattern>*.htm</url-pattern>
</servlet-mapping>
Step 2: The heavy lifting in this web application is done by a controller class. This is an ordinary java class or bean that extends org.springframework.web.servlet.mvc.AbstractController. We override the handleRequestInternal method. In this method, you would do the things necessary to handle the request which may include for example reading from a database.

The method returns a org.springframework.web.servlet.ModelAndView object which encapsulates the name of the view and any data (model) that needs to be displayed by the view. ModelAndView holds data as name value pairs.This data is later made available to the view. If the view is a jsp, then you can access the data using either jstl techniques or by directly querying the Request object. The code for our controller is shown below:
public class SpringMVCController extends AbstractController {
    protected ModelAndView handleRequestInternal(HttpServletRequest request, HttpServletResponse response) {
        ModelAndView mview = new ModelAndView("springmvc") ;
        mview.addObject("greeting", "Greetings from SpringMVC") ;
        mview.addObject("member1", new Member("Jonh","Doe", 
            "1234 Main St","Pleasanton","94588","kh@gmail.com","1234")) ;
        return mview ;
    }
}
The name of the view springmvc is passed in to the constructor of ModelAndView. The addObject methods add 2 model objects greeting and member1. Later you will see how the view can retrieve the objects and display them.

Step 3: Every Spring application needs metadata that defines the beans and their dependencies. For this application, we create a springmvc-servlet.xml. We help spring find it by specifying its location in web.xml.
<context-param>
    <param-name>contextConfigLocation</param-name>
    <param-value>/WEB-INF/springmvc-servlet.xml</param-value>
</context-param>
In springmvc-servlet.xml, the controller bean is defined as
<bean name="/*.htm" class="com.mj.spring.mvc.SpringMVCController"/>
Step 4: How does DispatcherServlet know which Controller should handle the request ?

Spring uses handler mappings to associate controllers with requests. 2 commonly used handler mappings are BeanNameUrlHandlerMapping and SimpleUrlHandlerMapping.

In BeanNameUrlHandlerMapping, when the request url matches the name of the bean, the class in the bean definition is the controller that will handle the request.

In our example, we use BeanNameUrlHandlerMapping as shown below. Every request url ending in .htm is handled by SpringMVCController.
<bean name="/*.htm" class="com.mj.spring.mvc.SpringMVCController"/>
In SimpleUrlHandlerMapping, the mapping is more explicit. You can specify a number of urls and each URL can be explicitly associated with a controller.

Step 5: How does the DispatcherServlet know what to return as the response ?

As mentioned earlier, the handleInternalRequest method of the controller returns a ModelandView Object.

In the controller code shown above, the name of the view "springmvc" is passed in the constructor to ModelAndView. At this point we have just given the name of the view. We have not said what file or classes or artifacts help produce the html, nor have we said whether the view technology used is JSP or velocity templates or XSLT. For this you need a ViewResolver, which provides that mapping between view name and a concrete view. Spring lets you produce a concrete view using many different technologies, but for this example we shall use JSP.

Spring provides a class InternalResourceViewResolver that supports JSPs and the declaration below in springmvc-servlet.xml tells spring that we use this resolver. The prefix and suffix get added to view name to produce the path to the jsp file that renders the view.
<bean id="viewResolver" class="org.springframework.web.servlet.view.InternalResourceViewResolver">
    <property name="prefix" value="/WEB-INF/jsp/"></property>
    <property name="suffix" value=".jsp"></property>
</bean>

Step 6: In this example, the view resolves to springmvc.jsp, which uses JSTL to get the data and display it. Spring makes the model objects greeting and member1 available to the JSP as request scope objects. For educational purposes, the code below also gets the objects directly from the request.
// Using JSTL to get the model data
${greeting}
${member1.lastname
// Using java to get the model directly from the request
Map props = request.getParameterMap() ;
System.out.println("PARAMS =" + props) ;
Enumeration em = request.getAttributeNames() ;
while (em.hasMoreElements()) {
    String name = (String) em.nextElement() ;
    System.out.println("name = "+name) ;
}
System.out.println("Attrs are "+request.getAttributeNames()) ;
System.out.println("greeting is "+ request.getAttribute("greeting")) ;
Member m = (Member)request.getAttribute("member1") ;
System.out.println("member is "+m.toString()) ;
Step 7: All files we have developed so far should be packaged into a war file as you would in any web application. The war may be deployed to tomcat by copying to tomcat_install\webapps. I built a war that you can download at springmvc.war

Step 8: Point your web browser http://localhost:8080/springmvc/test.htm to run the application. The browser should display the data.


To summarize, Spring simplifies web application development by providing building blocks that you can assemble easily. We built a web application using Spring MVC. Spring provides an easy way to wire together our model, the controller SpringMVCController and the view springmvc.jsp. We did not have to explicitly code any request/response handling logic. By changing the metadata in springmvc-servlet.xml, you can switch to a different controller or a different view technology.

Sunday, September 19, 2010

Database access made simple with Spring

In "The promise of spring" I talked about some of the benefits of Spring, one of which is the simplification of the usage of JDBC.

Typical JDBC programming requires the programmer to write repeatedly the same code for basic things like loading the JDBC driver, getting a connection, closing a connection. After getting a connection, one has to create a PreparedStatement to execute the SQL. The PreparedStatement may return a ResultSet, over which the programmer needs to iterate to extract the data.

Spring addresses the above the issues as follows:

1. The cornerstone of the Spring framework is dependency injection. With Spring, the connection information such as driver, connection URL can be defined as metadata and a datasource object injected into the application. This frees the programmer from the burden of writting code to manage connections.

2. Spring provides a class JdbcTemplate that abstracts away the repetetive code involving Statements and ResultSets.

3. Lastly Spring maps JDBC checked exceptions to a runtime exception hierarchy. This helps to unclutter application code because the code no longer needs to be cluttered with database specific try catch logic.

Let us see how this works with a sample. Let us write a DAO (data access object) to insert and retrieve information from a database. Before you proceed any further, you can download the complete source code from springjdbc.zip.  In the blog below, for brevity, I show only code snippets. To run the sample, you will also need Spring 3.0, which you can download from Spring 3.0.x.

Step 1. For a database, I am going to use Apache Derby. We are going to store and retrieve user information as is typically required in any application. The schema is
create table puma_members ( 

  firstname  varchar(20), 
  lastname   varchar(30) not null, 
  street     varchar(40), 
  city       varchar(15), 
  zip        varchar(6), 
  email      varchar(30) not null primary key, 
  password   varchar(8) 

) ;
To download derby and for help on creating a database and table , see Derby documentation

Step 2. The DAO interface to access this table is
public interface MemberDAO {

    public int insertMember(Member m) ;
    public int deleteMember(String email) ;
    public Member getMember(String email) ;
}
Member is a class that has get and set methods for every column in puma_members. For brevity, the code is not shown here.

Step 3. The class MemberSpringJDBCDAO shall provide an implementation for MemberDAO using Spring JDBC.

The heavylifting in this class is done by an instance of org.springframework.jdbc.core.JdbcTemplate. However we don'nt need to instantiate it explicitly.We will let Spring create an instance and inject it into this class.

To help Spring, we however need to provide getter/setter method for JdbcTemplate.

So the class needs a private variable to hold the jdbcTemplate and setter/getter methods that Spring can call to set its value.
private JdbcTemplate jdbcTemplate ;
public JdbcTemplate getJdbcTemplate() {
    return jdbcTemplate ;
}
public void setJdbcTemplate(JdbcTemplate template) {
    jdbcTemplate = template ;
}
This is a form of dependency injection called setter injection. Spring calls the setJdbcTemplate method to provide our class with an instance of JdbcTemplate.

Step 4. How do we tell Spring we need a JdbcTemplate ? And how does Spring know what database driver to load and what database to connect to ?
All the bean definitions are in springjdbcdao.xml.The main bean memberdao has a property that references a jdbcTemplate.
<bean class="com.mj.spring.jdbc.MemberSpringJDBCDAO" id="memberdao">
    <property name="jdbcTemplate">
       <ref bean="jdbcTemplate"></ref>
    </property>
</bean>

<bean class="org.springframework.jdbc.core.JdbcTemplate" id="jdbcTemplate">
    <constructor-arg>
        <ref bean="dataSource"></ref>
    </constructor-arg>
</bean>
The jdbcTemplate bean has as it implementation org.springframework.jdbc.core.JdbcTemplate which is the class we are interested in. It references another bean dataSource , which has all the necessary jdbc configuration.
<bean id="dataSource"
  class="org.springframework.jdbc.datasource.DriverManagerDataSource">
    <property name="driverClassName" value="org.apache.derby.jdbc.EmbeddedDriver"/>
    <property name="url" 
         value="jdbc:derby:/home/mdk/mjprojects/database/pumausers"/>
</bean>
This configuration has all the information necessary for Spring to create a JdbcTemplate with a dataSource and inject it into memberDAO.

Step 5. JdbcTemplate has a number of helper methods to execute SQL commands that make implementing MemberDAO methods easy.
private static final String insert_sql = "INSERT into puma_members VALUES(?,?,?,?,?,?,?)" ;
private static final String select_sql = "Select * from puma_members where email = ?" ;
public int insertMember(Member member) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {member.getFirstname(),member.getLastname(),
                           member.getStreet(),member.getCity(),member.getZip(),
                           member.getEmail(),member.getPassword()} ;
    int ret = jt.update(insert_sql, params) ;
    return ret ;
}
public Member getMember(String email) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {email} ;
    List result = jt.query(select_sql,params, new MemberRowMapper()) ;
    Member member = (Member)result.get(0) ;
    return member;
}
private class MemberRowMapper implements RowMapper {
    public Object mapRow(ResultSet rs, int arg1) throws SQLException {
        Member member = new Member(rs.getString("firstname"), rs.getString("lastname"), 
                                   rs.getString("street"), rs.getString("city"), rs.getString("zip"),
                                   rs.getString("email"), rs.getString("password"));
         
        return member ;
    }
}
In insertMember, the update method of JdbcTemplate takes 2 parameters, the SQL insert statement and an array that contains the data to be inserted. In getMember, the query method takes an additional parameter, a class that implements the RowMapper interface, and this maps the JDBC resultSet to the object we want, which is an instance of Member. The Spring javadocs very clearly state that JdbcTemplate is star of the Spring JDBC package. It has several variations of query, update, execute methods. Too many,one might think.

Step 6. The class MemberSpringJDBCDAOTest has the junit tests that test MemberSPringJDBCDAO. A Snippet is below
public void insertMember() {
    ApplicationContext context = new ClassPathXmlApplicationContext(
                                                "springjdbcdao.xml");
    BeanFactory factory = (BeanFactory) context;
    MemberSpringJDBCDAO mDAO=(MemberSpringJDBCDAO) factory.getBean("memberdao");
    Member newMember = new Member("John","Doe","2121 FirstStreet","Doecity",
                                   "42345","jdoe@gmail.com","jondoe") ;
    int ret = mDAO.insertMember(newMember) ;
}
This is typical Spring client code. First we create a BeanFactory and load the metadata in springjdbcdao.xml. Then we request the factory to create a memberDAO. Insert a record into the database by calling the insertMember method.

Clearly the code is a lot simpler than if you implemented MemberDAO in plain JDBC. If you are new to Spring and are intimated with buzz words like inversion of control (IOC), then using Spring for database access is a good way to start benefiting from Spring while learning to use it. Note the loose coupling between the interface MemberDAO and its implementation. The loose coupling is good design and a reason for the popularity of frameworks like Spring. In future blogs, I will implement MemberDAO using other persistence APIs like JPA and may be Hibernate and show how the implementation can be switched without having to change client code.

Tuesday, August 31, 2010

Security concerns with Cloud Computing

A few years ago, there was a news story about an online tax service mixing up peoples tax returns. That scared the hell out of me. While I continued to e-file my returns, I use old fashioned desktop tax preparation software to prepare my return and then e-file it. I am not yet comfortable with someone else hosting all the personal information in a tax return.

Concerns regarding security are a primary barrier in businesses adopting cloud based services for critical stuff. As a cloud consumer, below are some issues to think about and ask the provider. As a cloud service provider, these are issue to think about and address.

1. Physical security

Are the premises that host the servers, databases etc physically secure. The buildings that host the systems needs state of art security technology to restrict entry and monitor who goes in and who goes out, as well as record who is doing what. Location is important as well. Who would want their business data hosted in a high crime area or in a country with a track record of wars.

2. Isolation

A cloud service provider is hosting data from multiple customers. That is something users should never have to care about. Any mixup, like the one described in the first paragraph is completely unacceptable.

3. Authentication and Access control

When sensitive data is hosted outside your enterprise, are the people who manage or access the data properly authenticated ? Is the access limited to those that absolutely need it ? Is the access control policy available for review and reviewed periodically.

4. Data security

Is sensitive data encrypted ? Operations staff such as system administrators manage files and databases. They need to move , backup, copy stuff etc but they do not necessarily need to be able to read credit card numbers from a customer table.

5. Audit trail

As is the case in any business, things can go wrong. There will be the bad apple who happens to come across some sensitive information and decides to misuse it. To be able to investigate such issues, a detailed audit trail is required. Who entered or left the premises ? Who logged on to the system ? what actions did he perform ?

As the saying goes "forewarned is forearmed". If you know the security practices of your provider, you can weigh risk versus benefit and decide what is appropriate to be hosted by the provider.

Sunday, August 15, 2010

The promise of Spring

The Spring framework  is a popular alternative to J2EE for enterprise application development.  To download Spring or read about it, visit http://www.springsource.org/

The top 5 documented benefits of Spring are:

1. Spring saves the developer the time and effort required to write boiler plate code like JNDI lookups, creating JDBC connections, API specific exception handling etc. Every one has their horror story on not finding a JNDI reference and a framework that abstracts such plumbing code away from the application developer is certainly useful.

2. Spring is a lightweight framework. A lightweight framework should be small in size, conceptually simple and easy to use.

3. Spring simplifies database access. Most applications need to store and retrieve data from a relational database. There are many alternatives for database access such as JDBC, JPA, Hibernate, with varying levels of complexity. Spring provides a simpler abstraction over these APIs. Similarly it simplifies web development with its MVC framework.

4. Spring can used in various environments. It can be used in standalone J2SE applications. It can be used with a webserver such as tomcat. It can be used with a full blown application server like JBOSS or Websphere.

5. Spring lets you develop applications by assembling loosely coupled components. Loose coupling is a better design practice because it allows you to swap out moving parts without having to do major changes to the application.  Spring achieves this by what it calls "Inversion of Control" and "Dependency Injection".  Actually, it is mostly dependency injection.

Inversion of Control and Dependency Injection are a topic for a separate blog. But very briefly, in Spring, when you author a "bean", you specify the dependencies in an XML file and the Spring container makes them available. You don'nt have to explicitly create each dependency.

In subsequent blogs, using some code example, let us see if Spring delivers on its promise. I will dig deeper with examples and point out where Spring really simplifies development and where it just adds another layer of complexity. So stay tuned.

Sunday, July 25, 2010

Cloud computing: I can see the "cloud" clearly now the rain ....

There are many blogs and articles on the internet on cloud computing and perhaps there are too many. Yet the question keeps popping up. What is cloud computing ? Let us try to clear the fog around this topic.

Almost everyone uses online services like Gmail, Google docs,  yahoo mail etc. To use these services, there is no investment to the consumer. There are no software licensing costs, there is no time spent in installing, managing, upgrading and software - not even client software. The infrastructure is provided and managed behind the scenes by providers like Google. Behind the scenes, Google adds the necessary hardware, upgrades the software when necessary, to ensure that you and I as end users get the same quality of service when we use Gmail. This is an example of  what is called software as a service and is a form of cloud computing. Some companies do the same for enterprise software, the best example of which is SalesForce.com. Enterprise software is extremely hard to install and manage. With cloud computing, you can pay a fee and start using the software and let the provider take care of  installation, management, maintenance and customization.

Companies like Amazon offer storage and servers that an IT department can use on demand. If I am running an IT department and all of sudden the enterprise needs several gigabytes of disk space or additional servers to run some additional jobs, I want to be able to rent the disk space or servers and use it temporarily - without an 18 wheeler bringing in boxes of hardware that I need to install, configure and manage. This is an example of the cloud offering computing services just the way utility companies offer electricity or water.

There are many more variations of this concept of providing some computing infrastructure whether hardware or software, over the internet for use by a customer.

Why is this stuff called the cloud ? The services we are talking about are generally offered over the internet. In computing literature, the cloud drawing is a popular way to show the internet as a computer network. The idea is that as a user you do not care about what is in the cloud, but you can reliably get some computing service from it. Cloud is an abstraction for the underlying computing technology which makes it easy to offer services in a highly dynamic and scalable manner. What distinguishes cloud computing from other forms of distributed computing is that one can view the cloud as vast supply of computing power, some of which you can buy for a fee.

If you are providing a cloud based service, a key requirement is that the architecture needs to dynamic. Depending on user demand, the provider should be able to scale seamlessly by adding hardware or software as necessary. This could be be anything from adding more storage or starting more server processes to handle user requests.

The consumer of the service should not have to do any infrastructure setup related to the service. Ideally the consumer should be able to use the service with minimal help from the provider. When you move into a new apartment or house, you call up the utility company to turn on the electricity service. After that you have uninterrupted service as long as you are paying your monthly bill. Using a cloud service should be as easy as that.

A good cloud platform is self managed and self healing. Services and usage are monitored and resources allocated optimally. Problems should be detected automatically and fixed without interruption of service.

Security for applications as well as data is critical for obvious reasons. Customers would suffer severe financial damages if their private data falls into the wrong hands.

So just putting up a web site that is hosted from your garage does not necessarily make you a cloud service provider.

The biggest advantage of cloud based services for the customer is that it provides a low cost entry point to a variety of computing services. Whether it is storage, web hosting, application services for email, documents to even things like payroll, to be able to get them with no infrastructure investment is a huge plus. Obviously, as with any out sourcing, you give up some control and become dependent on the service provider. However when you are starting out as a business, the lack of control might be tolerable when compared to the cost of setting up a data center or huge IT department. As the business grows and you have solid revenues, you can of course decide to move some function in house.

There are advantages for the service provider as well, especially for software.

If you have developed commercial software, you know how expensive  it is to support several different versions. If your software is available as a cloud service, every user is on the same version.

Since users are not installing the software on their hardware, you do not have to support different platforms.

Since you control hardware and software, making updates/upgrades is easier. They can happen behind the scenes.

The application can to be tuned to the hardware/OS platform of your choice and hence can deliver the best performance.

To conclude, it is clear that the cloud computing paradigm does provide value to both customers and providers and it is not just a rehash of old technology - cloud service providers have to solve some real technical problems to ensure the quality of service for their customers.

Sunday, June 27, 2010

What Java Map class should I use ?

The interface java.util.Map is an interface for mapping keys to values. Implementations of Map provide in memory maps of keys to values. The interface provides methods to insert (key,value) pairs into a map and methods to look up values based on the key.

The JDK provides a few implementations of the map interface such as HashMap, LinkedHashMap, TreeMap and the IdentityMap. Which implementation should you use ? The answers depends on what your requirement is. 

The HashMap is implemented using a hash table. Hash tables are implemented using a hash function to calculate the index into an array where the value is placed.  Hence HashMap provides the fastest search and insertion times O(1). HashMaps however take up more memory. Also they do not guarantee any ordering for the keys. If you iterate over the keys, you will find them in no particular order.

The LinkedHashMap is implemented using both hash table and linked list. In addition to using a hash table for fast retrieval, it maintains a doubly linked list through the items. The linked list helps preserve the order in which keys were inserted into the map. Search or get operation is still as fast as an HashMap, but inserts or deletes are a little slower because of the additional work involved in maintaining the linked list. LinkedHashMap can be used if you are going to  need to iterate over the keys in say the order in which the keys were inserted into the Map. LinkedHashMap also has a constructor parameter accessorder, which when set to true, changes the ordering from insertion order to access order - from least recently accessed to most recently accessed. This is particularly useful if you are using the Map to build a cache.

TreeMap is implemented using the red black tree which is a balanced tree. When items are inserted into a tree, they are placed at a position based on their natural ordering or based on a provided comparator. The ordering is thus a sorted order. Search and insertion for TreeMap are of the order O(log(n)). This is slower than HashMap and LinkedHashMap. You might need to use a TreeMap if you are going to need to iterate over the keys in the map in a sorted order.

IdentityHashMap is implemented using hash table and is very fast - O(1) for both get and insert. But unlike other Map classes which use the equals method when comparing keys, it just uses object reference equality. For 2 keys key1 and key2, it will compare using key1 == key2 rather than key1.equals(key2). The JDK is clear that this is not be to used as general purpose Map where you need lookups based on values. The specification of the containsKey(object key) in the Map interface states that it returns true if key.equals(k). This is not true for the IdentityHashMap. IdentityHashMap is rarely used. But you might use it if you are doing something that needs keeping track of object references.

In summary,
  • HashMap is the fastest map with O(1) search and insertion times.
  • LinkedHashMap is a little slower for inserts, but maintains the insertion order.
  • TreeMap is the slowest map, but lets you iterate over the keys in a sorted order.