Code & Cluster: java

Showing posts with label java. Show all posts

Tuesday, October 20, 2015

JAVA 8 : Lambdas tutorial

Lambdas are the biggest addition to JAVA in not just release 8 but several releases. But when you look at the cryptic lambda syntax, like most regular programmers, you are left wondering why one should write code this way. 6. The purpose of this tutorial is to introduce lambdas, so that you can start using them in real code.

Overview

Lambdas facilitate defining, storing and passing as parameters blocks of code. They may be stored in variables for later use or passed as parameters to methods who may invoke the code. This style of programming is known as functional programming.

You might argue that JAVA already supported functional programming using anonymous classes. But that approach is considered verbose.

Example

Listing 1 shows the old way to pass executable code to a thread.

public void Listing1_oldWayRunnable() {
        Runnable r = new Runnable() {
            @Override
            public void run() {
                System.out.println("Hello Anonymous") ;
            }
        } ;
        Thread t = new Thread(r) ;
        t.start() ;
    }

Listing 2 shows the new way using lambdas.

public void Listing2() {

        Thread t = new Thread(()->System.out.println("Hello Lambdas")) ;
        t.start() ;
    }

Listing 2 has no anonymous class. It is much more compact.

()->System.out.println is the lambda.

Syntax

The syntax is

(type)->statement
Where type is the parameter passed in. In our example, there was no parameter. Hence the syntax was ()->statement

If you had multiple parameters, the syntax would be
(type1,type2)->statement

If you had multiple statements, the syntax would be a
(type) ->{statement1; statement2} ;

Storing in a variable

The lambda expression can also be stored in variable and passed around as shown in listing 3.

public void Listing3() {
        Runnable r = ()->System.out.println("Hello functional interface") ;
        Thread t = new Thread(r) ;
        t.start() ;
    }

Functional interface

JAVA 8 introduces a new term functional interface. It is an interface with just one abstract method that needs to be implemented. The lambda expression provides the implementation for the method. For that reason, lambda expressions can be assigned to variables that are functional interfaces. In the example above Runnable is the functional interface.

You can create new functional interfaces. They are ordinary interfaces but with only one abstract method. @FunctionalInterface is an annotation that may be used to document the fact that an interface is functional.

Listing 5 show the definition and usage of a functional interface.

@FunctionalInterface
    public interface Greeting {
        public void sayGreeting() ;
    }

    public static void greet(Greeting s) {
        s.sayGreeting();
    }

    @Test
    public void Listing5() {
        // old way
        greet(new Greeting() {
            @Override
            public void sayGreeting() {
                System.out.println("Hello old way") ;
            }
        }) ;

        // lambda new way
        greet(()->System.out.println("Hello lambdas")) ;
    }
}

Once again you can see that the code with lambdas is much more compact. Within an anonymous class, the "this" variable resolves to the anonymous class. But within a lambda, the this variable resolves to the enclosing class.

java.util.Function

The java.util.Function package in JDK 8 has several starter ready to use functional interfaces. For example the Consumer interface takes a single argument and returns no result. This is widely used in new methods in the java.util.collections package. Listing 6 shows one such use with the foreach method added to Iterable interface, that can be used to process all elements in a collection.

@Test
    public void Listing6() {
         List l = Arrays.asList(1,2,3,4,5,6,7,8,9) ;
         l.forEach((i)->System.out.println(i*i)) ;
    }

In summary, Java 8 lambdas introduce a new programming style to java. It attempts to bring JAVA up to par with other languages that claim to be superior because they support functional programming. It is not all just programming style. Lambdas do provide some performance advantages. I will examine them more in future blogs.

Friday, November 22, 2013

JAVA Comparable vs Comparator

The java.lang.Comparable and java.util.Comparator interfaces provide similar function and there is sometimes confusion as to which interface to use where. Some documentation adds to the confusion by implying that the Comparable interface should be used for natural ordering where as Comparator should be used for total ordering without explaining what that means.

The Comparable interface is defined as

public interface Comparable<T> {

int compareTo(T o) ;
}

The compareTo method is for comparing this object with the object that is passed in. Instances of classes that implement Comparable use this method to compare other instances to themselves.

Let us say class Employee implements Comparable. If an array of Employees is passed to Arrays.sort(Object[] o) methods, the sorting code calls the compareTo method to compare Employee objects with one another and put them in the correct order. In fact the Arrays method sort(Object[] o) expects the objects to have implemented the Comparable interface and an exception is thrown if the objects do not implement Comparable.

class Employee implements Comparable<Employee> {

private int id ;
private String name ;

// natural ordering compare by id and by name
// simplistic
public int compareTo(Employee e) {

if ( id == e.id)
return name.compareTo(e.name) ;

return (id - e.id) ;
}

}

Generally the Comparable interface is used when

(1) You have control of the class and can make the class implement Comparable
(2) The objects need to be sorted in their natural order. Natural order is the order defined by the JVM. For number types everyone understand the order - 23 is greater than 22. For Strings it is the lexicographic order.
(3) Generally there is only one natural order for each type. In the above example, you can order by id,name. If you needed a second kind of ordering by just the id or just the name, it would not be possible with Comparable. But you could write multiple Comparators .

If you need to sort an array of objects which do not implement Comparable and the code for the classes is not under your control, you can write a new class that implements the Comparator interface and call the following method in java.util.Arrays.

public static void sort(T[] a , Comparator <? super T>) ;

interface Comparator<T> {

int compare(T o1, T o2) ;

boolean equals(Object o) ;
}

public class EmpIdComparator implements Comparator<Employee> {

public int compare(Employee e1, Employee e2) {

return e1.id - e2.id ;
}

}

public class EmpNameComparator implements Comparator<Employee> {

public int compare(Employee e1, Employee e2) {

return e1.name.compareTo(e2.name) ;

}

}

If you need to sort employees by id , you can pass an instance of the EmpIdComparator to the sort methods. If you need employees sorted by name, use the EmpNameComparator.

Here the compare method lets you compare any two objects. The Comparator interface is used when

(1) You do not have control of the class or the class does not implement Comparable for some reason.
(2) You want ordering that is different from what is generally accepted as the natural order.
(3) You need many different kinds of ordering.

All the sorting methods and collections that maintain order ( SortedMap, SortedSet ) expect the contained objects to either implement Comparable or you need to pass in another instance of a Comparator Object.

Lastly the Comparable and Comparator interfaces are useful design patterns that you can copy whenever you need to write code that applies to generics. Given a collection of type T and you need to do some operation on each type T. Consider defining a generic interface with a a generic method. The caller of your API can pass in an implementation for the interface for the appropriate type.

Friday, August 30, 2013

Java Serializable vs Externalizable

Serializable and Externalizable are almost forgotten interfaces in Java. And for good reason. Many applications do not need to serialize objects explicitly. Most applications write to a database and to write to databases you use either an API like JDBC or a framework like Hibernate or JPA. However if you writing to a file or network, it is worthwhile to understand the default serialization mechanism.

Serializable

To leverage the default serialization mechanism in Java, you need to implement the java.io.Serializable interface. This a marker interface, in that, for default behavior, you do not need to implement any methods.

public class Employee implements Serializable {

private int id ;

private String name ;

private String address ;

public Employee(int i, String nm, String adr) {

id = i ;

name = nm ;

address = adr ;

}

public int getId() {

return id ;

}

public String getName() {

return name ;

}

public String getAddress() {

return address ;

}

To Serialize an Employee Object:

try {

Employee e = new Employee(1,"John Doe","123 5th Street, New City , CA") ;

ObjectOutputStream os = new ObjectOutputStream(new FileOutputStream("employees.dat"))) ;

os.writeObject(e) ;

os.close() ;

} catch (IOException e) {

log(e) ;

}

To Deserialize an employee Object:

Employee er = null ;

try {

ObjectInputStream is = new ObjectInputStream(new FileInputStream("employee.dat"))) ;

er = (Employee) is.readObject() ;

} catch(IOException e1) {

log(e1) ;

}

You can customize the default serialization by adding the following methods:

private void writeObject(ObjectOutputStream out) throws IOException {
out.writeInt(id) ;
out.writeUTF(name) ;
out.writeUTF(address) ;

// add customization here
}
private void readObject(ObjectInputStream in) throws IOException {
id = in.readInt() ;
name = readUTF() ;
address = readUTF() ;
// add customization here
}

If these methods exist, the jvm calls them during serialization. You can also call defaultWriteObject and defaultReadObject to get default behaviour then and add to it.

Default serialization is quite slow and should not be used except in the most simple use cases. Implementing readObject and writeObject does not give much improvement because the JVM has to use reflection to determine if those private methods exist.

Externalizable
The default serialization protocol is considered a little heavy. If you wish to completely override it and use a different protocol, you can implement the Externalizable interface which has 2 methods.

public void readExternal(ObjectInput in) throws IOException

public void writeExternal(ObjectOutput out) throws IOException

Once you implement these methods, the client code for serialization is same as above. Studies have shown that using Externalizable provides better performance. Since these are public methods, the JVM does not have to resort to reflection. The implementations of readExternal and writeExternal would be similar to readObject and writeObject shown above.

Though for most industrial grade applications, you will want to forego serializing objects directly and serialize the types that make up your objects.

In Summary,

You may use the Serializable interface for simple use cases. It is easy to use, but the performance is not good.

Using the Externalizable interface is a little more work and it gives little better performance.

For best performance, design a record format for the data in the objects and serialize the types directly to memory and write blocks to disk. Of course, this is much more work.

Wednesday, January 16, 2013

Generics : Array Creation

How do you write a method to convert a generic collection to an Array ? A naive implementation would be:

 
public static <T> T[] convertToArray(Collection<T> c) {

   T[] a = new T[c.size()] ; // compilation error
   int i = 0 ;
   for (T x : c) {
      a[i++] = x ;

   }
     
}

The code does not compile because the type of Array is required to be able to create an array. Array is what is known as a reifiable type. A type is reifiable if its type information is available at runtime. Any java class or primitive type is reifiable.
Generics on the other hand are implemented by erasure - that is the type information is erased and runtime uses casts to get appropriate behavior. So while

 
List<T> a = new ArrayList<T> () ;

works because T is erased. Under the hood, just an ArrayList() is created and casts added when getting T. However,

 
T[] a = new T[size] ; // compile error

will not work because for arrays type information is required.
The solution is to use reflection, which is what you would if you wanted to dynamically create an instance of any reifiable type like a plain java class. The method signature in our example changes a little to take the array type as an additional parameter. Since it is easier to pass in the required array

 
public static <T> T[] convertToArray(Collection<T> c, T arry) {
   if (arry.length < c.size()) {

      arry = (T[]) java.lang.reflect.Array.newInstance(
                             arry.getClass().getComponentType(),c.size()) ;

      int i = 0 ;
      for (T x : c) {
          a[i++] = x ;
      }
   }
     
}

The newInstance method on Array creates an array of the required type. getComponentType returns the type of elements of the array. This is analagous to using reflection to a create an instance of class K. You would do

 
K.getClass.newInstance() ;

In summary, in generic methods, you can use new operator to create non-reifiable types (eg List<T>) because the type information is erased during compilation (List is created). But for reifiable type, you need to use reflection because the type information is required and cannot be erased

Friday, August 17, 2012

JAVA enum tutorial

Java language has supported enum type for several releases. Yet, many programmers do not use it or do not fully understand all features of enum.

We still see a lot of code like this:

public static final int LIGHT = 1 ;

public static final int MEDIUM = 2 ;

public static final int HEAVY = 3 ;

public static final int SUPERHEAVY = 4 ;

int weight_range = getRange():

if (weight_range == LIGHT ) {

} else if (weight_range == MEDIUM) {

} else if (weight_range == HEAVY) {

}

Such code is error prone. It lacks type safety. If the weight_range is serialized/deserialized somewhere you are going to have to remember what 1,2,3 represent.

Java enum is a cleaner type safe way of working with constants. It is a type that has a fixed set of constant fields that are instances of the type.

1. Defining enum

Defining enum is like defining a class.

public enum WeightRange {

LIGHT, MEDIUM, HEAVY,SUPERHEAVY

} ;

defines a WeightRange enum type with 4 constant fields.

2. Creating a variable of type enum

WeightRange wclass = WeightRange.Medium ;

is like declaring any other type.

3. Using the enum

WeightRange boxer_class = getWtRangeFromSomeWhere();

if (boxer_class == WeightRange.LIGHT) {

} else if (boxer_class == WeightRange.HEAVY) {

}

is more type safe than the code without enums.

4. Enum is a class.

As mentioned above, enum is a class. Every enum type extends java.lang.Enum. All enum types thus can have additional fields and constructors.

The above WeightRange enum can be enhanced to add fields for low and high range. The values are provided in the constructor.

public enum WeightRange {

    LIGHT(0,70) ,
    MEDIUM(71,150),
    HEAVY(151,225),
    SUPERHEAVY(226,350) ;

    private final int low ;
    private final int high ;

    WeightRange(int low, int high) {
        this.low = low ;
        this.high = high ;
    }

}

5. Enum can also have methods.

In the above enum we can add a method to check if a given weight is within a weight range.

public boolean isInRange(int wt) {
       if (wt >= low && wt <= high)
           return true ;
       else
           return false ;

}

5. It can have static factory method that takes a weight as parameter and returns the correct enum.

public static WeightRange getWeightRange(int weight) {

        if (weight <= 70)
            return LIGHT ;
        else if (weight <= 150)
            return MEDIUM ;
        else if (weight <= 225)
            return HEAVY ;
        else
            return SUPERHEAVY ;

}

6. Calling toString on an enum value returns the name used to define the constant field.

System.out.println(WeightRange.LIGHT) ;

prints LIGHT

7. In converse, enum can be constructed using a String using the valueOf method.

WeightRange w3 = WeightRange.valueOf("MEDIUM") ;

System.out.println(w3) ;

will print MEDIUM

8. You can iterate over the constants defined in the enum.

for (WeightRange r : WeightRange.values()) {
System.out.println(r) ;

}

9. enum constants are final.

WeightRange.LIGHT = WeighRange.Heavy ; // compilation error

10. The only instances of an enum that can be created are the constants defined in the enum defintion.

WeightRange r = new WeightRange(12,100) ; // compilation error

Next time you need a fixed set of constants, consider using enum. It is type safe, leads to better code and your constants are within a namespace.

Wednesday, May 16, 2012

When to use explicit Locks in JAVA ?

Prior to JDK 5, the only way to protect data from concurrent access was to use the synchronized keyword. The limitations of using synchronized are

(1) A thread that tries to acquire a lock has to wait till it gets the lock. There is no way to timeout.
(2) A thread that is waiting for a lock cannot be interrupted.
(3) Since synchronized applies to a block of code, the lock has to be acquired and released in the same block. While this is good most of the time, there are cases where you need the flexibility of acquiring and releasing the lock in different blocks.

The Lock interfaces and classes are well documented at java.util.concurrent.locks
The basic usage of the new Lock interface is

Lock l = new ReentrantLock() ;
l.lock() ;
try {
// update
} finally {
l.unlock() ;
}

You might be tempted to say that this can be done using synchronized. However the new Lock interface has several additional features.

1. Non blocking method
The trylock() method (without params) acquires the lock if it is available. If it is not available it returns immediately. This is very useful in avoiding deadlocks when you are trying to acquire multiple locks .

2. Timed
trylock(time......) acquires the lock if it is free within the time. Otherwise it returns false. The thread can be interrupted during the wait time.

This is useful when you have service time requirements such as in real time bidding. Say the method needs to response in 10 milli secs, otherwise the response is of no use because the bid is lost.

3. Interruptible
The lockInterruptibly method will try to acquire the lock till it is interrupted.
This is useful in implementing abort or cancel features.

4. Non block structured locking
You can acquire the lock in one method and release it in another. Or you can wrap the lock and unlock in your domain specific accquireLock() and releaseLock() methods.

This is useful in avoiding race conditions on read,update,save operations on data stored in caches. The synchronization provided by ConcurrentHashMap or Synchronized Map protects only for the duration of get and set operation. Not while the data is modified.

cache.acquireLock(key) ;
Data d = cache.get(key) ;
d.update1() ;
d.update2() ;
d.update3() ;
cache.put(key,d) ;
cache.releaseLock(key) ;

Acquiring and releasing the lock are abstracted away in acquirelock and releaseLock methods.

5. Read /Write Locks
This is my favorite feature. The ReadWriteLock interface exposes 2 locks objects. A read lock and a write lock.

You acquire the read lock when all you are doing is reading. Multiple threads can acquire the read lock.By allowing multiple readers, you achieve greater concurrency. A read lock cannot be acquired while a write lock is held by another thread.

You acquire the write lock when you need to write data. Only one thread can acquire a write lock at a time. A write lock cannot be acquired while other threads have acquired read locks.

Is the use of synchronized obsolete ?
Not really. Synchronized blocks are simple to use and are widely used. Most programmers are very familiar with its usage. They are less error prone as the lock is automatically release. It is reasonable to continue using synchronized for the the simpler use cases of locking. But if you need any of the features described above, using explicit locks is well worth the extra coding. Performance wise there is not much difference though studies have shown that explicit locks are slightly faster.

Sunday, February 19, 2012

Java Generics #2 : what is "? super T" ?

Consider the merge method below that copies all elements from List source to List target.

public static <T> void merge(List<? super T> target, List<? extends T> source)

The <T> following static declares T as a new type variable for this method. We discussed "? extends T" in the blog Java Generics #1. Here, let us examine "? super T". One can guess that it is a wildcard that means any types, that are a superclass of T. If T is Integer, then List<? super T> could be List<Integer>, List<Number> or List<Object>. The code below shows the use of the merge method. In line 6, T is Integer and "? super T" is Number. In line 10, T is Number and "? super T" is Object.

1 List<Integer> aInts = new ArrayList<Integer>() ;
2 aInts.add(5) ;
3 aInts.add(7) ;
4 List<Number> aNums = new ArrayList<Number>() ;
5 aNums.add(12.5) ;
6 MCollections.<Integer>merge(aNums,aInts) ; // works
7 System.out.println(aNums.toString()) ; // aNums has 5,7,12.5
8 List<Object> aObjs = new ArrayList<Object>() ;
9 aObjs.add("hello") ;
10 MCollections.<Number>merge(aObjs,aNums) ; // works as well
11 System.out.println(aObjs.toString()) ; // aObjs has hello,5,7,12.5

We discussed in the last blog that if you have a Collection<? extends T> you can get values out of it, but you cannot put stuff into it. So what can you do with Collection<? super T> ?

In our merge example above, List<? super T> is the target, which means the implementation is putting/setting elements into it. "? super T" means any supertype of T. Logically it makes sense that you can put any supertype into the List.

The implementation of merge could be

1 public class MCollections {
2 public static <T> void merge(List<? super T> target, 
3                                List<? extends T> source) {
4   for(int i = 0 ; i < source.size(); i++) {
5     T e = source.get(i) ;
6     target.add(e) ;
7   }
8 }
9 }

But if you were to do a get, what would be the returned type ?. There would be no way to know. Hence it is not allowed as shown in line 4 of the code below.

1 List<? super Integer> aNums = new ArrayList<Number>() ;
2 aNums.add(11) ;
3 aNums.add(12) ;   
4 Number n = aNums.get(0) ; // Compilation Error - not allowed 
5 Object o = aNums.get(0) ; // allowed -- No compile error

The exception to the rule is getting an Object, which is allowed because since Object is a supertype of every other java type.

In summary, you can enable subtyping using "? super T" when you need to put objects into the collection. ( But you can get them out only as Object). You can enable subtyping using ? extends T when you need to get objects out of the collection. It follows that if you need to do both get and put, then you cannot use either of these wildcard mechanisms and you need to use a explicit type.

Sunday, January 15, 2012

Java Generics #1 : Subtyping using wildcard with extends

Generics is one of those more complicated language features in Java that is not well understood by many programmers. Many avoid it altogether. This is not without reason. While writing your program, if you have to stop and think a lot about syntax, there is more than a good chance, you would try to avoid that language construct. In this blog I discuss one type of subtyping with generics which can be tricky.

In Java we know that Integer extends Number. In other words, Integer is a subtype of Number. Anywhere a Number required, you can pass in an Integer. But does this mean that List<Integer> is a subtype of List<Number> ?

Consider the code. Will it work ?

1 List<Integer> aList = new ArrayList<Integer>() ;
2 aList.add(11) ;
3 aList.add(13) ;
4 List<Number> nList = aList ;
5 nList.add(11.5) ;

aList is a list of Integers. nList is a list of Numbers. In line4 nList is made to reference aList. In line 5 we add a double to aList. But aList is a list of integers. This is obviously not correct. And Java will not allow it. Line 4 will cause a compilation error. But sometimes we want to be able use subtypes. Generics have the concept of wildcards that enable subtyping when logically approriate.

Consider the addAll method of the Collection interface.

interface Collection<T> {
   public boolean addAll(Collection<? extends T> x) ;

}

? extend T says that given a collection of Type T , you can add to it elements from any collection whose type is a subtype of T. The following code is valid.

1 List<Number> aList = new ArrayList<Number>() ;
2 List<Integer> intList = Arrays.asList(11,12) ;
3 List<Double> dList = Arrays.asList(15.15) ;
4 aList.addAll(intList) ;
5 aList.addAll(dList) ;

The implemention of addAll method will get elements from the list passed in as parameter and put it into the target collection. Note that it is only a get operation on Collection<? extends T>. A put on Collection<? extends T> would however not be allowed. To understand, consider the code below

1 List <? extends Number> numList ;
2 List<Integer> intList = Arrays.asList(11,12) ;
3 numList = intList ; // Will this work ?
4 numList.add(5.67) ; // Will this work ?

Should line 3 work ? What about line 4 ?
The Java compiler allows line 3 because List<Integer> is considered a subtype of List <? extends Number>. But line 4 is a compilation error because you should not be allowed to add a double to List<Integer>.

In summary, when you have a Collection<? extends T>, it is safe to get elements out of the collection but not safe to put elements into it. Hence the compiler does not allow it.

Sunday, June 5, 2011

Java Executors

The old way of creating threads in Java was to extend the java.lang.Thread class or to implement the java.lang.Runnable interface and pass it to Thread as argument. In this approach, the task is modeled as a Runnable and you create one or more threads for each task. There were no built in facilities to re-use threads such as thread pools. Additionally, once a task was started, there was no easy way to know when the task completed without implementing the wait/notify mechanism.

Since JDK5, another abstraction for concurrent execution of tasks is the Executor interface.

public interface Executor {
   void execute(Runnable cmd) ; 
}

The task to be executed is coded by implementating the Runnable interface, similar to the older model. However in the old model, execution is typically hard coded by extending the java.lang.Thread class.

With the executor framework, the submission and execution is decoupled by using the Executors class to create different kinds of Executor's that can execute the runnable.

The ExecutorService interface extends Executor and provides additional methods that lets callers submit tasks for concurrent execution.

While one can implement Executor or ExecutorService and delegate to Thread to do the actual work, that is generally not the recommended way.

java.util.concurrent.Executors is a class with factory methods to create concrete implementations of ExecutorService. This includes flexible thread pool based implementations. Using thread pools for concurrent execution has the advantage that it lets your application scale.

public static ExecutorService newFixedThreadPool(int nThreads) creates a thread pool that created threads upto the size nThreads. After that it created additional threads only if one of the existing thread dies.

public static ExecutorService newCachedThreadPool() created a threadpool with no upper bound on the number of threads. It will create threads as needed but can reuse existing threads.

public static ExecutorService newSingleThreadExecutor() creates a single thread. If the existing one dies, it will create a new one. But you will never have more that one thread.

public static ScheduledExecutorService newScheduledThreadPool(int corePoolSize) lets you create a thread pool that can be scheduled to execute periodically.

Let us write a simple concurrent server using fixed thread pool.


1 public class WebServer {
2  private final ExecutorService tpool = Executors.newFixedThreadPool(50) ;
3
4  public void start() throws IOException {
5    ServerSocket socket = new ServerSocket(8080) ;
6    while(!tpool.isShutdown()) {
7        try {
8           Socket c = socket.accept() ;
9           tpool.execute(new Runnable() {
10               public void run() { process(c) ; }
11           }
12        } catch(Exception e) {
13           log("Error occured .." + e) ;
14        }
15    }
16  }
17  private void process(socket c) {
18    // service the request   
19  }
20  public void stop() {
21    tpool.shutdown() ;
22  }
23}

In line 2 we create an ExecutorService. In lines 9-11 we create a runnable for each task and submit it for execution. The shutdown call in line 21 is a graceful shutdown. Tasks that are already submitted will be completed and no new tasks are accepted.

The advantages of using Executors are:

   Decoupling of task creation with submission/excecution.
   Built in thread pools.
   Built in orderly shutdown.
   Built in scheduling. (mentioned above but not discussed)
   Ability to check or block on completion of tasks.

Today, if you needed a data structure such as a list or a Map, you use the classes in the collection framework java.util.*. Similarly for concurrent programming, you should use the executor framework in java.concurrent.* as it gives you lot of function that you would otherwise have to code yourself.

Sunday, January 23, 2011

Designing Java Classes For Inheritance

In Object oriented programming, inheritance is concept that lets programmers re-use code as well specialize existing code. In Java, a class, the sub class can extend another class, the super class and there by inherit the attributes and behaviour (methods) of the super class. Specializing behavior is really a related concept called polymorphism. However you cannot specialize without inheriting.

Popular belief is that inheritance is always a good thing. However when used inappropriately, it can lead to code with bugs. The problems have been well documented in several books and articles. Yet very often we come across code with a complex inheritance hierarchy that looks cute, but is difficult to use and more often than not broken. Below I recap a few key points to remember when using inheritance in Java.

Consider the class MemberStore which is used to persist Members to a database

public class MemberStore {
  public void insert(Member m) {
    // insert into database
  } 
  public void insertAll(List<member> c) {
    // insert all members into the database
    // by calling the insert(m) for each Member
    Iterator i = c.iterator() ;
    while(i.hasNext()) {
      insert((Member)c.next()) ;

    }
  }
}

Let us say there is a need for a class that not only stores Members, but also keeps count of recently stored members. The wrong way to do it would be to
extend the above class

public class MemberStoreWithCounter extends MemberStore {
  private int counter = 0 ;
  public void insert(Member m) {
    counter++ ;
    super.insert(m) ;
  } 
  public void insertAll(Collection<member> c) {
    counter = counter + c.size() ;
    super.insertAll(c) ;
  }
  public int getCount() {
    return counter ;
  }
}

Let us try to use the new class

List<Member> list = new ArrayList<Member>() ;
list.add(new  Member(......)) ;
List.add(new  Member(......)) ;
MemberStorewithCounter m  = new .... ;
m.insertAll(list) ;
System.out.println(a.getCount());

What do you suppose the last line prints ? You were probably expecting 2. But it will print 4. The problem is that the insertAll method is implemented by calling the insert method. So the counter is incremented twice for each Member. This is the classic wrong use of inheritance.

The problem is that inheritance breaks encapsulation. Inheritance requires the programmer implementing the subclass class to know the implementation details of the super class. So generally it is safe to use inheritance only when the subclass and the superclass are under the ownership of the same programmer or team of programmers.

If you are designing your class so that other programmers may extend it, then it is preferable that the implemented methods of your class not call other overridable methods as was done by insertAll. If they do, then the javadocs must clearly document it. In the above example, the javadocs should say that insertAll calls insert. But this is not sufficient.

The javadocs for methods that can be safely overridden should say under what conditions and for what behavior the method should be overridden.

Classes that are not meant to be extended and methods that should not be overridden should be marked final so that programmers cannot extend them.

Constructors should never invoke overridable methods.

Lastly, in Java inheritance implies an "generalization/specialization" or "is a" relationship. Even if you do not intend to specialize, if you add a method to the subclass with the same signature of a method in the subclass, the method in the subclass overrides the method in the superclass. If there is no "is a" relationship, it is never a good idea to use inheritance.

The better way to add a counter to MemberStore is to write a wrapper around MemberStore that has MemberStore as a private member and delegates to it. This is otherwise known as composition.

public class BetterMemberStoreWithCounter {
  private MemberStore ms = new MemberStore() ;
  private int count = 0 ;
  public insert(Member m) {
     ms.insert(m) ;
     count++ ;
  }
  public insertAll(Collection<member> c) {
     count = count.size() ;
     ms.insertAll(c) ;
  }

}

Sunday, October 24, 2010

Developing web applications with Spring

Spring MVC enables easy web application development with a framework based on the Model View Controller architecture (MVC) pattern. The MVC architectural pattern requires the separation of the user interface (View), the data being processed (Model) and the Controller which manages the interactions between the view and the model.

At the core of Spring MVC is a servlet, the DispatcherServlet, that handles every request. The DispatcherServlet routes the HTTP request to a Controller class authored by the application developer. The controller class handles the request and decides which view should be displayed to the user as part of the response.

Let us develop a simple web application that takes a request and sends some data back to the user. Before you proceed any further, I recommend you download the source code at springmvc.zip

For this tutorial you will also need

(1) A webserver like Tomcat
(2) Spring 3.0
(3) Eclipse is optional. I use eclipse as my IDE. Eclipse lets you export the war that can be deployed to Tomcat. But you can use other IDEs or command line tools as well.
(4) Some familiarity with JSPs and Servlets is required.

Step 1: If you were to develop a web application in J2EE, typically you do it by developing servlets or JSPs, that are packaged in .war file. Also necessary is a deployment descriptor web.xml that contains configuration metadata. The war is deployed to a web server like tomcat.

With Spring, the first thing to do is to wire Spring to this J2EE web infrastructure by defining org.springframework.web.servlet.DispatcherServlet as the servlet class for this application. You also need to define org.springframework.web.context.ContextLoaderListener as a listener. ContextLoaderListener is responsible for loading the spring specific application context which has Spring metadata.

The web.xml setup ensures that every request to the application is routed by the servlet engine to DipatcherServlet. The updated to web.xml is shown below:

<listener>
    <listener-class>
        org.springframework.web.context.ContextLoaderListener
    </listener-class>
</listener>
<servlet>
    <servlet-name>springmvc</servlet-name>
    <servlet-class>
        org.springframework.web.servlet.DispatcherServlet
    </servlet-class>
    <load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
    <servlet-name>springmvc</servlet-name>
    <url-pattern>*.htm</url-pattern>
</servlet-mapping>

Step 2: The heavy lifting in this web application is done by a controller class. This is an ordinary java class or bean that extends org.springframework.web.servlet.mvc.AbstractController. We override the handleRequestInternal method. In this method, you would do the things necessary to handle the request which may include for example reading from a database.

The method returns a org.springframework.web.servlet.ModelAndView object which encapsulates the name of the view and any data (model) that needs to be displayed by the view. ModelAndView holds data as name value pairs.This data is later made available to the view. If the view is a jsp, then you can access the data using either jstl techniques or by directly querying the Request object. The code for our controller is shown below:

public class SpringMVCController extends AbstractController {
    protected ModelAndView handleRequestInternal(HttpServletRequest request, HttpServletResponse response) {
        ModelAndView mview = new ModelAndView("springmvc") ;
        mview.addObject("greeting", "Greetings from SpringMVC") ;
        mview.addObject("member1", new Member("Jonh","Doe", 
            "1234 Main St","Pleasanton","94588","kh@gmail.com","1234")) ;
        return mview ;
    }
}

The name of the view springmvc is passed in to the constructor of ModelAndView. The addObject methods add 2 model objects greeting and member1. Later you will see how the view can retrieve the objects and display them.

Step 3: Every Spring application needs metadata that defines the beans and their dependencies. For this application, we create a springmvc-servlet.xml. We help spring find it by specifying its location in web.xml.

<context-param>
    <param-name>contextConfigLocation</param-name>
    <param-value>/WEB-INF/springmvc-servlet.xml</param-value>
</context-param>

In springmvc-servlet.xml, the controller bean is defined as

<bean name="/*.htm" class="com.mj.spring.mvc.SpringMVCController"/>

Step 4: How does DispatcherServlet know which Controller should handle the request ?

Spring uses handler mappings to associate controllers with requests. 2 commonly used handler mappings are BeanNameUrlHandlerMapping and SimpleUrlHandlerMapping.

In BeanNameUrlHandlerMapping, when the request url matches the name of the bean, the class in the bean definition is the controller that will handle the request.

In our example, we use BeanNameUrlHandlerMapping as shown below. Every request url ending in .htm is handled by SpringMVCController.

<bean name="/*.htm" class="com.mj.spring.mvc.SpringMVCController"/>

In SimpleUrlHandlerMapping, the mapping is more explicit. You can specify a number of urls and each URL can be explicitly associated with a controller.

Step 5: How does the DispatcherServlet know what to return as the response ?

As mentioned earlier, the handleInternalRequest method of the controller returns a ModelandView Object.

In the controller code shown above, the name of the view "springmvc" is passed in the constructor to ModelAndView. At this point we have just given the name of the view. We have not said what file or classes or artifacts help produce the html, nor have we said whether the view technology used is JSP or velocity templates or XSLT. For this you need a ViewResolver, which provides that mapping between view name and a concrete view. Spring lets you produce a concrete view using many different technologies, but for this example we shall use JSP.

Spring provides a class InternalResourceViewResolver that supports JSPs and the declaration below in springmvc-servlet.xml tells spring that we use this resolver. The prefix and suffix get added to view name to produce the path to the jsp file that renders the view.

<bean id="viewResolver" class="org.springframework.web.servlet.view.InternalResourceViewResolver">
    <property name="prefix" value="/WEB-INF/jsp/"></property>
    <property name="suffix" value=".jsp"></property>
</bean>

Step 6: In this example, the view resolves to springmvc.jsp, which uses JSTL to get the data and display it. Spring makes the model objects greeting and member1 available to the JSP as request scope objects. For educational purposes, the code below also gets the objects directly from the request.

// Using JSTL to get the model data
${greeting}
${member1.lastname
// Using java to get the model directly from the request
Map props = request.getParameterMap() ;
System.out.println("PARAMS =" + props) ;
Enumeration em = request.getAttributeNames() ;
while (em.hasMoreElements()) {
    String name = (String) em.nextElement() ;
    System.out.println("name = "+name) ;
}
System.out.println("Attrs are "+request.getAttributeNames()) ;
System.out.println("greeting is "+ request.getAttribute("greeting")) ;
Member m = (Member)request.getAttribute("member1") ;
System.out.println("member is "+m.toString()) ;

Step 7: All files we have developed so far should be packaged into a war file as you would in any web application. The war may be deployed to tomcat by copying to tomcat_install\webapps. I built a war that you can download at springmvc.war

Step 8: Point your web browser http://localhost:8080/springmvc/test.htm to run the application. The browser should display the data.

To summarize, Spring simplifies web application development by providing building blocks that you can assemble easily. We built a web application using Spring MVC. Spring provides an easy way to wire together our model, the controller SpringMVCController and the view springmvc.jsp. We did not have to explicitly code any request/response handling logic. By changing the metadata in springmvc-servlet.xml, you can switch to a different controller or a different view technology.

Sunday, September 19, 2010

Database access made simple with Spring

In "The promise of spring" I talked about some of the benefits of Spring, one of which is the simplification of the usage of JDBC.

Typical JDBC programming requires the programmer to write repeatedly the same code for basic things like loading the JDBC driver, getting a connection, closing a connection. After getting a connection, one has to create a PreparedStatement to execute the SQL. The PreparedStatement may return a ResultSet, over which the programmer needs to iterate to extract the data.

Spring addresses the above the issues as follows:

1. The cornerstone of the Spring framework is dependency injection. With Spring, the connection information such as driver, connection URL can be defined as metadata and a datasource object injected into the application. This frees the programmer from the burden of writting code to manage connections.

2. Spring provides a class JdbcTemplate that abstracts away the repetetive code involving Statements and ResultSets.

3. Lastly Spring maps JDBC checked exceptions to a runtime exception hierarchy. This helps to unclutter application code because the code no longer needs to be cluttered with database specific try catch logic.

Let us see how this works with a sample. Let us write a DAO (data access object) to insert and retrieve information from a database. Before you proceed any further, you can download the complete source code from springjdbc.zip. In the blog below, for brevity, I show only code snippets. To run the sample, you will also need Spring 3.0, which you can download from Spring 3.0.x.

Step 1. For a database, I am going to use Apache Derby. We are going to store and retrieve user information as is typically required in any application. The schema is

create table puma_members ( 

  firstname  varchar(20), 
  lastname   varchar(30) not null, 
  street     varchar(40), 
  city       varchar(15), 
  zip        varchar(6), 
  email      varchar(30) not null primary key, 
  password   varchar(8) 

) ;

To download derby and for help on creating a database and table , see Derby documentation

Step 2. The DAO interface to access this table is

public interface MemberDAO {

    public int insertMember(Member m) ;
    public int deleteMember(String email) ;
    public Member getMember(String email) ;
}

Member is a class that has get and set methods for every column in puma_members. For brevity, the code is not shown here.

Step 3. The class MemberSpringJDBCDAO shall provide an implementation for MemberDAO using Spring JDBC.

The heavylifting in this class is done by an instance of org.springframework.jdbc.core.JdbcTemplate. However we don'nt need to instantiate it explicitly.We will let Spring create an instance and inject it into this class.

To help Spring, we however need to provide getter/setter method for JdbcTemplate.

So the class needs a private variable to hold the jdbcTemplate and setter/getter methods that Spring can call to set its value.

private JdbcTemplate jdbcTemplate ;
public JdbcTemplate getJdbcTemplate() {
    return jdbcTemplate ;
}
public void setJdbcTemplate(JdbcTemplate template) {
    jdbcTemplate = template ;
}

This is a form of dependency injection called setter injection. Spring calls the setJdbcTemplate method to provide our class with an instance of JdbcTemplate.

Step 4. How do we tell Spring we need a JdbcTemplate ? And how does Spring know what database driver to load and what database to connect to ?
All the bean definitions are in springjdbcdao.xml.The main bean memberdao has a property that references a jdbcTemplate.

<bean class="com.mj.spring.jdbc.MemberSpringJDBCDAO" id="memberdao">
    <property name="jdbcTemplate">
       <ref bean="jdbcTemplate"></ref>
    </property>
</bean>

<bean class="org.springframework.jdbc.core.JdbcTemplate" id="jdbcTemplate">
    <constructor-arg>
        <ref bean="dataSource"></ref>
    </constructor-arg>
</bean>

The jdbcTemplate bean has as it implementation org.springframework.jdbc.core.JdbcTemplate which is the class we are interested in. It references another bean dataSource , which has all the necessary jdbc configuration.

<bean id="dataSource"
  class="org.springframework.jdbc.datasource.DriverManagerDataSource">
    <property name="driverClassName" value="org.apache.derby.jdbc.EmbeddedDriver"/>
    <property name="url" 
         value="jdbc:derby:/home/mdk/mjprojects/database/pumausers"/>
</bean>

This configuration has all the information necessary for Spring to create a JdbcTemplate with a dataSource and inject it into memberDAO.

Step 5. JdbcTemplate has a number of helper methods to execute SQL commands that make implementing MemberDAO methods easy.

private static final String insert_sql = "INSERT into puma_members VALUES(?,?,?,?,?,?,?)" ;
private static final String select_sql = "Select * from puma_members where email = ?" ;
public int insertMember(Member member) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {member.getFirstname(),member.getLastname(),
                           member.getStreet(),member.getCity(),member.getZip(),
                           member.getEmail(),member.getPassword()} ;
    int ret = jt.update(insert_sql, params) ;
    return ret ;
}
public Member getMember(String email) {
    JdbcTemplate jt = getJdbcTemplate() ;
    Object[] params = new Object[] {email} ;
    List result = jt.query(select_sql,params, new MemberRowMapper()) ;
    Member member = (Member)result.get(0) ;
    return member;
}
private class MemberRowMapper implements RowMapper {
    public Object mapRow(ResultSet rs, int arg1) throws SQLException {
        Member member = new Member(rs.getString("firstname"), rs.getString("lastname"), 
                                   rs.getString("street"), rs.getString("city"), rs.getString("zip"),
                                   rs.getString("email"), rs.getString("password"));
         
        return member ;
    }
}

In insertMember, the update method of JdbcTemplate takes 2 parameters, the SQL insert statement and an array that contains the data to be inserted. In getMember, the query method takes an additional parameter, a class that implements the RowMapper interface, and this maps the JDBC resultSet to the object we want, which is an instance of Member. The Spring javadocs very clearly state that JdbcTemplate is star of the Spring JDBC package. It has several variations of query, update, execute methods. Too many,one might think.

Step 6. The class MemberSpringJDBCDAOTest has the junit tests that test MemberSPringJDBCDAO. A Snippet is below

public void insertMember() {
    ApplicationContext context = new ClassPathXmlApplicationContext(
                                                "springjdbcdao.xml");
    BeanFactory factory = (BeanFactory) context;
    MemberSpringJDBCDAO mDAO=(MemberSpringJDBCDAO) factory.getBean("memberdao");
    Member newMember = new Member("John","Doe","2121 FirstStreet","Doecity",
                                   "42345","jdoe@gmail.com","jondoe") ;
    int ret = mDAO.insertMember(newMember) ;
}

This is typical Spring client code. First we create a BeanFactory and load the metadata in springjdbcdao.xml. Then we request the factory to create a memberDAO. Insert a record into the database by calling the insertMember method.

Clearly the code is a lot simpler than if you implemented MemberDAO in plain JDBC. If you are new to Spring and are intimated with buzz words like inversion of control (IOC), then using Spring for database access is a good way to start benefiting from Spring while learning to use it. Note the loose coupling between the interface MemberDAO and its implementation. The loose coupling is good design and a reason for the popularity of frameworks like Spring. In future blogs, I will implement MemberDAO using other persistence APIs like JPA and may be Hibernate and show how the implementation can be switched without having to change client code.

Sunday, August 15, 2010

The promise of Spring

The Spring framework is a popular alternative to J2EE for enterprise application development. To download Spring or read about it, visit http://www.springsource.org/

The top 5 documented benefits of Spring are:

1. Spring saves the developer the time and effort required to write boiler plate code like JNDI lookups, creating JDBC connections, API specific exception handling etc. Every one has their horror story on not finding a JNDI reference and a framework that abstracts such plumbing code away from the application developer is certainly useful.

2. Spring is a lightweight framework. A lightweight framework should be small in size, conceptually simple and easy to use.

3. Spring simplifies database access. Most applications need to store and retrieve data from a relational database. There are many alternatives for database access such as JDBC, JPA, Hibernate, with varying levels of complexity. Spring provides a simpler abstraction over these APIs. Similarly it simplifies web development with its MVC framework.

4. Spring can used in various environments. It can be used in standalone J2SE applications. It can be used with a webserver such as tomcat. It can be used with a full blown application server like JBOSS or Websphere.

5. Spring lets you develop applications by assembling loosely coupled components. Loose coupling is a better design practice because it allows you to swap out moving parts without having to do major changes to the application. Spring achieves this by what it calls "Inversion of Control" and "Dependency Injection". Actually, it is mostly dependency injection.

Inversion of Control and Dependency Injection are a topic for a separate blog. But very briefly, in Spring, when you author a "bean", you specify the dependencies in an XML file and the Spring container makes them available. You don'nt have to explicitly create each dependency.

In subsequent blogs, using some code example, let us see if Spring delivers on its promise. I will dig deeper with examples and point out where Spring really simplifies development and where it just adds another layer of complexity. So stay tuned.

Sunday, April 18, 2010

The Java Memory Model

The Java memory model describes the rules that define how variables written to memory are seen, when such variables are written and read by multiple threads.

When a thread reads a variable, it is not necessarily getting the latest value from memory. The processor might return a cached value. Additionally, even though the programmer authored code where a variable is first written and later read, the compiler might reorder the statements as long as it does not change the program semantics. It is quite common for processors and compilers to do this for performance optimization. As a result, a thread might not see the values it expects to see. This can result in hard to fix bugs in concurrent programs.

The Java programming language provides synchronized, volatile, final to help write safe multithreaded code. However earlier versions of Java had several issues because the memory model was underspecified. JSR 133 fixed some of the flaws in the earlier memory model.

Most programmers are familiar that entering a synchronized block means obtaining a lock on a monitor that ensures that no other thread can enter the synchronized block. Less familiar but equally important are the facts that
(1) accquring a lock and entering a synchronized block forces the thread to refresh data from memory.
(2) On exiting the synchronized block, data written is flushed to memory.

This ensures that values written by a thread in a synchronized block are visible to other threads in synchronized blocks.

Ever heard of "happens before" in the context of Java ? JSR 133 introduced the term "happens before" and provided some guarantees about the ordering of actions within a program.These guarantees are

(1) Every action in a thread happens before every other action that comes after it in the thread.
(2) An unlock on a monitor happens before a subsequent lock on the same monitor
(3) A volatile write on a variable happens before a subsequent volatile read on the same variable
(4) A call to Thread.start() happens before any other statement in that thread
(5) All actions in thread happen before any other thread returns from a join() on that thread

Where action is defined in section 17.4.2 of the Java language specification as statements that can be detected or influenced by other threads. Normal read/write, volatile read/write, lock/unlock are some actions.

Rules 1 ,4 and 5 guarantee that within a single thread all actions will execute in the order in which they appear in the authored program. Rules 2 and 4 guarantee that between multiple threads working on shared data, the relative ordering of synchronized blocks and the order of read/writes on volatile variables is preserved.

Rules 2 and 4 makes volatile very similar synchronized block. Prior to JSR 133, volatile still meant that a write to volatile variable is written directly to memory and a read is read from memory. But a compiler could reorder volatile read/writes with non volatle read/writes causing incorrect results. Not possible after JSR 133.

One additional notable point. This is related to final members that are initalized in the constructor of a class. As long as the constructor completes execution properly, the final members are visible to other threads without synchronization. If you however share the reference to the object from within the constructor, then all bets are off.

Thursday, March 11, 2010

The wait/notify mechanism in JAVA

Consider the problem where one thread produces some data and other threads consume the data. This is the producer consumer pattern. If the producer is not producing data fast enough, the consumers might have to wait. The wrong way to handle this is for consumer threads to go to sleep and then check for available data at periodic intervals. This eats up cpu cycles. Fortunately JAVA provides a more elegant way.

The wait/notify mechanism allows one thread to wait for a notification from another thread. The consumer thread checks a variable that can indicate if data is available or not and if not , then call the wait() method. Calling the wait method puts the thread to sleep until it is notified. Whenever the producer thread produces data and updates the variable, it calls notify() or notifyAll() to notify all the waiting threads.

The class WorkQueue in Listing 1 shows how this works. The addWork method is called by producer threads to queue up work that needs to be done. The getWork method is called by consumer threads that will do the work. When there is no work, the consumer threads need to wait.

public class WorkQueue {
     private final ArrayDeque workq = new ArrayDeque();
    public void addWork(Object work) {
        synchronized(workq) {
             workq.addLast(work) ;
            workq.notifyAll() ;
            System.out.println("Thread producer added work notiying waiter") ;
        }
    }
    public Object getWork() {
        Object ret = null ;
    synchronized(workq) {
            while(((ret = workq.pollFirst()) == null) ){
                 try {
                    System.out.println("No Work Thread consumer going to block") ;
                    workq.wait() ;
                    System.out.println("Thread consumer woken") ;
                } catch(Exception e) {
                    System.out.println(e) ;
                }
            }

        }
      return ret ;
    }
Listing 1

wait(), notify() and notifyAll() are methods of java.lang.Object. The thread needs to accquire a lock on an object before calling any of these methods. The getWork method gets a lock on the variable workq. The condition in the while loop checks if there is a work item. If there is an item, the condition is false, we break out of the loop, exit the synchronized block, there by releasing the lock on workq and return the work item. If workq.pollFirst() returns null, there is no work queued, we enter the loop and call the wait method. Calling the wait method causes the thread to give up the lock and go to sleep.

When a producer thread calls the addWork method, the statement synchronized(workq) ensures that it first accquires a lock. If another thread has a lock on workq, this thread will wait until it gets the lock. It then adds the work item to the workq. Lastly it calls notifyAll() on workq to notify all waiting threads that work is available.

When the consumer thread receives the notification, it wakes up. However , to return from the wait() method, it needs to reaccquire the lock first. On accquiring the lock, it returns from wait() and continues to execute the loop. The condition workq.pollFirst() == null is checked again. If false, the thread got an item and it exits the loop and the method. If not, it calls wait again and sleeps till the next notification.

The code in listing 2 exercises the class in listing 1 with 2 threads.
public static void main(String[] args) {
        final WorkQueue wq = new WorkQueue() ;
        Runnable producer = new Runnable() {
            public void run() {
                for (int i = 1 ; i <=10 ; i++) {
                    wq.addWork(Integer.toString(i)) ;
                    System.out.println("Thread producer created work :" + i) ;
                    try {
                        Thread.sleep(5000) ;
                    } catch(Exception e) {
                        System.out.println(e) ;
                    }
                }

            }

        } ;

        Runnable consumer = new Runnable() {
            public void run() {
                for (int i = 1 ; i <=10 ; i++) {
                    String work = (String)wq.getWork() ;
                    System.out.println("Thread consumer Got work:" + work) ;
                }
            }
        } ;

        Thread tconsumer = new Thread(consumer,"tconsumer") ;
        tconsumer.start() ;
        Thread tproducer = new Thread(producer,"tproducer") ;
        tproducer.start() ;

    }
Listing 2

Running the code should produce output shown below.
No Work Thread consumer going to block
Thread producer added work notiying waiter
Thread producer created work :1
Thread consumer woken
Thread consumer Got work:1
No Work Thread consumer going to block
Thread producer added work notiying waiter
Thread producer created work :2
Thread consumer woken
Thread consumer Got work:2
.
.
.
.......

Note that the wait must always be called within a loop that checks the condition that the thread is waiting on. Testing the condition before the wait() ensures that the thread waits() only when the condition is true. Testing the condition after the wait(), that is after the thread is woken up, ensures that the thread continues to wait, if the condition is still true.

In summary, the wait/notify mechanism is a powerful mechanism for threads to communicate with each other. However, use with care and remember to call the wait() within a while loop.

Thursday, February 25, 2010

The dreaded double check pattern in java

The problem with double check locking in java is well documented. Yet even a seasoned programmer can get overzealous trying to optimize synchronization of code that creates singletons and fall prey to the trap. Consider the code

public class Sample {
  private static Sample s = null ;
  public static Sample getSample() {
    if (s == null) {
      s = new Sample() ;
    }
  return s ;
  }
}
Listing 1

This code is not thread safe. If 2 threads t1 and t2 enter the getSample() method at the same time, they are likely to get different instances of sample. This can be fixed easily by adding the synchronized keyword to the getSample() method.

public class Sample {
  private static Sample s = null ;
  public static synchronized Sample getSample() {
    if (s == null) {
      s = new Sample() ;
    }
    return s ;
  }
}
Listing 2

Now the getSample method works correctly. Before entering the getSample method, thread t1 accquires a lock. Any other thread t2 that needs to enter the method will block until t1 exits the method and releases the lock. Code works. Life is good. This is where the smart programmer if not careful, can outsmart himself. He will notice that in reality only the first call to getSample, which creates the instance, needs to be synchronized and subsequent calls that merely return s are paying an unnecessary penalty. He decides to optimize the code to

public class Sample {
  private static Sample s = null ;
  public static Sample getSample() {
    if (s == null) {
      synchronized(Sample.class) {
        s = new Sample() ;
      }
    }
    return s ;
  }
}
Listing 3

Our java guru quickly realizes that this code has the same problem that listing 1 has. So he fine tunes it further.

public class Sample {
  private static Sample s = null ;
  public static Sample getSample() {
    if (s == null) {
      synchronized(Sample.class) {
        if (s == null) {
          s = new Sample() ;
        }
      }
    }
    return s ;
  }
}
Listing 4

By adding an additional check withing the synchronized block, he has ensured that only one thread will ever create an instance of the sample. This is the double check pattern. Our guru's friend, a java expert, buddy reviews the code. Code is checked in and product is shipped. Life is good right ?

Wrong !! Let us say thread t1 enters getSample. s is null. It gets a lock. Within the synchronized block, it checks that s is still null and then executes the constructor for Sample. Before the execution of the constructor completes t1 is swapped out and t2 gets control. Since the constructor did not complete, s is partially initialized. It is not null, but has some corrupt or incomplete value. When t2 enters getSample, it sees that s is not null and returns a corrupt value.

In summary, the double check pattern does not work. The options are to synchronize at a method level as in s listing 2 or to forego lazy initialization. Another option that works is to use a nested holder class that delays initialization until the get method is called.

public class Sample {
  private static class SampleHolder {
    public static Sample INSTANCE = new Sample() ;
  }
  public static Sample getSample()  {
    return SampleHolder.INSTANCE ;
  }
}

Listing 5