Friday, August 30, 2013

Java Serializable vs Externalizable

Serializable and Externalizable are almost forgotten interfaces in Java. And for good reason. Many applications do not need to serialize objects explicitly. Most applications write to a database and to write to databases you use either an API like JDBC or a framework like Hibernate or JPA. However if you writing to a file or network, it is worthwhile to understand the default serialization mechanism.

Serializable

To leverage the default serialization mechanism in Java, you need to implement the java.io.Serializable interface. This a marker interface, in that, for default behavior, you do not need to implement any methods.

public class Employee implements Serializable {
    private int id ;
    private String name ;
    private String address ;

    public Employee(int i, String nm, String adr) {
        id = i ;
        name = nm ;
        address = adr ;
    }

    public int getId() {
        return id ;
    }
    
    public String getName() {
        return name ;
    }

    public String getAddress() {
        return address ;
   }

}

To Serialize an Employee Object:

try {
    Employee e = new Employee(1,"John Doe","123 5th Street, New City , CA") ;
    ObjectOutputStream os = new ObjectOutputStream(new FileOutputStream("employees.dat"))) ;
    os.writeObject(e) ;
    os.close() ;
} catch (IOException e) {
    log(e) ;
}

To Deserialize an employee Object:

Employee er = null ;
try {
     ObjectInputStream is = new ObjectInputStream(new FileInputStream("employee.dat"))) ;
     er = (Employee) is.readObject() ;

} catch(IOException e1) {
    log(e1) ;
}

You can customize the default serialization by adding the following methods:

private void writeObject(ObjectOutputStream out) throws IOException {
    out.writeInt(id) ;
    out.writeUTF(name) ;
    out.writeUTF(address) ;

   // add customization here
}
private void readObject(ObjectInputStream in) throws IOException {
    id = in.readInt() ;
    name = readUTF() ;
    address = readUTF() ;
    // add customization here
}

If these methods exist, the jvm calls them during serialization. You can also call defaultWriteObject and defaultReadObject to get default behaviour then and add to it.

Default serialization is quite slow and should not be used except in the most simple use cases. Implementing readObject and writeObject does not give much improvement because the JVM has to use reflection to determine if those private methods exist.

Externalizable
The default serialization protocol is considered a little heavy. If you wish to completely override it and use a different protocol, you can implement the Externalizable interface which has 2 methods.

public void readExternal(ObjectInput in) throws IOException

public void writeExternal(ObjectOutput out) throws IOException

Once you implement these methods, the client code for serialization is same as above. Studies have shown that using Externalizable provides better performance. Since these are public methods, the JVM does not have to resort to reflection. The implementations of readExternal and writeExternal would be similar to readObject and writeObject shown above.

Though for most industrial grade applications, you will want to forego serializing objects directly and serialize the types that make up your objects.

In Summary,

You may use the Serializable interface for simple use cases. It is easy to use, but the performance is not good.

Using the Externalizable interface is a little more work and it gives little better performance.

For best performance, design a record format for the data in the objects and serialize the types directly to memory and write blocks to disk. Of course, this is much more work.