Tuesday, June 12, 2012

5 Tips for building low latency web applications

Low latency applications are those that need to service requests in a few milliseconds or microseconds. Examples of a low latency application are
  • Servers that serve ads to be shown on web pages. If you don'nt serve the ad as per SLA dictated by publishers, you will not be given the opportunity to serve the ad. Typically the server has a few milliseconds to respond.
  • Servers that participate in real time bidding. Again, taking internet advertising as an example, if you are bidding for the opportunity to show an ad, you have just a few milli-seconds to make a bid.
  • Applications that provide real time quotes such as a travel portal. The portal makes requests to its partner agents. For the portal to be usable, the quotes need to displayed before the user loses interest and moves on to another travel site. Typically the portal app gives each partner a few milliseconds to respond. Otherwise the response is ignored.
Here are 5 simple guidelines to remember when building low latency applications.

1. Read from Memory: Keep the data required to serve the request in memory. Keep as much as possible in local memory and then use external caches like memcached, Ehcache and the more recent NoSql key value stores. Having to read data from a database or filesystem is slow and should be avoided.

2. Write asynchronously: The thread that services the request should never be involved in writing to any external storage such database or disk. The main thread should hand off the write task to worker threads that do the writing asynchronously.

3. Avoid locks and contention: Design the application in way that multiple threads are not contending for the same data and requiring locks. This is generally not an issue if the application mostly reads data. But multiple threads trying to write requires accquiring locks that slows down the application. You should consider a design where write operations are delegated to a single thread. You might need to relax the requirement of ACID properties on data. In many cases, your users will be able to tolerate data that becomes eventually consistent.

4. Architect with loosely coupled components: When tuning for low latency, you might need to co locate certain components. Co location reduces hops and hence latency. In other cases, you might need to geographically distribute components, so that they are closer to the end user. An application built with loosely coupled components can accommodate such physical changes without requiring a rewrite of the application.

5. Horizontal scalability: Architect the application so that you can scale horizontally as the load on the application goes up. Even if you read all data from a cache, as the number of requests go up and size of data increases, because of things like garbage collection(in the case of JAVA) the latency will go up. To handle more requests without increasing latency, you will need to add more servers. If you are using a key-value store, you might need to shard the data across multiple servers to keep sub millisecond response time.

Most importantly, building and maintaining low latency is an iterative process that involves designing correctly, building, measuring performance and then tuning by looking at the design and code for improvements.

Wednesday, May 16, 2012

When to use explicit Locks in JAVA ?

Prior to JDK 5, the only way to protect data from concurrent access was to use the synchronized keyword. The limitations of using synchronized are

(1) A thread that tries to acquire a lock has to wait till it gets the lock. There is no way to timeout.
(2) A thread that is waiting for a lock cannot be interrupted.
(3) Since synchronized applies to a block of code, the lock has to be acquired and released in the same block. While this is good most of the time, there are cases where you need the flexibility of acquiring and releasing the lock in different blocks.

The Lock interfaces and classes are well documented at java.util.concurrent.locks
The basic usage of the new Lock interface is

Lock l = new ReentrantLock() ;
l.lock() ;
try {
// update
} finally {
l.unlock() ;
}

You might be tempted to say that this can be done using synchronized. However the new Lock interface has several additional features.

1. Non blocking method
The trylock() method (without params) acquires the lock if it is available. If it is not available it returns immediately. This is very useful in avoiding deadlocks when you are trying to acquire multiple locks .

2. Timed
trylock(time......)  acquires the lock if it is free within the time. Otherwise it returns false. The thread can be interrupted during the wait time.

This is useful when you have service time requirements such as in real time bidding. Say the method needs to response in 10 milli secs, otherwise the response is of no use because the bid is lost.

3. Interruptible
The lockInterruptibly method will try to acquire the lock till it is interrupted.
This is useful in implementing abort or cancel features.

4. Non block structured locking
You can acquire the lock in one method and release it in another. Or you can wrap the lock and unlock in your domain specific accquireLock() and releaseLock() methods.

This is useful in avoiding race conditions on read,update,save operations on data stored in caches. The synchronization provided by ConcurrentHashMap or Synchronized Map protects only for the duration of get and set operation. Not while the data is modified.

cache.acquireLock(key) ;
Data d = cache.get(key) ;
d.update1() ;
d.update2() ;
d.update3() ;
cache.put(key,d) ;
cache.releaseLock(key) ;

Acquiring and releasing the lock are abstracted away in acquirelock and releaseLock methods.

5. Read /Write Locks
This is my favorite feature. The ReadWriteLock interface exposes 2 locks objects. A read lock and a write lock.

You acquire the read lock when all you are doing is reading. Multiple threads can acquire the read lock.By allowing multiple readers, you achieve greater concurrency. A read lock cannot be acquired while a write lock is held by another thread.

You acquire the write lock when you need to write data. Only one thread can acquire a write lock at a time. A write lock cannot be acquired while other threads have acquired read locks.

Is the use of synchronized obsolete ?
Not really. Synchronized blocks are simple to use and are widely used. Most programmers are very familiar with its usage. They are less error prone as the lock is automatically release. It is reasonable to continue using synchronized for the the simpler use cases of locking. But if you need any of the features described above, using explicit locks is well worth the extra coding. Performance wise there is not much difference though studies have shown that explicit locks are slightly faster.

Friday, April 20, 2012

Build distributed applications using Spring HTTP Invoker

Buliding distributed applications involves calling methods on objects that are remote - on different machines and/or different JVMs. Code running on machine A invokes a method on an object running on machine B and it works just as if the caller and the target were in the same JVM. In the past, CORBA, RMI and EJBs were technologies used for remote invocation. But they are complicated to use. The protocols are binary and difficult to troubleshoot. Also they are not suitable for use across intranets because they use ports that networks admins hate to open.

Since 2000, SOAP based web services enabled remote invocation using HTTP as the transport and XML for payload. While HTTP solved the problems of troubleshooting and firewalls, the performance of using XML was not very good. Some developers prefer web services using JSON over HTTP, but that requires modeling the data in JSON.

Spring HTTP Invoker is a remoting mechanism, where the programming model is plain java, but HTTP is used as the transport and the payload is created using java serialization. Spring HTTP gives developers the benefit of HTTP without the performance overhead of XML based web services. In the rest of the article, we explain with a simple example remoting using Sprint HTTP Invokers.

For this tutorial you will need

(1) Spring
(2) JDK
(3) Eclipse
(4) Tomcat

In this example, we create a service AccountService, with a method getAccount. The service is deployed to tomcat. We invoke the getAccount method from J2SE client in a different JVM. You may download the full source code for this sample at RemoteService

Step 1: create the service and implementation

Let us define an interface AccountService and its implementation AccountServiceImpl in plain JAVA.
public interface AccountService {
  public Account getAccount(int id) ;
}
public class AccountServiceImpl implements AccountService {
 @Override
 public Account getAccount(int id) {
  // TODO Auto-generated method stub
  return new Account(id,"testacct",100,2999.99F) ;
 }
}
Step 2: Spring Application context for server side
The Spring application context is defined in the file remoting-servlet.xml.
<beans>
    <bean id="accountService" class="com.mj.account.AccountServiceImpl"/>
    <bean> name="/AccountService" class="org.springframework.remoting.httpinvoker.HttpInvokerServiceExporter">
       <property name="service" ref="accountService"/>
       <property name="serviceInterface" value="com.mj.account.AccountService"/>
   </bean>
</beans>
The first bean accountService needs no explanation - it is a simple spring bean. The 2nd exports a bean /AccountService. This is exported by HttpInvokerServiceExporter, a Spring provided class. The service exported is accountService defined by the 1st bean. Since we will be invoking using HTTP, the url is /AccountService. (by convention).

Step 3: package as war and deploy to tomcat

The classes and context xml needs to be packaged as a war and deployed to tomcat. The standard spring MVC dispatcherServlet needs to be wired into the web.xml.
<servlet>
        <servlet-name>remoting</servlet-name>
        <servlet-class>
            org.springframework.web.servlet.DispatcherServlet
        </servlet-class>
        <load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
        <servlet-name>remoting</servlet-name>
        <url-pattern>/*</url-pattern>
</servlet-mapping>
Step 4: Create a application context for the client with the entry
<bean id="AccountProxy" class="org.springframework.remoting.httpinvoker.HttpInvokerProxyFactoryBean">
     <property name="serviceUrl" value="http://localhost:8080/remoteservice/AccountService"/>
     <property name="serviceInterface" value="com.mj.account.AccountService"/>
</bean>
This defines a bean AccountProxy whose implementation is the HttpInvokerProxyFactorybean, which will create the Http invoker. The url to invoke is http://localhost:8080/remoteservice/AccountService. http://localhost:8080 is where the target web server is listening. /remoteservice is the tomcat context ( I deployed the service as remoteservice.war). We defined /AccountService as the url for our bean in remoting-servlet.xml.

Step 5: Remote invoke the service
public class AccountServiceClient {
 public static void main(String[] args) {
   ApplicationContext applicationContext = new ClassPathXmlApplicationContext("remoteclient.xml");  
   AccountService testService = (AccountService) applicationContext.getBean("AccountProxy");  
   Account a = testService.getAccount(25) ;   
   System.out.println(a) ;
 }
}
You should see the output 25 testacct 100 2999.99

In summary, it is very easy to do remote invocation and distribute services using Spring HTTP invokers. You get the ease of plain JAVA programming and the ease of maintainence and troubleshooting because of HTTP. There is simply no reason to use RMI like protocols any more.

Sunday, March 18, 2012

High Availability for Web applications

As more mission critical applications move to the cloud, making the application highly available becomes super critical. An application not available for whatever reason, web server down, database down etc mean lost users, lost revenue that can be devastating to your business. In this blog we examine some basic high availability concepts.

Availability means your web application is available to your users to use. We would all like our applications to available 100% of the time. But for various reasons it does not happen. The goal of high availability is to make the application available as much as possible. Generally, availability is expressed as a percent of time that application is available per year. One may say availability is 99% or 99.9% and so on.

Redundancy and failover are techniques used to achieve high availability. Redundancy is achieved by having multiple copies of your server. Instead of 1 apache web server, you have two. One is the active server. The active server is monitered and if for some reason it fails, you failover to the 2nd server which becomes active. Another approach is to use a cluster of active servers as is done in a tomcat clusters. All servers are active. A load balancer distributes load among the members of the cluster. If one or two member of the cluster go down, no users are affects because other servers continue processing. Of course, the load balancer can become a point of failure and needs redundancy and failover.

If you were launching a new web application to the cloud, you might start of with a basic architecture as shown below without any HA consideration.

Phase 1: 1 Tomcat web server




Very soon you hit 2 issues. First whenever the server goes down, by accident or by intent, your users cannot use the application. As the number of users goes up, your server is not able to handle the load.

Phase 2: Tomcat cluster

You add redundancy and scalability by using a tomcat cluster as shown in the figure below. The cluster is fronted by Apache Web server + mod_proxy which distributes requests to the individual server. Mod_proxy is the load balancer.


Now the application scales horizontally. Tomcat or application failure is not an issue because there are other servers in the cluster. But we have introduced a new point a failure, the load balancer. If Apache+mod_proxy goes down, the application is unavailable.

To read more about setting up a tomcat cluster see Tomcat clustering

To learn how to use a load balancer with tomcat see Loadbalancing with Tomcat

Phase 3: Highly available Tomcat cluster
The figure below shows how to eliminate the point of failure and make the load balancer highly available.
You add redundancy by adding a second apache+mod_proxy. However only one of the apache is active. The second apache is not handling any requests. It merely monitors the active server using a tool like heartbeat. If for some reason, the active server goes down, the 2nd server knows and the passive server takes over the ip address and starts handling requests. How does this happen ?

This is possible because the ip address for this application that is advertised to the world is shared by the two apache's. This is know as a virtual ip address. While the 2 servers share the virtual IP, TCP/IP routes packets to only the active server. When the active server goes down, the passive server tells TCP/IP to start routing packets intended for this ip address to it. There are TCP/IP commands that let the server start and stop listening on the virtual ip address.

Tools like heartbeat and Ultramonkey enable you to maintain a heartbeat with another and failover when necessary. With heartbeat, there is a heartbeat process on each server. Config files have information on the virtual ip address, active server, passive server. There are several articles on the internet on how to setup heartbeat.

In summary, you can build highly available applications using open source tools. The key cocepts of HA, redundancy, monitoring & failover, virtual ip address apply to any service and not just web servers. You can use the same concepts to make your database server highly available.

Sunday, February 19, 2012

Java Generics #2 : what is "? super T" ?

Consider the merge method below that copies all elements from List source to List target.
public static <T> void merge(List<? super T> target, List<? extends T> source)
The <T> following static declares T as a new type variable for this method. We discussed "? extends T" in the blog Java Generics #1. Here, let us examine "? super T". One can guess that it is a wildcard that means any types, that are a superclass of T. If T is Integer, then List<? super T> could be List<Integer>, List<Number> or List<Object>. The code below shows the use of the merge method. In line 6, T is Integer and "? super T" is Number. In line 10, T is Number and "? super T" is Object.
1 List<Integer> aInts = new ArrayList<Integer>() ;
2 aInts.add(5) ;
3 aInts.add(7) ;
4 List<Number> aNums = new ArrayList<Number>() ;
5 aNums.add(12.5) ;
6 MCollections.<Integer>merge(aNums,aInts) ; // works
7 System.out.println(aNums.toString()) ; // aNums has 5,7,12.5
8 List<Object> aObjs = new ArrayList<Object>() ;
9 aObjs.add("hello") ;
10 MCollections.<Number>merge(aObjs,aNums) ; // works as well
11 System.out.println(aObjs.toString()) ; // aObjs has hello,5,7,12.5
We discussed in the last blog that if you have a Collection<? extends T> you can get values out of it, but you cannot put stuff into it. So what can you do with Collection<? super T> ?

In our merge example above, List<? super T> is the target, which means the implementation is putting/setting elements into it. "? super T" means any supertype of T. Logically it makes sense that you can put any supertype into the List.

The implementation of merge could be
1 public class MCollections {
2 public static <T> void merge(List<? super T> target, 
3                                List<? extends T> source) {
4   for(int i = 0 ; i < source.size(); i++) {
5     T e = source.get(i) ;
6     target.add(e) ;
7   }
8 }
9 } 
But if you were to do a get, what would be the returned type ?. There would be no way to know. Hence it is not allowed as shown in line 4 of the code below.
1 List<? super Integer> aNums = new ArrayList<Number>() ;
2 aNums.add(11) ;
3 aNums.add(12) ;   
4 Number n = aNums.get(0) ; // Compilation Error - not allowed 
5 Object o = aNums.get(0) ; // allowed -- No compile error 
The exception to the rule is getting an Object, which is allowed because since Object is a supertype of every other java type.

In summary, you can enable subtyping using "? super T" when you need to put objects into the collection. ( But you can get them out only as Object). You can enable subtyping using ? extends T when you need to get objects out of the collection. It follows that if you need to do both get and put, then you cannot use either of these wildcard mechanisms and you need to use a explicit type.

Sunday, January 15, 2012

Java Generics #1 : Subtyping using wildcard with extends

Generics is one of those more complicated language features in Java that is not well understood by many programmers. Many avoid it altogether. This is not without reason. While writing your program, if you have to stop and think a lot about syntax, there is more than a good chance, you would try to avoid that language construct. In this blog I discuss one type of subtyping with generics which can be tricky.

In Java we know that Integer extends Number. In other words, Integer is a subtype of Number. Anywhere a Number required, you can pass in an Integer. But does this mean that List<Integer> is a subtype of List<Number> ?

Consider the code. Will it work ?
1 List<Integer> aList = new ArrayList<Integer>() ;
2 aList.add(11) ;
3 aList.add(13) ;
4 List<Number> nList = aList ;
5 nList.add(11.5) ;
aList is a list of Integers. nList is a list of Numbers. In line4 nList is made to reference aList. In line 5 we add a double to aList. But aList is a list of integers. This is obviously not correct. And Java will not allow it. Line 4 will cause a compilation error. But sometimes we want to be able use subtypes. Generics have the concept of wildcards that enable subtyping when logically approriate.

Consider the addAll method of the Collection interface.
interface Collection<T> {
   public boolean addAll(Collection<? extends T> x) ;

}
? extend T says that given a collection of Type T , you can add to it elements from any collection whose type is a subtype of T. The following code is valid.
1 List<Number> aList = new ArrayList<Number>() ;
2 List<Integer> intList = Arrays.asList(11,12) ;
3 List<Double> dList = Arrays.asList(15.15) ;
4 aList.addAll(intList) ;
5 aList.addAll(dList) ;
The implemention of addAll method will get elements from the list passed in as parameter and put it into the target collection. Note that it is only a get operation on Collection<? extends T>. A put on Collection<? extends T> would however not be allowed. To understand, consider the code below
1 List <? extends Number> numList ;
2 List<Integer> intList = Arrays.asList(11,12) ;
3 numList = intList ; // Will this work ?
4 numList.add(5.67) ; // Will this work ?
Should line 3 work ? What about line 4 ?
The Java compiler allows line 3 because List<Integer> is considered a subtype of List <? extends Number>. But line 4 is a compilation error because you should not be allowed to add a double to List<Integer>.

In summary, when you have a Collection<? extends T>, it is safe to get elements out of the collection but not safe to put elements into it. Hence the compiler does not allow it.

Sunday, December 18, 2011

Single Sign On for the cloud: SAML & OpenId

When accessing different applications owned by different organizations, having to authenticate everytime you go from one application to another is annoying. Not only is it time consuming, but you also have to remember multiple passwords, which are often lost.

Single sign on is the ability to authenticate once and be able to move between applications seamlessly using the authenticated identity.

Within an intranet or between applications that are under the control of one development organization, single sign on for web applications can be easily implemented by generating a sessionid and passing it around using cookies. However, such a solution is proprietary and will not work if you need to leave the intranet and access other applications on the cloud. To interoperate with applications on the cloud, a more standards based solution is required.

A related concept and benefit is federated identity. Organizations can agree to a common name to refer to users. The user and his attributes needs to be created only in once place and others can refer to this information.

In this blog, we briefly examine two popular protocols that can be used for single sign on on the cloud: SAML and OpenId.

OpenId

The problem OpenId solves is that you as a user do not have to maintain and provide a password to each and every site you visit.

You maintain your password or other identifying credential with one provider known as the OpenId provider.

The website or application that you visit and that requires proof of who you are, relies on the OpenId provider to verify that you are who you claim to be. This is known as the relying party.

The basics of the OpenId protocol are:

1. You visit a web application (relying party) and enter an OpenId

2. Based on your OpenId, the relying party determines who your OpenId provider is.

3. The relying party redirects your request to the OpenId provider.

4. If you are already authenticated, this step is skipped.

The OpenId provider authenticates you by asking for a password or other information. The provider warns you that the relying party is requesting information about you.

5. The request is redirected back to the relying party where you are shown the URL you were trying to access.

The protocol does not require providers or relying parties to be registered anywhere. It uses plain HTTP requests and responses. The protocol messages are plain text key value pairs. The protocol works well with modern "Web20" AJAX style applications.

The OpenId protocol originated from consumer oriented websites such as Google, Twitter, Facebook etc and that is where this protocol is popular.

The OpenId specification is described at OpenId specification
There is a java implementation of OpenId at openid4java

SAML (Security Assertion Markup Language)

SAML is a XML based protocol that enables web based authentication, authorization and single sign on.

SAML involves a relying party requesting an assertion and a SAML provider responding with the assertion.

Examples of assertions are :
Authentication Assertion : This user was authenticated using such and such method at time t.
Attribute Assertion : This user has a title supermanager.
Authorization Assertion : This user has permission to delete file xyz.doc.

A typical SAML interaction would be as follows:

1. A user tries to access a URL or a web application which is the relying party
2. The relying party creates a SAML authentication request.
3. The relying party redirects the users browser to a SAML provider. Embedded in the request is the SAML authentication request.
4. The SAML provider evaluates the SAML request and authenticates the user.
5. The SAML provides returns a SAML authentication response to the user browser.
6. The browser forwards the SAML response back to the relying party.
7. The relying party verifies and interprets the SAML response.
8. If the response implies successful authentication, the user is redirected to the URL, he was originally trying to reach.

SAML has the concept of profiles. The interaction is different based on the profile. The interaction above is the Web SSO profile.

SAML has its origins more in enterprise software, in web services, in B2B communication and is from the early 2000s when XML was very popular. In fact SAML1.x had only a SOAP binding.

The SAML specification is at SAML Specification
There is a SAML implementation at OpenSAML

Which protocol should I use ?

OpenId is a simpler protocol. But has SAML has more features.

OpenId supports discovery of the OpenId provider. Using SAML has generally required expensive SAML projects.

OpenId supports only service provider initiated SSO. You go to a service provider web site and they need authentication. They start the conversation with the OpenId provider. SAML can also support identity provider initiated SSO. You are authenticated into your companys portal. Your company has a partner travel website for business travel. With SAML, you can go from your companys portal ( SAML provider) to the partner website ( relying party) without needing reauthentication.

SAML has been around longer than OpenId. SAML is more popular in the enterprise where as OpenId is more popular in consumer oriented applications.

Both OpenId and SAML rely on external transport layer security protocols such as SSL for the security of protocol messages.

If you are starting a new website and want to accept users from other popular sites such as google or twitter, you might consider OpenId. However if you are an enterprise and you want your authenticated users to access your partner sites without re-authentication, you might need SAML.

In summary, SAML is a feature rich protocol more popular in the enterprise. OpenId is simpler protocol with some limitations.