Alex Blewitt tweeted an article by Paul Tyma titled: Thousands of Threads and Blocking I/O: The old way to write Java servers is new again. Paul is the Founder/CEO of ManyBrain, the creator of Mailinator.
Paul’s 65-slide presentation is a fast read for anyone interested in Java I/O, especially in a client/server setup. What makes the presentation interesting is Paul began his research of IO vs NIO with the presumptions that all Java developers are running around with: NIO is faster than IO because it’s asynchronous and non-blocking.
The more research he did, the more he found everyone repeating that claim, but a complete lack of benchmarks and research to go with it. Paul sat down and wrote up a quick “blast the server with data” benchmark and found in every case the NIO-based server was 25% slower than the blocking, thread-based IO server.
Researching further Paul found others online that had come across the exact same performance discrepancies. Here is a quote from Rahul Bhargava of CTO Rascal Systems that is relevant:
Blocking model was consistently 25-35% faster than using NIO selectors. Lot of techniques suggested by EmberIO folks were employed – using multiple selectors, doing multiple (2) reads if the first read returned EAGAIN equivalent in Java. Yet we couldn’t beat the plain thread per connection model with Linux NPTL.
To work around not so performant/scalable poll() implementation on Linux’s we tried using epoll with Blackwidow JVM on a 2.6.5 kernel. while epoll improved the over scalability, the performance still remained 25% below the vanilla thread per connection model. With epoll we needed lot fewer threads to get to the best performance mark that we could get out of NIO.
Paul goes on to further make the case that the reason blocking I/O is now the new (old) way to writer servers is because of the extremely low-cost to thread synchronization that exists in modern operating systems combined with the explosion of multi-cored systems become more of the norm. Paul’s benchmark of thread contention, whether it be 2 threads or 1000 threads shows constant-time work that the OS needs to do in order to keep all those threads actively engaged with the cost of an idle thread damn near zero:
For folks that remember the days of Java 1.1, 1.2, chat clients and on giant application servers, you do remember that blocking I/O was the bane of Java’s high-performance-server existence. It seems that while the software community solved that problem with Java’s NIO libraries, the OS and hardware community solved the original problem of expensive threads with advanced OS threading libraries like NPTL and multi-core machines.
Naturally the union of both approaches is the “perfect mix” and what Paul has touched upon in his presentation; software is advanced enough to give us what it used to not be able to, and hardware has improved enough to magnify that benefit many times over.
Paul’s presentation goes on to discuss developing high-performance services and the different implementation details of things like work queues, blocking vs non-blocking work queues, waking sleeping threads and how to scale different approaches out to huge problem sets like he has for his own Mailinator service (which can handle millions of emails a day).
It’s an excellent read if you get a chance to scroll through it.