posts - 12, comments - 0, trackbacks - 0, articles - 7
  BlogJava :: 首页 :: 新随笔 :: 联系 :: 聚合  :: 管理
from: Tricks and Tips With AIO Part 1: The Frightening Thread Pool


Jean-Francois 

A while ago JDK 1.4 introduced the notion of non-blocking I/O. With non-blocking I/O (NIO), you're getting events through a selector when there is some I/O ready to be processed, like read and write operations. before JDK 1.4, only blocking I/O was available. With locking I/O, you were just blocking on a stream, trying to read and write.
JDK 7 introduces asynchronous I/O (AIO). Asynchronous I/O gives you a notification when the I/O is completed. The big difference with non-blocking is with AIO you get the notification when the I/O operation complete, where with blocking you you get notified when the I/O operation is ready to be completed.
For example, with a socket channel in a non-blocking mode, you register with a selector, and the selector will give you a notification when there is data on that socket to read. With the asynchronous I/O, you actually start the read, and the I/O will complete sometime later when the read has happened and there is data in your byte buffer.
With AIO, you wait for completed I/O operation using we a completion handler (explained in details below). You specify a completion handler when you do your read, and the completion handler is invoked to tell you that the I/O operation has completed with the bytes that has been read. With non-blocking, you would have been notified and then you would have executed yourself the read operation to read bytes.
One of the nice thing you can do with AIO is to configure yourself the thread pool the kernel will uses to invoke a completion handler. A completion handler is an handler for consuming the result of an asynchronous I/O operation like accepting a remote connection or reading/writing some bytes. So an asynchronous channels (with NIO.1 we had SelectableChannel) allow a completion handler to be specified to consume the result of an asynchronous operation. The interface define three "callback":
   * completed(...): invoked when the I/O operation completes successfully.
   * failed(...): invoked if the I/O operations fails (like when the remote client close the connection).
   * cancelled(...): invoked when the I/O operation is cancelled by invoking the cancel method.
Below is an example (I will talk about it it much more details in part II) of how you can open a port and listen for requests:

 1 // Open a port
 2 final AsynchronousServerSocketChannel listener =
 3    AsynchronousServerSocketChannel.open().bind(new InetSocketAddress(port));
 4 
 5 // Accept connections
 6 listener.accept(nullnew CompletionHandler<AsynchronousSocketChannel,Void>() {
 7    public void completed(AsynchronousSocketChannel channel,Void> result) {}
 8    public void cancelled(AsynchronousSocketChannel channel,Void> result) {}
 9    public void failed(AsynchronousSocketChannel channel,Void> result) {}
10 }
11 

Now every time a connection is made to the port, the completed method will be invoked by a kernel's thread. Do you catch the difference with NIO.1? To achieve the same kind of operation with NIO.1 you would have listen for requests by doing: 

 1 selector.select(timeout);
 2 
 3 while(readyKeys.hasNext()){
 4  SelectionKey key = iterators.next();
 5  if (key.isAcceptable()){
 6    // Do something that doesn't block
 7    // because if it blocks, no more connection can be
 8    // accepted as the selector.select(..)
 9    // cannot be executed
10  }
11 }
12 

With AIO, the kernel is spawning the thread for us. Where this thread coming from? This is the topic of this Tricks and Tips with AIO.

By default, applications that do not create their own asynchronous channel group will use the default group that has an associated thread pool that is created automatically. What? The kernel will create a thread pool and manage it for me? Might be well suited for simple application, but for complex applications like the Grizzly Framework, relying on an 'external' thread pool is unthinkable as most of the time the application embedding Grizzly will configure its thread pool and pass it to Grizzly. Another reason is Grizzly has its own WorkerThread implementation that contains information about transactions (like ByteBuffer, attributes, etc.). At least the monster needs to be able to set the ThreadFactory!.

Note that I'm not saying using the kernel's thread pool is wrong, but for Grizzly, I prefer having full control of the thread pool. So what's my solution? There is two solutions, et c'est parti:

Fixed number of Threads (FixedThreadPool)

An asynchronous channel group associated with a fixed thread pool of size N creates N threads that are waiting for already processed I/O events. The kernel dispatch event directly to those threads, and those thread will first complete the I/O operation (like filling a ByteBuffer during a read operation). Once ready, the thread is re-used to directly invoke the completion handler that consumes the result. When the completion handler terminates normally then the thread returns to the thread pool and wait on a next event. If the completion handler terminates due to an uncaught error or runtime exception, the thread is returned to the pool and wait for new events as well (no thread are lost). For those cases, the thread is allowed to terminate (a new event is submitted to replace it). The reason the thread is allowed to terminate is so that the thread (or thread group) uncaught exception handler is executed.

So far so good? ....NOT. The first issue you must be aware when using fixed thread pool is if all threads "dead lock" inside a completion handler, your entire application can hangs until one thread becomes free to execute again. Hence this is critically important that the completion handler's methods complete in a timely manner so as to avoid keeping the invoking thread from dispatching to other completion handlers. If all completion handlers are blocked, any new event will be queued until one thread is 'delivered' from the lock. That can cause a really bad situation, is it? As an example, using a Future when waiting for a read operation to complete can lock you entire application:

1 Future result = ((AsynchronousSocketChannel)channel).read
2               (byteBuffer,,myCompletionHandler);
3 
4    try{
5        count = result.get(30, TimeUnit.SECONDS);
6    } catch (Throwable ex){
7        throw new EOFException(ex.getMessage());
8    }
9 

Like for OP_WRITE, I'm pretty sure nobody will ever code something like that, right? Well, some application needs to blocks until all the bytes are arrived (a Servlet Container is a good example) and if you don't paid attention, your server might hangs. Not convinced? Another example could be:

 1 channel.write(bb,30,TimeUnit.SECONDS,db_pool,new CompletionHandler() {
 2 
 3            public void completed(Integer byteWritten, DataBasePool attachment) {
 4                // Wait for a jdbc connection, blocking.
 5                MyDBConnection db_con = attachment.get();
 6            }
 7 
 8            public void failed(Throwable exc, DataBasePool attachment) {
 9            }
10 
11            public void cancelled(DataBasePool attachment) {
12            }
13        });
14 

Again, all threads may dead lock waiting for a database connection and your application might stop working as the kernel has no thread available to dispatch and complete I/O operation.

Grrr what's our solution? The first solution consists to carefully avoid blocking operations inside a completion handler, meaning any threads executing a kernel event must never block on something. I suspect this will be simple to achieve if you write an application from zero and you want to have a fully asynchronous application. Still, be careful and make sure you properly create enough threads. How you do that? Here is an example from Grizzly:

1 ThreadPoolExecutorServicePipeline executor = new ThreadPoolExecutorServicePipeline
2      (corePoolThreads,maxThreads,8192,30,TimeUnit.SECONDS);
3 AsynchronousChannelGroup asyncChannelGroup = AsynchronousChannelGroup
4       .withFixedThreadPool(executor,maxThreads);
5 

The second solution is to use a cached thread pool

Cached Thread Pool Configuration

An asynchronous channel group associated with a cached thread pool submits events to the thread pool that simply invoke the user's completion handler. Internal kernel's I/O operations are handled by one or more internal threads that are not visible to the user application. Yup! That means you have one hidden thread pool (not configurable via the official API, but as a system property) that dispatch events to a cached thread pool, which in turn invokes completion handler (Wait! you just win a price: a thread's context switch for free ;-). Since this is a cached thread pool, the probability of suffering the hangs problem described above is lower. I'm not saying it cannot happens as you can always create cached thread pool that cannot grows infinitely (those infinite thread pool should have never existed anyway!). But at least with cached thread pool you are guarantee that the kernel will be able to complete its I/O operations (like reading bytes). Just the invocation of the completion handler might be delayed when all the threads are blocked. Note that a cached thread pool must support unbounded queueing to works properly. How you do set a cached thread pool? Here is an example from Grizzly:

1 ThreadPoolExecutorServicePipeline executor = new ThreadPoolExecutorServicePipeline
2      (corePoolThreads,maxCachedThreadPoolSize,8192,
3            30,TimeUnit.SECONDS);
4 AsynchronousChannelGroup asyncChannelGroup = AsynchronousChannelGroup
5       .withCachedThreadPool(executor,maxCachedThreadPoolSize);
6 


What about the default that ship with the kernel?



If you do not create your own asynchronous channel group, then the kernel's default group that has an associated thread pool will be created automatically. This thread pool is a hybrid of the above configurations. It is a cached thread pool that creates threads on demand, and it has N threads that dequeue events and dispatch directly to the application's completion handler. The value of N defaults to the number of hardware threads but may be configured by a system property. In addition to N threads, there is one additional internal thread that dequeues events and submits tasks to the thread pool to invoke completion handlers. This internal thread ensures that the system doesn't stall when all of the fixed threads are blocked, or otherwise busy, executing completion handlers.

Conclusion

If you have one thing to learn from this part I is independently of which thread pool you decide to use (default, your own cached or fixed), make sure you at least limit blocking operations. This is specially true when a fixed thread pool is used as it may hangs your entire application as the kernel is running out of available threads. The situation can also occurs with cached thread pool, but at least the kernel can still execute the I/O operations.