[jetty-dev] Jetty-9 development

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[jetty-dev] Jetty-9 development

Greg Wilkins-3
All,

Recently, we've been doing a bit too much of our development behind
"closed doors" within webtide/intalio.    Now that we've started
jetty-9 development, I'd like to reverse that trend and start having
more of the development discussions on the open forums (as we used to
do, and should have all along).
There is a jetty-9 branch already, but it is a long long way from
being usable... or even compileable!  But if you want to follow the
discussions, the code is there.

So why are we having jetty-9?  have we just gone jetty-8?  Is there
need for another major version change?

The driver for another new version is the increasing use of new
protocols on the web, namely Websocket and SPDY.  We have supported
both of those in the 7/8 architecture, but it has been a bit of a
stretch.   The way that jetty currently works is as follows:

+ we have our own Buffer and EndPoint abstraction, which roughly
corresponds to NIO2's AsynchronousSocketChannel and is an asynchronous
end point. We had our own abstraction so we could support JVMs that
did not support NIO.  NIO is pretty much universal now, so we can
definitely drop having our own buffers, and might even be able to move
away from our own EndPoint abstraction.

+ Connectors are responsible for accepting new EndPoints and turning
then into Connection, with the primary connection implementations
being AsyncHttpConnection and BlockingHttpConnection, but now also
SpdyConnection and WebSocketConnection.   In a world where even smart
phones are multi-threaded and NIO, I don't think the blocking
connection style needs to be supported anymore and we can move to all
Async.  We no longer have to run on palm pilots!

+ The primary entry point for Connections, is the method :
Connection handle()  . Which does the parsing of the requests, calls
the Server.handle(request,response) and then completes the generation
of the response.   Note that it return a Connection, so that
HttpConnection.handle() call can return a WebSocketConnection when an
upgrade request is received.    Also SSL now is implemented as an
Connection interceptor, so you can upgrade the chain of
EndPoint->HttpConnection    to
EndPoint->SslConnection->HttpConnection   and then to
EndPoint->SslConnection->WebSocketConnection.

So far so good.   Except things are becoming even more complex, mostly
because of mutliplexing over Websocket and SPDY. Currently 1
Connection instance == 1 TCP Socket == 1 protocol instances == 1
application channels (eg stream of HTTP requests or
WebSocketMessages).  But websockets will soon have a MUX extension
that will allow multiple websocket streams to be transported over the
one TCP connection,  thus  1 TCP Socket == 1 protocol instance == N
application channels.   This is already the case with SPDY which
supports multiple channels, each with HTTP semantics over the one TCP
Connection.  With Spdy we have:    1TCP Socket == 1 SPDY connection
instance == N Fake HTTP connection instances == N application
channels.   This is implemented today with
EndPoint->SslConnection->SpdyConnection->SpdyHttpConnection->app

But it is getting even more complex, as SPDY is also able to transport
WebSockets (at least the semantics of websocket, but not the
protocol), so we will need to be able to have
EndPoint->SslConnection->SpdyConnection->[SpdyHttpConnection|SpdyWebSocketConnection]->app

We are working towards this architecture, but currently in 7/8 it
means that we have to fake HTTP connections and have mock parsers.
Which is working.... but is not an elegant solution long term.

So what we need to do in jetty-9 is to separate out the wire protocol
handling from the application protocol handling.    So you can have:

 HTTP wire protocol -> 1 HTTP application protocol
 Websocket wire protocol -> 1 Websocket application protocol
 Websocket wire protocol -> N Websocket application protocol
 SPDY wire protocol ->  N HTTP application protocol + M Websocket
application protocol

Ideally the application protocol handlers will be able to be written
so that they are independent over the wire protocol they are being
used on.

So where is the jetty-9 branch?

The jetty-util, jetty-io and jetty-http modules have all been worked
over to remove the old jetty Buffer abstraction and to use NIO
directly.  The HTTP parser and generator have also been re factored to
be independent of the IO, so they can be used with jetty EndPoints or
JDK7 NIO2 Channels etc.  These build and pass there tests and have had
a lot of cruft from 15 years of development removed (I'm not saying
there is not more cruft to be removed).

jetty-server is where the current development is being done.  The
AsyncHttpConnection class has been torn apart into a HttpConnection
and HttpChannel class, representing the wire protocol and the
application protocol.   So parsers/generators are in HttpConnection,
while requests, responses and continuations are in the HttpChannel.
 The current challenge is to find the write contract between these
classes so that it may be efficiently generalised so the HttpChannel
will work with a future SpdyConnection and that we'll be able to have
a WebSocketConnection and a WebSocketChannel, with the later being
usable with a SpdyConnection.   It is an open question how much (if
any) behaviour will end up in an AbstractConnection and
AbstractChannel.  This split is complicated by the fact that we are
trying to change to the new parser/generator style at the same time.

Another complication is that the split also means a move away from
Jetty's single entry point for a connection.   Currently every things
starts with a call to Connection.handle(), even if you wake up a
suspended continuation the request is redispatched via a called to
Connection.handle(), which means that there can be a nice safe finally
block in Connection.handle() that will eventually see the request exit
in a non suspended state and thus know to complete the response and
move onto parsing the next request.      But if we are split between
HttpConnection and HttpChannel, things like resuming a suspended
request are not IO events, so they have no business being dispatched
via HttpConnection.handle(), instead they will probably be a direct
dispatch to HttpChannel.handle().     This in turn means that when
HttpChannel knows the request/response cycle is complete, it will have
to call back to the HttpConnection to complete the response.   THIS IS
A BIG CHANGE, because it means that instead of the same thread doing
the complete in a finally block, the complete will be triggered by a
call from a thread that may be entirely different from the one that
called HttpConnection.handle().    Thus we have to make the code
thread safe in areas that we have not had to before (although async
servlets are already pushing us this way a bit and so is SPDY muxed
channels).

It also makes upgrade a bit more complex.  Currently we essentially do:

 Connection connection = new HttpConnection(...)
 while(notdone)
    connection=connection.handle();

This will become:

 Connection connection = new HttpConnection(...)
 while(notdone)
    connection=connection.handle()
    {
        Channel nc = _channel.handle();
        if (_channel!=nc)
            return newConnectionFor(nc);
        return this;
   }

Anyway, that's enough brain farting for one email.  Just wanted to
explain some of the jetty-9 commits that you might have seen go by.
The stable branches remain 7/8 and they will continue to be developed
and improved for some time.  I hope 9 will be ready before Servlet 3.1
is out.... but we will see.
All/any feedback comments etc are welcome.

regards
_______________________________________________
jetty-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/jetty-dev
Reply | Threaded
Open this post in threaded view
|

Re: [jetty-dev] Jetty-9 development

Greg Wilkins-3

To continue my flow of consciousness brain dump on jetty-9.....


Today I'm considering refactoring the  dispatching of the EndPoints.

Currently the SelectorManager (as part of the connector) runs an NIO select set and when it selects an endpoint for IO activity it calls the EndPoint#schedule method, which does:
 
* If there is a thread blocked waiting to read/write and if the endpoint is now readable/writable, then wake up the thread and return.
* If we are not already dispatched, then dispatch a thread to run SelectChannelEndPoint#handle

SelectChannelEndPoint#handle calls the connection#handle method, which is normally AsyncHttpConnection#handle.  It calls the parser to parse the request, which then calls the handlers to handle the request.   These handlers will read/write input/ouput to/from the request/response in a blocking manner.  If a read/write blocks, then the calling thread(which is the thread dispatched by the EndPoint to call EndPoint#handle) is parked in the EndPoint blockforXxx methods.  When the endpoint becomes readable/writable it's schedule method is again called by the SelectorManager and finds the blocked thread, which it wakes up.   The read/write then continues by calling the parser/generator to frame the input/output as HTTP.   This means that the parser is reentrant the call stack is:


SelectorManager->EndPoint#schedule->dispatch(handle)
EndPoint#handle->Connection#handler->HttpParser#parse->Connection#handlerRequest->Server#service->HanderOrServlet->Response.getInputStream().read()->HttpParser#parse
                                                                                                                                                   ->EndPoint#blockForInput

SelectorManager->EndPoint#schedule->(wakeup blocked thread)



The reentrancy is a little complex, but it does make for very efficient waking up of blocked readers and writers.  However, now with Async servlets and websockets, the reader/writers may be threads other than the dispatched thread, so the parser now has to be reentrant and thread safe, which is tougher and more complex and requires slower locks.     Also with MUX that is needed for SPDY and websocket, the thread that parses the frames won't always be the thread that calls the handlers.


So for Jetty-9, I'm  reconsidering this design to see if we can do something simpler and/or more maintainable, but at least as fast.


I've already discussed how I'm splitting the AsyncHttpConnection into HttpConnection and HttpChannel, so the basic calling stack is going to be something like:

SelectorManager->EndPoint#schedule->dispatch(handle)
Endpoint#handle->HttpConnection#canRead->HttpParser#parse->HttpChannel#setXyz
                                       ->HttpChannel#handleRequest->
Server#service->HanderOrServlet->Response.getInputStream().read()->HttpChannel#blockForContent
SelectorManager->EndPoint#schedule->dispatch(handle)
Endpoint#handle->HttpConnection#canRead->HttpParser#parse->HttpChannel#handleContent-> (wakeup blocked thread)


So you can see that the parser is neither reentrant nor multi threaded.  It will always be called from the thread that is dispatched by the selector manager schedule call.   But because of this, we need another dispatch so that 3 threads are now involved rather than 2. This is because previously we could use the selector thread to wake up the blocked thread and it would do the parsing.  But now we can't get the selector thread to do the parsing (too long), so it has to dispatch another thread to do the parsing of the data, which will then wake up the blocked thread by passing the content the the HttpChannel.

So this extra dispatch feels a little expensive... but I don't think it is. it only happens when we have blocked on the network, so we are already looking at 10's or 100's of ms of latency while the network flow control works out when to continue, so the time for an extra dispatch is small.  Also in SPDY and MUX websocket we would have this dispatch anyway, because we have a 1 to n relationship between SpdyConnection and SpdyChannels (for example).

Note also that this puts the blocking logic in the Channel and that will simplify the connections, connectors and endpoints.

Also I think that we can have separate canRead and canWrite callbacks as the handling could be different.    We have to work out which thread(s) are going to be calling the generator and then flush the resulting buffers.  Will it be the threads that call the outputStream.write methods, or another thread dispatched to the connection via canWrite???  or is it the thread that called handleRequest that completes the response after it returns (this is the canRead thread)!?!?!.  Have to think about that one a bit more.

cheers













_______________________________________________
jetty-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/jetty-dev
Reply | Threaded
Open this post in threaded view
|

Re: [jetty-dev] Jetty-9 development

Greg Wilkins-3
.... and also because SPDY and websocket MUX both have flow control, it is no longer OK to have the blocking in the EndPoint, as that would block multiple Channels.   The blocking has to be in the Channel itself.



_______________________________________________
jetty-dev mailing list
[hidden email]
https://dev.eclipse.org/mailman/listinfo/jetty-dev