Flow control in AsyncMiddleManServlet

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Flow control in AsyncMiddleManServlet

Raoul Duke
Hi,

I'm using AsyncMiddleManServlet to proxy plain HTTP clients to an upsteam SSL server and wanted to ask your help in gaining an understanding of how flow control works or what configuration parameters influence it.  

so:
clients are sending plain HTTP 1.1 to the proxy
the proxy is then doing SSL encryption/decryption to the upstream server (also HTTP/1.1)

typically the local clients are near the proxy (e.g. on a LAN) and there is throtlled bandwidth / latency to the upstream server.

what I seem to be seeing (anecdotally) is that on high load the cost of the SSL encryption to the upstream is eventually max-ing out the available CPU and causing clients to timeout.  which is of course to be expected at some level of load.

to try to zone in on the above I'm just trying to understand the basic flow control workflow.  lets say I had (say) 1000 clients all sending large HTTP PUTs to the proxy concurrently with large files (10MB, say).

my assumptions are as follows (please correct any that are wrong as that would be a big help):
* a single socket read of $chunk_size will be performed on each client connection to the extent of the max read size.  (which seems to be 4K by default but correct me if I'm wrong)
* each read from above would  then have to be written to the upstream connection (which in my case will have SSL enabled) on a one-to-one basis i.e. one read of $chunk_size will have to be then written to the upstream /before/ the next read can happen on the client socket
* /OR/ is it the case that many reads can happen on the client socket which are then buffered / queued in the proxy meaning a fast sender and slow upstream can end up with bloat at this level?  if this is the case - then which configuration parameters will infleuence this?
* in the case of these reads/writes it looks like jetty has a pool of threads but I assume the thread is only occupied for the length of time the I/O operation takes and is not in any sense blocked on one connection.  so even if there was only one thread for handling upstream connection  it could still handle (say) 100 client connections by polling for reads/writes as something like libevent would do in the C world.

The above is just a basic sketch to give a rough idea of my current crude/flawed mental model and hopefully serve as a way for people with knowledge to fill in some blanks for me.  and it may be that some of my assumptions are way off base so please correct me where I'm wrong.  like I said - I'm just trying to get a broad brush understanding to help with debugging.

All feedback welcome/appreciated.

Thanks,
RD











_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Flow control in AsyncMiddleManServlet

Simone Bordet-3
Hi,

On Tue, Feb 19, 2019 at 2:11 AM Raoul Duke <[hidden email]> wrote:
> to try to zone in on the above I'm just trying to understand the basic flow control workflow.  lets say I had (say) 1000 clients all sending large HTTP PUTs to the proxy concurrently with large files (10MB, say).
>
> my assumptions are as follows (please correct any that are wrong as that would be a big help):
> * a single socket read of $chunk_size will be performed on each client connection to the extent of the max read size.  (which seems to be 4K by default but correct me if I'm wrong)

Correct.

Note that if your clients send concurrent requests, Jetty at the proxy
will allocate a number of threads to serve those requests
concurrently, where the number of threads depends on the thread pool
configuration.
So if you have the default thread pool of 200 threads, and you have
100 concurrent clients, Jetty will allocate 100 threads (or so) to
serve the concurrent requests.
Note also that AsyncMiddleManServlet is completely asynchronous, so if
those requests upload large files, Jetty will read the request content
asynchronously.
This means that if the clients are "slow" to send content, Jetty will
return idle threads to the thread pool and only use threads when there
is content available.
Think of this as Jetty will use the minimum number of threads to read
the requests but with the largest parallelism possible.

> * each read from above would  then have to be written to the upstream connection (which in my case will have SSL enabled) on a one-to-one basis i.e. one read of $chunk_size will have to be then written to the upstream /before/ the next read can happen on the client socket
> * /OR/ is it the case that many reads can happen on the client socket which are then buffered / queued in the proxy meaning a fast sender and slow upstream can end up with bloat at this level?  if this is the case - then which configuration parameters will infleuence this?

AsyncMiddleManServlet will read a chunk of content, then pass the pair
(chunk, callback) to the application, then stops reading (for that
request).
It is the application that decides _when_ to complete the callback.
When the callback is completed, Jetty will resume reading.
The application can pass the callback to the proxy-to-server side for
the write, and the callback will be completed when the write is
completed.
This allows you to completely control the backpressure towards the client.
If you just want to forward the content, then you just pass the
(chunk, callback) pair to the writing side and you're good: the write
side will be slower to write because of TLS, and the reads will be
slowed down via backpressure, all the way to the client.
If you need to transform the content, then you need to manage the
callback completion yourself at the application level (e.g. maybe
buffer a couple of (chunk, callback) pairs, then transform them, then
write them as a single unit, and when the write is finished succeed
both callbacks).

> * in the case of these reads/writes it looks like jetty has a pool of threads but I assume the thread is only occupied for the length of time the I/O operation takes and is not in any sense blocked on one connection.

Correct, AsyncMiddleManServlet is completely asynchronous.

>  so even if there was only one thread for handling upstream connection  it could still handle (say) 100 client connections by polling for reads/writes as something like libevent would do in the C world.

Behavior WRT to upstream depends on the configuration of HttpClient.
In particular, its thread pool and its maxConnectionsPerDestination.
HttpClient will open at most maxConnectionsPerDestination towards the
server, so the parallelism to can obtain towards one server will
depend on that setting.
On the receiving side (receiving responses from the upstream servers),
HttpClient uses the same mechanism used by Jetty server: it will try
to read concurrently from as many connections as possible without
blocking - within the HttpClient thread pool limits.

Bottom line is that AsyncMiddleManServlet is completely asynchronous,
will try to be as parallel as possible but without ever blocking, and
that backpressure can be controlled by the application (and by default
there is no buffering but backpressure is always applied).

Hope this helped.

--
Simone Bordet
----
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Flow control in AsyncMiddleManServlet

Raoul Duke


Hi Simon,

On Tue, Feb 19, 2019 at 12:32 PM Simone Bordet <[hidden email]> wrote:
Hi,

On Tue, Feb 19, 2019 at 2:11 AM Raoul Duke <[hidden email]> wrote:
> to try to zone in on the above I'm just trying to understand the basic flow control workflow.  lets say I had (say) 1000 clients all sending large HTTP PUTs to the proxy concurrently with large files (10MB, say).
>
> my assumptions are as follows (please correct any that are wrong as that would be a big help):
> * a single socket read of $chunk_size will be performed on each client connection to the extent of the max read size.  (which seems to be 4K by default but correct me if I'm wrong)

Correct.

Note that if your clients send concurrent requests, Jetty at the proxy
will allocate a number of threads to serve those requests
concurrently, where the number of threads depends on the thread pool
configuration.
So if you have the default thread pool of 200 threads, and you have
100 concurrent clients, Jetty will allocate 100 threads (or so) to
serve the concurrent requests.
Note also that AsyncMiddleManServlet is completely asynchronous, so if
those requests upload large files, Jetty will read the request content
asynchronously.
This means that if the clients are "slow" to send content, Jetty will
return idle threads to the thread pool and only use threads when there
is content available.

  
What happens in the case where there are (say) 100 threads and 100 concurrent clients all sending furiously (at LAN speed) for very large uploads.
it looks like the thread will spin in a "while" loop while there is still more data to read.  so if that is correct then couldn't all 100 threads be occupied without those
long lived and very fast uploads such that concurrent client 101 is frozen out of getting its upload payload shunted?

It wouldn't be my expectation that it would work as above but I see some evidence that it is.  is there configuration parameters I should be checking on here
that would influence this?


Hope this helped.

it does help.  Thanks so much for taking the time to respond.

RD.
 

_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Flow control in AsyncMiddleManServlet

Simone Bordet-3
Hi,

On Tue, Feb 19, 2019 at 4:48 PM Raoul Duke <[hidden email]> wrote:
> What happens in the case where there are (say) 100 threads and 100 concurrent clients all sending furiously (at LAN speed) for very large uploads.
> it looks like the thread will spin in a "while" loop while there is still more data to read.  so if that is correct then couldn't all 100 threads be occupied without those
> long lived and very fast uploads such that concurrent client 101 is frozen out of getting its upload payload shunted?

You have 100 clients _trying_ to upload at network speed.
Let's assume we have exactly 100 threads available in the proxy to
handle them concurrently.
Each thread will read a chunk and pass it to the slow write to the
server. The write is non-blocking so it may take a long time but won't
block any thread.
If the write is synchronous, the thread will finish the write and go
back to read another chunk, and so on.
If the write is asynchronous, the thread will return to the thread
pool and will be able to handle another client.

Chances are that in your setup the 100 clients will eventually be
slowed down by TCP congestion, and therefore won't be able to upload a
network speed.
This is because the proxy is not reading fast enough because it has to
do slow writes.

The moment one I/O operation on the proxy goes asynchronous (i.e. read
0 bytes or write less bytes than expected), the thread will go back to
the thread pool and potentially be available for another client.

In the perfect case, all 100 threads will be busy reading and writing
so the 101st client will be a job queued in the thread pool waiting
for a thread to be freed.
In the real case, I expect that some I/O operation or some scheduling
imbalance (I assume you don't have 100 hardware cores on the server)
will make one thread available to serve the 101st client before all
the previous 100 are finished.
E.g. client #13 finishes first so its thread will be able to serve
client #101 while the other 99 are still running.

For HTTP/1.1, your knob is the thread pool max size: the larger it is,
the more concurrent clients you should be able to handle.
The smaller it is, the more you are queueing on the proxy and
therefore pushing back on the clients (because you won't read them).
If it is too small, the queueing on the proxy may be so large that a
client may timeout before the proxy has the chance to read from it.

Alternatively, you can configure the proxy with the QoSFilter, which
throttles the number of concurrent requests that can be served
concurrently.
Or you can use AcceptRateLimit or ConnectionLimit to throttle things
at at the TCP level.

--
Simone Bordet
----
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Flow control in AsyncMiddleManServlet

Raoul Duke
Hi Simone,

On Tue, Feb 19, 2019 at 5:37 PM Simone Bordet <[hidden email]> wrote:
Hi,

On Tue, Feb 19, 2019 at 4:48 PM Raoul Duke <[hidden email]> wrote:
> What happens in the case where there are (say) 100 threads and 100 concurrent clients all sending furiously (at LAN speed) for very large uploads.
> it looks like the thread will spin in a "while" loop while there is still more data to read.  so if that is correct then couldn't all 100 threads be occupied without those
> long lived and very fast uploads such that concurrent client 101 is frozen out of getting its upload payload shunted?

You have 100 clients _trying_ to upload at network speed.
Let's assume we have exactly 100 threads available in the proxy to
handle them concurrently.
Each thread will read a chunk and pass it to the slow write to the
server. The write is non-blocking so it may take a long time but won't
block any thread.
If the write is synchronous, the thread will finish the write and go
back to read another chunk, and so on.
If the write is asynchronous, the thread will return to the thread
pool and will be able to handle another client.

Chances are that in your setup the 100 clients will eventually be
slowed down by TCP congestion, and therefore won't be able to upload a
network speed.
This is because the proxy is not reading fast enough because it has to
do slow writes.

The moment one I/O operation on the proxy goes asynchronous (i.e. read
0 bytes or write less bytes than expected), the thread will go back to
the thread pool and potentially be available for another client.

In the perfect case, all 100 threads will be busy reading and writing
so the 101st client will be a job queued in the thread pool waiting
for a thread to be freed.
In the real case, I expect that some I/O operation or some scheduling
imbalance (I assume you don't have 100 hardware cores on the server)
will make one thread available to serve the 101st client before all
the previous 100 are finished.
E.g. client #13 finishes first so its thread will be able to serve
client #101 while the other 99 are still running.

For HTTP/1.1, your knob is the thread pool max size: the larger it is,
the more concurrent clients you should be able to handle.
The smaller it is, the more you are queueing on the proxy and
therefore pushing back on the clients (because you won't read them).
If it is too small, the queueing on the proxy may be so large that a
client may timeout before the proxy has the chance to read from it.

Alternatively, you can configure the proxy with the QoSFilter, which
throttles the number of concurrent requests that can be served
concurrently.
Or you can use AcceptRateLimit or ConnectionLimit to throttle things
at at the TCP level.

Thanks for the superb write-up. it really gives a lot of context I was missing and avenues to explore.  I will do some further experiments / analysis and try to qualify my findings a bit better and will follow-up again if I have more questions/observations.

One follow-up question.  It occured to me that it may be an option to use HTTP/2 on the upstream connection and I wanted to ask you if that would be in any way helpful in the situation I am in?  for example: would it be reasonable to expect there would be less contention for the (conceptually discussed) 100 upstream threads if those requests could be multiplexed over existing backend connections (rather than each connection being head-of-line "blocking" on a particular PUT operation to complete before the connection can be used by another PUT as in HTTP/1.1).

RD
 

_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Flow control in AsyncMiddleManServlet

Simone Bordet-3
Hi,

On Wed, Feb 20, 2019 at 12:59 AM Raoul Duke <[hidden email]> wrote:
> One follow-up question.  It occured to me that it may be an option to use HTTP/2 on the upstream connection and I wanted to ask you if that would be in any way helpful in the situation I am in?  for example: would it be reasonable to expect there would be less contention for the (conceptually discussed) 100 upstream threads if those requests could be multiplexed over existing backend connections (rather than each connection being head-of-line "blocking" on a particular PUT operation to complete before the connection can be used by another PUT as in HTTP/1.1).
>

HTTP/2 is a beast on its own.

While it is true that you can multiplex requests onto a single
connection, it is also true that you have to play nice with HTTP/2
flow control. This is usually a huge bottleneck, but if you can
control the server and the client, you can enlarge their values to fit
your needs.
I would not recommend to use just one connection to the server: we
have seen good effects if we can parallelize the HTTP/2 traffic onto
multiple connections. Certainly not the number of connections that
HTTP/1.1 requires, but there is no point to insist on having just one.

HTTP/2 also typically requires stronger encryption, so you may have to
pay some additional cost there.

For your case, if your numbers are really 100 concurrent requests,
then you may be ok at multiplexing them onto one or few connections.
In general, however, both a HTTP/2 client and a HTTP/2 server will
impose a limit on the max number of concurrent streams, so you may
still hit head-of-line blocking if you saturate them all - though much
later than HTTP/1.1.
Imagine you can multiplex 50 requests per connection, you have 3
connections available, you are queueing the 151st concurrent request
(while in HTTP/1.1 you would be queueing the 4th concurrent request).

In general we have seen better performance, reduced CPU and less
resources used when using HTTP/2 in a proxy (or similar) scenario, so
we typically recommend it although it needs more careful tuning.

--
Simone Bordet
----
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/jetty-users