Re: FW: avoiding earlyEOF

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: FW: avoiding earlyEOF

Shawn Heisey
On 6/20/2018 11:23 PM, Robben, Bert wrote:
> The problem that we face is that we regularly see IOExceptions
> exceptions occurring in the communication between these components. See
> stacktrace below. These IOExceptions always contain the earlyEOF. A long
> time ago I already posted a similar message on this forum (see
> https://dev.eclipse.org/mhonarc/lists/jetty-users/msg07965.html). I
> followed the advice mentioned, upgraded to the latest version and
> explicitly set a different idleTimeout on both the server and the client.

I'm not an expert by any means.  Starting off with that in case I say
something wrong.

I come from the Apache Solr project, which uses Jetty.  We see
EOFException quite a lot on the solr-user mailing list.

In my experience with Solr, this is almost always caused by the client
disconnecting (usually due to TCP socket timeout) before the server has
completed the request, because the user has set their socket timeout too
low on the client.  When the server finally does try to respond,
EOFException is logged, because the connection is gone.  If the client
is the one logging the exception, then the server may have closed the
connection, followed by the client trying to send more data, and failing.

I am not seeing any direct mention of what you are setting the idle
timeout to, but I do see this in your message:  "so it’s clear to me
that nothing is idle for a second".  In my opinion, setting socket or
idle timeouts to low numbers is asking for problems.  Going as low as
one second will be extremely likely to lead to timeout issues.

I do understand the value of having these timeouts, but the timeout
needs to be significantly longer than you expect the requests to
actually take, because there may be situations where requests do take
longer than you expect them to.

Java software is prone to experiencing noticeable GC pauses, especially
as the heap size grows.  I have seen pauses of 10-15 seconds happen with
an 8GB heap unless garbage collection is extensively tuned to avoid full
GC.  For Solr, an 8GB heap could actually be quite small.

If there is no GC tuning beyond setting which collector to use, even the
G1 collector, which is Oracle's best option for low-pause collection,
will not avoid full GCs effectively.  It is the full GC that causes the
most problems with pauses.  GC tuning is very much an art form, and
settings that work well for one application may produce awful results on
another.

Let's say that you expect all requests to complete in 10 milliseconds or
less.  So you set your timeout to 1 second, thinking that's always going
to be plenty of time.  But then your application fills up its 2GB heap
right in the middle of handling one of those requests, and the resulting
garbage collection pauses the JVM for two seconds.  The entity at the
other end of the connection is going to give up and close the connection
before the program experiencing the GC pause can respond.  Tuning
garbage collection to reduce GC pauses is certainly a good idea, but if
the timeout were 10 seconds instead of one second, it probably would not
have had any problem.

Thanks,
Shawn
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: avoiding earlyEOF

Steven Schlansker

> On Jun 22, 2018, at 9:56 AM, Shawn Heisey <[hidden email]> wrote:
>
> On 6/20/2018 11:23 PM, Robben, Bert wrote:
>> The problem that we face is that we regularly see IOExceptions
>> exceptions occurring in the communication between these components.

> Let's say that you expect all requests to complete in 10 milliseconds or
> less.  So you set your timeout to 1 second, thinking that's always going
> to be plenty of time.  But then your application fills up its 2GB heap
> right in the middle of handling one of those requests, and the resulting
> garbage collection pauses the JVM for two seconds.  The entity at the
> other end of the connection is going to give up and close the connection
> before the program experiencing the GC pause can respond.  Tuning
> garbage collection to reduce GC pauses is certainly a good idea, but if
> the timeout were 10 seconds instead of one second, it probably would not
> have had any problem.
You can (and should!) explicitly monitor these conditions.  The JVM provides interesting
diagnostics output through JMX to monitor it, or you can directly measure:

https://github.com/opentable/otj-pausedetector

I run this in *every* application -- unexpected pauses cause all sorts of troubles,
monitoring it is cheap, and you'll save hours when you have a big warning
"hey, the JVM went to lunch for 30 seconds here, that might be why all this stuff broke"


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: avoiding earlyEOF

Bill Ross-2

Is there a simple way to incorporate this in a start.jar from script startup? If not, would it be worth building into jetty?

Thanks,

Bill

> https://github.com/opentable/otj-pausedetector

public class MyCoolApp {
  public static void main(String[] args) {
    try (new JvmPauseAlarm(100, 400).start()) {
      runMyCoolApp();
    }
  }
}

On 6/22/18 10:09 AM, Steven Schlansker wrote:

      
On Jun 22, 2018, at 9:56 AM, Shawn Heisey [hidden email] wrote:

On 6/20/2018 11:23 PM, Robben, Bert wrote:
The problem that we face is that we regularly see IOExceptions
exceptions occurring in the communication between these components.

      
Let's say that you expect all requests to complete in 10 milliseconds or
less.  So you set your timeout to 1 second, thinking that's always going
to be plenty of time.  But then your application fills up its 2GB heap
right in the middle of handling one of those requests, and the resulting
garbage collection pauses the JVM for two seconds.  The entity at the
other end of the connection is going to give up and close the connection
before the program experiencing the GC pause can respond.  Tuning
garbage collection to reduce GC pauses is certainly a good idea, but if
the timeout were 10 seconds instead of one second, it probably would not
have had any problem.
You can (and should!) explicitly monitor these conditions.  The JVM provides interesting
diagnostics output through JMX to monitor it, or you can directly measure:

https://github.com/opentable/otj-pausedetector

I run this in *every* application -- unexpected pauses cause all sorts of troubles,
monitoring it is cheap, and you'll save hours when you have a big warning
"hey, the JVM went to lunch for 30 seconds here, that might be why all this stuff broke"



_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: avoiding earlyEOF

Joakim Erdfelt-8
Jetty start uses Jetty XML.
Jetty XML doesn't support try-with-resources, or try-catch.
Also, Jetty XML expects an Object to be configured and returned.

Joakim Erdfelt / [hidden email]

On Fri, Jun 22, 2018 at 1:45 PM, Bill Ross <[hidden email]> wrote:

Is there a simple way to incorporate this in a start.jar from script startup? If not, would it be worth building into jetty?

Thanks,

Bill

> https://github.com/opentable/otj-pausedetector

public class MyCoolApp {
  public static void main(String[] args) {
    try (new JvmPauseAlarm(100, 400).start()) {
      runMyCoolApp();
    }
  }
}

On 6/22/18 10:09 AM, Steven Schlansker wrote:

      
On Jun 22, 2018, at 9:56 AM, Shawn Heisey [hidden email] wrote:

On 6/20/2018 11:23 PM, Robben, Bert wrote:
The problem that we face is that we regularly see IOExceptions
exceptions occurring in the communication between these components.

      
Let's say that you expect all requests to complete in 10 milliseconds or
less.  So you set your timeout to 1 second, thinking that's always going
to be plenty of time.  But then your application fills up its 2GB heap
right in the middle of handling one of those requests, and the resulting
garbage collection pauses the JVM for two seconds.  The entity at the
other end of the connection is going to give up and close the connection
before the program experiencing the GC pause can respond.  Tuning
garbage collection to reduce GC pauses is certainly a good idea, but if
the timeout were 10 seconds instead of one second, it probably would not
have had any problem.
You can (and should!) explicitly monitor these conditions.  The JVM provides interesting
diagnostics output through JMX to monitor it, or you can directly measure:

https://github.com/opentable/otj-pausedetector

I run this in *every* application -- unexpected pauses cause all sorts of troubles,
monitoring it is cheap, and you'll save hours when you have a big warning
"hey, the JVM went to lunch for 30 seconds here, that might be why all this stuff broke"



_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: avoiding earlyEOF

Steven Schlansker
The particular library I included was just for example,
if there's a simple way to make it nicely drop-in to Jetty that'd be totally great by us.
We embed all our Jettys so making it work nicely with XML based deployments was never a goal to us.

> On Jun 22, 2018, at 11:50 AM, Joakim Erdfelt <[hidden email]> wrote:
>
> Jetty start uses Jetty XML.
> Jetty XML doesn't support try-with-resources, or try-catch.
> Also, Jetty XML expects an Object to be configured and returned.
>
> Joakim Erdfelt / [hidden email]
>
> On Fri, Jun 22, 2018 at 1:45 PM, Bill Ross <[hidden email]> wrote:
> Is there a simple way to incorporate this in a start.jar from script startup? If not, would it be worth building into jetty?
> Thanks,
>
> Bill
> > https://github.com/opentable/otj-pausedetector
>
> public class MyCoolApp
>  {
>
> public static void main(String[] args
> ) {
>
> try (new JvmPauseAlarm(100, 400).
> start()) {
>       runMyCoolApp();
>     }
>   }
> }
>
>
> On 6/22/18 10:09 AM, Steven Schlansker wrote:
>>> On Jun 22, 2018, at 9:56 AM, Shawn Heisey <[hidden email]>
>>>  wrote:
>>>
>>> On 6/20/2018 11:23 PM, Robben, Bert wrote:
>>>
>>>> The problem that we face is that we regularly see IOExceptions
>>>> exceptions occurring in the communication between these components.
>>>>
>>> Let's say that you expect all requests to complete in 10 milliseconds or
>>> less.  So you set your timeout to 1 second, thinking that's always going
>>> to be plenty of time.  But then your application fills up its 2GB heap
>>> right in the middle of handling one of those requests, and the resulting
>>> garbage collection pauses the JVM for two seconds.  The entity at the
>>> other end of the connection is going to give up and close the connection
>>> before the program experiencing the GC pause can respond.  Tuning
>>> garbage collection to reduce GC pauses is certainly a good idea, but if
>>> the timeout were 10 seconds instead of one second, it probably would not
>>> have had any problem.
>>>
>> You can (and should!) explicitly monitor these conditions.  The JVM provides interesting
>> diagnostics output through JMX to monitor it, or you can directly measure:
>>
>>
>> https://github.com/opentable/otj-pausedetector
>>
>>
>> I run this in *every* application -- unexpected pauses cause all sorts of troubles,
>> monitoring it is cheap, and you'll save hours when you have a big warning
>> "hey, the JVM went to lunch for 30 seconds here, that might be why all this stuff broke"
>>
>>
>>
>>
>> ______________________________
>> _________________
>> jetty-users mailing list
>>
>> [hidden email]
>>
>> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
>>
>> https://dev.eclipse.org/mailman/listinfo/jetty-users
>
>
> _______________________________________________
> jetty-users mailing list
> [hidden email]
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> https://dev.eclipse.org/mailman/listinfo/jetty-users
>
> _______________________________________________
> jetty-users mailing list
> [hidden email]
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> https://dev.eclipse.org/mailman/listinfo/jetty-users

_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: avoiding earlyEOF

Robben, Bert
Thanks guys for trying to help me.

Some more info.

(1) It is always the client logging the timeout.
(2) The server timeout settings are configure to about 34 seconds.
for (Connector con : server.getConnectors()) {
            if (con instanceof AbstractConnector) {
                ((AbstractConnector) con).setIdleTimeout(34321);
            }
        }
(3) We monitor activity and gc -- there are no long pauses (certainly not for more then 2-3 seconds). The app continues processing all the time without pausing.

The track that we're currently investigating is the role of the proxy. As we're running our app in dcos, all traffic is routed through ha-proxy. So http client connects to ha-proxy and ha-proxy directs further to the http server. It may be the case that in this setup, ha-proxy might not be properly configured. See https://stackoverflow.com/questions/44204603/marathon-lb-not-returning-keep-alive-headers and https://stackoverflow.com/questions/21550337/haproxy-netty-way-to-prevent-exceptions-on-connection-reset/40005338#40005338. As I understand this, as the proxy is inbetween, incorrect configuration could also be the cause of connections being closed unexpectedly. Certainly given the fact that the connections are long-lived (since the clients continue to send one message after the other to the same server).

Tbc,

Bert Robben
IT Architect Senior
POM
T:  000.000.0000
C:  000.000.0000
E: [hidden email]
FIS | Empowering the Financial World





-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Steven Schlansker
Sent: vrijdag 22 juni 2018 21:54
To: JETTY user mailing list <[hidden email]>
Subject: Re: [jetty-users] avoiding earlyEOF

The particular library I included was just for example, if there's a simple way to make it nicely drop-in to Jetty that'd be totally great by us.
We embed all our Jettys so making it work nicely with XML based deployments was never a goal to us.

> On Jun 22, 2018, at 11:50 AM, Joakim Erdfelt <[hidden email]> wrote:
>
> Jetty start uses Jetty XML.
> Jetty XML doesn't support try-with-resources, or try-catch.
> Also, Jetty XML expects an Object to be configured and returned.
>
> Joakim Erdfelt / [hidden email]
>
> On Fri, Jun 22, 2018 at 1:45 PM, Bill Ross <[hidden email]> wrote:
> Is there a simple way to incorporate this in a start.jar from script startup? If not, would it be worth building into jetty?
> Thanks,
>
> Bill
> > https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg
> > ithub.com%2Fopentable%2Fotj-pausedetector&data=02%7C01%7Cbert.robben
> > %40fisglobal.com%7C0462cd541a7c4718769808d5d879f289%7Ce3ff91d834c84b
> > 15a0b418910a6ac575%7C0%7C0%7C636652940625885882&sdata=RFmN5MTqVf8YV%
> > 2FJn8lH%2B%2Fw7owQPMN1zQ8s5Ao8MsVSE%3D&reserved=0
>
> public class MyCoolApp
>  {
>
> public static void main(String[] args
> ) {
>
> try (new JvmPauseAlarm(100, 400).
> start()) {
>       runMyCoolApp();
>     }
>   }
> }
>
>
> On 6/22/18 10:09 AM, Steven Schlansker wrote:
>>> On Jun 22, 2018, at 9:56 AM, Shawn Heisey <[hidden email]>
>>>  wrote:
>>>
>>> On 6/20/2018 11:23 PM, Robben, Bert wrote:
>>>
>>>> The problem that we face is that we regularly see IOExceptions
>>>> exceptions occurring in the communication between these components.
>>>>
>>> Let's say that you expect all requests to complete in 10
>>> milliseconds or less.  So you set your timeout to 1 second, thinking
>>> that's always going to be plenty of time.  But then your application
>>> fills up its 2GB heap right in the middle of handling one of those
>>> requests, and the resulting garbage collection pauses the JVM for
>>> two seconds.  The entity at the other end of the connection is going
>>> to give up and close the connection before the program experiencing
>>> the GC pause can respond.  Tuning garbage collection to reduce GC
>>> pauses is certainly a good idea, but if the timeout were 10 seconds
>>> instead of one second, it probably would not have had any problem.
>>>
>> You can (and should!) explicitly monitor these conditions.  The JVM
>> provides interesting diagnostics output through JMX to monitor it, or you can directly measure:
>>
>>
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
>> thub.com%2Fopentable%2Fotj-pausedetector&data=02%7C01%7Cbert.robben%4
>> 0fisglobal.com%7C0462cd541a7c4718769808d5d879f289%7Ce3ff91d834c84b15a
>> 0b418910a6ac575%7C0%7C0%7C636652940625885882&sdata=RFmN5MTqVf8YV%2FJn
>> 8lH%2B%2Fw7owQPMN1zQ8s5Ao8MsVSE%3D&reserved=0
>>
>>
>> I run this in *every* application -- unexpected pauses cause all
>> sorts of troubles, monitoring it is cheap, and you'll save hours when
>> you have a big warning "hey, the JVM went to lunch for 30 seconds here, that might be why all this stuff broke"
>>
>>
>>
>>
>> ______________________________
>> _________________
>> jetty-users mailing list
>>
>> [hidden email]
>>
>> To change your delivery options, retrieve your password, or
>> unsubscribe from this list, visit
>>
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fde
>> v.eclipse.org%2Fmailman%2Flistinfo%2Fjetty-users&data=02%7C01%7Cbert.
>> robben%40fisglobal.com%7C0462cd541a7c4718769808d5d879f289%7Ce3ff91d83
>> 4c84b15a0b418910a6ac575%7C0%7C0%7C636652940625885882&sdata=vr94GxQf9z
>> RNEsiN4naVjxCm%2ByM2alooMYNATo%2Bmu7o%3D&reserved=0
>
>
> _______________________________________________
> jetty-users mailing list
> [hidden email]
> To change your delivery options, retrieve your password, or
> unsubscribe from this list, visit
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev
> .eclipse.org%2Fmailman%2Flistinfo%2Fjetty-users&data=02%7C01%7Cbert.ro
> bben%40fisglobal.com%7C0462cd541a7c4718769808d5d879f289%7Ce3ff91d834c8
> 4b15a0b418910a6ac575%7C0%7C0%7C636652940625885882&sdata=vr94GxQf9zRNEs
> iN4naVjxCm%2ByM2alooMYNATo%2Bmu7o%3D&reserved=0
>
> _______________________________________________
> jetty-users mailing list
> [hidden email]
> To change your delivery options, retrieve your password, or
> unsubscribe from this list, visit
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdev
> .eclipse.org%2Fmailman%2Flistinfo%2Fjetty-users&data=02%7C01%7Cbert.ro
> bben%40fisglobal.com%7C0462cd541a7c4718769808d5d879f289%7Ce3ff91d834c8
> 4b15a0b418910a6ac575%7C0%7C0%7C636652940625885882&sdata=vr94GxQf9zRNEs
> iN4naVjxCm%2ByM2alooMYNATo%2Bmu7o%3D&reserved=0

The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users