Problem with ClosedByInterruptException

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem with ClosedByInterruptException

John Gardiner Myers
I've been having problems with some Jetty client (embedded) instances
encounter frequent ClosedByInterruptException errors attempting to query
a different service.

I have four instances in our testing environment that have gotten into
this state. Three are Jetty 9.3.7, one is Jetty 9.3.6. All are either
Java 8u72 or 8u73.

The stack trace is:

java.nio.channels.ClosedByInterruptException
        at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:753)
        at org.eclipse.jetty.io.SelectorManager.finishConnect(SelectorManager.java:337)
        at org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:341)
        at org.eclipse.jetty.io.ManagedSelector.access$900(ManagedSelector.java:56)
        at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:278)
        at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:170)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume
.java:162)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
        at java.lang.Thread.run(Thread.java:745)

Looking at AbstractInterruptibleChannel.end(boolean) for 8u92, I see
what appears to be a bug in NIO. Once an AbstractInterruptibleChannel
interrupts on a given Thread, all subsequent calls to #end(boolean) from
that thread will throw ClosedByInterruptException. The line "interrupted
= null;" in that method has no effect and probably should have been
"this.interrupted = null;"

I'm not sure what the source of interruptions are, but I do see
TimeoutException("DNS timeout " + getTimeout() + " ms") exceptions from
within SocketAddressResolver.Async.resolve() being logged.

So my suspicion is that DNS timeouts might be causing worker threads to
get into a state where they cannot complete opening connections.

I'd appreciate any suggestions for how to diagnose the problem further
or otherwise proceed.


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

Fabian Lange
Hi John,
while I cannot comment on your actual problem, I had a look at
AbstractInterruptibleChannel.end(boolean) and I agree with you that
the current implementation is probably broken.
May I ask you to raise an OpenJDK bug? http://bugreport.java.com/ or
at least check on a relevant Java dev mailing list

Fabian

On Sat, Nov 5, 2016 at 12:37 AM, John Gardiner Myers
<[hidden email]> wrote:

> I've been having problems with some Jetty client (embedded) instances
> encounter frequent ClosedByInterruptException errors attempting to query a
> different service.
>
> I have four instances in our testing environment that have gotten into this
> state. Three are Jetty 9.3.7, one is Jetty 9.3.6. All are either Java 8u72
> or 8u73.
>
> The stack trace is:
>
> java.nio.channels.ClosedByInterruptException
>         at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:753)
>         at
> org.eclipse.jetty.io.SelectorManager.finishConnect(SelectorManager.java:337)
>         at
> org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:341)
>         at
> org.eclipse.jetty.io.ManagedSelector.access$900(ManagedSelector.java:56)
>         at
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:278)
>         at
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:170)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume
> .java:162)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
>         at java.lang.Thread.run(Thread.java:745)
>
> Looking at AbstractInterruptibleChannel.end(boolean) for 8u92, I see what
> appears to be a bug in NIO. Once an AbstractInterruptibleChannel interrupts
> on a given Thread, all subsequent calls to #end(boolean) from that thread
> will throw ClosedByInterruptException. The line "interrupted = null;" in
> that method has no effect and probably should have been "this.interrupted =
> null;"
>
> I'm not sure what the source of interruptions are, but I do see
> TimeoutException("DNS timeout " + getTimeout() + " ms") exceptions from
> within SocketAddressResolver.Async.resolve() being logged.
>
> So my suspicion is that DNS timeouts might be causing worker threads to get
> into a state where they cannot complete opening connections.
>
> I'd appreciate any suggestions for how to diagnose the problem further or
> otherwise proceed.
>
>
> _______________________________________________
> jetty-users mailing list
> [hidden email]
> To change your delivery options, retrieve your password, or unsubscribe from
> this list, visit
> https://dev.eclipse.org/mailman/listinfo/jetty-users
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

John Gardiner Myers
On 11/6/16 2:35 PM, Fabian Lange wrote:
> May I ask you to raise an OpenJDK bug?
It appears to be already filed as JDK-8065720. The fix doesn't look
destined to be backported to JDK 8.

_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

Simone Bordet-3
In reply to this post by John Gardiner Myers
Hi,

On Sat, Nov 5, 2016 at 12:37 AM, John Gardiner Myers
<[hidden email]> wrote:

> I've been having problems with some Jetty client (embedded) instances
> encounter frequent ClosedByInterruptException errors attempting to query a
> different service.
>
> I have four instances in our testing environment that have gotten into this
> state. Three are Jetty 9.3.7, one is Jetty 9.3.6. All are either Java 8u72
> or 8u73.
>
> The stack trace is:
>
> java.nio.channels.ClosedByInterruptException
>         at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:753)
>         at
> org.eclipse.jetty.io.SelectorManager.finishConnect(SelectorManager.java:337)
>         at
> org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:341)
>         at
> org.eclipse.jetty.io.ManagedSelector.access$900(ManagedSelector.java:56)
>         at
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:278)
>         at
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:170)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume
> .java:162)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
>         at java.lang.Thread.run(Thread.java:745)
>
> Looking at AbstractInterruptibleChannel.end(boolean) for 8u92, I see what
> appears to be a bug in NIO. Once an AbstractInterruptibleChannel interrupts
> on a given Thread, all subsequent calls to #end(boolean) from that thread
> will throw ClosedByInterruptException. The line "interrupted = null;" in
> that method has no effect and probably should have been "this.interrupted =
> null;"
>
> I'm not sure what the source of interruptions are, but I do see
> TimeoutException("DNS timeout " + getTimeout() + " ms") exceptions from
> within SocketAddressResolver.Async.resolve() being logged.

The next line after the promise is failed in
SocketAddressResolver.Async.resolve(), Jetty interrupts the thread
that triggered the DNS lookup, in the attempt to return it from what
it was doing back into the pool.
I did not know about this JDK bug.

> So my suspicion is that DNS timeouts might be causing worker threads to get
> into a state where they cannot complete opening connections.
>
> I'd appreciate any suggestions for how to diagnose the problem further or
> otherwise proceed.

I think you have identified the issue.
AFAIK, DNS lookups in Java don't timeout. If your DNS lookup hangs
forever, that thread is forever stuck there.
Jetty adds a DNS timeout, but that is not enough, since the thread
doing the DNS lookup will still be stuck.
So Jetty tries to interrupt it.
However, this hits the JDK bug.

We can make the interruption configurable (or perhaps removing it).

Would you be able to modify the code (or pass your own implementation
of SocketAddressResolver - basically the implementation in Async
without the interruption) and verify that without interruption things
work ?
What would the DNS thread do after the timeout expire ?

--
Simone Bordet
----
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

John Gardiner Myers


On 11/8/16 6:33 AM, Simone Bordet wrote:
> Would you be able to modify the code (or pass your own implementation
> of SocketAddressResolver - basically the implementation in Async
> without the interruption) and verify that without interruption things
> work ?
I can pass my own implementation of SocketAddressResolver, though I have
not done the work to reproduce the problem on demand. I would not be
able to confirm a fix with confidence in any reasonable amount of time.
> What would the DNS thread do after the timeout expire ?
>
My preferred workaround would be to have the thread terminate after
being interrupted. Then the pool could create a fresh thread which won't
trigger the bug. The thread won't be able to be GCed until the next DNS
timeout, but that's minor. You could condition the workaround on Java < 9.

And perhaps someone could let OpenJDK know the effect of JDK-8065720 is
more severe than they thought.

_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

Simone Bordet-3
In reply to this post by John Gardiner Myers
Hi,

On Sat, Nov 5, 2016 at 12:37 AM, John Gardiner Myers
<[hidden email]> wrote:

> I've been having problems with some Jetty client (embedded) instances
> encounter frequent ClosedByInterruptException errors attempting to query a
> different service.
>
> I have four instances in our testing environment that have gotten into this
> state. Three are Jetty 9.3.7, one is Jetty 9.3.6. All are either Java 8u72
> or 8u73.
>
> The stack trace is:
>
> java.nio.channels.ClosedByInterruptException
>         at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:753)
>         at
> org.eclipse.jetty.io.SelectorManager.finishConnect(SelectorManager.java:337)
>         at
> org.eclipse.jetty.io.ManagedSelector.processConnect(ManagedSelector.java:341)
>         at
> org.eclipse.jetty.io.ManagedSelector.access$900(ManagedSelector.java:56)
>         at
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.processSelected(ManagedSelector.java:278)
>         at
> org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:170)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume
> .java:162)
>         at
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:147)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
>         at java.lang.Thread.run(Thread.java:745)
>
> Looking at AbstractInterruptibleChannel.end(boolean) for 8u92, I see what
> appears to be a bug in NIO. Once an AbstractInterruptibleChannel interrupts
> on a given Thread, all subsequent calls to #end(boolean) from that thread
> will throw ClosedByInterruptException. The line "interrupted = null;" in
> that method has no effect and probably should have been "this.interrupted =
> null;"
>
> I'm not sure what the source of interruptions are, but I do see
> TimeoutException("DNS timeout " + getTimeout() + " ms") exceptions from
> within SocketAddressResolver.Async.resolve() being logged.
>
> So my suspicion is that DNS timeouts might be causing worker threads to get
> into a state where they cannot complete opening connections.
>
> I'd appreciate any suggestions for how to diagnose the problem further or
> otherwise proceed.

While looking again at this, I don't think it's a big problem, and
JDK-8065720 is probably indeed harmless.

So the DNS lookup fails, so the thread is interrupted and a
TimeoutException reported to the application.

If and when the DNS finally resolves, it will find that *for that
particular socket* the AbstractInterruptibleChannel.interrupted field
is set and throw ClosedByInterruptException.
Although the field is not reset, that socket instance is now dead, and
it was dead since the TimeoutException failed it.

So I don't think that the same pooled thread could affect other sockets.
I think what you were experiencing were just DNS failures.
My guess is that the notification of the TimeoutException raced with
the ClosedByInterruptException, so sometimes you see one, sometimes
the other.

--
Simone Bordet
----
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

John Gardiner Myers


On 11/17/16 8:38 AM, Simone Bordet wrote:
> So I don't think that the same pooled thread could affect other sockets.
> I think what you were experiencing were just DNS failures.
> My guess is that the notification of the TimeoutException raced with
> the ClosedByInterruptException, so sometimes you see one, sometimes
> the other.
>
The thing is the ClosedByInterruptException kept recurring until the
process was restarted, whereupon it stopped. That suggests something in
the process got wedged. It also seemed to affect only one of a few
HttpClient instances.

Could you give me a clue as to how I could find out which socket the
AbstractInterruptibleChannel might be and what code processes its
exceptions? From my reading of HttpClient.newConnection(...) the DNS
failure would be happening before it creates the socket for the HTTP
connection it is trying to open.

Thank you very much for the help.

_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

John Gardiner Myers
I should mention my hypothesis that this is some persistent DNS
connection, used by Java's DNS resolver code, which doesn't get closed
on ClosedByInterruptException.


On 11/17/16 10:01 AM, John Gardiner Myers wrote:

>
>
> On 11/17/16 8:38 AM, Simone Bordet wrote:
>> So I don't think that the same pooled thread could affect other sockets.
>> I think what you were experiencing were just DNS failures.
>> My guess is that the notification of the TimeoutException raced with
>> the ClosedByInterruptException, so sometimes you see one, sometimes
>> the other.
>>
> The thing is the ClosedByInterruptException kept recurring until the
> process was restarted, whereupon it stopped. That suggests something
> in the process got wedged. It also seemed to affect only one of a few
> HttpClient instances.
>
> Could you give me a clue as to how I could find out which socket the
> AbstractInterruptibleChannel might be and what code processes its
> exceptions? From my reading of HttpClient.newConnection(...) the DNS
> failure would be happening before it creates the socket for the HTTP
> connection it is trying to open.
>
> Thank you very much for the help.
>
> _______________________________________________
> jetty-users mailing list
> [hidden email]
> To change your delivery options, retrieve your password, or
> unsubscribe from this list, visit
> https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_jetty-2Dusers&d=DwICAg&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=q9XS-DioVZXN7OIaivO8bmd7sZlV6a51ELpGujR1Dlc&m=iZAygNd6K_GeXj4JJbmJ0JK9qtoP6tcCMBYd-QID6ag&s=aPY88mytL6GR9mIiGUoRlhd6j9H_yZLIkN3VWaGJx9s&e= 


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

John Gardiner Myers
...But QueuedThreadPool would let the thread die after logging the
ClosedByInterruptException. So it must be something else wedging DNS
lookups. I'll continue investigating on my side.


On 11/17/16 10:50 AM, John Gardiner Myers wrote:

> I should mention my hypothesis that this is some persistent DNS
> connection, used by Java's DNS resolver code, which doesn't get closed
> on ClosedByInterruptException.
>
>
> On 11/17/16 10:01 AM, John Gardiner Myers wrote:
>>
>>
>> On 11/17/16 8:38 AM, Simone Bordet wrote:
>>> So I don't think that the same pooled thread could affect other
>>> sockets.
>>> I think what you were experiencing were just DNS failures.
>>> My guess is that the notification of the TimeoutException raced with
>>> the ClosedByInterruptException, so sometimes you see one, sometimes
>>> the other.
>>>
>> The thing is the ClosedByInterruptException kept recurring until the
>> process was restarted, whereupon it stopped. That suggests something
>> in the process got wedged. It also seemed to affect only one of a few
>> HttpClient instances.
>>
>> Could you give me a clue as to how I could find out which socket the
>> AbstractInterruptibleChannel might be and what code processes its
>> exceptions? From my reading of HttpClient.newConnection(...) the DNS
>> failure would be happening before it creates the socket for the HTTP
>> connection it is trying to open.
>>
>> Thank you very much for the help.
>>
>> _______________________________________________
>> jetty-users mailing list
>> [hidden email]
>> To change your delivery options, retrieve your password, or
>> unsubscribe from this list, visit
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_jetty-2Dusers&d=DwICAg&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=q9XS-DioVZXN7OIaivO8bmd7sZlV6a51ELpGujR1Dlc&m=iZAygNd6K_GeXj4JJbmJ0JK9qtoP6tcCMBYd-QID6ag&s=aPY88mytL6GR9mIiGUoRlhd6j9H_yZLIkN3VWaGJx9s&e= 
>
>
>
> _______________________________________________
> jetty-users mailing list
> [hidden email]
> To change your delivery options, retrieve your password, or
> unsubscribe from this list, visit
> https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.eclipse.org_mailman_listinfo_jetty-2Dusers&d=DwICAg&c=Vxt5e0Osvvt2gflwSlsJ5DmPGcPvTRKLJyp031rXjhg&r=q9XS-DioVZXN7OIaivO8bmd7sZlV6a51ELpGujR1Dlc&m=UVp1zURa8nL6gFPEp9EYah_0pfUEF_EvT38AZhVz1Gs&s=ieyi6wedoR2CmQ0S3UNQHyiH0nwSlQMTH5ElCMCsqls&e= 


_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users
Reply | Threaded
Open this post in threaded view
|

Re: Problem with ClosedByInterruptException

Simone Bordet-3
Hi,

On Thu, Nov 17, 2016 at 8:50 PM, John Gardiner Myers
<[hidden email]> wrote:
> ...But QueuedThreadPool would let the thread die after logging the
> ClosedByInterruptException. So it must be something else wedging DNS
> lookups. I'll continue investigating on my side.

And your stack trace shows that the ClosedByInterruptException comes
from an attempt to connect from Jetty's ManagedSelector.
So unless you are programmatically feeding the ManagedSelector with
SocketChannels you create, this comes from HttpClient.

Would it be possible that you have a lot of DNS requests that fail ?
Perhaps you can tune/improve logging to catch what domain fails DNS ?
Note that the ClosedByInterruptException is thrown by the JDK, but
caught and notified to a callback.

If Jetty's thread pool detects a thread that has been interrupted but
its interrupted status not reset, then it lets the thread die.

Let us know if you find more.

--
Simone Bordet
----
http://cometd.org
http://webtide.com
Developer advice, training, services and support
from the Jetty & CometD experts.
_______________________________________________
jetty-users mailing list
[hidden email]
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.eclipse.org/mailman/listinfo/jetty-users