|
|
|
|
java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Feb 17, 2009 12:53 AM
|
|
|
|
|
During an execution of java.util.concurrent.locks.Condition.await(timeout, TimeUnit.MILLISECONDS) a thread hangs on Solaris/AMD x64 instead of being resumed after timeout is gone.
The environment is: java version "1.6.0_12" Java(TM) SE Runtime Environment (build 1.6.0_12-b04) Java HotSpot(TM) Server VM (build 11.2-b01, mixed mode) SunOS x2001 5.10 Generic_127128-11 i86pc i386 i86pc Update from 10.06.2009: Sorry, the version of Solaris is: SunOS x2001 5.10 Generic_137138-09 i86pc i386 i86pc AMD Opteron 2356 2 CPU x Quad-Core 2312 MHz
Steps to reproduce the bug: 1) Run the attached test application on Solaris/AMD x64 platform. 2) In 2-20 minutes the bug should be reproduced with the message "JVM Bug found in 8 threads !!!!" in the log
Here is a part of source code:
// one thread:
lock.lock(); try { try { while (queueSize == 0) { if (condition.await(AWAIT, TimeUnit.MILLISECONDS)) { conditionCount++; } } awaitCount++; } catch (InterruptedException e) { e.printStackTrace(); } } finally { lock.unlock(); }
// another thread:
lock.lock(); try { bufferPos = (int) (eventsCount % LONG_DATA_COUNT); if (queueSize == 0 && Math.random() > SIGNAL_PROBABILITY) { condition.signal(); } eventsCount++; queueSize++; } finally { lock.unlock(); }
The bug is NOT reproduced on the platforms: Solaris / UltraSPARC T1 Solaris / UltraSPARC T2+ Solaris / UltraSPARC IV+ Linux / Xeon
Solaris version is corrected from "Generic_127128-11" to "Generic_137138-09"
Message was edited by: neighbour
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Feb 19, 2009 2:54 PM
in response to: neighbour
|
|
|
Could you please submit a bug report at http://bugreport.sun.com with a complete runnable sample?
Thanks, Roger Y.
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Feb 23, 2009 8:30 AM
in response to: rogyeu
|
|
|
Hello,
our new sun X2200 M2 with opteron 2356 quad core and java 1.5.0_17-b04 have the same issue. We first noticed it in the gc.log. The elapsed time since jvm start just jumps from a valid value of several ten thousand seconds to something 2 million seconds. From there on the tomcat becomes instable. We already submited a bug report.
Thanks for the code snippet, it seems to be a good indicator.
best regards Paul
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Feb 24, 2009 2:57 AM
in response to: neighbour
|
|
|
Hello,
the bug seemed to be fixed after we applied the latest Solaris patches. After we applied them we were no longer able to reproduce the bug ( using the code provided in that forum thread ).
best regards Paul
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Feb 25, 2009 4:04 PM
in response to: pmehrer
|
|
|
Thanks for the info. neighbour, can you please make sure you have the latest Solaris patches?
-- RY
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Feb 27, 2009 2:12 PM
in response to: neighbour
|
|
|
Please try the suggestion in the bug report comment. We would like to isolate the issue. You may post on the bug report.
Thanks, RY
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Apr 8, 2009 4:59 AM
in response to: rogyeu
|
|
|
Sorry, what is the suggestion you are talking about? What else should I post on the bug report?
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
May 18, 2009 12:45 AM
in response to: neighbour
|
|
|
The suggestion in the bug report was to replace Thread.sleep with Object.wait to exclude the possibility of Thread.sleep returning early - which is a problem on Solaris in some circumstances.
The latest request is for a pstack/jstack dump of the hung process showing where the threads are. Taking a few in quick succession would help establish which threads are truly blocked.
There are also issues with synchronization (or lack thereof) in your test program. This is unlikely to be the cause of the problem unless it occurs when values overflow from needing 32-bits to needing 33-bits. Note that volatile longs still need to be accessed under a lock as the update to them is NOT atomic.
Please forward information via the bug report as I don't follow these forums.
Thank you. David Holmes
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
May 29, 2009 2:39 AM
in response to: dholmes
|
|
|
|
|
> The suggestion in the bug report was to replace Thread.sleep with Object.wait to exclude the possibility of Thread.sleep returning early - which is a problem on Solaris in some circumstances.
In the provided test there is no wrong usage of methods. It is not a right way to "solve" the problem by making the reproducible code non reproducible, is it? Apart from this test, in our production code there are no usages of Thread.sleep, but there is a usage of "await(timeout, units)" and other java.util.concurrent methods with timeouts, and this is the point.
> The latest request is for a pstack/jstack dump of the hung process showing where the threads are. Taking a few in quick succession would help establish which threads are truly blocked.
In the attached file "park-nanos-pstack-jstack.zip" there is output of the commands "pstack", "jstack -l", "jstack -m" with a few seconds interval.
> Please forward information via the bug report as I don't follow these forums.
I looks like there is no chance to attach anything to a ticket in bugs.sun.com, only text description is allowed. So I put the attachment here and will provide a link to it from the bug http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6807483.
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 8, 2009 9:57 PM
in response to: pmehrer
|
|
|
Paul,
> the bug seemed to be fixed after we applied the > latest Solaris patches. After we applied them we were > no longer able to reproduce the bug ( using the code > provided in that forum thread ).
Do you recall exactly which patches you applied that fixed this? I'm trying to pin down the root cause.
Thanks, David Holmes
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 4, 2009 4:12 AM
in response to: neighbour
|
|
|
After applying the latest Solaris release from May 2009 "Kernel version: SunOS 5.10 Generic_139556-08, Solaris 10 5/09 s10x_u7wos_08 X86", the bug is not reproduced any longer.
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 8, 2009 9:46 PM
in response to: neighbour
|
|
|
Thanks for the update.
So you went from a Solaris 10 update 5 install to Solaris 10 update 7, and that has seemingly fixed the problem.
We were unable to reproduce the problem. I will try to track down what fix in S10u6 or S10u7 might have fixed this.
David Holmes
Edited to correct version info.
Message was edited by: dholmes
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 8, 2009 10:14 PM
in response to: dholmes
|
|
|
It is possible that the root cause here was Solaris bug 6600939, which was fixed in Solaris 10 update 6.
David Holmes
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 8, 2009 11:50 PM
in response to: dholmes
|
|
|
No, we went from Solaris 10 update 6 (Solaris 10 10/08 s10x_u6wos_07b X86) to Solaris 10 update 7 (Solaris 10 5/09 s10x_u7wos_08 X86).
Indeed the root cause was Solaris bug 6600939, I submitted this problem too: http://forums.java.net/jive/thread.jspa?threadID=61998. But the point was: the Solaris bug 6600939 is stated as fixed in the patch 137112-01 (http://sunsolve.sun.com/search/document.do?assetkey=1-21-137112-01-1) dated by June 2008. We had "Solaris 10 10/08 s10x_u6wos_07b X86" dated by October 2008, so that Solaris release contained this patch, but the bug was still there.
Only when we moved to update 7, the problem has gone.
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 9, 2009 1:04 AM
in response to: neighbour
|
|
|
Neighbor,
Your original post states:
SunOS x2001 5.10 Generic_127128-11 i86pc i386 i86pc
and 127128-11 corresponds to Solaris 5/08 which is update 5. Hence my comment.
If you are indeed seeing this with update 6 then I need to dig further in update 7 to see if there is any additional fix related to 6600939.
When a future time is reported due to this bug subsequent calls to nanoTime will return the same value until time catches up with the erroneous value - this guarantees the monotonic non-decreasing property of nanoTime (at least on Solaris).
David Holmes
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 10, 2009 1:28 AM
in response to: dholmes
|
|
|
The version "Generic_127128-11" was a mistake. The real version was "Generic_137138-09". BTW in the attached file "x2001-show-rev.txt" the correct version is mentioned from the very beginning.
I've updated the top message in this thread accordingly.
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Jun 10, 2009 1:56 AM
in response to: neighbour
|
|
|
Thanks for the clarification. Unfortunately it means we've gone from "problem solved" to having a new mystery. 
David Holmes
|
|
|
|
|
|
|
|
Re: java.util.concurrent.locks.Condition.await(timeout, units) hangs forever
Posted:
Oct 21, 2009 1:30 AM
in response to: neighbour
|
|
|
|
|
Unfortunately, the attached test and showrev output were removed somehow. Here they are: "jvm-await-bug.zip", "x2001-show-rev.txt".
|
|
|
|
|