|
Replies:
9
-
Last Post:
Jul 9, 2008 11:48 AM
by: Hinkmond Wong
|
Threads:
[
Previous
|
Next
]
|
|
|
|
|
|
PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 4, 2008 8:16 AM
|
|
|
Hi
I am using PP with Linux 2.6.22/Qtopia 2.2 on the i.MX31 (ARM11), mostly with great success and stability. The project is a PDA-like device to be used for human sciences research/surveys. My import of PP was at revision 10877 (April 2008).
I have found what seems to be a definite bug, and also have a proposed solution that seems to be working for me - just wanted to find out if anyone else has seen this and if my solution is sensible. Unfortunately, I don't have a clear way of reproducing the problem - it happens spuriously with my heavily multi-threaded application, but not with the standard test cases.
PROBLEM So: the problem is a SEGV in CVM, attaching GDB I get a backtrace indicating that one of the threads has faulted in: file: share/personal/native/awt/qt/QtImageRepresentation.cc function: Java_sun_awt_qt_QtImageRepresentation_disposePixmapEntry Further investigation reveals that this is due to the occasional double deletion of the QPixmap* p passed as an opaque handle in the JNI call from: file: share/personal/classes/awt/peer_based/sun/awt/qt/QtImageRepresentation.java method: disposeImage This method is called from three places in share/basis/classes/common/sun/awt/image/ImageRepresentation.java
FIX? The clue is that two of the calls from ImageRepresentation.java are in synchronized methods, the third is not (look at the method doFinalization). It turns out that doFinalization() is called by the AWTFinalizer thread (see share/personal/classes/common/sun/awt/AWTFinalizer.java), so it seems there are multiple threads calling non-reentrant code, hence the double delete.
The simple fix, which seems to have worked for me, is to add the synchronized keyword to the doFinalization() method. The patch is copied at the bottom of this entry.
If this is a genuine bug & fix, should it be submitted to the Sun bug database?
Regards
Mike Walton Technical Director Far South Networks (Pty) Ltd http://www.farsouthnet.com
svn diff -r99 cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java Index: cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java =================================================================== --- cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (revision 99) +++ cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (working copy) @@ -459,7 +459,10 @@ AWTFinalizer.addFinalizeable(this); } - public void doFinalization() { + // MW: SEGV due to double delete in Qtnative code under disposeImage() + // I have added "synchronized" keyword hoping to provide thread safety + // between other threads and the AWTFinalizer that call this method + public synchronized void doFinalization() { disposeImage(); }
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 7, 2008 11:26 AM
in response to: michael_walton
|
|
|
phonemeadvanced@mobileandembedded.org wrote: > svn diff -r99 cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java > Index: cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java > =================================================================== > --- cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (revision 99) > +++ cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (working copy) > @@ -459,7 +459,10 @@ > AWTFinalizer.addFinalizeable(this); > } > > - public void doFinalization() { > + // MW: SEGV due to double delete in Qtnative code under disposeImage() > + // I have added "synchronized" keyword hoping to provide thread safety > + // between other threads and the AWTFinalizer that call this method > + public synchronized void doFinalization() { > disposeImage(); > } > [Message sent by forum member 'michael_walton' (michael_walton)] >
Hi Mike,
Good catch! Very good analysis too. To make this official and to give you proper credit for this code change, please take a look at our code submission process:
https://mobileandembedded.dev.java.net/content/contribute.html
If you sign the Sun Contributor Agreement and follow the instructions to send, fax, or e-mail your signature according to this Web page:
http://www.sun.com/software/opensource/contributor_agreement.jsp
then let me know after you've done so, I can start the process of having your change code reviewed, tested, and then committed to our repository adding your name to the commit logs to give you credit.
Would that be OK with you?
Thanks, Hinkmond
--------------------------------------------------------------------- To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net For additional commands, e-mail: advanced-help@phoneme.dev.java.net
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 7, 2008 4:25 PM
in response to: Hinkmond Wong
|
|
|
If the issue is indeed QT-specific then the fix should probably go into QtImageRepresentation.java and not ImageRepresentation.java. However, we have a proposed fix without an explanation of the nature of the race condition. Finalizers run when no other thread has a reference to the object, and disposePixmapEntry uses AWT_QT_LOCK, so it is not clear why additional synchronization would help.
Dean
Hinkmond Wong wrote: > phonemeadvanced@mobileandembedded.org wrote: >> svn diff -r99 >> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java >> Index: >> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java >> =================================================================== >> --- >> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java >> (revision 99) >> +++ >> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java >> (working copy) >> @@ -459,7 +459,10 @@ >> AWTFinalizer.addFinalizeable(this); >> } >> >> - public void doFinalization() { >> + // MW: SEGV due to double delete in Qtnative code under >> disposeImage() >> + // I have added "synchronized" keyword hoping to provide thread >> safety >> + // between other threads and the AWTFinalizer that call this method >> + public synchronized void doFinalization() { >> disposeImage(); >> } >> [Message sent by forum member 'michael_walton' (michael_walton)] >> > > Hi Mike, > > Good catch! Very good analysis too. To make this official and to give > you proper credit for this code change, please take a look at our code > submission process: > > https://mobileandembedded.dev.java.net/content/contribute.html > > If you sign the Sun Contributor Agreement and follow the instructions to > send, fax, or e-mail your signature according to this Web page: > > http://www.sun.com/software/opensource/contributor_agreement.jsp > > then let me know after you've done so, I can start the process of having > your change code reviewed, tested, and then committed to our repository > adding your name to the commit logs to give you credit. > > > Would that be OK with you? > > > Thanks, > Hinkmond > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net > For additional commands, e-mail: advanced-help@phoneme.dev.java.net >
--------------------------------------------------------------------- To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net For additional commands, e-mail: advanced-help@phoneme.dev.java.net
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 7, 2008 4:35 PM
in response to: xyzzy (Dean)
|
|
|
xyzzy (Dean) wrote: > If the issue is indeed QT-specific then the fix should probably go into > QtImageRepresentation.java and not ImageRepresentation.java. However, > we have a proposed fix without an explanation of the nature of the race > condition. Finalizers run when no other thread has a reference to the > object, and disposePixmapEntry uses AWT_QT_LOCK, so it is not clear why > additional synchronization would help.
Mike, I know you said you did not have a reproducible testcase, but I think Dean might be right that there is some ambiguity here that needs clarification.
Can you tell us a little about the case where you did see the failure? What did the test code you ran look like? How did you run (under what conditions and which builds)? And, what was the frequency of failure you saw?
Thanks, Hinkmond
--------------------------------------------------------------------- To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net For additional commands, e-mail: advanced-help@phoneme.dev.java.net
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 8, 2008 2:19 AM
in response to: Hinkmond Wong
|
|
|
See comments in line
> xyzzy (Dean) wrote: > > If the issue is indeed QT-specific then the fix > should probably go into > > QtImageRepresentation.java and not > ImageRepresentation.java. However,
The issue may or may not be Qt-specific: the method disposeImage() which is called unsynchronized from ImageRepresentation.java is a JNI method for the cases of GTK and PocketPC (as opposed to being a Java method in Qt), and I haven't investigated whether it is re-entrant in these cases.
> > we have a proposed fix without an explanation of > the nature of the race > > condition. Finalizers run when no other thread has > a reference to the > > object, and disposePixmapEntry uses AWT_QT_LOCK, so > it is not clear why > > additional synchronization would help. >
The method doFinalization() (as opposed to the normal finalize()) is not called as part of the normal finalizer/GC framework, but in a special AWTFinalizer thread. HOWEVER, the doFinalization() task is added to the AWTFinalizer queue in ImageRepresentation::finalize(), so your point is valid as far as stating that no other thread should have a reference to the object at this point.
The AWT_QT_LOCK won't make a difference, since the double delete consists of two atoms of execution w.r.t. this lock - i.e. each call to disposePixmapEntry is synchronized but not the loop that calls it.
I will have to investigate further as to how the AWTFinalizer could be hitting a race condition with its call to doFinalization() - as you say, this should not be possible.
For now, I would tend to agree that this is not a confirmed bug. More like, it "seems" to work for me.
> Mike, I know you said you did not have a reproducible > testcase, but I > think Dean might be right that there is some > ambiguity here that needs > clarification. > > Can you tell us a little about the case where you did > see the failure? > What did the test code you ran look like? How did > you run (under what > conditions and which builds)? And, what was the > frequency of failure > you saw?
I am trying now to reproduce it, and I am not seeing consistent results. The conditions under which it occurs are when I am frequently creating and disposing new full screen panels with various graphic icons on them. The build is as follows, also with JIT and VFP enabled:
make bin \ BINARY_BUNDLE_DIRNAME=phoneme_adv \ HOST_CC="$BUILDCC" HOST_CCC="$BUILDCXX" TARGET_CC="$CC" \ JAVAME_LEGAL_DIR="../../../legal" \ J2ME_CLASSLIB=personal \ JDK_HOME=/opt/j2sdk1.4.2_17 \ TOOLS_DIR=$TOP_DIR/tools \ USE_MIDP=false \ MIDP_DIR=$TOP_DIR/midp \ PCSL_DIR=$TOP_DIR/pcsl \ USE_JUMP=false \ JUMP_DIR=$TOP_DIR/jump \ QT_TARGET_DIR=$TOP_DIR/../qtopia-free-2.2.0/qt2 \ QTOPIA_TARGET_DIR=$TOP_DIR/../qtopia-free-2.2.0/qtopia \ QTEMBEDDED=true \ QTOPIA=true \ MOC=$TOP_DIR/../qtopia-free-2.2.0/qt2/bin/moc \ QT_TARGET_LIB_DIR=$TOP_DIR/../qtopia-free-2.2.0/qtopia/lib
> > > Thanks, > Hinkmond > > > > ------------------------------------------------------ > --------------- > To unsubscribe, e-mail: > advanced-unsubscribe@phoneme.dev.java.net > For additional commands, e-mail: > advanced-help@phoneme.dev.java.net
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 8, 2008 2:40 AM
in response to: michael_walton
|
|
|
Just after posting, I reproduced the problem. I had debug prints (see below), showing the calls to ImageRepresentation, with object HashCodes in parentheses. This proves that abort() is being called AFTER finalize() for a given object, which should not be possible given the GC contract... Note the double call with the same Qt handle to disposePixmapEntry, resulting in the SEGV. This would be as a result of the AWTFinalizer thread calling doFinalization() and interrupting the disposeImage() call that started in abort(). Hence, adding "synchronized" to the doFinalization declaration would be a "fix".
Note: in this run, I am explicitly calling System.gc() every time I replace my full screen panel, which may be stimulating this problem (although it can occur, without the explicit GC).
Note: in trying to isolate this, I have seen at least two other SEGVs, both very occasional: 1. QtToolkitEventHandler::postMouseButtonEvent (original JNI call is java_sun_awt_qt_QtToolkit_runNative) 2. Java_sun_awt_qt_QtImageRepresentation_disposeImageNative (also called from QtImageRepresentation::disposeImage, so most likely the same cause as the original bug, just a different location of the race condition)
I am now officially out of my depth on this!
DEBUG TRACE: (object hashmap/reference is in parentheses) ImageRepresentation::finalize(-823558358) ImageRepresentation::abort(-823558358) QtImageRepresentation::disposeImage disposePixmapEntry(11963488) ImageRepresentation::doFinalization(-823558358) QtImageRepresentation::disposeImage disposePixmapEntry(11963488) Process #3291 received signal 11 Process #3291 being suspended
(gdb) thread 15 [Switching to thread 15 (Thread 3293)]#0 0x400897bc in kill () from rootfs/lib/libc.so.6 (gdb) bt #0 0x400897bc in kill () from rootfs/lib/libc.so.6 #1 <signal handler called> #2 0x00000018 in ?? () #3 0x465819f4 in Java_sun_awt_qt_QtImageRepresentation_disposePixmapEntry () from rootfs/phoneme_adv-rev100/lib/libqtawt.so #4 0x000f32f0 in args_done () Backtrace stopped: frame did not save the PC (gdb)
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 8, 2008 2:50 PM
in response to: michael_walton
|
|
|
Interesting. Can you get a Java stack backtrace in abort() and find out who is calling it?
Dean
phonemeadvanced@mobileandembedded.org wrote: > Just after posting, I reproduced the problem. I had debug prints (see below), showing the calls to ImageRepresentation, with object HashCodes in parentheses. This proves that abort() is being called AFTER finalize() for a given object, which should not be possible given the GC contract... Note the double call with the same Qt handle to disposePixmapEntry, resulting in the SEGV. This would be as a result of the AWTFinalizer thread calling doFinalization() and interrupting the disposeImage() call that started in abort(). Hence, adding "synchronized" to the doFinalization declaration would be a "fix". > > Note: in this run, I am explicitly calling System.gc() every time I replace my full screen panel, which may be stimulating this problem (although it can occur, without the explicit GC). > > Note: in trying to isolate this, I have seen at least two other SEGVs, both very occasional: > 1. QtToolkitEventHandler::postMouseButtonEvent (original JNI call is java_sun_awt_qt_QtToolkit_runNative) > 2. Java_sun_awt_qt_QtImageRepresentation_disposeImageNative (also called from QtImageRepresentation::disposeImage, so most likely the same cause as the original bug, just a different location of the race condition) > > I am now officially out of my depth on this! > > DEBUG TRACE: (object hashmap/reference is in parentheses) > ImageRepresentation::finalize(-823558358) > ImageRepresentation::abort(-823558358) > QtImageRepresentation::disposeImage disposePixmapEntry(11963488) > ImageRepresentation::doFinalization(-823558358) > QtImageRepresentation::disposeImage disposePixmapEntry(11963488) > Process #3291 received signal 11 > Process #3291 being suspended > > (gdb) thread 15 > [Switching to thread 15 (Thread 3293)]#0 0x400897bc in kill () > from rootfs/lib/libc.so.6 > (gdb) bt > #0 0x400897bc in kill () from rootfs/lib/libc.so.6 > #1 <signal handler called> > #2 0x00000018 in ?? () > #3 0x465819f4 in Java_sun_awt_qt_QtImageRepresentation_disposePixmapEntry () > from rootfs/phoneme_adv-rev100/lib/libqtawt.so > #4 0x000f32f0 in args_done () > Backtrace stopped: frame did not save the PC > (gdb) > [Message sent by forum member 'michael_walton' (michael_walton)] > > http://forums.java.net/jive/thread.jspa?messageID=284986 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net > For additional commands, e-mail: advanced-help@phoneme.dev.java.net >
--------------------------------------------------------------------- To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net For additional commands, e-mail: advanced-help@phoneme.dev.java.net
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 9, 2008 12:51 AM
in response to: xyzzy (Dean)
|
|
|
Hi Dean
Firstly - how can I get a Java stack trace after a crash, I've tried using the calls from gdb in the Porting Guide, but I don't have the global variable ee (e.g. call CVMdumpStack(&ee->interpreterStack,0,0,0)). Simple code inspection shows that there is only one place calling abort(), here is the code snippet from sun.awt.image.Image:
public void flush() { if (src != null) { src.checkSecurity(null, false); } if (!(source instanceof OffScreenImageSource)) { ImageRepresentation ir; synchronized (this) { availinfo &= ~ImageObserver.ERROR; ir = imagerep; imagerep = null; } if (ir != null) { ir.abort(); } if (src != null) { src.flush(); } } }
It is interesting to note that the object reference to the ImageRepresentation is set to null just before calling abort, which must be what is triggering the finalization, even though there is still a local reference ir. Does this point to a problem in the VM/GC?
Another clue lies in this code snippet from my own application:
protected void finalize() throws Throwable { super.finalize(); myImage.flush(); }
I am doing this in an attempt to reduce resource usage - the object being finalized here is my lightweight component containing an image, and I want to make sure image resources are freed. I guess this is the trigger point for the code above.
Any ideas?
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 9, 2008 8:21 AM
in response to: michael_walton
|
|
|
phonemeadvanced@mobileandembedded.org wrote: > Hi Dean > > Firstly - how can I get a Java stack trace after a crash, I've tried using the calls from gdb in the Porting Guide, but I don't have the global variable ee (e.g. call CVMdumpStack(&ee->interpreterStack,0,0,0)).
If you can't find the value for "ee", then you can try calling CVMgetEE():
CVMdumpStack(&CVMgetEE()->interpreterStack,0,0,0)
> Simple code inspection shows that there is only one place calling abort(), here is the code snippet from sun.awt.image.Image: > > public void flush() { > if (src != null) { > src.checkSecurity(null, false); > } > if (!(source instanceof OffScreenImageSource)) { > ImageRepresentation ir; > synchronized (this) { > availinfo &= ~ImageObserver.ERROR; > ir = imagerep; > imagerep = null; > } > if (ir != null) { > ir.abort(); > } > if (src != null) { > src.flush(); > } > } > } > > It is interesting to note that the object reference to the ImageRepresentation is set to null just before calling abort, which must be what is triggering the finalization, even though there is still a local reference ir. Does this point to a problem in the VM/GC?
Maybe, but I don't think so.
> Another clue lies in this code snippet from my own application: > > protected void finalize() throws Throwable { > super.finalize(); > myImage.flush(); > } > > I am doing this in an attempt to reduce resource usage - the object being finalized here is my lightweight component containing an image, and I want to make sure image resources are freed. I guess this is the trigger point for the code above. > > Any ideas?
I think this is the problem. There are two threads calling disposePixmapEntry, the JVM's finalizer thread, and the AWT finalizer thread. I don't think you need to call flush() here, but I don't see anything wrong with it either, so your proposed fix is probably a good idea. But I'm not an AWT expert. Hinkmond, do you agree?
Dean
> [Message sent by forum member 'michael_walton' (michael_walton)] > > http://forums.java.net/jive/thread.jspa?messageID=285294 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net > For additional commands, e-mail: advanced-help@phoneme.dev.java.net >
--------------------------------------------------------------------- To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net For additional commands, e-mail: advanced-help@phoneme.dev.java.net
|
|
|
|
|
|
|
|
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted:
Jul 9, 2008 11:48 AM
in response to: xyzzy (Dean)
|
|
|
xyzzy (Dean) wrote: >> Another clue lies in this code snippet from my own application: >> >> protected void finalize() throws Throwable { >> super.finalize(); >> myImage.flush(); >> } >> >> I am doing this in an attempt to reduce resource usage - the object >> being finalized here is my lightweight component containing an image, >> and I want to make sure image resources are freed. I guess this is >> the trigger point for the code above. >> >> Any ideas? > > I think this is the problem. There are two threads calling > disposePixmapEntry, > the JVM's finalizer thread, and the AWT finalizer thread. > I don't think you need to call flush() here, but I don't see anything > wrong with it either, so your proposed fix is probably a good idea. > But I'm not an AWT expert. Hinkmond, do you agree?
I agree with Dean. Since the JVM finalizer and AWT finalizer are on different threads, there could be contention for the disposePixmapEntry call which might be the cause of the problem. Even though we might not be able to inspect and verify the actual failure to confirm this, I also think the proposed fix is good, plus it doesn't seem to hurt anything.
Hinkmond
--------------------------------------------------------------------- To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net For additional commands, e-mail: advanced-help@phoneme.dev.java.net
|
|
|
|
|