The Source for Java Technology Collaboration

Home » java.net Forums » Phone ME » phoneME Advanced software

Thread: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 9 - Last Post: Jul 9, 2008 11:48 AM by: Hinkmond Wong Threads: [ Previous | Next ]
michael_walton

Posts: 4
PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 4, 2008 8:16 AM
  Click to reply to this thread Reply

Hi

I am using PP with Linux 2.6.22/Qtopia 2.2 on the i.MX31 (ARM11), mostly with great success and stability. The project is a PDA-like device to be used for human sciences research/surveys. My import of PP was at revision 10877 (April 2008).

I have found what seems to be a definite bug, and also have a proposed solution that seems to be working for me - just wanted to find out if anyone else has seen this and if my solution is sensible. Unfortunately, I don't have a clear way of reproducing the problem - it happens spuriously with my heavily multi-threaded application, but not with the standard test cases.

PROBLEM
So: the problem is a SEGV in CVM, attaching GDB I get a backtrace indicating that one of the threads has faulted in:
file: share/personal/native/awt/qt/QtImageRepresentation.cc
function: Java_sun_awt_qt_QtImageRepresentation_disposePixmapEntry
Further investigation reveals that this is due to the occasional double deletion of the QPixmap* p passed as an opaque handle in the JNI call from:
file: share/personal/classes/awt/peer_based/sun/awt/qt/QtImageRepresentation.java
method: disposeImage
This method is called from three places in share/basis/classes/common/sun/awt/image/ImageRepresentation.java

FIX?
The clue is that two of the calls from ImageRepresentation.java are in synchronized methods, the third is not (look at the method doFinalization). It turns out that doFinalization() is called by the AWTFinalizer thread (see share/personal/classes/common/sun/awt/AWTFinalizer.java), so it seems there are multiple threads calling non-reentrant code, hence the double delete.

The simple fix, which seems to have worked for me, is to add the synchronized keyword to the doFinalization() method. The patch is copied at the bottom of this entry.

If this is a genuine bug & fix, should it be submitted to the Sun bug database?

Regards

Mike Walton
Technical Director
Far South Networks (Pty) Ltd
http://www.farsouthnet.com


svn diff -r99 cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
Index: cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
===================================================================
--- cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (revision 99)
+++ cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (working copy)
@@ -459,7 +459,10 @@
AWTFinalizer.addFinalizeable(this);
}

- public void doFinalization() {
+ // MW: SEGV due to double delete in Qtnative code under disposeImage()
+ // I have added "synchronized" keyword hoping to provide thread safety
+ // between other threads and the AWTFinalizer that call this method
+ public synchronized void doFinalization() {
disposeImage();
}

Hinkmond Wong
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 7, 2008 11:26 AM   in response to: michael_walton
  Click to reply to this thread Reply

phonemeadvanced@mobileandembedded.org wrote:
> svn diff -r99 cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
> Index: cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
> ===================================================================
> --- cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (revision 99)
> +++ cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java (working copy)
> @@ -459,7 +459,10 @@
> AWTFinalizer.addFinalizeable(this);
> }
>
> - public void doFinalization() {
> + // MW: SEGV due to double delete in Qtnative code under disposeImage()
> + // I have added "synchronized" keyword hoping to provide thread safety
> + // between other threads and the AWTFinalizer that call this method
> + public synchronized void doFinalization() {
> disposeImage();
> }
> [Message sent by forum member 'michael_walton' (michael_walton)]
>

Hi Mike,

Good catch! Very good analysis too. To make this official and to give
you proper credit for this code change, please take a look at our code
submission process:

https://mobileandembedded.dev.java.net/content/contribute.html

If you sign the Sun Contributor Agreement and follow the instructions to
send, fax, or e-mail your signature according to this Web page:

http://www.sun.com/software/opensource/contributor_agreement.jsp

then let me know after you've done so, I can start the process of having
your change code reviewed, tested, and then committed to our repository
adding your name to the commit logs to give you credit.


Would that be OK with you?


Thanks,
Hinkmond


---------------------------------------------------------------------
To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
For additional commands, e-mail: advanced-help@phoneme.dev.java.net


xyzzy (Dean)
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 7, 2008 4:25 PM   in response to: Hinkmond Wong
  Click to reply to this thread Reply

If the issue is indeed QT-specific then the fix should probably go into
QtImageRepresentation.java and not ImageRepresentation.java. However,
we have a proposed fix without an explanation of the nature of the race
condition. Finalizers run when no other thread has a reference to the
object, and disposePixmapEntry uses AWT_QT_LOCK, so it is not clear why
additional synchronization would help.

Dean

Hinkmond Wong wrote:
> phonemeadvanced@mobileandembedded.org wrote:
>> svn diff -r99
>> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
>> Index:
>> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
>> ===================================================================
>> ---
>> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
>> (revision 99)
>> +++
>> cdc/src/share/basis/classes/common/sun/awt/image/ImageRepresentation.java
>> (working copy)
>> @@ -459,7 +459,10 @@
>> AWTFinalizer.addFinalizeable(this);
>> }
>>
>> - public void doFinalization() {
>> + // MW: SEGV due to double delete in Qtnative code under
>> disposeImage()
>> + // I have added "synchronized" keyword hoping to provide thread
>> safety
>> + // between other threads and the AWTFinalizer that call this method
>> + public synchronized void doFinalization() {
>> disposeImage();
>> }
>> [Message sent by forum member 'michael_walton' (michael_walton)]
>>
>
> Hi Mike,
>
> Good catch! Very good analysis too. To make this official and to give
> you proper credit for this code change, please take a look at our code
> submission process:
>
> https://mobileandembedded.dev.java.net/content/contribute.html
>
> If you sign the Sun Contributor Agreement and follow the instructions to
> send, fax, or e-mail your signature according to this Web page:
>
> http://www.sun.com/software/opensource/contributor_agreement.jsp
>
> then let me know after you've done so, I can start the process of having
> your change code reviewed, tested, and then committed to our repository
> adding your name to the commit logs to give you credit.
>
>
> Would that be OK with you?
>
>
> Thanks,
> Hinkmond
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
> For additional commands, e-mail: advanced-help@phoneme.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
For additional commands, e-mail: advanced-help@phoneme.dev.java.net


Hinkmond Wong
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 7, 2008 4:35 PM   in response to: xyzzy (Dean)
  Click to reply to this thread Reply

xyzzy (Dean) wrote:
> If the issue is indeed QT-specific then the fix should probably go into
> QtImageRepresentation.java and not ImageRepresentation.java. However,
> we have a proposed fix without an explanation of the nature of the race
> condition. Finalizers run when no other thread has a reference to the
> object, and disposePixmapEntry uses AWT_QT_LOCK, so it is not clear why
> additional synchronization would help.

Mike, I know you said you did not have a reproducible testcase, but I
think Dean might be right that there is some ambiguity here that needs
clarification.

Can you tell us a little about the case where you did see the failure?
What did the test code you ran look like? How did you run (under what
conditions and which builds)? And, what was the frequency of failure
you saw?


Thanks,
Hinkmond



---------------------------------------------------------------------
To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
For additional commands, e-mail: advanced-help@phoneme.dev.java.net


michael_walton

Posts: 4
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 8, 2008 2:19 AM   in response to: Hinkmond Wong
  Click to reply to this thread Reply

See comments in line

> xyzzy (Dean) wrote:
> > If the issue is indeed QT-specific then the fix
> should probably go into
> > QtImageRepresentation.java and not
> ImageRepresentation.java. However,

The issue may or may not be Qt-specific: the method disposeImage() which is called unsynchronized from ImageRepresentation.java is a JNI method for the cases of GTK and PocketPC (as opposed to being a Java method in Qt), and I haven't investigated whether it is re-entrant in these cases.

> > we have a proposed fix without an explanation of
> the nature of the race
> > condition. Finalizers run when no other thread has
> a reference to the
> > object, and disposePixmapEntry uses AWT_QT_LOCK, so
> it is not clear why
> > additional synchronization would help.
>

The method doFinalization() (as opposed to the normal finalize()) is not called as part of the normal finalizer/GC framework, but in a special AWTFinalizer thread. HOWEVER, the doFinalization() task is added to the AWTFinalizer queue in ImageRepresentation::finalize(), so your point is valid as far as stating that no other thread should have a reference to the object at this point.

The AWT_QT_LOCK won't make a difference, since the double delete consists of two atoms of execution w.r.t. this lock - i.e. each call to disposePixmapEntry is synchronized but not the loop that calls it.

I will have to investigate further as to how the AWTFinalizer could be hitting a race condition with its call to doFinalization() - as you say, this should not be possible.

For now, I would tend to agree that this is not a confirmed bug. More like, it "seems" to work for me.

> Mike, I know you said you did not have a reproducible
> testcase, but I
> think Dean might be right that there is some
> ambiguity here that needs
> clarification.
>
> Can you tell us a little about the case where you did
> see the failure?
> What did the test code you ran look like? How did
> you run (under what
> conditions and which builds)? And, what was the
> frequency of failure
> you saw?

I am trying now to reproduce it, and I am not seeing consistent results. The conditions under which it occurs are when I am frequently creating and disposing new full screen panels with various graphic icons on them. The build is as follows, also with JIT and VFP enabled:

make bin \
BINARY_BUNDLE_DIRNAME=phoneme_adv \
HOST_CC="$BUILDCC" HOST_CCC="$BUILDCXX" TARGET_CC="$CC" \
JAVAME_LEGAL_DIR="../../../legal" \
J2ME_CLASSLIB=personal \
JDK_HOME=/opt/j2sdk1.4.2_17 \
TOOLS_DIR=$TOP_DIR/tools \
USE_MIDP=false \
MIDP_DIR=$TOP_DIR/midp \
PCSL_DIR=$TOP_DIR/pcsl \
USE_JUMP=false \
JUMP_DIR=$TOP_DIR/jump \
QT_TARGET_DIR=$TOP_DIR/../qtopia-free-2.2.0/qt2 \
QTOPIA_TARGET_DIR=$TOP_DIR/../qtopia-free-2.2.0/qtopia \
QTEMBEDDED=true \
QTOPIA=true \
MOC=$TOP_DIR/../qtopia-free-2.2.0/qt2/bin/moc \
QT_TARGET_LIB_DIR=$TOP_DIR/../qtopia-free-2.2.0/qtopia/lib

>
>
> Thanks,
> Hinkmond
>
>
>
> ------------------------------------------------------
> ---------------
> To unsubscribe, e-mail:
> advanced-unsubscribe@phoneme.dev.java.net
> For additional commands, e-mail:
> advanced-help@phoneme.dev.java.net

michael_walton

Posts: 4
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 8, 2008 2:40 AM   in response to: michael_walton
  Click to reply to this thread Reply

Just after posting, I reproduced the problem. I had debug prints (see below), showing the calls to ImageRepresentation, with object HashCodes in parentheses. This proves that abort() is being called AFTER finalize() for a given object, which should not be possible given the GC contract... Note the double call with the same Qt handle to disposePixmapEntry, resulting in the SEGV. This would be as a result of the AWTFinalizer thread calling doFinalization() and interrupting the disposeImage() call that started in abort(). Hence, adding "synchronized" to the doFinalization declaration would be a "fix".

Note: in this run, I am explicitly calling System.gc() every time I replace my full screen panel, which may be stimulating this problem (although it can occur, without the explicit GC).

Note: in trying to isolate this, I have seen at least two other SEGVs, both very occasional:
1. QtToolkitEventHandler::postMouseButtonEvent (original JNI call is java_sun_awt_qt_QtToolkit_runNative)
2. Java_sun_awt_qt_QtImageRepresentation_disposeImageNative (also called from QtImageRepresentation::disposeImage, so most likely the same cause as the original bug, just a different location of the race condition)

I am now officially out of my depth on this!

DEBUG TRACE: (object hashmap/reference is in parentheses)
ImageRepresentation::finalize(-823558358)
ImageRepresentation::abort(-823558358)
QtImageRepresentation::disposeImage disposePixmapEntry(11963488)
ImageRepresentation::doFinalization(-823558358)
QtImageRepresentation::disposeImage disposePixmapEntry(11963488)
Process #3291 received signal 11
Process #3291 being suspended

(gdb) thread 15
[Switching to thread 15 (Thread 3293)]#0 0x400897bc in kill ()
from rootfs/lib/libc.so.6
(gdb) bt
#0 0x400897bc in kill () from rootfs/lib/libc.so.6
#1 <signal handler called>
#2 0x00000018 in ?? ()
#3 0x465819f4 in Java_sun_awt_qt_QtImageRepresentation_disposePixmapEntry ()
from rootfs/phoneme_adv-rev100/lib/libqtawt.so
#4 0x000f32f0 in args_done ()
Backtrace stopped: frame did not save the PC
(gdb)

xyzzy (Dean)
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 8, 2008 2:50 PM   in response to: michael_walton
  Click to reply to this thread Reply

Interesting. Can you get a Java stack backtrace in abort() and
find out who is calling it?

Dean

phonemeadvanced@mobileandembedded.org wrote:
> Just after posting, I reproduced the problem. I had debug prints (see below), showing the calls to ImageRepresentation, with object HashCodes in parentheses. This proves that abort() is being called AFTER finalize() for a given object, which should not be possible given the GC contract... Note the double call with the same Qt handle to disposePixmapEntry, resulting in the SEGV. This would be as a result of the AWTFinalizer thread calling doFinalization() and interrupting the disposeImage() call that started in abort(). Hence, adding "synchronized" to the doFinalization declaration would be a "fix".
>
> Note: in this run, I am explicitly calling System.gc() every time I replace my full screen panel, which may be stimulating this problem (although it can occur, without the explicit GC).
>
> Note: in trying to isolate this, I have seen at least two other SEGVs, both very occasional:
> 1. QtToolkitEventHandler::postMouseButtonEvent (original JNI call is java_sun_awt_qt_QtToolkit_runNative)
> 2. Java_sun_awt_qt_QtImageRepresentation_disposeImageNative (also called from QtImageRepresentation::disposeImage, so most likely the same cause as the original bug, just a different location of the race condition)
>
> I am now officially out of my depth on this!
>
> DEBUG TRACE: (object hashmap/reference is in parentheses)
> ImageRepresentation::finalize(-823558358)
> ImageRepresentation::abort(-823558358)
> QtImageRepresentation::disposeImage disposePixmapEntry(11963488)
> ImageRepresentation::doFinalization(-823558358)
> QtImageRepresentation::disposeImage disposePixmapEntry(11963488)
> Process #3291 received signal 11
> Process #3291 being suspended
>
> (gdb) thread 15
> [Switching to thread 15 (Thread 3293)]#0 0x400897bc in kill ()
> from rootfs/lib/libc.so.6
> (gdb) bt
> #0 0x400897bc in kill () from rootfs/lib/libc.so.6
> #1 <signal handler called>
> #2 0x00000018 in ?? ()
> #3 0x465819f4 in Java_sun_awt_qt_QtImageRepresentation_disposePixmapEntry ()
> from rootfs/phoneme_adv-rev100/lib/libqtawt.so
> #4 0x000f32f0 in args_done ()
> Backtrace stopped: frame did not save the PC
> (gdb)
> [Message sent by forum member 'michael_walton' (michael_walton)]
>
> http://forums.java.net/jive/thread.jspa?messageID=284986
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
> For additional commands, e-mail: advanced-help@phoneme.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
For additional commands, e-mail: advanced-help@phoneme.dev.java.net


michael_walton

Posts: 4
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 9, 2008 12:51 AM   in response to: xyzzy (Dean)
  Click to reply to this thread Reply

Hi Dean

Firstly - how can I get a Java stack trace after a crash, I've tried using the calls from gdb in the Porting Guide, but I don't have the global variable ee (e.g. call CVMdumpStack(&ee->interpreterStack,0,0,0)).

Simple code inspection shows that there is only one place calling abort(), here is the code snippet from sun.awt.image.Image:

public void flush() {
if (src != null) {
src.checkSecurity(null, false);
}
if (!(source instanceof OffScreenImageSource)) {
ImageRepresentation ir;
synchronized (this) {
availinfo &= ~ImageObserver.ERROR;
ir = imagerep;
imagerep = null;
}
if (ir != null) {
ir.abort();
}
if (src != null) {
src.flush();
}
}
}

It is interesting to note that the object reference to the ImageRepresentation is set to null just before calling abort, which must be what is triggering the finalization, even though there is still a local reference ir. Does this point to a problem in the VM/GC?

Another clue lies in this code snippet from my own application:

protected void finalize() throws Throwable {
super.finalize();
myImage.flush();
}

I am doing this in an attempt to reduce resource usage - the object being finalized here is my lightweight component containing an image, and I want to make sure image resources are freed. I guess this is the trigger point for the code above.

Any ideas?

xyzzy (Dean)
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 9, 2008 8:21 AM   in response to: michael_walton
  Click to reply to this thread Reply

phonemeadvanced@mobileandembedded.org wrote:
> Hi Dean
>
> Firstly - how can I get a Java stack trace after a crash, I've tried using the calls from gdb in the Porting Guide, but I don't have the global variable ee (e.g. call CVMdumpStack(&ee->interpreterStack,0,0,0)).

If you can't find the value for "ee", then you can try calling CVMgetEE():

CVMdumpStack(&CVMgetEE()->interpreterStack,0,0,0)

> Simple code inspection shows that there is only one place calling abort(), here is the code snippet from sun.awt.image.Image:
>
> public void flush() {
> if (src != null) {
> src.checkSecurity(null, false);
> }
> if (!(source instanceof OffScreenImageSource)) {
> ImageRepresentation ir;
> synchronized (this) {
> availinfo &= ~ImageObserver.ERROR;
> ir = imagerep;
> imagerep = null;
> }
> if (ir != null) {
> ir.abort();
> }
> if (src != null) {
> src.flush();
> }
> }
> }
>
> It is interesting to note that the object reference to the ImageRepresentation is set to null just before calling abort, which must be what is triggering the finalization, even though there is still a local reference ir. Does this point to a problem in the VM/GC?

Maybe, but I don't think so.

> Another clue lies in this code snippet from my own application:
>
> protected void finalize() throws Throwable {
> super.finalize();
> myImage.flush();
> }
>
> I am doing this in an attempt to reduce resource usage - the object being finalized here is my lightweight component containing an image, and I want to make sure image resources are freed. I guess this is the trigger point for the code above.
>
> Any ideas?

I think this is the problem. There are two threads calling disposePixmapEntry,
the JVM's finalizer thread, and the AWT finalizer thread.
I don't think you need to call flush() here, but I don't see anything
wrong with it either, so your proposed fix is probably a good idea.
But I'm not an AWT expert. Hinkmond, do you agree?

Dean

> [Message sent by forum member 'michael_walton' (michael_walton)]
>
> http://forums.java.net/jive/thread.jspa?messageID=285294
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
> For additional commands, e-mail: advanced-help@phoneme.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
For additional commands, e-mail: advanced-help@phoneme.dev.java.net


Hinkmond Wong
Re: PP/Qt crashes with SEGV in disposePixmapEntry due to double delete
Posted: Jul 9, 2008 11:48 AM   in response to: xyzzy (Dean)
  Click to reply to this thread Reply

xyzzy (Dean) wrote:
>> Another clue lies in this code snippet from my own application:
>>
>> protected void finalize() throws Throwable {
>> super.finalize();
>> myImage.flush();
>> }
>>
>> I am doing this in an attempt to reduce resource usage - the object
>> being finalized here is my lightweight component containing an image,
>> and I want to make sure image resources are freed. I guess this is
>> the trigger point for the code above.
>>
>> Any ideas?
>
> I think this is the problem. There are two threads calling
> disposePixmapEntry,
> the JVM's finalizer thread, and the AWT finalizer thread.
> I don't think you need to call flush() here, but I don't see anything
> wrong with it either, so your proposed fix is probably a good idea.
> But I'm not an AWT expert. Hinkmond, do you agree?

I agree with Dean. Since the JVM finalizer and AWT finalizer are on
different threads, there could be contention for the disposePixmapEntry
call which might be the cause of the problem. Even though we might not
be able to inspect and verify the actual failure to confirm this, I also
think the proposed fix is good, plus it doesn't seem to hurt anything.


Hinkmond

---------------------------------------------------------------------
To unsubscribe, e-mail: advanced-unsubscribe@phoneme.dev.java.net
For additional commands, e-mail: advanced-help@phoneme.dev.java.net





 XML java.net RSS