The Source for Java Technology Collaboration

Home » java.net Forums » GlassFish » GlassFish

Thread: TopLink Essentials cannot handle large transactions, makes JVM run out of memory.

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 3 - Last Post: Sep 13, 2007 4:31 AM by: hewagn00
Witold Szczerba
TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted: Sep 9, 2007 11:18 AM
  Click to reply to this thread Reply

Hi there,
I have problem, looks like there is something wrong with TopLink
Essentials (tested with GFv2RC4).
It cannot handle large transactions, because it uses more and more
memory and eventually makes JVM crash because of OutOfMemory
exception.

Here is a simple case to show what the problem is:

@Stateless
public class PersonServiceBean implements PersonServiceRemote {

@PersistenceContext
private EntityManager em;

public PersonServiceBean() {
}
public void go() {
for (long i=10000000; i>00; i--) {
Person person = new Person();
person.setId(i);
person.setFirstName("firstName_"+i);
person.setLastName("lastName_"+i);
em.persist(person);
if (i%10000 == 0) {
System.out.format("%,d%n",i);
em.flush();
em.clear();
}
if (i==1) throw new RuntimeException();
}
}
}

The method "go" will never finish, with 512MB of RAM for Glassfish,
the JVM will crash after processing 1,000,000 records, but don't
forget it is just a test case, Person entity contains only two string
properties and nothing more. I know one can split one big transaction
into smaller ones (for example one tx per 10,000 records, but what if
I want all or nothing?)

I know it is very rare to use so big transactions, but here is an
example, where large transactions are required: in the company I work,
me and co-workers, we are developing new application using JavaEE that
is going to replace an old one written in M$Access. From the very
beginning, we had to develop parallel application that makes
conversion of data from legacy system (using ODBC-JDBC bridge) into
new structures. Because, in our new JavaEE application we use JPA, we
are using it in the module that transfers data from Access (that
really helps with refactoring, we can see problems with transfer
project immediately when we change/enhance entity classes).

After many days of "fighting" with TopLink (it was somewhere at the
beginning of this year, in those time I though it was a memory leak, I
posted many emails here), we switched to Hibernate (only in the
transfer project) and it works, I mean this is not a problem for
Hibernate to perform transaction of any size. It just pumps data into
database, uses almost no extra memory.

Can this be considered as a bug? Should I fill an issue?

Regards,
Witold Szczerba

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


hewagn00

Posts: 48
Re: TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted: Sep 10, 2007 7:27 AM   in response to: Witold Szczerba
  Click to reply to this thread Reply

Hi Witold,

this is a known problem. I duscussed this already with Gordon Yorke. I took a short look into what's happening, but I had no time to investigate in detail. It seems that Toplink creates ChangeSets, that refelect the changes within one transaction. The code seems to be not very memory efficient, especially when you perform multiple flush operations on the EntityManager within one transaction. (Entire ChageSets seem to be copied/duplicated). I found no work around for this so far. You could use a different persistence provider. According to my findings openJPA did only take up 1/10 of the memory of toplink in my use case.

Regards
Heiko

james_sutherland

Posts: 10
Re: TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted: Sep 11, 2007 12:29 PM   in response to: Witold Szczerba
  Click to reply to this thread Reply

The issue is that TopLink has a shared cache that it wants to merge the changes for the transaction into. TopLink cannot merge into the shared cache until the final commit of the transaction, so all the object changes are held in memory until the final commit. Technically we should through away the changes on a clear(), so please log a bug for this. As a workaround you can create a separate transaction for each batch of objects, instead of processing them all in a single transaction. This will also be much more efficient on the database. If you cannot do this, you could potentially work around the issue by going under the covers in TopLink and calling beginTransaction() and commitTransaction() on the TopLink EntityManager AbstractSession, then using a seperate JPA Transaction begin()/commit() for each batch, TopLink will hold the transaction open until the finally commitTransaction(). (but this is somewhat advanced). --- James Sutherland

hewagn00

Posts: 48
Re: TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted: Sep 13, 2007 4:31 AM   in response to: james_sutherland
  Click to reply to this thread Reply

I would also greatly appreciate a feature in TopLink that would clean up/prune the ChangeSets after a EntityManager.flush() is performed. In my application semantics a large number of objects must be created in one transaction. I have to perform a flush operation during that process several times to get reasonably progress information/timing behavior. Opposing to what one could assume, that calling a flush writes all pending changes, clearing the "changes buffer", it does not reduce the memory consumption it actually increases it extremely. When I perform no flush the memory consumption drastically reduces, but I have no information about progress, since all is happening on transaction commit.

Regards
Heiko




 XML java.net RSS