|
Replies:
3
-
Last Post:
Sep 13, 2007 4:31 AM
by: hewagn00
|
|
|
|
|
|
|
TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted:
Sep 9, 2007 11:18 AM
|
|
|
Hi there, I have problem, looks like there is something wrong with TopLink Essentials (tested with GFv2RC4). It cannot handle large transactions, because it uses more and more memory and eventually makes JVM crash because of OutOfMemory exception.
Here is a simple case to show what the problem is:
@Stateless public class PersonServiceBean implements PersonServiceRemote {
@PersistenceContext private EntityManager em;
public PersonServiceBean() { } public void go() { for (long i=10000000; i>00; i--) { Person person = new Person(); person.setId(i); person.setFirstName("firstName_"+i); person.setLastName("lastName_"+i); em.persist(person); if (i%10000 == 0) { System.out.format("%,d%n",i); em.flush(); em.clear(); } if (i==1) throw new RuntimeException(); } } }
The method "go" will never finish, with 512MB of RAM for Glassfish, the JVM will crash after processing 1,000,000 records, but don't forget it is just a test case, Person entity contains only two string properties and nothing more. I know one can split one big transaction into smaller ones (for example one tx per 10,000 records, but what if I want all or nothing?)
I know it is very rare to use so big transactions, but here is an example, where large transactions are required: in the company I work, me and co-workers, we are developing new application using JavaEE that is going to replace an old one written in M$Access. From the very beginning, we had to develop parallel application that makes conversion of data from legacy system (using ODBC-JDBC bridge) into new structures. Because, in our new JavaEE application we use JPA, we are using it in the module that transfers data from Access (that really helps with refactoring, we can see problems with transfer project immediately when we change/enhance entity classes).
After many days of "fighting" with TopLink (it was somewhere at the beginning of this year, in those time I though it was a memory leak, I posted many emails here), we switched to Hibernate (only in the transfer project) and it works, I mean this is not a problem for Hibernate to perform transaction of any size. It just pumps data into database, uses almost no extra memory.
Can this be considered as a bug? Should I fill an issue?
Regards, Witold Szczerba
--------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net For additional commands, e-mail: users-help@glassfish.dev.java.net
|
|
|
|
|
|
|
Re: TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted:
Sep 10, 2007 7:27 AM
in response to: Witold Szczerba
|
|
|
Hi Witold,
this is a known problem. I duscussed this already with Gordon Yorke. I took a short look into what's happening, but I had no time to investigate in detail. It seems that Toplink creates ChangeSets, that refelect the changes within one transaction. The code seems to be not very memory efficient, especially when you perform multiple flush operations on the EntityManager within one transaction. (Entire ChageSets seem to be copied/duplicated). I found no work around for this so far. You could use a different persistence provider. According to my findings openJPA did only take up 1/10 of the memory of toplink in my use case.
Regards Heiko
|
|
|
|
|
|
|
|
Re: TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted:
Sep 11, 2007 12:29 PM
in response to: Witold Szczerba
|
|
|
The issue is that TopLink has a shared cache that it wants to merge the changes for the transaction into. TopLink cannot merge into the shared cache until the final commit of the transaction, so all the object changes are held in memory until the final commit. Technically we should through away the changes on a clear(), so please log a bug for this.
As a workaround you can create a separate transaction for each batch of objects, instead of processing them all in a single transaction. This will also be much more efficient on the database.
If you cannot do this, you could potentially work around the issue by going under the covers in TopLink and calling beginTransaction() and commitTransaction() on the TopLink EntityManager AbstractSession, then using a seperate JPA Transaction begin()/commit() for each batch, TopLink will hold the transaction open until the finally commitTransaction(). (but this is somewhat advanced).
---
James Sutherland
|
|
|
|
|
|
|
|
Re: TopLink Essentials cannot handle large transactions, makes JVM run out of memory.
Posted:
Sep 13, 2007 4:31 AM
in response to: james_sutherland
|
|
|
I would also greatly appreciate a feature in TopLink that would clean up/prune the ChangeSets after a EntityManager.flush() is performed. In my application semantics a large number of objects must be created in one transaction. I have to perform a flush operation during that process several times to get reasonably progress information/timing behavior. Opposing to what one could assume, that calling a flush writes all pending changes, clearing the "changes buffer", it does not reduce the memory consumption it actually increases it extremely. When I perform no flush the memory consumption drastically reduces, but I have no information about progress, since all is happening on transaction commit.
Regards Heiko
|
|
|
|
|