|
Replies:
6
-
Last Post:
Sep 21, 2005 8:44 AM
by: t_rex
|
|
|
|
|
|
|
N medium-size heaps instead of only one huge heap.
Posted:
Sep 13, 2005 8:28 AM
|
|
|
Hi, Are there plans for using several medium-size heaps rather than only one huge heap ?
For large and very large applications (more than 2 GB of heap) running on large SMP machines, this would enable to have several GCs running on smaller heaps, thus reducing pauses and optimizing CPU.
Java classes could allocate objects having different "life expectancy" in different heaps. This could enable to put objects that will live for ever in a different heap than the one used for short and very-short life.
This also could be a solution for the bottleneck of many threads trying to create new objects in the same heap at the same time.
Your opinion ?
Tony
|
|
|
|
|
|
|
Re: N medium-size heaps instead of only one huge heap.
Posted:
Sep 13, 2005 9:08 AM
in response to: t_rex
|
|
|
Heap is currently divided into young, old and perm space. Young space is further divided into Eden and two survivor spaces. There are flags to set the size of each of these areas either as a % of total or an absolute value. In the 1.5 you can also specify % free values that help the JVM decide on how big each of the spaces should be. There are also adaptive policies that can be specified.
But splitting up heap is not free as it becomes almost impossible to predict where objects are connected. For example it is quite possible to have an object in Eden connected to one in old who also connects to one in Eden. When you do a GC, you have to consider the consequences of finding references in one space pointing to another. This is one of the reasons that GC is so slow in old space. The GC step must consider young space as well (hence full GC).
This is just the beginning of the story and it's not done. There is still a lot of research going on in the area of GC. Sun has taken one track, JRockit (BEA) has taken a similar yet different track as has IBM. It will be interesting to see who wins.
Anyways, this is about as much as I'm willig to write on the subject in this forum at this time. There are some excelent J1 presentations avaliable as well as some wonderful talks at BeJUG from the JRockit guys on how their GC works. I suggest that you take a look at them.
|
|
|
|
|
|
|
|
Re: N medium-size heaps instead of only one huge heap.
Posted:
Sep 13, 2005 9:43 AM
in response to: kcpeppe
|
|
|
Thanks for the explanations ! I'm aware of optimizations around the GC done by JVMs.
I'm talking about a way in Java language to use different heaps like there is a way to use different threads.
Why not imaging different parts of a big application using different heaps ? Would it provide ways for performance improvements ?
Also, parts of an application could directly allocate objects that are known to be "alive for ever" directly in a heap reserved for VERY-long-life objects where a GC is run only every day (or never ?). This would prevent to move these objects several times from young zone to old zone. This also would remove these objects to the list of objects that are screened by the GC, speeding up his work.
This would add in Java programming some complexity since developers will have to use different heaps depending on the life duration of their objects. But it would certainly improve performances. The "naive" way threads were used in first Java version has evolved in 1.5 . Why not do the same about the heap in future versions ? A clean API or language addon should be defined.
Yes, as you said, there will be links between the different heaps, leading to a complex GC. But, as you said, this already occurs since there are links between young and old spaces. So, maybe this can be addressed without a big pain.
Tony
|
|
|
|
|
|
|
|
Re: N medium-size heaps instead of only one huge heap.
Posted:
Sep 13, 2005 10:08 AM
in response to: t_rex
|
|
|
> I'm talking about a way in Java language to use > different heaps like there is a way to use different > threads.
Again, there have been some products that accidently provide this type of behavior. For example, GemStone/J. and application server provided an orthogonol persistance mechanisum. Using that product, you'd get a persistant root and then connect your object to it. Most common roots were collections. Once connected the object would be flushed or sucked up by the other cache when the application reached a transactional boundry. I believe that the same technique is being used in their GemFire product. There are other products about that may offer the same type of heap partitioning in the name of persistance, clustering or some other related activity.
So what it suggests is; there is some validity to your fully explained idea. However beyond perm space I'd be hesitant to give developers the ability to pin objects into one heap space over the other. The reason is, they are not so good at managing memory as it is and this is a one more implement which they could use to harm themselves . Certianly I would not like to see this functionality in the language proper. Maybe in the JVM and libraries but I think only for special cases. Or (thinking out-loud) as part of the classloader. Classloading is dynamically scoped and thus does impose some memory access constraints. It is possible that these memory constraints could be used to partition heap space so the GC would work on a classloader basis. I don't know! Certianly I'd rather be having Peter Hagger and crowd do that work for me that is for sure 8^)
|
|
|
|
|
|
|
|
Re: N medium-size heaps instead of only one huge heap.
Posted:
Sep 13, 2005 10:55 AM
in response to: t_rex
|
|
|
AFAIK, the reason why there hasn't been multiple heaps so far is because of the architectural design of the heap space - which depends on a single, global referencing hash-map for all the objects within the System.
So one can say that the reason for having only a single heap so far is due to simplicity in design hence ease of programming (read: easier to maintain).
Looking over your arguments for the advantages of having multiple heap spaces:
1) The heap space is divided into various regions (as mentioned by others in the earlier posts) to accomodate for objects with various "life expectancies"
2) At first sight it may seem that multiple GC can be run simutaneously upon multiple heap spaces; but since we are expecting objects referencing to others in seperate heap spaces this cannot be expected to be simple (hence higher algorithm complexity thus run-time & memory overheads) with not as much concurrency.
3) Being able to create new objects concurrently seems to be the only real benefit (left); however the added complexity of resolving references across heap spaces might put off the gains.
Just my 2 cents.
|
|
|
|
|
|
|
|
Re: N medium-size heaps instead of only one huge heap.
Posted:
Sep 13, 2005 11:24 AM
in response to: alexlamsl
|
|
|
> AFAIK, the reason why there hasn't been multiple > heaps so far is because of the architectural design > of the heap space - which depends on a single, global > referencing hash-map for all the objects within the > System.
Is this really the case? It is my understanding (not having looked at the source code) that each space has it's own OOP table.
> > 1) The heap space is divided into various regions (as > mentioned by others in the earlier posts) to > accomodate for objects with various "life > expectancies"
and these are mostly unpredicable.
> > 2) At first sight it may seem that multiple GC can be > run simutaneously upon multiple heap spaces; but > since we are expecting objects referencing to others > in seperate heap spaces this cannot be expected to be > simple (hence higher algorithm complexity thus > run-time & memory overheads) with not as much > concurrency.
bingo, young GC doesn't expect references in old and therefore doesn't check for them IIRC. In this case, old garbage is left behind. however GC in old must GC young. If there are more spaces in the equation then things get much more complex. Oh yeah I forgot, it is possible that objects in old will get pulled back into young space. My it does get messy doesn't it
> > 3) Being able to create new objects concurrently > seems to be the only real benefit
humm, my micro-benchmarking would suggest that creating objects is cheap so this gain is dubious at best. However collecting them is expensive so this looks like the side of the problem that needs to be tackled.. "Big gains first"
|
|
|
|
|
|
|
|
Re: N medium-size heaps instead of only one huge heap.
Posted:
Sep 21, 2005 8:44 AM
in response to: kcpeppe
|
|
|
> humm, my micro-benchmarking would suggest that creating > objects is cheap so this gain is dubious at best. However > collecting them is expensive so this looks like the side > of the problem that needs to be tackled.
How many concurrent threads on how many processors did you used for your micro-benchmarking ? What if 500 Java active threads are running on a machine with 2x or 4x processors ? This may appear with J2EE applications with many clients.
(I designed and built a Smalltalk-80-like ObjectOriented language 17 years ago, but it seems the complexity of GCs has reached a complexity I have difficulty to imagine !)
|
|
|
|
|