The Source for Java Technology Collaboration

Home » java.net Forums » GlassFish » GlassFish

Thread: Debugging Glassfish CPU usage

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
This question is answered. Helpful answers available: 2. Correct answers available: 1.

Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 44 - Last Post: Apr 10, 2009 8:05 AM by: rwillie6
rwillie6

Posts: 78
Debugging Glassfish CPU usage
Posted: Mar 23, 2009 5:53 PM
 
  Click to reply to this thread Reply

I'm trying to determine the cause of elevated CPU and Interrupt / Context-switch activity on our application server that occurs at non-peak traffic times and at times causes the application server to become non-responsive. Munin graphs of the CPU usage and Interrupts / Context-Switches are attached.

These spikes are odd because they occur during non-peak times and do not resolve themselves. So far, the only solution I have found is to restart glassfish, which usually makes the problem go away for 8-24 hours. Sometimes, these spikes do not affect load times, but sometimes the spike is more severe and the application server becomes non-responsive.

I have confirmed during these times that there are no other processes running that are using a significant amount of CPU; i.e. the CPU spikes are glassfish CPU spikes. IO activity is no higher than usual during there periods. Munin memory graphs showed no differences between problematic and non-problematic times.

The app is J2EE using JPA + Hibernate, no EJBs.

I'm looking for ideas / advice on tracking down the root cause. Thanks!

System:
- x-large high-cpu Amazon EC2 instance (7.5GB of memory, 8 cpus, 20 amazon "compute units")
- ubuntu 8.04, 64-bit
- dedicated to glassfish

Notable JVM options:
<jvm-options>-server</jvm-options>
<jvm-options>-Xmx2500m</jvm-options>
<jvm-options>-Xms2500m</jvm-options>
<jvm-options>-Xmn1000m</jvm-options>
<jvm-options>-Xss128k</jvm-options>
<jvm-options>-XX:+AggressiveOpts</jvm-options>
<jvm-options>-XX:+AggressiveHeap</jvm-options>
<jvm-options>-XX:+DisableExplicitGC</jvm-options>
<jvm-options>-XX:+UseParallelGC</jvm-options>
<jvm-options>-XX:+UseParallelOldGC</jvm-options>
<jvm-options>-XX:ParallelGCThreads=10</jvm-options>
<jvm-options>-Dcom.sun.enterprise.server.ss.ASQuickStartup=false</jvm-options>
<jvm-options>-XX:MaxPermSize=192m</jvm-options>
<jvm-options>-XX:NewRatio=2</jvm-options>

Notable Glassfish Settings:
<http-listener acceptor-threads="5" address="0.0.0.0" blocking-enabled="false" default-virtual-server="server" enabled="true" family="inet" id="http-listener-1" port="80" security-enabled="false" server-name="" xpowered-by="false"><property name="compression" value="on"/></http-listener>
<request-processing header-buffer-length-in-bytes="8192" initial-thread-count="50" request-timeout-in-seconds="60" thread-count="250" thread-increment="10"/>
<keep-alive max-connections="256" thread-count="5" timeout-in-seconds="30"/>
<connection-pool max-pending-count="4096" queue-size-in-bytes="4096" receive-buffer-size-in-bytes="4096" send-buffer-size-in-bytes="8192"/>
<http-file-cache file-caching-enabled="true" file-transmission-enabled="false" globally-enabled="true" hash-init-size="0" max-age-in-seconds="60" max-files-count="1024" medium-file-size-limit-in-bytes="537600" medium-file-space-in-bytes="10485760" small-file-size-limit-in-bytes="2048" small-file-space-in-bytes="1048576"/>

km

Posts: 681
Re: Debugging Glassfish CPU usage
Posted: Mar 23, 2009 7:32 PM   in response to: rwillie6
 
  Click to reply to this thread Reply

Thanks for all the data. One thing I'd like to suggest is getting the Java Thread Dump (send a
kill -3 to JVM process) in order to get the same (it goes to domain/logs/jvm.log), when you
experience the CPU spikes.

The number of Grizzly threads seem to be slightly higher than usual. With 8 cores, your max
number of Grizzly threads go to 250, which is rather high. That's just an observation.

It looks to me that you can consider sending "-d64" instead of "-server".

Also, what is the average time per HTTP request that is spent in the database tier?

Some other data like pfiles would help too. But a number of stack dumps are required at a minimum.

Regards,
Kedar

rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 3:26 AM   in response to: km
 
  Click to reply to this thread Reply

Hi Kedar,

I've taken a couple of thread dumps with jstack, are those the same as would be created with a kill -3? If so, see attached.

I've knocked the threads down to 200, would that still be rather high? Also, I noticed an error in the jvm-options above; all are correct except we're running -Xmx3000m and -Xms3000m, instead of 2500 as I previously listed.

Also, I thought that -d64 only applied to Solaris. If I'm incorrect, what are the benefits of -d64 over -server?

As for average time in the database tier, we don't calculate that, but from watching mytop (filtering inactive) while a spike was occurring I would guesstimate it at or under 1s.

I'm unfamiliar with pfiles, could you explain a bit more there?

Thanks!

Jeanfrancois Ar...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 3:07 AM   in response to: rwillie6
  Click to reply to this thread Reply

Salut,

glassfish@javadesktop.org wrote:
> I'm trying to determine the cause of elevated CPU and Interrupt / Context-switch activity on our application server that occurs at non-peak traffic times and at times causes the application server to become non-responsive. Munin graphs of the CPU usage and Interrupts / Context-Switches are attached.
>
> These spikes are odd because they occur during non-peak times and do not resolve themselves. So far, the only solution I have found is to restart glassfish, which usually makes the problem go away for 8-24 hours. Sometimes, these spikes do not affect load times, but sometimes the spike is more severe and the application server becomes non-responsive.
>
> I have confirmed during these times that there are no other processes running that are using a significant amount of CPU; i.e. the CPU spikes are glassfish CPU spikes. IO activity is no higher than usual during there periods. Munin memory graphs showed no differences between problematic and non-problematic times.
>
> The app is J2EE using JPA + Hibernate, no EJBs.

any web related call? It looks like a possible JDK issue IMO (which JDK
are you using?). When it peak, can you grab a thread dump? I think the
NIO Selector might get into a spinning state...we are currently testing
a fix for JDK 7 b51 and if that work it will be backported to 6 urX.
Before I share a patch, I just want to make sure this is the issue.

Thanks

-- Jeanfrancois


>
> I'm looking for ideas / advice on tracking down the root cause. Thanks!
>
> System:
> - x-large high-cpu Amazon EC2 instance (7.5GB of memory, 8 cpus, 20 amazon "compute units")
> - ubuntu 8.04, 64-bit
> - dedicated to glassfish
>
> Notable JVM options:
> <jvm-options>-server</jvm-options>
> <jvm-options>-Xmx2500m</jvm-options>
> <jvm-options>-Xms2500m</jvm-options>
> <jvm-options>-Xmn1000m</jvm-options>
> <jvm-options>-Xss128k</jvm-options>
> <jvm-options>-XX:+AggressiveOpts</jvm-options>
> <jvm-options>-XX:+AggressiveHeap</jvm-options>
> <jvm-options>-XX:+DisableExplicitGC</jvm-options>
> <jvm-options>-XX:+UseParallelGC</jvm-options>
> <jvm-options>-XX:+UseParallelOldGC</jvm-options>
> <jvm-options>-XX:ParallelGCThreads=10</jvm-options>
> <jvm-options>-Dcom.sun.enterprise.server.ss.ASQuickStartup=false</jvm-options>
> <jvm-options>-XX:MaxPermSize=192m</jvm-options>
> <jvm-options>-XX:NewRatio=2</jvm-options>
>
> Notable Glassfish Settings:
> <http-listener acceptor-threads="5" address="0.0.0.0" blocking-enabled="false" default-virtual-server="server" enabled="true" family="inet" id="http-listener-1" port="80" security-enabled="false" server-name="" xpowered-by="false"><property name="compression" value="on"/></http-listener>
> <request-processing header-buffer-length-in-bytes="8192" initial-thread-count="50" request-timeout-in-seconds="60" thread-count="250" thread-increment="10"/>
> <keep-alive max-connections="256" thread-count="5" timeout-in-seconds="30"/>
> <connection-pool max-pending-count="4096" queue-size-in-bytes="4096" receive-buffer-size-in-bytes="4096" send-buffer-size-in-bytes="8192"/>
> <http-file-cache file-caching-enabled="true" file-transmission-enabled="false" globally-enabled="true" hash-init-size="0" max-age-in-seconds="60" max-files-count="1024" medium-file-size-limit-in-bytes="537600" medium-file-space-in-bytes="10485760" small-file-size-limit-in-bytes="2048" small-file-space-in-bytes="1048576"/>
> [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=338592
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 3:37 AM   in response to: Jeanfrancois Ar...
 
  Click to reply to this thread Reply

Hi Jeanfrancois,

Could you clarify what you mean by "any web related call?"? In case it helps, we have anywhere from 500 - 900 users using the site during the "off-peak" times I mentioned before, compared to our "peak" times which range can reach 1600 concurrent users.

The Java version on the server is 1.6.0_07-b06. Though, it should be noted that our team runs Apple computers, so the development environment Java version is 1.5.0_16-b06-284. Even though the server is 64-bit, we are running the 32-bit jvm in mixed mode on the recommendation of Scott Oaks, who advised me when configuring glassfish last fall.

For thread dumps, see my reply to Kedar, above. I attached two jstack dumps, one from yesterday and one that just occurred. I have another from 2 days ago, but it was too large a file size to attach.

Thanks!

Jeanfrancois Ar...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 3:46 AM   in response to: rwillie6
  Click to reply to this thread Reply

Salut,

glassfish@javadesktop.org wrote:
> Hi Jeanfrancois,
>
> Could you clarify what you mean by "any web related call?"? In case it helps, we have anywhere from 500 - 900 users using the site during the "off-peak" times I mentioned before, compared to our "peak" times which range can reach 1600 concurrent users.
>
> The Java version on the server is 1.6.0_07-b06. Though, it should be noted that our team runs Apple computers, so the development environment Java version is 1.5.0_16-b06-284. Even though the server is 64-bit, we are running the 32-bit jvm in mixed mode on the recommendation of Scott Oaks, who advised me when configuring glassfish last fall.
>
> For thread dumps, see my reply to Kedar, above. I attached two jstack dumps, one from yesterday and one that just occurred. I have another from 2 days ago, but it was too large a file size to attach.
>

I've missed those. The issue might be

> "httpWorkerThread-80-197" daemon prio=10 tid=0x1d0a5000 nid=0xe0e waiting for monitor entry [0x04f72000..0x04f72dc0]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at com.sun.enterprise.resource.AbstractResourcePool.transactionCompleted(AbstractResourcePool.java:1176)
> - waiting to lock <0x3bf7c598> (a com.sun.enterprise.resource.SJSASResourcePool)
> at com.sun.enterprise.resource.PoolManagerImpl.transactionCompleted(PoolManagerImpl.java:393)
> at com.sun.enterprise.resource.PoolManagerImpl$SynchronizationListener.afterCompletion(PoolManagerImpl.java:540)
> at com.sun.enterprise.distributedtx.J2EETransaction.commit(J2EETransaction.java:491)
> at com.sun.enterprise.distributedtx.J2EETransactionManagerOpt.commit(J2EETransactionManagerOpt.java:371)
> at com.sun.enterprise.distributedtx.UserTransactionImpl.commit(UserTransactionImpl.java:197)
> at org.j2free.jpa.Controller.endTransaction(Controller.java:158)
> at org.j2free.servlet.filter.InvokerFilter.doFilter(InvokerFilter.java:151)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:230)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:198)
> at org.j2free.servlet.filter.ErrorLogFilter.doFilter(ErrorLogFilter.java:103)
> at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:230)
> at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:198)
> at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:288)
> at org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:271)
> at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:202)
> at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
> at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
> at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94)

but I also see in the second log:

> "httpWorkerThread-80-229" daemon prio=10 tid=0x08eccc00 nid=0x5833 runnable [0x35b59000..0x35b5b140]
> java.lang.Thread.State: RUNNABLE
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> - locked <0xbad6e370> (a java.io.BufferedInputStream)
> at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2452)
> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2906)
> at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2895)
> at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3438)
> at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1951)
> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2101)
> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2554)
> - locked <0xbad6c648> (a java.lang.Object)
> at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1761)
> at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1912)
> - locked <0xbad6c648> (a java.lang.Object)
> at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:186)
> at org.hibernate.loader.Loader.getResultSet(Loader.java:1787)
> at org.hibernate.loader.Loader.doQuery(Loader.java:674)
> at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236)
> at org.hibernate.loader.Loader.doList(Loader.java:2213)
> at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2104)
> at org.hibernate.loader.Loader.list(Loader.java:2099)
> at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:378)
> at org.hibernate.hql.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:338)
> at org.hibernate.engine.query.HQLQueryPlan.performList(HQLQueryPlan.java:172)
> at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1121)
> at org.hibernate.impl.QueryImpl.list(QueryImpl.java:79)
> at org.hibernate.ejb.QueryImpl.getSingleResult(QueryImpl.java:81)
> at org.j2free.jpa.Controller.query(Controller.java:653)
> at org.j2free.jpa.Controller.query(Controller.java:661)
> at com.tuneshero.servlet.ajax.FlexRequests.doGet(FlexRequests.java:309)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:718)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:831)

Jagadish, do you know if this could be the issue or not? But for sure
all Grizzly WorkerThread may gets locked by such call.

A+

-- Jeanfrancois


> Thanks!
> [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=338656
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 4:01 AM   in response to: Jeanfrancois Ar...
  Click to reply to this thread Reply

@rwillie6 : can you post your connection pool configuration ?

Also, you can set connection-validation to "table" instead of
"auto-commit" and let us know the results.
http://blogs.sun.com/JagadishPrasath/entry/connection_validation_in_glassfish_jdbc

Thanks,
-Jagadish


On Tue, 2009-03-24 at 11:46 +0100, Jeanfrancois Arcand wrote:
> Salut,
>
> glassfish@javadesktop.org wrote:
> > Hi Jeanfrancois,
> >
> > Could you clarify what you mean by "any web related call?"? In case it helps, we have anywhere from 500 - 900 users using the site during the "off-peak" times I mentioned before, compared to our "peak" times which range can reach 1600 concurrent users.
> >
> > The Java version on the server is 1.6.0_07-b06. Though, it should be noted that our team runs Apple computers, so the development environment Java version is 1.5.0_16-b06-284. Even though the server is 64-bit, we are running the 32-bit jvm in mixed mode on the recommendation of Scott Oaks, who advised me when configuring glassfish last fall.
> >
> > For thread dumps, see my reply to Kedar, above. I attached two jstack dumps, one from yesterday and one that just occurred. I have another from 2 days ago, but it was too large a file size to attach.
> >
>
> I've missed those. The issue might be
>
> > "httpWorkerThread-80-197" daemon prio=10 tid=0x1d0a5000 nid=0xe0e waiting for monitor entry [0x04f72000..0x04f72dc0]
> > java.lang.Thread.State: BLOCKED (on object monitor)
> > at com.sun.enterprise.resource.AbstractResourcePool.transactionCompleted(AbstractResourcePool.java:1176)
> > - waiting to lock <0x3bf7c598> (a com.sun.enterprise.resource.SJSASResourcePool)
> > at com.sun.enterprise.resource.PoolManagerImpl.transactionCompleted(PoolManagerImpl.java:393)
> > at com.sun.enterprise.resource.PoolManagerImpl$SynchronizationListener.afterCompletion(PoolManagerImpl.java:540)
> > at com.sun.enterprise.distributedtx.J2EETransaction.commit(J2EETransaction.java:491)
> > at com.sun.enterprise.distributedtx.J2EETransactionManagerOpt.commit(J2EETransactionManagerOpt.java:371)
> > at com.sun.enterprise.distributedtx.UserTransactionImpl.commit(UserTransactionImpl.java:197)
> > at org.j2free.jpa.Controller.endTransaction(Controller.java:158)
> > at org.j2free.servlet.filter.InvokerFilter.doFilter(InvokerFilter.java:151)
> > at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:230)
> > at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:198)
> > at org.j2free.servlet.filter.ErrorLogFilter.doFilter(ErrorLogFilter.java:103)
> > at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:230)
> > at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:198)
> > at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:288)
> > at org.apache.catalina.core.StandardContextValve.invokeInternal(StandardContextValve.java:271)
> > at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:202)
> > at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:632)
> > at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:577)
> > at com.sun.enterprise.web.WebPipeline.invoke(WebPipeline.java:94)
>
> but I also see in the second log:
>
> > "httpWorkerThread-80-229" daemon prio=10 tid=0x08eccc00 nid=0x5833 runnable [0x35b59000..0x35b5b140]
> > java.lang.Thread.State: RUNNABLE
> > at java.net.SocketInputStream.socketRead0(Native Method)
> > at java.net.SocketInputStream.read(SocketInputStream.java:129)
> > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> > at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> > at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> > - locked <0xbad6e370> (a java.io.BufferedInputStream)
> > at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2452)
> > at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2906)
> > at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2895)
> > at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3438)
> > at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1951)
> > at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2101)
> > at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2554)
> > - locked <0xbad6c648> (a java.lang.Object)
> > at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1761)
> > at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1912)
> > - locked <0xbad6c648> (a java.lang.Object)
> > at org.hibernate.jdbc.AbstractBatcher.getResultSet(AbstractBatcher.java:186)
> > at org.hibernate.loader.Loader.getResultSet(Loader.java:1787)
> > at org.hibernate.loader.Loader.doQuery(Loader.java:674)
> > at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236)
> > at org.hibernate.loader.Loader.doList(Loader.java:2213)
> > at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2104)
> > at org.hibernate.loader.Loader.list(Loader.java:2099)
> > at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:378)
> > at org.hibernate.hql.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:338)
> > at org.hibernate.engine.query.HQLQueryPlan.performList(HQLQueryPlan.java:172)
> > at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1121)
> > at org.hibernate.impl.QueryImpl.list(QueryImpl.java:79)
> > at org.hibernate.ejb.QueryImpl.getSingleResult(QueryImpl.java:81)
> > at org.j2free.jpa.Controller.query(Controller.java:653)
> > at org.j2free.jpa.Controller.query(Controller.java:661)
> > at com.tuneshero.servlet.ajax.FlexRequests.doGet(FlexRequests.java:309)
> > at javax.servlet.http.HttpServlet.service(HttpServlet.java:718)
> > at javax.servlet.http.HttpServlet.service(HttpServlet.java:831)
>
> Jagadish, do you know if this could be the issue or not? But for sure
> all Grizzly WorkerThread may gets locked by such call.
>
> A+
>
> -- Jeanfrancois
>
>
> > Thanks!
> > [Message sent by forum member 'rwillie6' (rwillie6)]
> >
> > http://forums.java.net/jive/thread.jspa?messageID=338656
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> > For additional commands, e-mail: users-help@glassfish.dev.java.net
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 11:44 AM   in response to: Jagadish Prasat...
 
  Click to reply to this thread Reply

I'm not sure which connection pool you were referring to, so both are included below. I changed auto-commit to table after pasting the config below and will report any results.

<connection-pool max-pending-count="4096" queue-size-in-bytes="4096" receive-buffer-size-in-bytes="4096" send-buffer-size-in-bytes="8192"/>

And the JDBC pool:
<jdbc-connection-pool allow-non-component-callers="false" associate-with-thread="false" connection-creation-retry-attempts="0" connection-creation-retry-interval-in-seconds="10" connection-leak-reclaim="false" connection-leak-timeout-in-seconds="0" connection-validation-method="auto-commit" datasource-classname="com.mysql.jdbc.jdbc2.optional.MysqlDataSource" fail-all-connections="false" idle-timeout-in-seconds="300" is-connection-validation-required="true" is-isolation-level-guaranteed="false" lazy-connection-association="false" lazy-connection-enlistment="false" match-connections="false" max-connection-usage-count="300" max-pool-size="400" max-wait-time-in-millis="60000" name="fathom-pool2" non-transactional-connections="false" pool-resize-quantity="10" res-type="javax.sql.DataSource" statement-timeout-in-seconds="55" steady-pool-size="40" validate-atmost-once-period-in-seconds="10" wrap-jdbc-objects="false">
<property name="SelfDestructOnPingMaxOperations" value="0"/>
<property name="UseUltraDevWorkAround" value="false"/>
<property name="UseStreamLengthsInPrepStmts" value="true"/>
<property name="TcpRcvBuf" value="0"/>
<property name="UltraDevHack" value="false"/>
<property name="InitialTimeout" value="2"/>
<property name="UseNanosForElapsedTime" value="false"/>
<property name="MetadataCacheSize" value="50"/>
<property name="UseColumnNamesInFindColumn" value="false"/>
<property name="RequireSSL" value="false"/>
<property name="LogXaCommands" value="false"/>
<property name="CacheCallableStatements" value="false"/>
<property name="NoDatetimeStringSync" value="false"/>
<property name="EmulateUnsupportedPstmts" value="true"/>
<property name="TreatUtilDateAsTimestamp" value="true"/>
<property name="EmulateLocators" value="false"/>
<property name="ClientInfoProvider" value="com.mysql.jdbc.JDBC4CommentClientInfoProvider"/>
<property name="IsInteractiveClient" value="false"/>
<property name="IncludeInnodbStatusInDeadlockExceptions" value="false"/>
<property name="ProcessEscapeCodesForPrepStmts" value="true"/>
<property name="GatherPerformanceMetrics" value="false"/>
<property name="UseOnlyServerErrorMessages" value="true"/>
<property name="HoldResultsOpenOverStatementClose" value="false"/>
<property name="LoadBalanceBlacklistTimeout" value="0"/>
<property name="UseServerPreparedStmts" value="false"/>
<property name="UseTimezone" value="false"/>
<property name="BlobsAreStrings" value="false"/>
<property name="IgnoreNonTxTables" value="false"/>
<property name="UseSSL" value="false"/>
<property name="AllowLoadLocalInfile" value="true"/>
<property name="SocketFactoryClassName" value="com.mysql.jdbc.StandardSocketFactory"/>
<property name="SelfDestructOnPingSecondsLifetime" value="0"/>
<property name="AutoClosePStmtStreams" value="false"/>
<property name="NullCatalogMeansCurrent" value="true"/>
<property name="UseInformationSchema" value="false"/>
<property name="DynamicCalendars" value="false"/>
<property name="CreateDatabaseIfNotExist" value="false"/>
<property name="UseServerPrepStmts" value="false"/>
<property name="Pedantic" value="false"/>
<property name="UseSSPSCompatibleTimezoneShift" value="false"/>
<property name="ReconnectAtTxEnd" value="false"/>
<property name="Url" value="jdbc:mysql://:3306/"/>
<property name="UseJvmCharsetConverters" value="false"/>
<property name="FunctionsNeverReturnBlobs" value="false"/>
<property name="NoAccessToProcedureBodies" value="false"/>
<property name="AllowNanAndInf" value="false"/>
<property name="CachePreparedStatements" value="false"/>
<property name="ExplainSlowQueries" value="false"/>
<property name="CacheResultSetMetadata" value="false"/>
<property name="TcpKeepAlive" value="true"/>
<property name="PreparedStatementCacheSqlLimit" value="256"/>
<property name="SlowQueryThresholdNanos" value="0"/>
<property name="LoginTimeout" value="0"/>
<property name="UseCompression" value="false"/>
<property name="AutoSlowLog" value="true"/>
<property name="TcpSndBuf" value="0"/>
<property name="TcpTrafficClass" value="0"/>
<property name="RoundRobinLoadBalance" value="false"/>
<property name="CallableStatementCacheSize" value="100"/>
<property name="InteractiveClient" value="false"/>
<property name="NoTimezoneConversionForTimeType" value="false"/>
<property name="UseHostsInPrivileges" value="true"/>
<property name="UseSqlStateCodes" value="true"/>
<property name="GatherPerfMetrics" value="false"/>
<property name="ReportMetricsIntervalMillis" value="30000"/>
<property name="UseFastDateParsing" value="true"/>
<property name="LoggerClassName" value="com.mysql.jdbc.log.StandardLogger"/>
<property name="DontTrackOpenResources" value="false"/>
<property name="UseOldUTF8Behavior" value="false"/>
<property name="OverrideSupportsIntegrityEnhancementFacility" value="false"/>
<property name="RollbackOnPooledClose" value="true"/>
<property name="GenerateSimpleParameterMetadata" value="false"/>
<property name="SecondsBeforeRetryMaster" value="30"/>
<property name="ResultSetSizeThreshold" value="100"/>
<property name="TransformedBitIsBoolean" value="false"/>
<property name="SocketTimeout" value="0"/>
<property name="Paranoid" value="false"/>
<property name="FailOverReadOnly" value="true"/>
<property name="CapitalizeTypeNames" value="true"/>
<property name="VerifyServerCertificate" value="true"/>
<property name="TinyInt1isBit" value="true"/>
<property name="UseDynamicCharsetInfo" value="true"/>
<property name="UseBlobToStoreUTF8OutsideBMP" value="false"/>
<property name="EnableQueryTimeouts" value="true"/>
<property name="PopulateInsertRowWithDefaultValues" value="false"/>
<property name="UseDirectRowUnpack" value="true"/>
<property name="PadCharsWithSpace" value="false"/>
<property name="UseOldAliasMetadataBehavior" value="false"/>
<property name="PrepStmtCacheSqlLimit" value="256"/>
<property name="MaxReconnects" value="3"/>
<property name="AutoDeserialize" value="false"/>
<property name="JdbcCompliantTruncationForReads" value="true"/>
<property name="CompensateOnDuplicateKeyUpdateCounts" value="false"/>
<property name="PinGlobalTxToPhysicalConnection" value="false"/>
<property name="ProfileSQL" value="false"/>
<property name="RewriteBatchedStatements" value="false"/>
<property name="UseUnicode" value="true"/>
<property name="Logger" value="com.mysql.jdbc.log.StandardLogger"/>
<property name="DumpMetadataOnColumnNotFound" value="false"/>
<property name="MaxQuerySizeToLog" value="2048"/>
<property name="LocatorFetchBufferSize" value="1048576"/>
<property name="CallableStmtCacheSize" value="100"/>
<property name="ClobberStreamingResults" value="false"/>
<property name="TcpNoDelay" value="true"/>
<property name="LoadBalanceStrategy" value="random"/>
<property name="StrictFloatingPoint" value="false"/>
<property name="AutoReconnectForPools" value="false"/>
<property name="CacheCallableStmts" value="false"/>
<property name="MaxRows" value="-1"/>
<property name="UseLocalTransactionState" value="false"/>
<property name="UseLegacyDatetimeCode" value="true"/>
<property name="BlobSendChunkSize" value="1048576"/>
<property name="NetTimeoutForStreamingResults" value="600"/>
<property name="UseGmtMillisForDatetimes" value="false"/>
<property name="AllowMultiQueries" value="false"/>
<property name="JdbcCompliantTruncation" value="true"/>
<property name="AutoGenerateTestcaseScript" value="false"/>
<property name="EnablePacketDebug" value="false"/>
<property name="ContinueBatchOnError" value="true"/>
<property name="TraceProtocol" value="false"/>
<property name="UseCursorFetch" value="false"/>
<property name="UseAffectedRows" value="false"/>
<property name="EmptyStringsConvertToZero" value="true"/>
<property name="PreparedStatementCacheSize" value="25"/>
<property name="StrictUpdates" value="true"/>
<property name="DumpQueriesOnException" value="false"/>
<property name="PacketDebugBufferSize" value="20"/>
<property name="AllowUrlInLocalInfile" value="false"/>
<property name="UseUsageAdvisor" value="false"/>
<property name="QueriesBeforeRetryMaster" value="50"/>
<property name="SocketFactory" value="com.mysql.jdbc.StandardSocketFactory"/>
<property name="DefaultFetchSize" value="0"/>
<property name="ProfilerEventHandler" value="com.mysql.jdbc.profiler.LoggingProfilerEventHandler"/>
<property name="ConnectTimeout" value="0"/>
<property name="YearIsDateType" value="true"/>
<property name="RunningCTS13" value="false"/>
<property name="UseJDBCCompliantTimezoneShift" value="false"/>
<property name="UseFastIntParsing" value="true"/>
<property name="ProfileSql" value="false"/>
<property name="RelaxAutoCommit" value="false"/>
<property name="NullNamePatternMatchesAll" value="true"/>
<property name="RetriesAllDown" value="120"/>
<property name="CachePrepStmts" value="true"/>
<property name="UseReadAheadInput" value="false"/>
<property name="SlowQueryThresholdMillis" value="30000"/>
<property name="LogSlowQueries" value="true"/>
<property name="MaintainTimeStats" value="false"/>
<property name="PrepStmtCacheSize" value="512"/>
<property name="ZeroDateTimeBehavior" value="convertToNull"/>
<property name="AlwaysSendSetIsolation" value="false"/>
<property name="UseUnbufferedInput" value="false"/>
<property name="CacheServerConfiguration" value="true"/>
<property name="ElideSetAutoCommits" value="true"/>
<property name="UseLocalSessionState" value="true"/>
<property name="DatabaseName" value="XXX"/>
<property name="PortNumber" value="XXX"/>
<property name="Port" value="XXX"/>
<property name="User" value="XXX"/>
<property name="URL" value="XXX"/>
<property name="Password" value=""/>
</jdbc-connection-pool>

Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 9:52 PM   in response to: rwillie6
  Click to reply to this thread Reply

Hi Ryan,

Most of your pool settings look fine.
You can switch back to "auto-commit" mode of validation.

Regarding the following thread dump :
http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338842-8159/jstack_dump_2009-03-24_09:26:34.txt

As the resizer is cleaning & validating the pool, all the connection
requests are made to wait. This will be completed in few seconds, again
dependent on the number of connections to be validated. You can confirm
this by taking another thread dump in a couple of minutes.
However, you can tune the pool settings to minimize the frequency of
resizer calls.

This is not the solution to the actual issue you may be facing (as your
other stack traces confirm that the real issue is not w.r.t database),
but it will help to further reduce the number of database validation
calls / cleanup jobs.

idle-timeout : 300 seconds can be made to 1500 seconds
validate-atmost-once-period-in-seconds : 10 seconds can be made to 60
seconds
max-connection-usage : 300 can be made to 500

Hope this helps.

Thanks,
-Jagadish



On Tue, 2009-03-24 at 11:44 -0700, glassfish@javadesktop.org wrote:
> I'm not sure which connection pool you were referring to, so here are both:
>
> <connection-pool max-pending-count="4096" queue-size-in-bytes="4096" receive-buffer-size-in-bytes="4096" send-buffer-size-in-bytes="8192"/>
>
> And the JDBC pool:
> <jdbc-connection-pool allow-non-component-callers="false" associate-with-thread="false" connection-creation-retry-attempts="0" connection-creation-retry-interval-in-seconds="10" connection-leak-reclaim="false" connection-leak-timeout-in-seconds="0" connection-validation-method="auto-commit" datasource-classname="com.mysql.jdbc.jdbc2.optional.MysqlDataSource" fail-all-connections="false" idle-timeout-in-seconds="300" is-connection-validation-required="true" is-isolation-level-guaranteed="false" lazy-connection-association="false" lazy-connection-enlistment="false" match-connections="false" max-connection-usage-count="300" max-pool-size="400" max-wait-time-in-millis="60000" name="fathom-pool2" non-transactional-connections="false" pool-resize-quantity="10" res-type="javax.sql.DataSource" statement-timeout-in-seconds="55" steady-pool-size="40" validate-atmost-once-period-in-seconds="10" wrap-jdbc-objects="false">
> <property name="SelfDestructOnPingMaxOperations" value="0"/>
> <property name="UseUltraDevWorkAround" value="false"/>
> <property name="UseStreamLengthsInPrepStmts" value="true"/>
> <property name="TcpRcvBuf" value="0"/>
> <property name="UltraDevHack" value="false"/>
> <property name="InitialTimeout" value="2"/>
> <property name="UseNanosForElapsedTime" value="false"/>
> <property name="MetadataCacheSize" value="50"/>
> <property name="UseColumnNamesInFindColumn" value="false"/>
> <property name="RequireSSL" value="false"/>
> <property name="LogXaCommands" value="false"/>
> <property name="CacheCallableStatements" value="false"/>
> <property name="NoDatetimeStringSync" value="false"/>
> <property name="EmulateUnsupportedPstmts" value="true"/>
> <property name="TreatUtilDateAsTimestamp" value="true"/>
> <property name="EmulateLocators" value="false"/>
> <property name="ClientInfoProvider" value="com.mysql.jdbc.JDBC4CommentClientInfoProvider"/>
> <property name="IsInteractiveClient" value="false"/>
> <property name="IncludeInnodbStatusInDeadlockExceptions" value="false"/>
> <property name="ProcessEscapeCodesForPrepStmts" value="true"/>
> <property name="GatherPerformanceMetrics" value="false"/>
> <property name="UseOnlyServerErrorMessages" value="true"/>
> <property name="HoldResultsOpenOverStatementClose" value="false"/>
> <property name="LoadBalanceBlacklistTimeout" value="0"/>
> <property name="UseServerPreparedStmts" value="false"/>
> <property name="UseTimezone" value="false"/>
> <property name="BlobsAreStrings" value="false"/>
> <property name="IgnoreNonTxTables" value="false"/>
> <property name="UseSSL" value="false"/>
> <property name="AllowLoadLocalInfile" value="true"/>
> <property name="SocketFactoryClassName" value="com.mysql.jdbc.StandardSocketFactory"/>
> <property name="SelfDestructOnPingSecondsLifetime" value="0"/>
> <property name="AutoClosePStmtStreams" value="false"/>
> <property name="NullCatalogMeansCurrent" value="true"/>
> <property name="UseInformationSchema" value="false"/>
> <property name="DynamicCalendars" value="false"/>
> <property name="CreateDatabaseIfNotExist" value="false"/>
> <property name="UseServerPrepStmts" value="false"/>
> <property name="Pedantic" value="false"/>
> <property name="UseSSPSCompatibleTimezoneShift" value="false"/>
> <property name="ReconnectAtTxEnd" value="false"/>
> <property name="Url" value="jdbc:mysql://:3306/"/>
> <property name="UseJvmCharsetConverters" value="false"/>
> <property name="FunctionsNeverReturnBlobs" value="false"/>
> <property name="NoAccessToProcedureBodies" value="false"/>
> <property name="AllowNanAndInf" value="false"/>
> <property name="CachePreparedStatements" value="false"/>
> <property name="ExplainSlowQueries" value="false"/>
> <property name="CacheResultSetMetadata" value="false"/>
> <property name="TcpKeepAlive" value="true"/>
> <property name="PreparedStatementCacheSqlLimit" value="256"/>
> <property name="SlowQueryThresholdNanos" value="0"/>
> <property name="LoginTimeout" value="0"/>
> <property name="UseCompression" value="false"/>
> <property name="AutoSlowLog" value="true"/>
> <property name="TcpSndBuf" value="0"/>
> <property name="TcpTrafficClass" value="0"/>
> <property name="RoundRobinLoadBalance" value="false"/>
> <property name="CallableStatementCacheSize" value="100"/>
> <property name="InteractiveClient" value="false"/>
> <property name="NoTimezoneConversionForTimeType" value="false"/>
> <property name="UseHostsInPrivileges" value="true"/>
> <property name="UseSqlStateCodes" value="true"/>
> <property name="GatherPerfMetrics" value="false"/>
> <property name="ReportMetricsIntervalMillis" value="30000"/>
> <property name="UseFastDateParsing" value="true"/>
> <property name="LoggerClassName" value="com.mysql.jdbc.log.StandardLogger"/>
> <property name="DontTrackOpenResources" value="false"/>
> <property name="UseOldUTF8Behavior" value="false"/>
> <property name="OverrideSupportsIntegrityEnhancementFacility" value="false"/>
> <property name="RollbackOnPooledClose" value="true"/>
> <property name="GenerateSimpleParameterMetadata" value="false"/>
> <property name="SecondsBeforeRetryMaster" value="30"/>
> <property name="ResultSetSizeThreshold" value="100"/>
> <property name="TransformedBitIsBoolean" value="false"/>
> <property name="SocketTimeout" value="0"/>
> <property name="Paranoid" value="false"/>
> <property name="FailOverReadOnly" value="true"/>
> <property name="CapitalizeTypeNames" value="true"/>
> <property name="VerifyServerCertificate" value="true"/>
> <property name="TinyInt1isBit" value="true"/>
> <property name="UseDynamicCharsetInfo" value="true"/>
> <property name="UseBlobToStoreUTF8OutsideBMP" value="false"/>
> <property name="EnableQueryTimeouts" value="true"/>
> <property name="PopulateInsertRowWithDefaultValues" value="false"/>
> <property name="UseDirectRowUnpack" value="true"/>
> <property name="PadCharsWithSpace" value="false"/>
> <property name="UseOldAliasMetadataBehavior" value="false"/>
> <property name="PrepStmtCacheSqlLimit" value="256"/>
> <property name="MaxReconnects" value="3"/>
> <property name="AutoDeserialize" value="false"/>
> <property name="JdbcCompliantTruncationForReads" value="true"/>
> <property name="CompensateOnDuplicateKeyUpdateCounts" value="false"/>
> <property name="PinGlobalTxToPhysicalConnection" value="false"/>
> <property name="ProfileSQL" value="false"/>
> <property name="RewriteBatchedStatements" value="false"/>
> <property name="UseUnicode" value="true"/>
> <property name="Logger" value="com.mysql.jdbc.log.StandardLogger"/>
> <property name="DumpMetadataOnColumnNotFound" value="false"/>
> <property name="MaxQuerySizeToLog" value="2048"/>
> <property name="LocatorFetchBufferSize" value="1048576"/>
> <property name="CallableStmtCacheSize" value="100"/>
> <property name="ClobberStreamingResults" value="false"/>
> <property name="TcpNoDelay" value="true"/>
> <property name="LoadBalanceStrategy" value="random"/>
> <property name="StrictFloatingPoint" value="false"/>
> <property name="AutoReconnectForPools" value="false"/>
> <property name="CacheCallableStmts" value="false"/>
> <property name="MaxRows" value="-1"/>
> <property name="UseLocalTransactionState" value="false"/>
> <property name="UseLegacyDatetimeCode" value="true"/>
> <property name="BlobSendChunkSize" value="1048576"/>
> <property name="NetTimeoutForStreamingResults" value="600"/>
> <property name="UseGmtMillisForDatetimes" value="false"/>
> <property name="AllowMultiQueries" value="false"/>
> <property name="JdbcCompliantTruncation" value="true"/>
> <property name="AutoGenerateTestcaseScript" value="false"/>
> <property name="EnablePacketDebug" value="false"/>
> <property name="ContinueBatchOnError" value="true"/>
> <property name="TraceProtocol" value="false"/>
> <property name="UseCursorFetch" value="false"/>
> <property name="UseAffectedRows" value="false"/>
> <property name="EmptyStringsConvertToZero" value="true"/>
> <property name="PreparedStatementCacheSize" value="25"/>
> <property name="StrictUpdates" value="true"/>
> <property name="DumpQueriesOnException" value="false"/>
> <property name="PacketDebugBufferSize" value="20"/>
> <property name="AllowUrlInLocalInfile" value="false"/>
> <property name="UseUsageAdvisor" value="false"/>
> <property name="QueriesBeforeRetryMaster" value="50"/>
> <property name="SocketFactory" value="com.mysql.jdbc.StandardSocketFactory"/>
> <property name="DefaultFetchSize" value="0"/>
> <property name="ProfilerEventHandler" value="com.mysql.jdbc.profiler.LoggingProfilerEventHandler"/>
> <property name="ConnectTimeout" value="0"/>
> <property name="YearIsDateType" value="true"/>
> <property name="RunningCTS13" value="false"/>
> <property name="UseJDBCCompliantTimezoneShift" value="false"/>
> <property name="UseFastIntParsing" value="true"/>
> <property name="ProfileSql" value="false"/>
> <property name="RelaxAutoCommit" value="false"/>
> <property name="NullNamePatternMatchesAll" value="true"/>
> <property name="RetriesAllDown" value="120"/>
> <property name="CachePrepStmts" value="true"/>
> <property name="UseReadAheadInput" value="false"/>
> <property name="SlowQueryThresholdMillis" value="30000"/>
> <property name="LogSlowQueries" value="true"/>
> <property name="MaintainTimeStats" value="false"/>
> <property name="PrepStmtCacheSize" value="512"/>
> <property name="ZeroDateTimeBehavior" value="convertToNull"/>
> <property name="AlwaysSendSetIsolation" value="false"/>
> <property name="UseUnbufferedInput" value="false"/>
> <property name="CacheServerConfiguration" value="true"/>
> <property name="ElideSetAutoCommits" value="true"/>
> <property name="UseLocalSessionState" value="true"/>
> <property name="DatabaseName" value="XXX"/>
> <property name="PortNumber" value="XXX"/>
> <property name="Port" value="XXX"/>
> <property name="User" value="XXX"/>
> <property name="URL" value="XXX"/>
> <property name="Password" value=""/>
> </jdbc-connection-pool>
> [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=338782
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 12:20 AM   in response to: Jagadish Prasat...
 
  Click to reply to this thread Reply

Hi Jagadish,

I have made your changes with respect to the connection pool, as well as disabled auto-commit for the time being. Also, I updated to jdk 1.6.0_13, as well as Glassfish 2.1-b60e.

Unfortunately, I am not sure whether these changes have had a positive effect because we started experiencing general sluggishness (10s+ page loads) about 12 hours ago that we have been unable to remedy. Our traffic during this period has been unremarkable, and our munin graphs have appeared normal. Nonetheless, the server has responding very slowly. I don't think it's related to our application, since this is the same version of the app we had deployed for the previous week without experiencing this slowness. Also, I don't think it's database related, since even static content is affected by the slow responses.

I thought the change may have been due to running at Xmx3000m and Xms3000m, instead of Xmx2500m and Xms2500m, so we changed back to 2500 but that was unremarkable as well.

I'm not sure what to do. Sometimes, page loads are quite snappy, but other times they take 20s, and I have been unable to pinpoint differences.

Please let me know if there is any other data I can provide.

rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 12:31 AM   in response to: rwillie6
 
  Click to reply to this thread Reply

Actually, there is one remarkable change: the application server is now using less CPU than ever. I suspect this is partially due to decreased traffic, but the decrease seems too large to be explainable by that alone...

Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 2:33 AM   in response to: rwillie6
  Click to reply to this thread Reply



On Wed, 2009-03-25 at 00:20 -0700, glassfish@javadesktop.org wrote:
> Nonetheless, the server has responding very slowly. I don't think
> it's related to our application, since this is the same version of the
> app we had deployed for the previous week without experiencing this
> slowness. Also, I don't think it's database related, since even
> static content is affected by the slow responses.
JeanFrancois : hope this clarifies that this is not w.r.t DB, can you
look into the issue ?

Thanks,
-Jagadish


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 3:50 AM   in response to: Jagadish Prasat...
 
  Click to reply to this thread Reply

On the whole am having general sluggishness in the last 12+ hours, this includes both static and dynamic content. I setup a new instance using the latest jdk and glassfish, configured it from a clean install changing only know settings and managed to reproduce the problem I was having on the previous server; though I was intending to solve the problem.

I'm not sure how it could be the code base, since the same code base was running for days previously without incident.

I'm checking with our DB expert to see if anything changed there that I was unaware of.

I've attached a jstack dump that was taken while I was waiting for a 20s page load, perhaps there is something in there.

Not sure what to do....

rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 4:52 AM   in response to: rwillie6
 
  Click to reply to this thread Reply

As an addendum, I did some further debugging to isolate glassfish with these results:

Bypassing ha-proxy as a load-balancer and directing traffic directly to glassfish does not reduce slowness.

Switching to a different database setup we had previously used without trouble does not reduce slowness. (We tried this because a little over 2 weeks ago we switch to a new database server and even though we hadn't had trouble with it in the last 2 weeks we wanted to confirm that it was not the issue).

We're on ec2 and the ec2 status panel is reporting norma behavior, so I'm not sure about any ways to confirm it isn't ec2 related.

This time I've attached the domain.xml from our currently problematic install; maybe there's something in there that matters that I was forgetting to mention.

Thanks for the help so far!

km

Posts: 681
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 6:55 AM   in response to: rwillie6
 
  Click to reply to this thread Reply

I don't see anything suspicious in your domain.xml.

In your stack dump however, *all* the 150 Grizzly threads are just waiting for someone to notify
them, since everyone grabs the pipeline and goes back to waiting state. JF, do you have any
comment on that?

-Kedar

Scott Oaks
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 7:55 AM   in response to: km
  Click to reply to this thread Reply

I'm not JF (nor do I play him on TV), but I'll weigh in with a few
observations.

In the stacks I saw, either all threads are just waiting (as Kedar
mentions), or most threads are waiting and a few are blocked reading
from the database. None of those threads would be causing CPU to be
used, nor would that slow down accessing static resources. [It could
mean that requests that need to go to the database are slow; the fact
that threads are either idle or reading from the database but not doing
anything else indicates that is a potential problem, but not necessarily
so either.]

When the OS reports that CPU is being used and it doesn't show up in
jstacks with threads actually running (in the RUNNABLE state, but not in
socketRead0), it typically means that the GC threads are running and
using the CPU. It would be good to use jstat to monitor how much time GC
(or use -Xloggc:gc.log in the domain.xml, which will print out the GC
usage to config/gc.log, which you can tail and see when GC is running).
If GC is spending more than a few % of total time, that's an issue, and
I'd guess that the hibernate cache is causing memory pressures.

The spinning poll issue JF mentioned earlier in the thread is still a
possibility in my mind if CPU is 100% utilized; I didn't see any
discussion of that specifically.

-Scott

On Wed, 2009-03-25 at 06:55 -0700, glassfish@javadesktop.org wrote:

> I don't see anything suspicious in your domain.xml.
>
> In your stack dump however, *all* the 150 Grizzly threads are just waiting for someone to notify
> them, since everyone grabs the pipeline and goes back to waiting state. JF, do you have any
> comment on that?
>
> -Kedar
> [Message sent by forum member 'km' (km)]
>
> http://forums.java.net/jive/thread.jspa?messageID=338958
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net


[att1.html]


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 12:28 PM   in response to: Scott Oaks
 
  Click to reply to this thread Reply

Hi Scott,

Just to clarify, the original issue for creating this thread was investigating increases in CPU and interrupts / context switching that occurred during non-peak traffic. Now, however, our graphs are unremarkable. We're barely using any CPU ( <8% ) and the interruptes /

Scott Oaks
Re: Debugging Glassfish CPU usage
Posted: Mar 25, 2009 2:58 PM   in response to: rwillie6
  Click to reply to this thread Reply

So if you're not seeing the CPU spikes anymore, it's hard to guess
what's going on -- but given that the glassfish threads are all idle in
the latest stack dumps, it seems likely that the problem is elsewhere.

The most likely culprit seems to me to be the network -- I did notice in
some of the stack dumps that a few glassfish threads were blocking
writing data back to the clients, which would normally only happen if
there is a network issue. I wonder if running tcpdump/snoop on both the
client and server machines would show any delays. Or seeing how much ftp
thoughput you get during those 30-second intervals. Or even simple ping
tests (or traceroutes) could show a long time delay.

We have seen on some platforms (like Windows 2003) that the OS fails to
send us the appropriate signal when a request is pending; I suppose
something like that could be going on. That might also explain why it
started happening suddenly, since an automatic OS update might have
introduced the problem. But that seems much less likely to me; it would
be a big OS bug (Windows 2003, not being a server-targeted OS, didn't
necessarily cause so many difficulties with its issue).

-Scott

On Wed, 2009-03-25 at 12:28 -0700, glassfish@javadesktop.org wrote:

> Hi Scott,
>
> Just to clarify, the original issue for creating this thread was investigating increases in CPU and interrupts / context switching that occurred during non-peak traffic. Now, however, our graphs are unremarkable. We're barely using any CPU ( <8% ) and the interruptes / [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=339036
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net


[att1.html]


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 26, 2009 1:15 AM   in response to: Scott Oaks
 
  Click to reply to this thread Reply

Here's an update of our progress. We tried to isolate various parts of the setup today to find anything that would help. We tried creating an entirely new app server from scratch, using a different ec2 availability zone, the latest jdk6 and latest glassfish, but the results were the same.

So, I started investigating gc. I added the JVM options Xloggc and XX:+PrintGCDetails and then used apache benchmark to put the server under higher load to try to force memory to grow more quickly. After a while of doing this and watching the gc log, I concluded that gc was not a problem. For the duration, young gen collections took only a few milliseconds and full GC took 2-3 seconds (when it happened, which was rarely). Another lesson from this was that it occurred when the network variable was removed, as ab hit the same slow request loads when running locally on the app server.

I wanted to know more about where the slow requests were spending their time, so I added a filter to our app on /* to benchmark the processing time that each request spent in our app's code. I ran the server with this logging enabled until I'd collected 30K requests, then wrote a little util to parse that log and insert the results into my local DB. I ran some queries on the data and discovered that only 4 of the 30K requests had a processing time of >15s (1 at >20s, and 1 at >30s), of which 3 or those 4 requests were (oddly) to a HttpServlet that did nothing but call request.getSession() and then set two String constants to the session. Overall, 99.9% of requests were <2s.

This was confusing because I had data to show that our code was processing requests very quickly, but I was watching the counter on our down page hover around at 200+ users all afternoon (the down page is served by haproxy when it receives a 504 from glassfish). So, requests were processing slowly enough to timeout frequently, but they were not spending that time in our application code.

To be doubly sure that haproxy wasn't misbehaving, we directed traffic straight to glassfish for an hour or so, but that didn't change anything. On a whim, I cut the http acceptor threads down to 1 with an interesting effect: the site was still slow, but less erratic. Whereas, before, some requests were very fast and others very slow, moving to 1 acceptor just made every request in the range of sort-of-slow to more-slow. Also, with 1 acceptor, the server is noticeably snappier immediately after restart and becomes progressively slower thereafter.

Lacking other ideas, we're considering trying solaris instead of ubuntu, but are hoping to get access to an ec2 ami of OpenSolaris 2008.11 64-bit (if we can't, we might try the 64-bit xsce glassfish ami that we found).

Any other advice? Things to try? Data to collect?

arjunlall

Posts: 19
Re: Debugging Glassfish CPU usage
Posted: Mar 26, 2009 3:44 AM   in response to: rwillie6
 
  Click to reply to this thread Reply

I work with Ryan (rwillie6) and want to report that we believe we have narrowed down the root of our problem, and have resolved our server slowdowns for the time being.

While debugging our slowness we tried setting our acceptor threads count to 1 (it used to be 5). Unfortunately, when we did this something with lucene caused our threads to lock up permanently so we thought that setting the acceptor threads down to 1 didn't help. However after looking through thread dumps and finding out what lucene was doing we redeployed a version of our app with lucene commented out and have since had fantastic performance, even with just 1 acceptor thread.

However, thinking that the lucene conflict was the root of the problem, we set the acceptor threads back to 5 and discovered that the occasional slowness problem reappeared. After some more testing we discovered that the problem occurs with any acceptor thread value above 1.

I noticed that during the periods of "slowness" when the acceptor threads were greater than 1 our CPU usage on the server was 0%. I made a thread dump during this state. Im just shooting in the dark but I noticed there were two threads there were running sun.nio.ch.FileDispatcher.preClose(). Perhaps a file lock problem? I have attached the full dump so you can take a closer look.

Its still quite a mystery to us why we were able to run with 5 acceptor threads and no problems up until yesterday, especially given that yesterday was not an abnormal day in terms of traffic.

Thanks for all the great help so far. Let us know if we can do anything else to help detail the problem.

arjunlall

Posts: 19
Re: Debugging Glassfish CPU usage
Posted: Mar 26, 2009 3:48 AM   in response to: arjunlall
 
  Click to reply to this thread Reply

Forgot to attach the thread dump!

rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Apr 8, 2009 7:05 PM   in response to: Scott Oaks
 
  Click to reply to this thread Reply

Hi Scott,

See my most recent reply at the bottom. It does seem to be a GC issue, and I attached a couple of gc logs from the most recent occurrences. Unfortunately, I'm not sure it's the Hibernate cache, since we do not have any second-level cache enabled. Or, could it be elsewhere in Hibrnate? It seems to me from the gc logs that something is stuck, since full GC's are running continuously when the problem occurs. It's as if they are running, but failing, and so running over and over again. Where might I be able to look to try to find the problem?

Thanks for your help once again!

Scott Oaks
Re: Debugging Glassfish CPU usage
Posted: Apr 9, 2009 7:42 AM   in response to: rwillie6
  Click to reply to this thread Reply

There's definitely something there using more memory than you want; then
at each GC it cleans up just enough to limp along to the next full gc
(but not enough to do anything useful).

You'll have to look into some memory tools (jmap to generate a heap
dump, and then something like jhat or the Netbeans memory tool or the
Eclipse memory analyzer) to see what objects are using all that space
and see why they aren't getting reclaimed.

-Scott

On Wed, 2009-04-08 at 19:05 -0700, glassfish@javadesktop.org wrote:

> Hi Scott,
>
> See my most recent reply at the bottom. It does seem to be a GC issue, and I attached a couple of gc logs from the most recent occurrences. Unfortunately, I'm not sure it's the Hibernate cache, since we do not have any second-level cache enabled. Or, could it be elsewhere in Hibrnate? It seems to me from the gc logs that something is stuck, since full GC's are running continuously when the problem occurs. It's as if they are running, but failing, and so running over and over again. Where might I be able to look to try to find the problem?
>
> Thanks for your help once again!
> [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=341268
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net


[att1.html]


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 5:59 PM   in response to: Jagadish Prasat...
 
  Click to reply to this thread Reply

Hi Jagadish,

Hmm... the only noticeable change after switching to "table" connection validation is the whole site seems to be more sluggish. Is that expected?

Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 4:16 AM   in response to: Jeanfrancois Ar...
  Click to reply to this thread Reply




On Tue, 2009-03-24 at 11:46 +0100, Jeanfrancois Arcand wrote:
> Salut,
>
> glassfish@javadesktop.org wrote:

> Jagadish, do you know if this could be the issue or not? But for sure
> all Grizzly WorkerThread may gets locked by such call.
jstack_dump_2009-03-23_22:10:22.txt
I see 250 "httpWorkerThread-80*" threads of which 34 are used for
database.

Lets wait for "rwillie" to provide more details, output based on the
proposed settings and see.

Thanks,
-Jagadish


>
> A+
>
> -- Jeanfrancois
>
>
> > Thanks!
> > [Message sent by forum member 'rwillie6' (rwillie6)]
> >
> > http://forums.java.net/jive/thread.jspa?messageID=338656
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> > For additional commands, e-mail: users-help@glassfish.dev.java.net
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


Jeanfrancois Ar...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 7:11 AM   in response to: Jagadish Prasat...
  Click to reply to this thread Reply

Salut,

Jagadish Prasath Ramu wrote:
>
>
> On Tue, 2009-03-24 at 11:46 +0100, Jeanfrancois Arcand wrote:
>> Salut,
>>
>> glassfish@javadesktop.org wrote:
>
>> Jagadish, do you know if this could be the issue or not? But for sure
>> all Grizzly WorkerThread may gets locked by such call.
> jstack_dump_2009-03-23_22:10:22.txt
> I see 250 "httpWorkerThread-80*" threads of which 34 are used for
> database.

Hum, for that one:

http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338654-8150/jstack_dump_2009-03-24_09:26:34.txt

All Grizzly WorkerThread are locked.

>
> Lets wait for "rwillie" to provide more details, output based on the
> proposed settings and see.

Thanks!

-- Jeanfrancois


>
> Thanks,
> -Jagadish
>
>
>> A+
>>
>> -- Jeanfrancois
>>
>>
>>> Thanks!
>>> [Message sent by forum member 'rwillie6' (rwillie6)]
>>>
>>> http://forums.java.net/jive/thread.jspa?messageID=338656
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
>>> For additional commands, e-mail: users-help@glassfish.dev.java.net
>>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 7:20 AM   in response to: Jeanfrancois Ar...
  Click to reply to this thread Reply



On Tue, 2009-03-24 at 15:11 +0100, Jeanfrancois Arcand wrote:
> Salut,
>
> Jagadish Prasath Ramu wrote:
> >
> >
> > On Tue, 2009-03-24 at 11:46 +0100, Jeanfrancois Arcand wrote:
> >> Salut,
> >>
> >> glassfish@javadesktop.org wrote:
> >
> >> Jagadish, do you know if this could be the issue or not? But for sure
> >> all Grizzly WorkerThread may gets locked by such call.
> > jstack_dump_2009-03-23_22:10:22.txt
> > I see 250 "httpWorkerThread-80*" threads of which 34 are used for
> > database.
>
> Hum, for that one:
>
> http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338654-8150/jstack_dump_2009-03-24_09:26:34.txt
>
> All Grizzly WorkerThread are locked.
When resizer (cleaner) is doing cleanup of the pool, all the connections
will be validated. This will make all the requesting threads to wait
till resizer completes its job. This is a temporary wait stage.

jstack_dump_2009-03-23_22:10:22.txt
For the above stack, only 34 out of 250 httpWorkerThreads are used by
database calls. So, what could be the issue ?

Thanks,
-Jagadish
>
> >
> > Lets wait for "rwillie" to provide more details, output based on the
> > proposed settings and see.
>
> Thanks!
>
> -- Jeanfrancois
>
>
> >
> > Thanks,
> > -Jagadish
> >
> >
> >> A+
> >>
> >> -- Jeanfrancois
> >>
> >>
> >>> Thanks!
> >>> [Message sent by forum member 'rwillie6' (rwillie6)]
> >>>
> >>> http://forums.java.net/jive/thread.jspa?messageID=338656
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> >>> For additional commands, e-mail: users-help@glassfish.dev.java.net
> >>>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> > For additional commands, e-mail: users-help@glassfish.dev.java.net
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


Jeanfrancois Ar...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 7:27 AM   in response to: Jagadish Prasat...
  Click to reply to this thread Reply



Jagadish Prasath Ramu wrote:
>
> On Tue, 2009-03-24 at 15:11 +0100, Jeanfrancois Arcand wrote:
>> Salut,
>>
>> Jagadish Prasath Ramu wrote:
>>>
>>> On Tue, 2009-03-24 at 11:46 +0100, Jeanfrancois Arcand wrote:
>>>> Salut,
>>>>
>>>> glassfish@javadesktop.org wrote:
>>>> Jagadish, do you know if this could be the issue or not? But for sure
>>>> all Grizzly WorkerThread may gets locked by such call.
>>> jstack_dump_2009-03-23_22:10:22.txt
>>> I see 250 "httpWorkerThread-80*" threads of which 34 are used for
>>> database.
>> Hum, for that one:
>>
>> http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338654-8150/jstack_dump_2009-03-24_09:26:34.txt
>>
>> All Grizzly WorkerThread are locked.
> When resizer (cleaner) is doing cleanup of the pool, all the connections
> will be validated. This will make all the requesting threads to wait
> till resizer completes its job. This is a temporary wait stage.
>
> jstack_dump_2009-03-23_22:10:22.txt


Also 1 hour pass between the two jstack so I would think the slow down
happened during that time. How long does it take for the finalizer to
execute? That window might be the issue. It is really hard to say with
those 2 jstack as the time frame in between is really long.

A+

- Jeanfrancois


> For the above stack, only 34 out of 250 httpWorkerThreads are used by
> database calls. So, what could be the issue ?
>
> Thanks,
> -Jagadish
>>> Lets wait for "rwillie" to provide more details, output based on the
>>> proposed settings and see.
>> Thanks!
>>
>> -- Jeanfrancois
>>
>>
>>> Thanks,
>>> -Jagadish
>>>
>>>
>>>> A+
>>>>
>>>> -- Jeanfrancois
>>>>
>>>>
>>>>> Thanks!
>>>>> [Message sent by forum member 'rwillie6' (rwillie6)]
>>>>>
>>>>> http://forums.java.net/jive/thread.jspa?messageID=338656
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
>>>>> For additional commands, e-mail: users-help@glassfish.dev.java.net
>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
>>> For additional commands, e-mail: users-help@glassfish.dev.java.net
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
>> For additional commands, e-mail: users-help@glassfish.dev.java.net
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 7:43 AM   in response to: Jeanfrancois Ar...
  Click to reply to this thread Reply



On Tue, 2009-03-24 at 15:27 +0100, Jeanfrancois Arcand wrote:
> > jstack_dump_2009-03-23_22:10:22.txt
>
>
> Also 1 hour pass between the two jstack so I would think the slow
> down
> happened during that time. How long does it take for the finalizer to
> execute?
It will be in the order of seconds. Again, it depends on the total
number of connections being validated.

Thanks,
-Jagadish
> That window might be the issue. It is really hard to say with
> those 2 jstack as the time frame in between is really long.
>
> A+
>
> - Jeanfrancois


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 7:45 AM   in response to: Jagadish Prasat...
  Click to reply to this thread Reply



On Tue, 2009-03-24 at 20:13 +0530, Jagadish Prasath Ramu wrote:
>
> On Tue, 2009-03-24 at 15:27 +0100, Jeanfrancois Arcand wrote:
> > > jstack_dump_2009-03-23_22:10:22.txt
> >
> >
> > Also 1 hour pass between the two jstack so I would think the slow
> > down
> > happened during that time. How long does it take for the finalizer to
> > execute?
> It will be in the order of seconds. Again, it depends on the total
> number of connections being validated.
Also, there is no resizer involved in the first jstack.


>
> Thanks,
> -Jagadish
> > That window might be the issue. It is really hard to say with
> > those 2 jstack as the time frame in between is really long.
> >
> > A+
> >
> > - Jeanfrancois


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 5:47 PM   in response to: Jagadish Prasat...
 
  Click to reply to this thread Reply

More jstack dumps attached from earlier today.

jr158900

Posts: 323
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 9:33 PM   in response to: rwillie6
 
  Click to reply to this thread Reply

attaching another thread dump from rwillie6

Jagadish Prasat...
Re: Debugging Glassfish CPU usage
Posted: Mar 24, 2009 9:52 PM   in response to: Jeanfrancois Ar...
  Click to reply to this thread Reply


Hi JeanFrancois :

Except the follwing thread dump
jstack_dump_2009-03-24_09:26:34.txt

all others are not related to connection pool /db.

As there are three other stack traces which are indicating that the
database operations are fine during the "appserver unresponsive"
moments, the issue is not really due to database calls.

Can you take a look at the three stack traces below ?

http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338857-8163/jstack_dump_2009-03-22_20-25-12.txt.tar.gz
grep -ri "httpWorkerThread-80" jstack_dump_2009-03-22_20:25:12.txt | wc
-l
250

grep -ri "AbstractResourcePool" jstack_dump_2009-03-22_20:25:12.txt | wc
-l
0

grep -ri "org.apache.log4j.Category.forcedLog"
jstack_dump_2009-03-22_20:25:12.txt | wc -l
67

grep -ri "BLOCKED" jstack_dump_2009-03-22_20:25:12.txt | wc -l
68



http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338654-8151/jstack_dump_2009-03-23_22:10:22.txt

grep -ri "httpWorkerThread-80" jstack_dump_2009-03-23_22:10:22.txt | wc
-l
250

grep -ri "com.mysql.jdbc.MysqlIO.readFully"
jstack_dump_2009-03-23_22:10:22.txt | wc -l
24

grep -ri "AbstractResourcePool.getResource"
jstack_dump_2009-03-23_22:10:22.txt | wc -l
5



http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338842-8160/jstack_dump_2009-03-24_18:49:21.txt
grep -ri "httpWorkerThread-80" jstack_dump_2009-03-24_18:49:21.txt | wc
-l
50

grep -ri "AbstractResourcePool" jstack_dump_2009-03-24_18:49:21.txt | wc
-l
0

grep -ri "com.mysql.jdbc.ConnectionImpl"
jstack_dump_2009-03-24_18:49:21.txt | wc -l
5


Thanks,
-Jagadish










On Tue, 2009-03-24 at 15:27 +0100, Jeanfrancois Arcand wrote:
>
> Jagadish Prasath Ramu wrote:
> >
> > On Tue, 2009-03-24 at 15:11 +0100, Jeanfrancois Arcand wrote:
> >> Salut,
> >>
> >> Jagadish Prasath Ramu wrote:
> >>>
> >>> On Tue, 2009-03-24 at 11:46 +0100, Jeanfrancois Arcand wrote:
> >>>> Salut,
> >>>>
> >>>> glassfish@javadesktop.org wrote:
> >>>> Jagadish, do you know if this could be the issue or not? But for sure
> >>>> all Grizzly WorkerThread may gets locked by such call.
> >>> jstack_dump_2009-03-23_22:10:22.txt
> >>> I see 250 "httpWorkerThread-80*" threads of which 34 are used for
> >>> database.
> >> Hum, for that one:
> >>
> >> http://forums.java.net/jive/servlet/JiveServlet/download/56-59323-338654-8150/jstack_dump_2009-03-24_09:26:34.txt
> >>
> >> All Grizzly WorkerThread are locked.
> > When resizer (cleaner) is doing cleanup of the pool, all the connections
> > will be validated. This will make all the requesting threads to wait
> > till resizer completes its job. This is a temporary wait stage.
> >
> > jstack_dump_2009-03-23_22:10:22.txt
>
>
> Also 1 hour pass between the two jstack so I would think the slow down
> happened during that time. How long does it take for the finalizer to
> execute? That window might be the issue. It is really hard to say with
> those 2 jstack as the time frame in between is really long.
>
> A+
>
> - Jeanfrancois
>
>
> > For the above stack, only 34 out of 250 httpWorkerThreads are used by
> > database calls. So, what could be the issue ?
> >
> > Thanks,
> > -Jagadish
> >>> Lets wait for "rwillie" to provide more details, output based on the
> >>> proposed settings and see.
> >> Thanks!
> >>
> >> -- Jeanfrancois
> >>
> >>
> >>> Thanks,
> >>> -Jagadish
> >>>
> >>>
> >>>> A+
> >>>>
> >>>> -- Jeanfrancois
> >>>>
> >>>>
> >>>>> Thanks!
> >>>>> [Message sent by forum member 'rwillie6' (rwillie6)]
> >>>>>
> >>>>> http://forums.java.net/jive/thread.jspa?messageID=338656
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> >>>>> For additional commands, e-mail: users-help@glassfish.dev.java.net
> >>>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> >>> For additional commands, e-mail: users-help@glassfish.dev.java.net
> >>>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> >> For additional commands, e-mail: users-help@glassfish.dev.java.net
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Mar 27, 2009 6:45 AM   in response to: Jeanfrancois Ar...
 
  Click to reply to this thread Reply

Ah, I misread the timestamps on the thread dumps I thought I had (turns out they were actually taken by Arjun 2 hours before the CPU spike I was investigating). Sorry, JF.

rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Apr 1, 2009 1:02 PM   in response to: Jeanfrancois Ar...
 
  Click to reply to this thread Reply

Hi JF,

I'm still having the intermittent spikes that were the original reason for the post and, because of increasing traffic, they've increased in frequency to every 24 hours or so. They are easy to see coming, because of increases in CPU and Interrupts / Context-Switches in my munin graphs, and can be remedied by restarting the app server. But, without a restart, the situation does not resolve itself. Eventually, the spikes go even higher and the server becomes non-responsive.

As an update, to resolve the sluggishness that we started experiencing after originally creating this thread, I cut the acceptors down to 1 and the thread-count down to 80 (I'm not sure why this helped, since we had been running 5 acceptors and 150 thread-count for 6 months without problem). After doing so, I'm now back to the original problem and am still hoping to find a solution, since waking up in the middle of the night to restart the server after it blew up is getting exhausting ...

I've attached two more thread dumps from when the CPU and Interrupts were escalating and I had to restart the server. Please let me know if these are indicative of the spin state you previously referenced or whether there is any other data I can provide.

Jeanfrancois Ar...
Re: Debugging Glassfish CPU usage
Posted: Apr 3, 2009 1:02 PM   in response to: rwillie6
  Click to reply to this thread Reply

Salut,

glassfish@javadesktop.org wrote:
> Hi JF,
>
> I'm still having the intermittent spikes that were the original reason for the post and, because of increasing traffic, they've increased in frequency to every 24 hours or so. They are easy to see coming, because of increases in CPU and Interrupts / Context-Switches in my munin graphs, and can be remedied by restarting the app server. But, without a restart, the situation does not resolve itself. Eventually, the spikes go even higher and the server becomes non-responsive.
>
> As an update, to resolve the sluggishness that we started experiencing after originally creating this thread, I cut the acceptors down to 1 and the thread-count down to 80 (I'm not sure why this helped, since we had been running 5 acceptors and 150 thread-count for 6 months without problem). After doing so, I'm now back to the original problem and am still hoping to find a solution, since waking up in the middle of the night to restart the server after it blew up is getting exhausting ...
>
> I've attached two more thread dumps from when the CPU and Interrupts were escalating and I had to restart the server. Please let me know if these are indicative of the spin state you previously referenced or whether there is any other data I can provide.
> [Message sent by forum member 'rwillie6' (rwillie6)]

Nothing strange. Can you try openJDK b 50? That one has the fix for the
spinning Selector so it will clear the issue.

Thanks

-- Jeanfrancois

>
> http://forums.java.net/jive/thread.jspa?messageID=340113
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Apr 5, 2009 2:32 PM   in response to: Jeanfrancois Ar...
 
  Click to reply to this thread Reply

Hi JF,

Thanks for the help. Unfortunately, after 12 hours of running on JDK 7 b50 the server exhibited the same increase in CPU usage and context-switch activity as before. Any other suggestions?

Burak Oguz
Re: Debugging Glassfish CPU usage
Posted: Apr 6, 2009 3:11 AM   in response to: rwillie6
  Click to reply to this thread Reply

Hi all,

We are suffering with that CPU usage peeks in GFv3 b43 promoted. We
discovered that it has nothing to do with the applications loaded to GFv3.
It is a very strange behaviour because when garbage collector starts to run,
CPU usage never drops from 100% in Linux. We have profiled it via
Netbeans6.5 profiler. Our gc related jvm settings is below:

<jvm-options>-server</jvm-options>
<jvm-options>-XX:NewRatio=2</jvm-options>
<jvm-options>-XX:PermSize=32m</jvm-options>
<jvm-options>-XX:MaxPermSize=256m</jvm-options>
<jvm-options>-XX:+UseConcMarkSweepGC</jvm-options>
<jvm-options>-XX:+CMSPermGenSweepingEnabled</jvm-options>
<jvm-options>-XX:+CMSClassUnloadingEnabled</jvm-options>
<jvm-options>-Dsun.rmi.dgc.server.gcInterval=3600000</jvm-options>
<jvm-options>-Dsun.rmi.dgc.client.gcInterval=3600000</jvm-options>
<jvm-options>-Dcom.sun.
enterprise.server.ss.ASQuickStartup=false</jvm-options>
<jvm-options>-Xms40m</jvm-options>
<jvm-options>-Xmx512m</jvm-options>

In order to avoid PermGenSpace exception we have changed our GC to ConcMark
and after that it starts to give this non-understandable CPU usage issue.
Also we observed that it starts this strange behaviour when memory usage is
higher than Xmx which means GC starts to work. We tried too many
configurations but could not solve that problem.

Any help or idea will be greatly appreciated.

- Burak


On Mon, Apr 6, 2009 at 12:32 AM, <glassfish@javadesktop.org> wrote:

> Hi JF,
>
> Thanks for the help. Unfortunately, after 12 hours of running on JDK 7 b50
> the server exhibited the same increase in CPU usage and context-switch
> activity as before. Any other suggestions?
> [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=340680
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>
>
[att1.html]


Burak Oguz
Re: Debugging Glassfish CPU usage
Posted: Apr 6, 2009 7:58 AM   in response to: Burak Oguz
  Click to reply to this thread Reply

Hi all again,

We have tried it crossing these 1.5.0_18, 6u7, 6u12, 6u13 JDKs with GFv3
builds b43, b31. We experienced 100% CPU problem again.

-Burak

On Mon, Apr 6, 2009 at 1:11 PM, Burak Oguz <burak.oguz@gmail.com> wrote:

> Hi all,
>
> We are suffering with that CPU usage peeks in GFv3 b43 promoted. We
> discovered that it has nothing to do with the applications loaded to GFv3.
> It is a very strange behaviour because when garbage collector starts to run,
> CPU usage never drops from 100% in Linux. We have profiled it via
> Netbeans6.5 profiler. Our gc related jvm settings is below:
>
> <jvm-options>-server</jvm-options>
> <jvm-options>-XX:NewRatio=2</jvm-options>
> <jvm-options>-XX:PermSize=32m</jvm-options>
> <jvm-options>-XX:MaxPermSize=256m</jvm-options>
> <jvm-options>-XX:+UseConcMarkSweepGC</jvm-options>
> <jvm-options>-XX:+CMSPermGenSweepingEnabled</jvm-options>
> <jvm-options>-XX:+CMSClassUnloadingEnabled</jvm-options>
> <jvm-options>-Dsun.rmi.dgc.server.gcInterval=3600000</jvm-options>
> <jvm-options>-Dsun.rmi.dgc.client.gcInterval=3600000</jvm-options>
> <jvm-options>-Dcom.sun.
> enterprise.server.ss.ASQuickStartup=false</jvm-options>
> <jvm-options>-Xms40m</jvm-options>
> <jvm-options>-Xmx512m</jvm-options>
>
> In order to avoid PermGenSpace exception we have changed our GC to ConcMark
> and after that it starts to give this non-understandable CPU usage issue.
> Also we observed that it starts this strange behaviour when memory usage is
> higher than Xmx which means GC starts to work. We tried too many
> configurations but could not solve that problem.
>
> Any help or idea will be greatly appreciated.
>
> - Burak
>
>
>
> On Mon, Apr 6, 2009 at 12:32 AM, <glassfish@javadesktop.org> wrote:
>
>> Hi JF,
>>
>> Thanks for the help. Unfortunately, after 12 hours of running on JDK 7
>> b50 the server exhibited the same increase in CPU usage and context-switch
>> activity as before. Any other suggestions?
>> [Message sent by forum member 'rwillie6' (rwillie6)]
>>
>> http://forums.java.net/jive/thread.jspa?messageID=340680
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
>> For additional commands, e-mail: users-help@glassfish.dev.java.net
>>
>>
>
[att1.html]


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Apr 7, 2009 1:52 AM   in response to: Jeanfrancois Ar...
 
  Click to reply to this thread Reply

Unfortunately, this is still occurring, nearly like clockwork every 8-12 hours. I've attached two images; one of the CPU usage for the last day and week, the other for the context-switch activity during the same period.

Every 8-12 hours, there is an (as yet) unexplained increase in CPU and context-switch activity that is unrelated to traffic patterns. I've haven't noticed any unusual GC times, but I will post a GC log next time this happens in case anybody sees something I'm missing.

From looking at the stack traces repeatedly, many of which are posted above, I haven't seen anything indicative; mostly threads just waiting.

We're currently running on Ubuntu, but are considering trying Solaris. Any thoughts on whether that could help, or would be better for performance anyway?

rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Apr 8, 2009 7:00 PM   in response to: Jeanfrancois Ar...
 
  Click to reply to this thread Reply

Hi JF,

I had observed the GC log when the server was behaving appropriately but had not captured a GC log from when it was being problematic (since the gc.log was overwritten on server restart). After copying the GC log on the last two cpu / context-switch spikes, I think the problem is GC related. Attached are two zips, each containing a jstack dump and the gc.log when the problem was occurring.

I noticed for most of the GC log looks normal (many PSYoungGen collections with intermittent Full GC), but at the end, when the CPU usage was so high, Full GC's are continually running back to back. Since the jstack dumps have shown mostly waiting threads, it seems like the GC must be responsible for the increased CPU and context-switches and eventual non-responsiveness of the server if not restarted. Any idea why this might be? Is it configuration related? Application related? Etc?

Thanks for all your help so far. I can't believe I didn't notice this sooner!

For reference, here are the jvm-options from domain.xml:
-XX:+PrintGCDetails
-Xloggc:${com.sun.aas.instanceRoot}/logs/gc.log
-XX:HeapDumpPath=/mnt/dumps
-XX:+HeapDumpOnOutOfMemoryError
-server
-Xmx2500m
-Xms2500m
-Xmn1000m
-Xss128k
-XX:+AggressiveOpts
-XX:+AggressiveHeap
-XX:+DisableExplicitGC
-XX:+UseParallelGC
-XX:+UseParallelOldGC
-XX:ParallelGCThreads=8
-Dcom.sun.enterprise.server.ss.ASQuickStartup=false
-XX:MaxPermSize=192m
-Djava.endorsed.dirs=${com.sun.aas.installRoot}/lib/endorsed
-Djava.security.policy=${com.sun.aas.instanceRoot}/config/server.policy
-Djava.security.auth.login.config=${com.sun.aas.instanceRoot}/config/login.conf
-Djavax.net.ssl.keyStore=${com.sun.aas.instanceRoot}/config/keystore.jks
-Djavax.net.ssl.trustStore=${com.sun.aas.instanceRoot}/config/cacerts.jks
-Djava.ext.dirs=${com.sun.aas.javaRoot}/lib/ext${path.separator}${com.sun.aas.javaRoot}/jre/lib/ext${path.separator}${com.sun.aas.instanceRoot}/lib/ext${path.separator}${com.sun.aa\
s.derbyRoot}/lib
-Djdbc.drivers=org.apache.derby.jdbc.ClientDriver
-Djavax.management.builder.initial=com.sun.enterprise.admin.server.core.jmx.AppServerMBeanServerBuilder
-Dcom.sun.enterprise.config.config_environment_factory_class=com.sun.enterprise.config.serverbeans.AppserverConfigEnvironmentFactory
-Dcom.sun.enterprise.taglibs=appserv-jstl.jar,jsf-impl.jar
-Dcom.sun.enterprise.taglisteners=jsf-impl.jar
-XX:NewRatio=2

Jacob Kessler
Re: Debugging Glassfish CPU usage
Posted: Apr 9, 2009 10:05 AM   in response to: rwillie6
  Click to reply to this thread Reply

That looks like a memory leak to me (constantly increasing Old
Generation to near-full), and you start really being in trouble around
46,000 seconds in (on the first one, anyway), when a full GC doesn't
empty the young generation (and thus, the old generation is full). It
looks like you are leaking from the very beginning, though, even if you
don't notice until the old generation fills up and you start running a
GC every few seconds. Since it seems unlikely that the memory leak is
GF's fault, it's almost certainly going to be application related (it's
difficult to configure things to make them leak memory =).

To try to deal with it, I'd suggest taking a jmap dump (as Scott Oaks
suggested above) and looking through that. If you aren't afraid of a bit
of command-line diving, you can forgo the memory analyzers and use jmap
-histo:live [process id] (which will print out the types, instance
counts, and memory sizes of everything in memory). You'll want to run it
once towards the beginning and once when you start having trouble, then
compare the two for significant differences. The leaking object type
will likely stick out, and give you a clue as to what in your
application isn't going away properly.

glassfish@javadesktop.org wrote:
> Hi JF,
>
> I had observed the GC log when the server was behaving appropriately but had not captured a GC log from when it was being problematic (since the gc.log was overwritten on server restart). After copying the GC log on the last two cpu / context-switch spikes, I think the problem is GC related. Attached are two zips, each containing a jstack dump and the gc.log when the problem was occurring.
>
> I noticed for most of the GC log looks normal (many PSYoungGen collections with intermittent Full GC), but at the end, when the CPU usage was so high, Full GC's are continually running back to back. Since the jstack dumps have shown mostly waiting threads, it seems like the GC must be responsible for the increased CPU and context-switches and eventual non-responsiveness of the server if not restarted. Any idea why this might be? Is it configuration related? Application related? Etc?
>
> Thanks for all your help so far. I can't believe I didn't notice this sooner!
>
> For reference, here are the jvm-options from domain.xml:
> -XX:+PrintGCDetails
> -Xloggc:${com.sun.aas.instanceRoot}/logs/gc.log
> -XX:HeapDumpPath=/mnt/dumps
> -XX:+HeapDumpOnOutOfMemoryError
> -server
> -Xmx2500m
> -Xms2500m
> -Xmn1000m
> -Xss128k
> -XX:+AggressiveOpts
> -XX:+AggressiveHeap
> -XX:+DisableExplicitGC
> -XX:+UseParallelGC
> -XX:+UseParallelOldGC
> -XX:ParallelGCThreads=8
> -Dcom.sun.enterprise.server.ss.ASQuickStartup=false
> -XX:MaxPermSize=192m
> -Djava.endorsed.dirs=${com.sun.aas.installRoot}/lib/endorsed
> -Djava.security.policy=${com.sun.aas.instanceRoot}/config/server.policy
> -Djava.security.auth.login.config=${com.sun.aas.instanceRoot}/config/login.conf
> -Djavax.net.ssl.keyStore=${com.sun.aas.instanceRoot}/config/keystore.jks
> -Djavax.net.ssl.trustStore=${com.sun.aas.instanceRoot}/config/cacerts.jks
> -Djava.ext.dirs=${com.sun.aas.javaRoot}/lib/ext${path.separator}${com.sun.aas.javaRoot}/jre/lib/ext${path.separator}${com.sun.aas.instanceRoot}/lib/ext${path.separator}${com.sun.aa\
> s.derbyRoot}/lib
> -Djdbc.drivers=org.apache.derby.jdbc.ClientDriver
> -Djavax.management.builder.initial=com.sun.enterprise.admin.server.core.jmx.AppServerMBeanServerBuilder
> -Dcom.sun.enterprise.config.config_environment_factory_class=com.sun.enterprise.config.serverbeans.AppserverConfigEnvironmentFactory
> -Dcom.sun.enterprise.taglibs=appserv-jstl.jar,jsf-impl.jar
> -Dcom.sun.enterprise.taglisteners=jsf-impl.jar
> -XX:NewRatio=2
> [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=341267
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
For additional commands, e-mail: users-help@glassfish.dev.java.net


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Apr 10, 2009 1:34 AM   in response to: Jacob Kessler
 
  Click to reply to this thread Reply

Hi Jacob,

love a good CLI session... so, I setup a cron to run jmap -histo:live every hour and collected the data until the problem resurfaced. After looking at the dumps, the class using the most memory was [C, which from what I have discovered seems to mean char arrays. When I started collecting the data (which was hours after restarting the server, but hours before the symptoms surfaced again), [C was using approx. 700MB. When the CPU spikes began to occur, [C was using 1.3GB. The next most memory intensive class in each dump was [B, but the usage was roughly the same in all dumps, fluctuating less than 20MB.

So, any advice on how to determine what is creating / using / holding on to the char arrays? I'm going to try generating a heap dump and analyzing it with eclipse memory analyzer or jhat. Hopefully that will show me the enclosing object ...

Scott Oaks
Re: Debugging Glassfish CPU usage
Posted: Apr 10, 2009 7:28 AM   in response to: rwillie6
  Click to reply to this thread Reply

There are several programs that will read a heap dump and show you what
is holding onto the memory -- you can see who is holding on to your char
arrays (likely strings, which in turn are held in some container, and so
on -- the idea is to look at those objects and determine who is holding
references to them and preventing them from being collected). The JDK
comes with jhat, which lets you use a web browser to look through it and
even do rudimentary queries on the heap; Netbeans will load a heap and
let you poke around inside of it; there is an Eclipse memory analyzer
which does a good job too.

--Scott

On Fri, 2009-04-10 at 01:34 -0700, glassfish@javadesktop.org wrote:

> Hi Jacob,
>
> love a good CLI session... so, I setup a cron to run jmap -histo:live every hour and collected the data until the problem resurfaced. After looking at the dumps, the class using the most memory was [C, which from what I have discovered seems to mean char arrays. When I started collecting the data (which was hours after restarting the server, but hours before the symptoms surfaced again), [C was using approx. 700MB. When the CPU spikes began to occur, [C was using 1.3GB. The next most memory intensive class in each dump was [B, but the usage was roughly the same in all dumps, fluctuating less than 20MB.
>
> So, any advice on how to determine what is creating / using / holding on to the char arrays? I'm going to try generating a heap dump and analyzing it with eclipse memory analyzer or jhat. Hopefully that will show me the enclosing object ...
> [Message sent by forum member 'rwillie6' (rwillie6)]
>
> http://forums.java.net/jive/thread.jspa?messageID=341594
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@glassfish.dev.java.net
> For additional commands, e-mail: users-help@glassfish.dev.java.net


[att1.html]


rwillie6

Posts: 78
Re: Debugging Glassfish CPU usage
Posted: Apr 10, 2009 8:05 AM   in response to: Scott Oaks
 
  Click to reply to this thread Reply

Tracked it down to a HTML FragmentCache that was not appropriately releasing expired items. It would replace expired items as needed with their new versions, but expired items that were not accessed would build up indefinitely. Quite a leak. Created a cleaner task to run every couple hours to remove the expired items, hopefully that will take care of it. If it doesn't, then we're just trying to cache too much in the java heap (and our machine can only address 4GB). If that's the case, we'll have to either cache less overall, cache for shorter periods and clean often, externalize the memory consumption of the cache to something like memcached, or try a 64-bit solaris installation so we can address more memory.

Thanks all for the help!




 XML java.net RSS