The Source for Java Technology Collaboration

Home » java.net Forums » GlassFish » GlassFish

Thread: After some time running, cluster becomes non-responsive

Welcome, Guest Help
Login Login
Guest Settings Guest Settings
This question is not answered. Helpful answers available: 2. Correct answers available: 1.

Reply to this Thread Reply to this Thread Search Forum Search Forum Back to Thread List Back to Thread List

Permlink Replies: 1 - Last Post: Jan 3, 2008 2:28 AM by: granat
rwillie6

Posts: 78
After some time running, cluster becomes non-responsive
Posted: Jan 2, 2008 9:02 PM
 
  Click to reply to this thread Reply

I have a 3 machine glassfish cluster load balanced by SJSWS 7.0. If I reboot all machines and start up the cluster, everything is fine. But after some time, the cluster becomes non-responsive with all instances returning HTTP 403 error codes. Executing a "asadmin stop-cluster cluster-name" command takes ridiculously longer than usual but eventually completes. The biggest problem is that after stopping the cluster, trying to restart it with "asadmin start-cluster cluster-name" fails with this error:

[root@glassfish1 ~]# asadmin start-cluster cluster-name
Operation 'startCluster' failed in 'clusters' Config Mbean.
Target exception message: All server instances in cluster cluster-name were not started.
Failed to retrieve RMIServer stub: javax.naming.NameNotFoundException: management/rmi-jmx-connector
Failed to retrieve RMIServer stub: javax.naming.NameNotFoundException: management/rmi-jmx-connector
Failed to retrieve RMIServer stub: javax.naming.NameNotFoundException: management/rmi-jmx-connector
CLI137 Command start-cluster failed.

Why does this happen? The server starts up and runs fine but eventually hits this after a variable amount of time. When it happens, I haven't found any solution yet besides rebooting all the machines in the cluster.

Also, rather than stopping/starting the cluster, trying to just stop/start any of the individual instances results in the same

Failed to retrieve RMIServer stub: javax.naming.NameNotFoundException: management/rmi-jmx-connector

message. Need help soon! I'll be watching this closely and can respond quickly. What other information would be helpful?

A few weeks the above error was thrown when trying to stop the cluster, yet that time the instances were still serving the webpages just fine and eventually the error went away on its own because the next day I was able to execute successfully the same command that caused the error the day before.

What does this error mean? Why does it occur? Why does it occur intermittently? And why does it sometimes solve itself and other times not?

Thanks!

I'll be watching this closely, and will respond quickly. What other information would be helpful to diagnose this?

granat

Posts: 43
Re: After some time running, cluster becomes non-responsive
Posted: Jan 3, 2008 2:28 AM   in response to: rwillie6
 
  Click to reply to this thread Reply

Hi,

I think log files would be helpfull.

Are there any dump file in the /<server>/config directory (should be named something like hs_err_pid10680.log)?

What does your jvm.log files says about the failed starting command (/<server>/logs/)?

greets
jeremie




 XML java.net RSS