Tomcat 5.5.26 hangs

View: New views
7 Messages — Rating Filter:   Alert me  

Tomcat 5.5.26 hangs

by conrad-tomcat.users.2009 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

our customer is running a cluster of tomcat servlet engines. On these,
our web application is running. The basic setup is

Loadbalancer <---> Apache 1.3.x with mod_jk <---> Tomcat

with 2-3 Apache servers and >30 Tomcat instances bundled into clusters
of 3-5 instances each. Apache + Tomcat servers are running on recent
SUN multi-core machines under Solaris. The basic setup hasn't changed
much over the past few years, except occasional updates to soft- and
hardware, and the number of Tomcat instances has been increasing steadily.

Currently, they're using Tomcat-5.5.26 on SUN's jdk 1.5.0_10 (64 bit)
and mod_jk 1.2.28. Over the years, we have seen the same situation since
before Tomcat-5.5.12.

Most of the time, things work nicely. Occasionally, though, the whole
system comes to a complete halt. A post-mortem thread dump shows all (!)
worker threads on all instances waiting for input from the Apache servers,
e. g.:

"TP-Processor2432" daemon prio=10 tid=0x00b2f258 nid=0x9f1 runnable [0x7cfbf000.
.0x7cfbfa70]
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
        - locked <0x95947c70> (a java.io.BufferedInputStream)
        at org.apache.jk.common.ChannelSocket.read(ChannelSocket.java:626)
        at org.apache.jk.common.ChannelSocket.receive(ChannelSocket.java:564)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:691)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:895)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
        at java.lang.Thread.run(Thread.java:595)

Due to the large number of machines involved and the high number of client
requests, it is impossible to see how such a situation evolves. We have
ruled out lengthy garbage collection pauses (CMS collector is enabled).
There is no obviously relevant information in the logfiles.

Usually, the situation can be resolved by restarting Apache and/or
(some) Tomcat servers, which makes DOS attacks unlikely, IMO.

Has anyone seen this situation before? Any ideas what could be the
problem, and how to resolve it? Any idea how to gain more information?

Thanks,
        Peter
--
Peter Conrad
Tivano Software GmbH
Bahnhofstr. 18
63263 Neu-Isenburg
Tel: 06102 / 8099070
Fax: 06102 / 8099071
HRB 11680, AG Offenbach/Main
Geschäftsführer: Martin Apel

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: Tomcat 5.5.26 hangs

by Mark Thomas :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

conrad-tomcat.users.2009@... wrote:

> Hi,
>
> our customer is running a cluster of tomcat servlet engines. On these,
> our web application is running. The basic setup is
>
> Loadbalancer <---> Apache 1.3.x with mod_jk <---> Tomcat
>
> with 2-3 Apache servers and >30 Tomcat instances bundled into clusters
> of 3-5 instances each. Apache + Tomcat servers are running on recent
> SUN multi-core machines under Solaris. The basic setup hasn't changed
> much over the past few years, except occasional updates to soft- and
> hardware, and the number of Tomcat instances has been increasing steadily.
>
> Currently, they're using Tomcat-5.5.26 on SUN's jdk 1.5.0_10 (64 bit)
> and mod_jk 1.2.28. Over the years, we have seen the same situation since
> before Tomcat-5.5.12.
>
> Most of the time, things work nicely. Occasionally, though, the whole
> system comes to a complete halt. A post-mortem thread dump shows all (!)
> worker threads on all instances waiting for input from the Apache servers,
> e. g.:
>
> "TP-Processor2432" daemon prio=10 tid=0x00b2f258 nid=0x9f1 runnable [0x7cfbf000.
> .0x7cfbfa70]
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
>         at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:313)
>         - locked <0x95947c70> (a java.io.BufferedInputStream)
>         at org.apache.jk.common.ChannelSocket.read(ChannelSocket.java:626)
>         at org.apache.jk.common.ChannelSocket.receive(ChannelSocket.java:564)
>         at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:691)
>         at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:895)
>         at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
>         at java.lang.Thread.run(Thread.java:595)
>
> Due to the large number of machines involved and the high number of client
> requests, it is impossible to see how such a situation evolves. We have
> ruled out lengthy garbage collection pauses (CMS collector is enabled).
> There is no obviously relevant information in the logfiles.
>
> Usually, the situation can be resolved by restarting Apache and/or
> (some) Tomcat servers, which makes DOS attacks unlikely, IMO.
>
> Has anyone seen this situation before? Any ideas what could be the
> problem, and how to resolve it?

Have you tried
JkOptions     +DisableReuse


> Any idea how to gain more information?

Jk debug logs
wireshark
compare httpd and Tomcat access logs

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: Tomcat 5.5.26 hangs

by Christopher Schultz-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Peter,

On 10/14/2009 10:48 AM, conrad-tomcat.users.2009@... wrote:

> Currently, they're using Tomcat-5.5.26 on SUN's jdk 1.5.0_10 (64 bit)
> and mod_jk 1.2.28. Over the years, we have seen the same situation since
> before Tomcat-5.5.12.
>
> Most of the time, things work nicely. Occasionally, though, the whole
> system comes to a complete halt. A post-mortem thread dump shows all (!)
> worker threads on all instances waiting for input from the Apache servers,
> e. g.:
>
> "TP-Processor2432" daemon prio=10 tid=0x00b2f258 nid=0x9f1 runnable [0x7cfbf000.
> .0x7cfbfa70]
>         at java.net.SocketInputStream.socketRead0(Native Method)

Although those threads say "runnable", they're really blocked at the OS
level waiting to receive data from the mod_jk connector. These threads
are actually idle, waiting for requests from httpd to come through the pipe.

You can probably confirm this by checking with 'top' to see that Tomcat
isn't using any CPU time, because it's just waiting.

> Usually, the situation can be resolved by restarting Apache and/or
> (some) Tomcat servers, which makes DOS attacks unlikely, IMO.

If restarting Apache httpd solves the problem, this may be an httpd problem.

Is it feasible to remove httpd from the equation? Tomcat 5.5 can easily
compete with httpd for static file delivery if that's all your using it for.

If you could post your httpd configuration for your worker/prefork stuff
AND your mod_jk configuration, it might be helpful.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrWAy4ACgkQ9CaO5/Lv0PA/RwCgq/EWFWUKJYWpU8Zz6d/u9K51
ZeMAnjvC9WAqvH6SyziVKllPCaFmcHF0
=wTyU
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: Tomcat 5.5.26 hangs

by Tsirkin Evgeny-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Oct 14, 2009 at 6:15 PM, Mark Thomas <markt@...> wrote:


> Have you tried
> JkOptions     +DisableReuse
>
>
And if that solves the problem setup connection timeout .
http://tomcat.apache.org/connectors-doc/generic_howto/timeouts.html
Evgeny

Re: Tomcat 5.5.26 hangs

by conrad-tomcat.users.2009 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Am Mittwoch, 14. Oktober 2009 schrieb Christopher Schultz:
>
> Although those threads say "runnable", they're really blocked at the OS
> level waiting to receive data from the mod_jk connector. These threads
> are actually idle, waiting for requests from httpd to come through the
> pipe.
>
> You can probably confirm this by checking with 'top' to see that Tomcat
> isn't using any CPU time, because it's just waiting.

exactly. That's what I meant with "waiting for input from the Apache servers".
Thanks for confirming this.

> Is it feasible to remove httpd from the equation? Tomcat 5.5 can easily
> compete with httpd for static file delivery if that's all your using it
> for.

Not really. We're relying on mod_jk for load-balancing with sticky
sessions, and for SSL termination. Getting rid of the Apaches would be
a major PITA.

> If you could post your httpd configuration for your worker/prefork stuff
> AND your mod_jk configuration, it might be helpful.

===workers.properties===
worker.list=lb,jkstatus

worker.jkstatus.type=status

worker.lb.type=lb

worker.lb.balance_workers=xx01E1, xx02E1, [...]

worker.xx01E1.port=31011
worker.xx01E1.host=appsrv01
worker.xx01E1.type=ajp13
worker.xx01E1.lbfactor=5
worker.xx01E1.activation=A
worker.xx01E1.domain=d01
worker.xx01E1.connect_timeout=15000
worker.xx01E1.prepost_timeout=15000

[...more workers with identical config except "host" and "domain"...]
===/workers.properties===

===httpd.conf===
<IfModule mod_jk.c>
        JkWorkersFile /...path.../conf/workers.properties
        JkShmFile /...path.../logs/apache_2_2/jk-shm.file
        JkLogFile /...path.../logs/apache_2_2/jk.log
        JkLogLevel info
#       JkLogLevel Fatal
#       JkLogLevel info
#       JkLogLevel trace
#       JkLogLevel debug
</IfModule>

# Manager config:
 <Location /jkmanager/>
                JkMount jkstatus
                Order deny,allow
                Deny from all
                Allow from 10.207.69 10.64 192.168.7
        </Location>

# Virtual Host config:
 JkMount /app/* lb
 JkMount jkstatus
===/httpd.conf===

Thanks,
        Peter
--
Peter Conrad
Tivano Software GmbH
Bahnhofstr. 18
63263 Neu-Isenburg
Tel: 06102 / 8099070
Fax: 06102 / 8099071
HRB 11680, AG Offenbach/Main
Geschäftsführer: Martin Apel


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: Tomcat 5.5.26 hangs

by Christopher Schultz-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Conrad,

On 10/15/2009 9:44 AM, conrad-tomcat.users.2009@... wrote:
>> If you could post your httpd configuration for your worker/prefork stuff
>> AND your mod_jk configuration, it might be helpful.
>
> ===workers.properties===

All that looks okay to me, though as Mark and Tsirkin suggest, you might
want to try using JkOptions +DisablResuse and if that works, set up a
connection timeout (which forces mod_jk connections to be destroyed
after a certain amount of time, just to clean things up).

> [...more workers with identical config except "host" and "domain"...]

Check out the "template" capabilities of later mod_jks. You might save
yourself a lot of typing if something changes in your worker configuration.

>  <Location /jkmanager/>
>                 JkMount jkstatus
>                 Order deny,allow
>                 Deny from all
>                 Allow from 10.207.69 10.64 192.168.7
>         </Location>

What does the jkstatus thing say when one of your app servers stops
responding? Anything in the mod_jk log file?

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkrXMboACgkQ9CaO5/Lv0PCbTACgj0fYf7l/sRjknzNcoAhq1EpI
NJkAn1KbbJ2IPbcARNyo1NNLf9XjVYcM
=pGnX
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...


Re: Tomcat 5.5.26 hangs

by conrad-tomcat.users.2009 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

for completeness: the issue seems to have been resolved.
The problems were apparently caused by a misconfigured
router between the webservers and the appservers.

Am Mittwoch, 14. Oktober 2009 schrieb Mark Thomas:
>
> > Any idea how to gain more information?
>
> Jk debug logs
> wireshark
> compare httpd and Tomcat access logs

netstat was found to be very helpful, because it showed
non-empty send-queues and lots of connections in FIN_WAIT_1
on the webservers. Which proved that the problems were
network-related, and not due to software bugs.

Thanks for your help!

        Peter
--
Peter Conrad
Tivano Software GmbH
Bahnhofstr. 18
63263 Neu-Isenburg
Tel: 06102 / 8099070
Fax: 06102 / 8099071
HRB 11680, AG Offenbach/Main
Geschäftsführer: Martin Apel


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@...
For additional commands, e-mail: users-help@...