[activemq-user] Application Hanging when Sending JMS Messages

View: New views
6 Messages — Rating Filter:   Alert me  

[activemq-user] Application Hanging when Sending JMS Messages

by Sean Kleinjung :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

[I sent this message yesterday, but it does not appear to have been
successfully delivered. Since I have seen others posting on this list
with a similar delivery problem, I figured I would resend the e-mail. If
it ends up being posted twice, I apologize.]
 
We am running into a deadlock of sorts that appears to be caused by a
JMS client blocking indefinitely when attempting to send a message.
 
We are using Kodo JDO as the persistence framework for a number of web
applications (six to be exact) -- all of which share the same database
tablespace (for example, a corporate intranet with maintenance screens
for controlling portions of a public website). In addition, each of
these applications is deployed on two seperate nodes in a clustered
environment. This means that in total there are six seperate Kodo data
caches in use, and we are using Kodo's JMS remote commit provider with
ActiveMQ as the JMS implementation to keep them in sync. Periodically,
one of the applications will hang, and a thread dump points to ActiveMQ
as a possible cause. Whenever the application is hung, we see one thread
that is actively sending data to the server via the
TcpBufferedOuputStream.
 
The client code executing the Kodo commit varies, but the thread will
always have a pattern like the following:
 
"resin-tcp-connection-*:7771-7" daemon prio=5 tid=0xe91668 nid=0x2fa
runnable [70afe000..70affc24]
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
org.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutp
utStream.java:109)
at java.io.DataOutputStream.flush(DataOutputStream.java:101)
at
org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportC
hannel.java:465)
- locked <842a6f20> (a java.lang.Object)
at
org.activemq.transport.tcp.TcpTransportChannel.asyncSend(TcpTransportCha
nnel.java:285)
at
org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:
956)
at
org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:
935)
at org.activemq.ActiveMQSession.send(ActiveMQSession.java:1458)
at
org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:4
26)
at
org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:3
37)
at
org.activemq.ActiveMQTopicPublisher.publish(ActiveMQTopicPublisher.java:
129)
at
kodo.event.JMSRemoteCommitProvider.broadcast(JMSRemoteCommitProvider.jav
a:110)
at
kodo.event.RemoteCommitEventManager.afterCommit(RemoteCommitEventManager
.java:130)
at
kodo.event.TransactionEventManager.fireEvent(TransactionEventManager.jav
a:66)
at
serp.util.AbstractEventManager.fireEvent(AbstractEventManager.java:111)
- locked <969dda10> (a kodo.event.TransactionEventManager)
at
kodo.runtime.PersistenceManagerImpl.fireTransactionEvent(PersistenceMana
gerImpl.java:581)
at
kodo.runtime.PersistenceManagerImpl.endTransaction(PersistenceManagerImp
l.java:1364)
at
kodo.runtime.PersistenceManagerImpl.afterCompletion(PersistenceManagerIm
pl.java:998)
at kodo.runtime.LocalManagedRuntime.commit(LocalManagedRuntime.java:86)
- locked <969ddb20> (a kodo.runtime.LocalManagedRuntime)
at
kodo.runtime.PersistenceManagerImpl.commit(PersistenceManagerImpl.java:6
29)
at <snip>
 
When the problem occurs, the call to socketWrite0 will never return. In
addition to the above, any future attempts to commit to the database --
and, therefore, send a JMS message -- result in a stack trace like the
following:
 
"resin-tcp-connection-*:7771-35" daemon prio=5 tid=0x7fcbb0 nid=0x83
waiting for monitor entry [70e2e000..70e2fc24]
at
org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportC
hannel.java:464)
- waiting to lock <842a6f20> (a java.lang.Object)
at
org.activemq.transport.tcp.TcpTransportChannel.asyncSend(TcpTransportCha
nnel.java:285)
at
org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:
956)
at
org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java:
935)
at org.activemq.ActiveMQSession.send(ActiveMQSession.java:1458)
at
org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:4
26)
at
org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:3
37)
at
org.activemq.ActiveMQTopicPublisher.publish(ActiveMQTopicPublisher.java:
129)
at
kodo.event.JMSRemoteCommitProvider.broadcast(JMSRemoteCommitProvider.jav
a:110)
at
kodo.event.RemoteCommitEventManager.afterCommit(RemoteCommitEventManager
.java:130)
at
kodo.event.TransactionEventManager.fireEvent(TransactionEventManager.jav
a:66)
at
serp.util.AbstractEventManager.fireEvent(AbstractEventManager.java:111)
- locked <969dde78> (a kodo.event.TransactionEventManager)
at
kodo.runtime.PersistenceManagerImpl.fireTransactionEvent(PersistenceMana
gerImpl.java:581)
at
kodo.runtime.PersistenceManagerImpl.endTransaction(PersistenceManagerImp
l.java:1364)
at
kodo.runtime.PersistenceManagerImpl.afterCompletion(PersistenceManagerIm
pl.java:998)
at kodo.runtime.LocalManagedRuntime.commit(LocalManagedRuntime.java:86)
- locked <969ddf88> (a kodo.runtime.LocalManagedRuntime)
at
kodo.runtime.PersistenceManagerImpl.commit(PersistenceManagerImpl.java:6
29)
at <snip>
 
These threads are deadlocked waiting for the lock held by the first
thread (<842a6f20> in this case). (I have the associated native thread
dumps for the above if that would be useful.) As stated above, the
snipped portion varies. Oftentimes, however, the commits will hang
during processes that perform batch updates and have many consecutive
commits (such as updating a database table from an uploaded XML file.)
 
We are using a standalone broker, and I have included the activemq.xml
file at the end of this e-mail. The host environment is Solaris 9 and
the applications are running on Resin Professional 3.0.13. The JMS
topics are configured as such for each web application (where jmsprod is
the name of the host running the ActiveMQ broker):
 
            <jndi-link>
                <jndi-name>jms/KodoCommitProviderTopic</jndi-name>
 
<factory>org.activemq.jndi.ActiveMQInitialContextFactory</factory>
                <foreign-name>KodoCommitProviderTopic</foreign-name>
                <init-param brokerURL="tcp://jmsprod:61616"/>
                <init-param
topic.KodoCommitProviderTopic="jms/KodoCommitProviderTopic"/>
            </jndi-link>
 
            <jndi-link>
                <jndi-name>jms/TopicConnectionFactory</jndi-name>
 
<factory>org.activemq.jndi.ActiveMQInitialContextFactory</factory>
                <foreign-name>TopicConnectionFactory</foreign-name>
                <init-param brokerURL="tcp://jmsprod:61616"/>
            </jndi-link>

The broker's log file does not contain anything that appears relevant,
containing only client connects/disconnects and a bunch of 'Checkpoint
started.' / 'Checkpoint done.' pairs -- even with logging set to TRACE.
Unfortunately, since Kodo is performing the JMS work and not our own
code, I cannot speak specifically about how the messages are sent and
recieved.
 
The problem recurs quite frequently. Whenever an application hangs, we
restart it. If we do not also restart the broker, the problem will recur
in about 1-3 hours under peak load during business hours. If the broker
and all of the other web applications are restarted sequentially, the
problem will not recur for about 1-2 weeks.
 
At this point, I am left with the following list of possible causes for
this problem:
 
1) A bug in the Solaris socket implementation
2) A bug in ActiveMQ causing the client sockets to wait for a server
that is never going to respond
3) A flaw in Kodo's use of JMS that is causing ActiveMQ to misbehave
4) A misconfiguration of the Topics and/or broker
 
Some of these are much more likely than others, and since I am not very
familiar with the details of JMS I am leaning towards the
misconfiguration option -- hopefully if that is the case someone here
can help. I have been reading the ActiveMQ documentation, and was
wondering if perhaps what we are witnessing is an extreme case of the
'Fast Producer / Slow Consumer' behavior described on the ActiveMQ
website. If a consumer is too slow, will it cause the producer to
eventually hang like this? One reason I think this might be related is
because one of the other web applications -- never the one that hangs --
will have its logs flooded with the following error:
 
...
WARN  [TcpTransportChannel:
Socket[addr=jmsprod/10.126.144.94,port=61616,localport=56004] |
org.activemq.message.util.MemoryBoundedQueue] 07 Sep 2005 11:36:14:
Queue is full, waiting for it to be dequeued.
WARN  [TcpTransportChannel:
Socket[addr=jmsprod/10.126.144.94,port=61616,localport=56004] |
org.activemq.message.util.MemoryBoundedQueue] 07 Sep 2005 11:36:14:
Queue is full, waiting for it to be dequeued.
WARN  [TcpTransportChannel:
Socket[addr=jmsprod/10.126.144.94,port=61616,localport=56004] |
org.activemq.message.util.MemoryBoundedQueue] 07 Sep 2005 11:36:14:
Queue is full, waiting for it to be dequeued.
...
 
Sometimes the application will function normally while this error is
occurring, however, and so it does not automatically indicate that the
first application is hung. Unfortunately I am not sure how to tell what
queue is full, or why it is full.
 
This is a major quality of service problem for us, and is prompting us
to have to consider completely changing the architecture of our
applications. I am hoping that a much simpler solution exists, however.
As such, any help would be greatly appreciated.
 
Thanks in advance,
Sean Kleinjung
Application Services Coordinator
Sundog Interactive, Inc.
 
 
----------------------------------------------------
activemq.xml
----------------------------------------------------
 
 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE beans PUBLIC  "-//ACTIVEMQ//DTD//EN"
"http://activemq.org/dtd/activemq.dtd">
<beans>
 
  <!--
==================================================================== -->
  <!-- ActiveMQ Broker Configuration -->
  <!--
==================================================================== -->
  <broker>
    <connector>
      <tcpServerTransport uri="tcp://localhost:61616" backlog="1000"
useAsyncSend="true" maxOutstandingMessages="50"/>
    </connector>
 
    <persistence>
      <cachePersistence>
        <journalPersistence directory="../var/journal">
          <jdbcPersistence dataSourceRef="derby-ds"/>
        </journalPersistence>
      </cachePersistence>
    </persistence>
  </broker>
 
  <!--
==================================================================== -->
  <!-- JDBC DataSource Configurations -->
  <!--
==================================================================== -->
 
  <!-- The Derby Datasource that will be used by the Broker -->
  <bean id="derby-ds" class="org.apache.commons.dbcp.BasicDataSource"
destroy-method="close">
    <property name="driverClassName">
      <value>org.apache.derby.jdbc.EmbeddedDriver</value>
    </property>
    <property name="url">
      <!-- Use a URL like 'jdbc:hsqldb:hsql://localhost:9001' if you
want to connect to a remote hsqldb -->
      <value>jdbc:derby:derbydb;create=true</value>
    </property>
    <property name="username">
      <value></value>
    </property>
    <property name="password">
      <value></value>
    </property>
    <property name="poolPreparedStatements">
      <value>true</value>
    </property>
  </bean>
 
</beans>

Re: [activemq-user] Application Hanging when Sending JMS Messages

by etolson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

We have experienced a similar problem, with only one overlapping variable - so clearly this is an ActiveMQ bug.  We are running a JBoss appserver on Linux using Java 1.5 and Active MQ v.3.2 to distribute data to client applications.  We are using topics and non-persistant messages.  Active MQ locks up with a thread stuck doing a socketWrite and holding locks needed by all the other Active MQ threads.  Here's the thread dump of the offending thread:

"Thread-224058" daemon prio=1 tid=0xe3954c10 nid=0x2a61 runnable [0xde12d000..0xde12d670]
        at java.net.SocketOutputStream.socketWrite0(Native Method)
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
        at org.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:109)
        at java.io.DataOutputStream.flush(DataOutputStream.java:106)
        at org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportChannel.java:475)
        - locked <0x4e66b458> (a java.lang.Object)
        at org.activemq.transport.tcp.TcpTransportChannel$1.run(TcpTransportChannel.java:262)
        at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:748)
        at java.lang.Thread.run(Thread.java:595)

Has any progress been made on resolving this bug since this post was made 7 months ago?

Re: [activemq-user] Application Hanging when Sending JMS Messages

by James.Strachan :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

BTW we've had ActiveMQ 4.x now for about a year - have you tried
upgrading to 4.x (e.g. 4.0-RC2?) to see if this bug has been fixed;
we've fixed lots and lots of bugs and issues in 4.x over the last year

FWIW a common case of lockup is when the TCP timeout on sockets is set
to your operating system defaults which sometimes can be an hour or
day; so ActiveMQ will sometimes hang until the OS realises the scoket
is dead.

James

On 4/12/06, etolson <etolson@...> wrote:

>
> We have experienced a similar problem, with only one overlapping variable -
> so clearly this is an ActiveMQ bug.  We are running a JBoss appserver on
> Linux using Java 1.5 and Active MQ v.3.2 to distribute data to client
> applications.  We are using topics and non-persistant messages.  Active MQ
> locks up with a thread stuck doing a socketWrite and holding locks needed by
> all the other Active MQ threads.  Here's the thread dump of the offending
> thread:
>
> "Thread-224058" daemon prio=1 tid=0xe3954c10 nid=0x2a61 runnable
> [0xde12d000..0xde12d670]
>         at java.net.SocketOutputStream.socketWrite0(Native Method)
>         at
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>         at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>         at
> org.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:109)
>         at java.io.DataOutputStream.flush(DataOutputStream.java:106)
>         at
> org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportChannel.java:475)
>         - locked <0x4e66b458> (a java.lang.Object)
>         at
> org.activemq.transport.tcp.TcpTransportChannel$1.run(TcpTransportChannel.java:262)
>         at
> EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:748)
>         at java.lang.Thread.run(Thread.java:595)
>
> Has any progress been made on resolving this bug since this post was made 7
> months ago?
> --
> View this message in context: http://www.nabble.com/-activemq-user-Application-Hanging-when-Sending-JMS-Messages-t290790.html#a3885539
> Sent from the ActiveMQ - User forum at Nabble.com.
>
>


--

James
-------
http://radio.weblogs.com/0112098/

Re: [activemq-user] Application Hanging when Sending JMS Messages

by etolson :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Wait a sec, ActiveMQ 4.x isn't production yet right?  I'm using this in a production system, I need a production JMS implementation.  I am planning on trying 4.x as soon as it's a production release.

I appreciate the TCP timeout comment, that may address this problem.  Is this an ActiveMQ setting or how do I change the TCP socket timeout?  

Re: [activemq-user] Application Hanging when Sending JMS Messages

by James.Strachan :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 4/12/06, etolson <etolson@...> wrote:
> Wait a sec, ActiveMQ 4.x isn't production yet right?  I'm using this in a
> production system, I need a production JMS implementation.  I am planning on
> trying 4.x as soon as it's a production release.

ActiveMQ 4 has been kinda production ready all year and its live at
lots of sites in production right now; what slowed us down was the
Apache move and joining the incubator; if we'd have been at codehaus
we'd probably be on 4.3 by now I should think :).

So 4.0-RC2 can be considered a production ActiveMQ 4 release - I'd
urge anyone using 3.x to upgrade to 4.0-RC2 as soon as possible; its
way more stable, heavily tested and has large numbers of issues fixed
such as dealing with large queues nicely. If folks don't find any
problems with it, we'll be shipping a full 4.0 release real soon.


> I appreciate the TCP timeout comment, that may address this problem.  Is
> this an ActiveMQ setting or how do I change the TCP socket timeout?

There's lots in 4.x :)
http://activemq.org/TCP+Transport+Reference

--

James
-------
http://radio.weblogs.com/0112098/

Re: [activemq-user] Application Hanging when Sending JMS Messages

by Pravin Kundal :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I am seeing a similar problem where my producer gets unresponsive. I simulate this by not running a consumer and letting broker get flooded. The producer application goes unresponsive. Did tcp settings from the below link work for anyone? http://activemq.apache.org/tcp-transport-reference.html I want send() producer method to timeout in say 1 sec, so I am trying to use the below setings. can anyone confirm? "trace=true&connectionTimeout=5000&soTimeout=1000"; (connectiontimeout = 5 secs, tcptimeout=1 sec)