|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
[activemq-user] Application Hanging when Sending JMS MessagesHi,
[I sent this message yesterday, but it does not appear to have been successfully delivered. Since I have seen others posting on this list with a similar delivery problem, I figured I would resend the e-mail. If it ends up being posted twice, I apologize.] We am running into a deadlock of sorts that appears to be caused by a JMS client blocking indefinitely when attempting to send a message. We are using Kodo JDO as the persistence framework for a number of web applications (six to be exact) -- all of which share the same database tablespace (for example, a corporate intranet with maintenance screens for controlling portions of a public website). In addition, each of these applications is deployed on two seperate nodes in a clustered environment. This means that in total there are six seperate Kodo data caches in use, and we are using Kodo's JMS remote commit provider with ActiveMQ as the JMS implementation to keep them in sync. Periodically, one of the applications will hang, and a thread dump points to ActiveMQ as a possible cause. Whenever the application is hung, we see one thread that is actively sending data to the server via the TcpBufferedOuputStream. The client code executing the Kodo commit varies, but the thread will always have a pattern like the following: "resin-tcp-connection-*:7771-7" daemon prio=5 tid=0xe91668 nid=0x2fa runnable [70afe000..70affc24] at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at org.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutp utStream.java:109) at java.io.DataOutputStream.flush(DataOutputStream.java:101) at org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportC hannel.java:465) - locked <842a6f20> (a java.lang.Object) at org.activemq.transport.tcp.TcpTransportChannel.asyncSend(TcpTransportCha nnel.java:285) at org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java: 956) at org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java: 935) at org.activemq.ActiveMQSession.send(ActiveMQSession.java:1458) at org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:4 26) at org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:3 37) at org.activemq.ActiveMQTopicPublisher.publish(ActiveMQTopicPublisher.java: 129) at kodo.event.JMSRemoteCommitProvider.broadcast(JMSRemoteCommitProvider.jav a:110) at kodo.event.RemoteCommitEventManager.afterCommit(RemoteCommitEventManager .java:130) at kodo.event.TransactionEventManager.fireEvent(TransactionEventManager.jav a:66) at serp.util.AbstractEventManager.fireEvent(AbstractEventManager.java:111) - locked <969dda10> (a kodo.event.TransactionEventManager) at kodo.runtime.PersistenceManagerImpl.fireTransactionEvent(PersistenceMana gerImpl.java:581) at kodo.runtime.PersistenceManagerImpl.endTransaction(PersistenceManagerImp l.java:1364) at kodo.runtime.PersistenceManagerImpl.afterCompletion(PersistenceManagerIm pl.java:998) at kodo.runtime.LocalManagedRuntime.commit(LocalManagedRuntime.java:86) - locked <969ddb20> (a kodo.runtime.LocalManagedRuntime) at kodo.runtime.PersistenceManagerImpl.commit(PersistenceManagerImpl.java:6 29) at <snip> When the problem occurs, the call to socketWrite0 will never return. In addition to the above, any future attempts to commit to the database -- and, therefore, send a JMS message -- result in a stack trace like the following: "resin-tcp-connection-*:7771-35" daemon prio=5 tid=0x7fcbb0 nid=0x83 waiting for monitor entry [70e2e000..70e2fc24] at org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportC hannel.java:464) - waiting to lock <842a6f20> (a java.lang.Object) at org.activemq.transport.tcp.TcpTransportChannel.asyncSend(TcpTransportCha nnel.java:285) at org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java: 956) at org.activemq.ActiveMQConnection.asyncSendPacket(ActiveMQConnection.java: 935) at org.activemq.ActiveMQSession.send(ActiveMQSession.java:1458) at org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:4 26) at org.activemq.ActiveMQMessageProducer.send(ActiveMQMessageProducer.java:3 37) at org.activemq.ActiveMQTopicPublisher.publish(ActiveMQTopicPublisher.java: 129) at kodo.event.JMSRemoteCommitProvider.broadcast(JMSRemoteCommitProvider.jav a:110) at kodo.event.RemoteCommitEventManager.afterCommit(RemoteCommitEventManager .java:130) at kodo.event.TransactionEventManager.fireEvent(TransactionEventManager.jav a:66) at serp.util.AbstractEventManager.fireEvent(AbstractEventManager.java:111) - locked <969dde78> (a kodo.event.TransactionEventManager) at kodo.runtime.PersistenceManagerImpl.fireTransactionEvent(PersistenceMana gerImpl.java:581) at kodo.runtime.PersistenceManagerImpl.endTransaction(PersistenceManagerImp l.java:1364) at kodo.runtime.PersistenceManagerImpl.afterCompletion(PersistenceManagerIm pl.java:998) at kodo.runtime.LocalManagedRuntime.commit(LocalManagedRuntime.java:86) - locked <969ddf88> (a kodo.runtime.LocalManagedRuntime) at kodo.runtime.PersistenceManagerImpl.commit(PersistenceManagerImpl.java:6 29) at <snip> These threads are deadlocked waiting for the lock held by the first thread (<842a6f20> in this case). (I have the associated native thread dumps for the above if that would be useful.) As stated above, the snipped portion varies. Oftentimes, however, the commits will hang during processes that perform batch updates and have many consecutive commits (such as updating a database table from an uploaded XML file.) We are using a standalone broker, and I have included the activemq.xml file at the end of this e-mail. The host environment is Solaris 9 and the applications are running on Resin Professional 3.0.13. The JMS topics are configured as such for each web application (where jmsprod is the name of the host running the ActiveMQ broker): <jndi-link> <jndi-name>jms/KodoCommitProviderTopic</jndi-name> <factory>org.activemq.jndi.ActiveMQInitialContextFactory</factory> <foreign-name>KodoCommitProviderTopic</foreign-name> <init-param brokerURL="tcp://jmsprod:61616"/> <init-param topic.KodoCommitProviderTopic="jms/KodoCommitProviderTopic"/> </jndi-link> <jndi-link> <jndi-name>jms/TopicConnectionFactory</jndi-name> <factory>org.activemq.jndi.ActiveMQInitialContextFactory</factory> <foreign-name>TopicConnectionFactory</foreign-name> <init-param brokerURL="tcp://jmsprod:61616"/> </jndi-link> The broker's log file does not contain anything that appears relevant, containing only client connects/disconnects and a bunch of 'Checkpoint started.' / 'Checkpoint done.' pairs -- even with logging set to TRACE. Unfortunately, since Kodo is performing the JMS work and not our own code, I cannot speak specifically about how the messages are sent and recieved. The problem recurs quite frequently. Whenever an application hangs, we restart it. If we do not also restart the broker, the problem will recur in about 1-3 hours under peak load during business hours. If the broker and all of the other web applications are restarted sequentially, the problem will not recur for about 1-2 weeks. At this point, I am left with the following list of possible causes for this problem: 1) A bug in the Solaris socket implementation 2) A bug in ActiveMQ causing the client sockets to wait for a server that is never going to respond 3) A flaw in Kodo's use of JMS that is causing ActiveMQ to misbehave 4) A misconfiguration of the Topics and/or broker Some of these are much more likely than others, and since I am not very familiar with the details of JMS I am leaning towards the misconfiguration option -- hopefully if that is the case someone here can help. I have been reading the ActiveMQ documentation, and was wondering if perhaps what we are witnessing is an extreme case of the 'Fast Producer / Slow Consumer' behavior described on the ActiveMQ website. If a consumer is too slow, will it cause the producer to eventually hang like this? One reason I think this might be related is because one of the other web applications -- never the one that hangs -- will have its logs flooded with the following error: ... WARN [TcpTransportChannel: Socket[addr=jmsprod/10.126.144.94,port=61616,localport=56004] | org.activemq.message.util.MemoryBoundedQueue] 07 Sep 2005 11:36:14: Queue is full, waiting for it to be dequeued. WARN [TcpTransportChannel: Socket[addr=jmsprod/10.126.144.94,port=61616,localport=56004] | org.activemq.message.util.MemoryBoundedQueue] 07 Sep 2005 11:36:14: Queue is full, waiting for it to be dequeued. WARN [TcpTransportChannel: Socket[addr=jmsprod/10.126.144.94,port=61616,localport=56004] | org.activemq.message.util.MemoryBoundedQueue] 07 Sep 2005 11:36:14: Queue is full, waiting for it to be dequeued. ... Sometimes the application will function normally while this error is occurring, however, and so it does not automatically indicate that the first application is hung. Unfortunately I am not sure how to tell what queue is full, or why it is full. This is a major quality of service problem for us, and is prompting us to have to consider completely changing the architecture of our applications. I am hoping that a much simpler solution exists, however. As such, any help would be greatly appreciated. Thanks in advance, Sean Kleinjung Application Services Coordinator Sundog Interactive, Inc. ---------------------------------------------------- activemq.xml ---------------------------------------------------- <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE beans PUBLIC "-//ACTIVEMQ//DTD//EN" "http://activemq.org/dtd/activemq.dtd"> <beans> <!-- ==================================================================== --> <!-- ActiveMQ Broker Configuration --> <!-- ==================================================================== --> <broker> <connector> <tcpServerTransport uri="tcp://localhost:61616" backlog="1000" useAsyncSend="true" maxOutstandingMessages="50"/> </connector> <persistence> <cachePersistence> <journalPersistence directory="../var/journal"> <jdbcPersistence dataSourceRef="derby-ds"/> </journalPersistence> </cachePersistence> </persistence> </broker> <!-- ==================================================================== --> <!-- JDBC DataSource Configurations --> <!-- ==================================================================== --> <!-- The Derby Datasource that will be used by the Broker --> <bean id="derby-ds" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close"> <property name="driverClassName"> <value>org.apache.derby.jdbc.EmbeddedDriver</value> </property> <property name="url"> <!-- Use a URL like 'jdbc:hsqldb:hsql://localhost:9001' if you want to connect to a remote hsqldb --> <value>jdbc:derby:derbydb;create=true</value> </property> <property name="username"> <value></value> </property> <property name="password"> <value></value> </property> <property name="poolPreparedStatements"> <value>true</value> </property> </bean> </beans> |
|
|
Re: [activemq-user] Application Hanging when Sending JMS MessagesWe have experienced a similar problem, with only one overlapping variable - so clearly this is an ActiveMQ bug. We are running a JBoss appserver on Linux using Java 1.5 and Active MQ v.3.2 to distribute data to client applications. We are using topics and non-persistant messages. Active MQ locks up with a thread stuck doing a socketWrite and holding locks needed by all the other Active MQ threads. Here's the thread dump of the offending thread:
"Thread-224058" daemon prio=1 tid=0xe3954c10 nid=0x2a61 runnable [0xde12d000..0xde12d670] at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at java.net.SocketOutputStream.write(SocketOutputStream.java:136) at org.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:109) at java.io.DataOutputStream.flush(DataOutputStream.java:106) at org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportChannel.java:475) - locked <0x4e66b458> (a java.lang.Object) at org.activemq.transport.tcp.TcpTransportChannel$1.run(TcpTransportChannel.java:262) at EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:748) at java.lang.Thread.run(Thread.java:595) Has any progress been made on resolving this bug since this post was made 7 months ago? |
|
|
Re: [activemq-user] Application Hanging when Sending JMS MessagesBTW we've had ActiveMQ 4.x now for about a year - have you tried
upgrading to 4.x (e.g. 4.0-RC2?) to see if this bug has been fixed; we've fixed lots and lots of bugs and issues in 4.x over the last year FWIW a common case of lockup is when the TCP timeout on sockets is set to your operating system defaults which sometimes can be an hour or day; so ActiveMQ will sometimes hang until the OS realises the scoket is dead. James On 4/12/06, etolson <etolson@...> wrote: > > We have experienced a similar problem, with only one overlapping variable - > so clearly this is an ActiveMQ bug. We are running a JBoss appserver on > Linux using Java 1.5 and Active MQ v.3.2 to distribute data to client > applications. We are using topics and non-persistant messages. Active MQ > locks up with a thread stuck doing a socketWrite and holding locks needed by > all the other Active MQ threads. Here's the thread dump of the offending > thread: > > "Thread-224058" daemon prio=1 tid=0xe3954c10 nid=0x2a61 runnable > [0xde12d000..0xde12d670] > at java.net.SocketOutputStream.socketWrite0(Native Method) > at > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) > at java.net.SocketOutputStream.write(SocketOutputStream.java:136) > at > org.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:109) > at java.io.DataOutputStream.flush(DataOutputStream.java:106) > at > org.activemq.transport.tcp.TcpTransportChannel.doAsyncSend(TcpTransportChannel.java:475) > - locked <0x4e66b458> (a java.lang.Object) > at > org.activemq.transport.tcp.TcpTransportChannel$1.run(TcpTransportChannel.java:262) > at > EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(PooledExecutor.java:748) > at java.lang.Thread.run(Thread.java:595) > > Has any progress been made on resolving this bug since this post was made 7 > months ago? > -- > View this message in context: http://www.nabble.com/-activemq-user-Application-Hanging-when-Sending-JMS-Messages-t290790.html#a3885539 > Sent from the ActiveMQ - User forum at Nabble.com. > > -- James ------- http://radio.weblogs.com/0112098/ |
|
|
Re: [activemq-user] Application Hanging when Sending JMS MessagesWait a sec, ActiveMQ 4.x isn't production yet right? I'm using this in a production system, I need a production JMS implementation. I am planning on trying 4.x as soon as it's a production release.
I appreciate the TCP timeout comment, that may address this problem. Is this an ActiveMQ setting or how do I change the TCP socket timeout? |
|
|
Re: [activemq-user] Application Hanging when Sending JMS MessagesOn 4/12/06, etolson <etolson@...> wrote:
> Wait a sec, ActiveMQ 4.x isn't production yet right? I'm using this in a > production system, I need a production JMS implementation. I am planning on > trying 4.x as soon as it's a production release. ActiveMQ 4 has been kinda production ready all year and its live at lots of sites in production right now; what slowed us down was the Apache move and joining the incubator; if we'd have been at codehaus we'd probably be on 4.3 by now I should think :). So 4.0-RC2 can be considered a production ActiveMQ 4 release - I'd urge anyone using 3.x to upgrade to 4.0-RC2 as soon as possible; its way more stable, heavily tested and has large numbers of issues fixed such as dealing with large queues nicely. If folks don't find any problems with it, we'll be shipping a full 4.0 release real soon. > I appreciate the TCP timeout comment, that may address this problem. Is > this an ActiveMQ setting or how do I change the TCP socket timeout? There's lots in 4.x :) http://activemq.org/TCP+Transport+Reference -- James ------- http://radio.weblogs.com/0112098/ |
|
|
Re: [activemq-user] Application Hanging when Sending JMS MessagesI am seeing a similar problem where my producer gets unresponsive. I simulate this by not running a consumer and letting broker get flooded. The producer application goes unresponsive.
Did tcp settings from the below link work for anyone?
http://activemq.apache.org/tcp-transport-reference.html
I want send() producer method to timeout in say 1 sec, so I am trying to use the below setings. can anyone confirm?
"trace=true&connectionTimeout=5000&soTimeout=1000";
(connectiontimeout = 5 secs, tcptimeout=1 sec)
|
| Free embeddable forum powered by Nabble | Forum Help |