concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

View: New views
11 Messages — Rating Filter:   Alert me  

concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by Leos Bitto :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

When performing load tests of our application, sometimes there is a situation when all the threads which communicate with SwiftMQ seem to get stuck and after some timeout (which is sligtly over 1 minute) they all get "javax.transaction.xa.XAException: Request time out (60000) ms!".

Some of them have this stack trace:

javax.transaction.xa.XAException: Request time out (60000) ms!
        at com.swiftmq.jms.v610.XAResourceImpl.start(Unknown Source)
        at org.jboss.tm.TransactionImpl$Resource.startResource(TransactionImpl.java:2063)
        at org.jboss.tm.TransactionImpl.enlistResource(TransactionImpl.java:581)

Some of them have this stack trace:

javax.transaction.xa.XAException: Request time out (60000) ms!
        at com.swiftmq.jms.v610.XAResourceImpl.prepare(Unknown Source)
        at org.jboss.tm.TransactionImpl$Resource.prepare(TransactionImpl.java:2212)
        at org.jboss.tm.TransactionImpl.prepareResources(TransactionImpl.java:1660)
        at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:347)
        at org.jboss.tm.TxManager.commit(TxManager.java:240)

Some of them have this stack trace:

javax.transaction.xa.XAException: Request time out (60000) ms!
        at com.swiftmq.jms.v610.XAResourceImpl.end(Unknown Source)
        at org.jboss.tm.TransactionImpl$Resource.endResource(TransactionImpl.java:2143)
        at org.jboss.tm.TransactionImpl$Resource.endResource(TransactionImpl.java:2118)
        at org.jboss.tm.TransactionImpl.endResources(TransactionImpl.java:1462)
        at org.jboss.tm.TransactionImpl.beforePrepare(TransactionImpl.java:1116)
        at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:324)
        at org.jboss.tm.TxManager.commit(TxManager.java:240)

All threads share the same XAConnection, each thread has its own XASession. The used version of SwiftMQ is 6.2.1. Is there anything we could do to avoid this?

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by Leos Bitto :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

After the previous XAExceptions happen, the transaction manager (JBoss in our case) obviously tries to rollback the affected transactions and after 40 seconds of inactivity some more interesting XAExceptions occur (each in a separate thread):

javax.transaction.xa.XAException: java.lang.ClassCastException: com.swiftmq.jms.smqp.v610.XAResStartReply
        at com.swiftmq.jms.v610.XAResourceImpl.rollback(Unknown Source)
        at org.jboss.tm.TransactionImpl$Resource.rollback(TransactionImpl.java:2277)
        at org.jboss.tm.TransactionImpl.rollbackResources(TransactionImpl.java:1837)
        at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:368)
        at org.jboss.tm.TxManager.commit(TxManager.java:240)

javax.transaction.xa.XAException: java.lang.ClassCastException: com.swiftmq.jms.smqp.v610.XAResPrepareReply
        at com.swiftmq.jms.v610.XAResourceImpl.rollback(Unknown Source)
        at org.jboss.tm.TransactionImpl$Resource.rollback(TransactionImpl.java:2277)
        at org.jboss.tm.TransactionImpl.rollbackResources(TransactionImpl.java:1837)
        at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:368)
        at org.jboss.tm.TxManager.commit(TxManager.java:240)

javax.transaction.xa.XAException: java.lang.ClassCastException: com.swiftmq.jms.smqp.v610.XAResEndReply
        at com.swiftmq.jms.v610.XAResourceImpl.rollback(Unknown Source)
        at org.jboss.tm.TransactionImpl$Resource.rollback(TransactionImpl.java:2277)
        at org.jboss.tm.TransactionImpl.rollbackResources(TransactionImpl.java:1837)
        at org.jboss.tm.TransactionImpl.commit(TransactionImpl.java:368)
        at org.jboss.tm.TxManager.commit(TxManager.java:240)

Do I guess correctly that the responses which the threads were waiting for initially were returned now? The names XAResStartReply, XAResPrepareReply and XAResEndReply suggest that they were supposed to be returned to the previous calls to XAResourceImpl.start, XAResourceImpl.prepare and XAResourceImpl.end which got "javax.transaction.xa.XAException: Request time out (60000) ms!" instead...

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by IIT Software :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Seems to be the old replies were delivered after the timeout. Actually this indicates there is a problem at the client. Usually this is caused if the client side session thread pool is out of threads because e.g. all threads stuck in onMessage delivery. If now a request times out and the replies were already in the session pool's input queue but can't be delivered due to waiting for onMessage completion, they will be delivered when the next request is sent.

Increasing the max threads of the session pool should help. Look here. In a recent release we have increased the max threads to 50.

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by Leos Bitto :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks for the hint. Is there any way how to actually calculate the necessary values for swiftmq.pool.session.threads.max and swiftmq.pool.connection.threads.max? As I wrote, I have an application which uses just 1 XAConnection and each thread uses 1 XASession with multiple MessageProducers and MessageConsumers. I can control the number of the threads started by the application.

Your documentation states "The default settings are optimal, so a change would be seldom." Because I seem to hit the seldom situation when I have to change the default values, how do I calculate them? Randomly choosing different values while experimenting with a system under heavy load seems to be problematic...

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by IIT Software :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Since onMessage is the only thing which can block a session thread, the theoretical maximum is the maximum number of concurrent onMessage invocations.

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by Leos Bitto :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In that case my problems must be caused by something else, because the application does not use the asynchronous delivery of the messages via onMessage() at all - it uses synchronous delivery via MessageProducer#receive instead.

Since you wrote about session threads, I understand that it is about the value swiftmq.pool.session.threads.max. Is there any information about the value swiftmq.pool.connection.threads.max, too?

Additionally please note that my threads got blocked when dealing with com.swiftmq.jms.v610.XAResourceImpl - is there any special parameter for tuning the XA transactions?

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by IIT Software :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The connection pool will not block. The best would be to shoot a thread dump of the client when you get this problem again to see if it's related to the session pool. The dump must be created when it stucks (before the request timeout).

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by IIT Software :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The problem could also be related to network buffer resize because you use only 1 connection but many XA sessions sending to it. This requires network buffers (client output and router input). You might set them to a higher initial size and an extend size of 1 MB.

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by Leos Bitto :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

In this case it might be important to note that we use the Network NIO Swiftlet. Is there anything special to configure for this swiftlet? Or could it help to use the standard Network Swiftlet instead?

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by IIT Software :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

There isn't any special configuration for the Network NIO Swiftlet. However, since you use a 6.x release you might upgrade to 7.5.4 because we had some important improvements concerning buffers in 7.5.0.

How many XA sessions do you use and what is the average message and transaction size?

Re: concurrency problem in com.swiftmq.jms.v610.XAResourceImpl?

by Leos Bitto :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Strange thing: after we tuned various parameters in our application, we are not able to reproduce the problem with SwiftMQ XA commit anymore - so I cannot provide the thread dump you wanted. We did not touch any SwiftMQ configuration parameters, though (still using the same version 6.2.1). The application usually uses about 20 XA sessions in 1 XA connection, the size of the transaction is about 50 TextMessages, most TextMessages have 1-2 kB, some are bigger (max. 100 kB).