« Return to Thread: Getting "bitronix.tm.internal.BitronixXAException: resource already started on XID" error

Re: Getting "bitronix.tm.internal.BitronixXAException: resource already started on XID" error

by Brett Wooldridge-2 :: Rate this Message:

| View in Thread

Eric,

Another thing which will be useful whether you're running  btm 2.1.2 or the development branch
is to enable logging.  Bitronix has extensive logging, and usually if we can see the logs from
the time of the error, we can figure out what went wrong (whether it was a bitronix bug or a bug
in some other component in the stack).

Because of the volume of logging from Bitronix, I recommend logging to a different file than
your primary application log.  Also, because of the volume of logging -- particularly under a 
load test -- I recommend rolling the logs so you don't lose the logs at the crucial time of
failure.

Here is a sample recommended configuration for log4j:

log4j.logger.bitronix.tm=debug, BTM
log4j.additivity.bitronix.tm=false

log4j.appender.BTM=org.apache.log4j.RollingFileAppender
log4j.appender.BTM.File=btm.log
log4j.appender.BTM.MaxFileSize=10MB
log4j.appender.BTM.MaxBackupIndex=50
log4j.appender.BTM.layout=org.apache.log4j.PatternLayout
log4j.appender.BTM.layout.ConversionPattern=%d{HH:mm:ss,SSS} [%-25.25c{1}] [%-25t] %-5p - %m%n

This is going to create a 10MB log file, which rolls 50 times (btm.log.1, btm.log.2, etc.).  The [%-25t] part
is also crucial because it logs the thread name.

What you want to do is run your load test, and if you can, stop the test as soon after the error is
observed as possible.  In your application log, find the exception that occurred during the run, and note
the time (HH:mm:ss,SSS).  Now, start opening btm.log files (btm.log, btm.log.1, btm.log.2, etc) and
find the one that contains the same approximate timestamp as the application error (+/- a few milliseconds).

Once you find the corresponding btm log, zip that btm.log AND THE TWO NEXT OLDER LOGS together
with the application log file with the error.  So, if you the error timestamp corresponded to logs in the
btm.log.18 file, zip that file along with btm.log.19 and btm.log.20 files.

You can open a bug and attach the zipped log files and we'll take a look at them.

Regards,
Brett


On Sat, Dec 10, 2011 at 6:55 PM, Brett Wooldridge <brett.wooldridge@...> wrote:
Hi Eric,

Thanks for using Bitronix.  I'm sure Ludovic will be responding soon regarding this issue, but
if you're willing to test the BTM-2.2 development branch, I would be interested to see if this
issue has been fixed.

With BTM-2.2 we're moving from a Java 1.4 codebase to Java 5.  And in the process
converting the code to use type-safe collections and more importantly taking advantage of
the newer Java 5 concurrent collections and synchronization primitives.  While going
through this process, I found at least one theoretical race condition, though not one that
had yet been reported.

I've been running the BTM-2.2 codebase on our pre-production servers for the past
two months, and have run several tens of millions of transactions through without 
issue.  Admittedly, we do not have multiple datasources and are not using LRC, so 
your environment is definitely different.

So, if you want to try the snapshot and give me your feedback regarding this issue
or whether any other errors occur in your testing.  I've made a snapshot available here:


Regards,
Brett

On Sat, Dec 10, 2011 at 12:32 AM, Eric Westfall <ewestfal@...> wrote:
Hello, we're having an issue that I'm hoping someone on this list may know something about.

We have an application configured which uses two different databases. It's using the transaction management support provided by the spring framework configured for JTA.  Things seems to work well most of the time, but when we run load tests against it we get errors similar to the following occasionally:

* bitronix.tm.internal.BitronixXAException: resource already started on XID a Bitronix XID [737072696E672D62746D0000013423394EA100000E41 : 737072696E672D62746D0000013423394F0100000E4F]
       at bitronix.tm.resource.jdbc.lrc.LrcXAResource.start(LrcXAResource.java:135)
       at bitronix.tm.internal.XAResourceHolderState.start(XAResourceHolderState.java:219)
       at bitronix.tm.internal.XAResourceManager.enlist(XAResourceManager.java:111)
       at bitronix.tm.BitronixTransaction.enlistResource(BitronixTransaction.java:93)
       at bitronix.tm.resource.common.TransactionContextHelper.enlistInCurrentTransaction(TransactionContextHelper.java:70)
       at bitronix.tm.resource.jdbc.JdbcConnectionHandle.enlistResource(JdbcConnectionHandle.java:84)
       at bitronix.tm.resource.jdbc.JdbcConnectionHandle.prepareStatement(JdbcConnectionHandle.java:289)
       at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
   ...

Full stack trace here: http://www.pastie.org/2991516

Any ideas on why this might be happening?  It appears that somehow a resource which is "started" is getting reused across multiple concurrent transactions which seems like it shouldn't happen.

Here is our bitronix transaction manager configuration:

bitronix.tm.serverId=spring-btm
bitronix.tm.disableJmx=true
bitronix.tm.timer.defaultTransactionTimeout=3600
bitronix.tm.journal=null
bitronix.tm.timer.gracefulShutdownInterval=0
bitronix.tm.journal.disk.logPart1Filename=target/btm1.tlog
bitronix.tm.journal.disk.logPart2Filename=target/btm2.tlog
bitronix.tm.2pc.warnAboutZeroResourceTransactions=false
bitronix.tm.allowMultipleLrc=true
bitronix.tm.setIgnoreRecoveryFailures=true

Of particular note here, we are using the "LRC" setup with two datasources which is why we have allowMultipleLrc=true

Relevant PoolingDataSource configuration from our spring configuration is as follows:

<property name="useTmJoin" value="true" />
<property name="allowLocalTransactions" value="true" />
<property name="shareTransactionConnections" value="true" />
<property name="enableJdbc4ConnectionTest" value="true" />
<property name="automaticEnlistingEnabled" value="true"/>

Thanks in advance for any help!
Eric



---------------------------------------------------------------------
To unsubscribe from this list, please visit:

  http://xircles.codehaus.org/manage_email




 « Return to Thread: Getting "bitronix.tm.internal.BitronixXAException: resource already started on XID" error