Suspend/resume not working as expected

View: New views
8 Messages — Rating Filter:   Alert me  

Suspend/resume not working as expected

by Adrian Miron :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The following issue was discovered while doing some work with hibernate, but it boils down to the following scenario:

1. start a transaction
2. create a prepared statement (and keep a reference to it for later use)
3. execute the statement
4. suspend the transaction
5. resume the transaction
6. execute the prepared statement a second time

By doing some debugging I saw that suspend will call xaresource.end(tmsuccess) for all resources in the transaction, but resume won’t start them again. (probably because they will be lazily started with a tmjoin later when they are used – via JdbcConnectionHandle)

But in this case, the second execution of that prepared statement is not “intercepted” by bitronix, and it will execute on a connection that was previously ended.  From my observations results may vary: sometimes the statement execution just blocks, other times it is successful but done in a new transaction.

I did a wild guess :) and added the following line:
xaResourceHolderState.start(XAResource.TMJOIN);
in XAResourceManager# resume() right before the end of the loop and things seem to work, but I really don’t know if it’s the right thing to do.


I'm using:
Bitronix 1.3.2
MSSQL 2005
MSSQL JDBC driver v2.0

Re: Suspend/resume not working as expected

by Ludovic Orban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Strictly speaking this isn't a bug but a limitation of the current connection pool but I agree the result is the same: it does not work as you expect it to.

Your patch is just going to work in this exact situation and only if the underlying database properly supports transaction joining. I've searched for the best way to cope with that problem and the only proper way is to wrap the driver's Statement / PreparedStatement / CallableStatement objects and trigger enlistment from there which is quite a serious change.

A simpler (but largely less elegant) fix XAResourceManager#resume() could re-enlist the resource. This isn't as easy as it sounds as some logic to check if TMJOIN is disabled or not must be added and some internal safeguards must be relaxed.

I'll try to build a patched version with those changes in over the weekend, I'd be glad if you could open an issue in JIRA in the meantime.

Thanks for the report,
Ludovic

Re: Suspend/resume not working as expected

by Adrian Miron :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thank you for your response.

I’ve added the following JIRA issue: http://jira.codehaus.org/browse/BTM-49


Thanks,
Adrian Miron

Re: Suspend/resume not working as expected

by Ludovic Orban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I've finally found the time to work on this issue. I've committed the fix in the SVN trunk and prepared a snapshot build (see link in the JIRA issue).

Could you please give it a try and let me know if it helped ?

Thanks,
Ludovic

Re: Suspend/resume not working as expected

by Adrian Miron :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Sorry for my delayed response, finally I was able to try the fix.

I did find a problem:

The XAResourceManager#resume method iterates over resources (type Scheduler) and in the same time the enlist method will remove and add objects into the resources collection (a case of concurrent modification).

In my particular case, I had two resources to be resumed and only one of them got re-enlisted. The iterator from the resume method returned twice the same object.

Re: Suspend/resume not working as expected

by Ludovic Orban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks for the feedback.

Would it be possible for you to collect debug logs while reproducing the problems and send me the file?

This would help tremendously figuring out exactly what you did and what went wrong.

Thanks,
Ludovic

Re: Suspend/resume not working as expected

by Ludovic Orban :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I found a problem in the code in case the resource supports join but does not want to join at this time. Maybe this is the problem you hit?

I've prepared a snapshot build with the fix: http://snapshots.repository.codehaus.org/org/codehaus/btm/btm/1.3.3-20090829/

Please try it and let me know if it helped. If not, please send me the debug logs you collected with that snapshot version (not the RC1).

Thanks,
Ludovic

Re: Suspend/resume not working as expected

by Adrian Miron :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Yes, that was it. Moving the re-enlistment out of the loop did the trick.

Thanks,
Adrian Miron