[
https://issues.apache.org/jira/browse/DIRSERVER-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722995#action_12722995 ]
Emmanuel Lecharny commented on DIRSERVER-1377:
----------------------------------------------
Update :
After having put some more logs, and grepped/sedded gigabytes of logs (no fun at all), I have found some dubious code and result. Here is the log I obtain just before the first NPE :
---> pool-6-thread-7 - ---> Remove apacheOneLevel_forward = 1, 226034
pool-6-thread-7 - <--- Remove AVL apacheOneLevel_forward = 1, 226034
pool-6-thread-7 - ---> Remove apacheOneLevel_reverse = 226034
pool-6-thread-7 - <--- Remove AVL apacheOneLevel_reverse = 226034
pool-6-thread-7 - ---> Remove apacheSubLevel_forward = 226034, 226034
pool-6-thread-7 - <--- Remove AVL apacheSubLevel_forward = 226034, 226034
pool-6-thread-7 - ---> Remove apacheSubLevel_reverse = 226034
pool-6-thread-7 - <--- Remove AVL apacheSubLevel_reverse = 226034
---> pool-6-thread-7 - ---> Remove apacheOneLevel_forward = 1, 226034
pool-6-thread-7 - Error while removing 1, 226034 on table apacheOneLevel_forward
java.lang.NullPointerException
at org.apache.directory.server.core.avltree.AvlTreeMarshaller.deserialize(AvlTreeMarshaller.java:240)
As one can see, we remove twice something from the oneLevelIndex. The code which does that is :
ndnIdx.drop( id );
updnIdx.drop( id );
* oneLevelIdx.drop( id );
entryCsnIdx.drop( id );
entryUuidIdx.drop( id );
if( id != 1 )
{
subLevelIdx.drop( id );
}
// Remove parent's reference to entry only if entry is not the upSuffix
if ( !parentId.equals( 0L ) )
{
* oneLevelIdx.drop( parentId, id );
}
Sadly (?) the second removal has no impact (I checked it), except sucking CPU.
Otherwise, *all* the exceptions I get are from the same line of code : AvlTreeMarshaller.java:240
for( int i = 0; i < nodes.length - 1; i++ )
{
nodes[ i ].setNext( nodes[ i + 1] );
* nodes[ i + 1].setPrevious( nodes[ i ] );
}
which means the index table is broken (nodes[i+1] is null). There are two possibilities for this table to be incorrect :
- we save some bad table
- we recreate a bad table
I will double check that code.
Otherwise, we discussed a lot with Kiran about serialization/deserialization, and we agreed that it should be done *inside* jdbm. Will work on that too.
> Potential concurrency issue when adding/modifying/deleting entries at a high rate
> ---------------------------------------------------------------------------------
>
> Key: DIRSERVER-1377
> URL:
https://issues.apache.org/jira/browse/DIRSERVER-1377> Project: Directory ApacheDS
> Issue Type: Bug
> Affects Versions: 1.5.4
> Reporter: Emmanuel Lecharny
> Priority: Blocker
> Fix For: 1.5.5
>
>
> When adding/deleting entries with many clients (a client add and delete an entry many times), we may have some concurrency problem, as the index are updated without concurrent acces protection.
> Synchronizing the classes where we update the index might help.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.