Lock problems: Lock obtain timed out

View: New views
4 Messages — Rating Filter:   Alert me  

Lock problems: Lock obtain timed out

by Jérôme Etévé :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

  I've got a few machines who post documents concurrently to a solr
instance. They do not issue the commit themselves, instead, I've got
autocommit set up at solr server side:
   <autoCommit>
      <maxDocs>50000</maxDocs> <!--  commit at least every 50000 docs -->
      <maxTime>60000</maxTime> <!-- Stays max 60s without commit -->
    </autoCommit>

This usually works fine, but sometime the server goes in a deadlock
state . Here's the errors I get from the log (these go on forever
until I delete the index and restart all from zero):

02-Nov-2009 10:35:27 org.apache.solr.update.SolrIndexWriter finalize
SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates
a bug -- POSSIBLE RESOURCE LEAK!!!
...
[ multiple messages like this ]
...
02-Nov-2009 10:35:27 org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
timed out: NativeFSLock@/home/solrdata/jobs/index/lucene-703db99881e56205cb910a2e5fd816d3-write.lock
        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
        at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1538)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1395)
        at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:190)
        at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
        at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
        at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
        at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
        at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
        at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
        at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)


I'm wondering what could be the reason for this (if a commit takes
mire than 60 seconds for instance?), and if I should use better
locking or autocommittting options?

Here's the locking conf I've got at the moment:
   <writeLockTimeout>1000</writeLockTimeout>
    <commitLockTimeout>10000</commitLockTimeout>
   <lockType>native</lockType>

I'm using solr trunk from 12 oct 2009 within tomcat.

Thanks for any help.

Jerome.

--
Jerome Eteve.
http://www.eteve.net
jerome@...

Re: Lock problems: Lock obtain timed out

by hossman :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


: 02-Nov-2009 10:35:27 org.apache.solr.update.SolrIndexWriter finalize
: SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates
: a bug -- POSSIBLE RESOURCE LEAK!!!

can you post some context showing what the logs look like just before
these errors?

I'm not sure what might be causing lock collision but your guess about
commit's taking too long and overlapping is a good one -- what do the log
messages about the commits say arround the time these errors start? the
commit logs when it finishes and how long it takes so it's easy to spot.

increasing your writeLockTimeout is probably a good idea, but i'm still
confused as to why the whole server would lock up until you delete the
index and restart, at worst i would expect the update/commit attempts that
time out getting the lock to complain loudly, but then the "slow" one
would eventually finish and subsequent attempts would work ok.

...very odd.

-Hoss


Re: Lock problems: Lock obtain timed out

by Jérôme Etévé :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

It seems this situation is caused by some No space left on device exeptions:
SEVERE: java.io.IOException: No space left on device
        at java.io.RandomAccessFile.writeBytes(Native Method)
        at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
        at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)


I'd better try to set my maxMergeDocs and mergeFactor to more
adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
device, so I guess there's problem when solr tries to merge the index
bits being build.

At the moment, they are set to   <mergeFactor>100</mergeFactor> and
<maxMergeDocs>2147483647</maxMergeDocs>

Jerome.

--
Jerome Eteve.
http://www.eteve.net
jerome@...

Re: Lock problems: Lock obtain timed out

by Lance Norskog-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This will not ever work reliably. You should have 2x total disk space
for the index. Optimize, for one, requires this.

On Wed, Nov 4, 2009 at 6:37 AM, Jérôme Etévé <jerome.eteve@...> wrote:

> Hi,
>
> It seems this situation is caused by some No space left on device exeptions:
> SEVERE: java.io.IOException: No space left on device
>        at java.io.RandomAccessFile.writeBytes(Native Method)
>        at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
>        at org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
>        at org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
>
>
> I'd better try to set my maxMergeDocs and mergeFactor to more
> adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
> device, so I guess there's problem when solr tries to merge the index
> bits being build.
>
> At the moment, they are set to   <mergeFactor>100</mergeFactor> and
> <maxMergeDocs>2147483647</maxMergeDocs>
>
> Jerome.
>
> --
> Jerome Eteve.
> http://www.eteve.net
> jerome@...
>



--
Lance Norskog
goksron@...