Delete by docId in IndexWriter

View: New views
4 Messages — Rating Filter:   Alert me  

Delete by docId in IndexWriter

by Shay Banon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

   I have a case where deleting documents by doc id make sense (I know before hand the docs I want to delete based on the doc id). I am wondering why the API is not exposed in the IndexWriter (as it is in IndexReader). I understand that this API is more "expert" than typical usage, but it allows for certain optimization on my end (already performed the query for deletion and I have the doc ids, so I don't want to perform it again...). It looks like the DocumentsWriter already has support for deleting by document id, so I was wondering if it is possible to expose it in IndexWriter.

Thanks,
Shay

Re: Delete by docId in IndexWriter

by Jason Rutherglen-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This requires tracking the genealogy of docids as they are merged inside
IndexWriter. It's doable, so if you're particularly interested feel free to
open a jira issue.

On Sun, Jun 28, 2009 at 2:21 AM, Shay Banon <kimchy@...> wrote:

>
> Hi,
>
>   I have a case where deleting documents by doc id make sense (I know
> before hand the docs I want to delete based on the doc id). I am wondering
> why the API is not exposed in the IndexWriter (as it is in IndexReader). I
> understand that this API is more "expert" than typical usage, but it allows
> for certain optimization on my end (already performed the query for
> deletion
> and I have the doc ids, so I don't want to perform it again...). It looks
> like the DocumentsWriter already has support for deleting by document id,
> so
> I was wondering if it is possible to expose it in IndexWriter.
>
> Thanks,
> Shay
> --
> View this message in context:
> http://www.nabble.com/Delete-by-docId-in-IndexWriter-tp24239930p24239930.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@...
> For additional commands, e-mail: java-user-help@...
>

Re: Delete by docId in IndexWriter

by Shay Banon :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Agreed, thats the tricky part. I will open a Jira issue. Really hoping to get some time and maybe also provide a patch...

Thanks,
Shay

Jason Rutherglen-2 wrote:
This requires tracking the genealogy of docids as they are merged inside
IndexWriter. It's doable, so if you're particularly interested feel free to
open a jira issue.

On Sun, Jun 28, 2009 at 2:21 AM, Shay Banon <kimchy@gmail.com> wrote:

>
> Hi,
>
>   I have a case where deleting documents by doc id make sense (I know
> before hand the docs I want to delete based on the doc id). I am wondering
> why the API is not exposed in the IndexWriter (as it is in IndexReader). I
> understand that this API is more "expert" than typical usage, but it allows
> for certain optimization on my end (already performed the query for
> deletion
> and I have the doc ids, so I don't want to perform it again...). It looks
> like the DocumentsWriter already has support for deleting by document id,
> so
> I was wondering if it is possible to expose it in IndexWriter.
>
> Thanks,
> Shay
> --
> View this message in context:
> http://www.nabble.com/Delete-by-docId-in-IndexWriter-tp24239930p24239930.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

Re: Delete by docId in IndexWriter

by Ganesh - yahoo :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This issue has been raised earlier. IndexWriter is not providing any functionality to delete by doc id. IndexReader does, but it requires write permission. In my case, IndexReader and IndexWriter will be opened always. IR will be reopened frequently. To delete, i have no other option to write a query using IW and delete, evethough i know docid.

This feature is a required one.

Regards
Ganesh


----- Original Message -----
From: "Jason Rutherglen" <jason.rutherglen@...>
To: <java-user@...>
Sent: Monday, June 29, 2009 6:54 AM
Subject: Re: Delete by docId in IndexWriter


> This requires tracking the genealogy of docids as they are merged inside
> IndexWriter. It's doable, so if you're particularly interested feel free to
> open a jira issue.
>
> On Sun, Jun 28, 2009 at 2:21 AM, Shay Banon <kimchy@...> wrote:
>
>>
>> Hi,
>>
>>   I have a case where deleting documents by doc id make sense (I know
>> before hand the docs I want to delete based on the doc id). I am wondering
>> why the API is not exposed in the IndexWriter (as it is in IndexReader). I
>> understand that this API is more "expert" than typical usage, but it allows
>> for certain optimization on my end (already performed the query for
>> deletion
>> and I have the doc ids, so I don't want to perform it again...). It looks
>> like the DocumentsWriter already has support for deleting by document id,
>> so
>> I was wondering if it is possible to expose it in IndexWriter.
>>
>> Thanks,
>> Shay
>> --
>> View this message in context:
>> http://www.nabble.com/Delete-by-docId-in-IndexWriter-tp24239930p24239930.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@...
>> For additional commands, e-mail: java-user-help@...
>>
>
Send instant messages to your online friends http://in.messenger.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@...
For additional commands, e-mail: java-user-help@...