|
View:
New views
8 Messages
—
Rating Filter:
Alert me
|
|
|
deleteDocuments() does not workHi all,
I have a very simple method to delete a document that is indexed before /** * @param id */ public void deleteById(String id) throws IOException { IndexWriter writer = IndexWriterFactory.factory(); try { writer.deleteDocuments(new Term(Configuration.Field.ID, String.valueOf(id))); writer.commit(); } catch (ArrayIndexOutOfBoundsException e) { // CHECK ignore this. Can happen if index has not been built yet } catch (IOException e) { System.out.println(e); } } The problem is after executing this method without any exception, I come back and try to do a search the supposed-to-be-deleted record is still there. I need to restart my servlet engine to have that record been really deleted. How can it happen? Thanks Dinh |
|
|
Re: deleteDocuments() does not workHi Dinh,
Is it that your engine keeps an IndexSearcher[Reader] open all through this while? For the deleted document to actually reflect in the search (service), you'd need to reload the index searcher with the latest version. -- Anshum Gupta Naukri Labs! http://ai-cafe.blogspot.com The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw............ On Wed, Oct 28, 2009 at 4:15 PM, Dinh <pcdinh@...> wrote: > Hi all, > > I have a very simple method to delete a document that is indexed before > > /** > * @param id > */ > public void deleteById(String id) throws IOException { > IndexWriter writer = IndexWriterFactory.factory(); > > try { > writer.deleteDocuments(new Term(Configuration.Field.ID, > String.valueOf(id))); > writer.commit(); > } catch (ArrayIndexOutOfBoundsException e) { > // CHECK ignore this. Can happen if index has not been built yet > } catch (IOException e) { > System.out.println(e); > } > } > > The problem is after executing this method without any exception, I come > back and try to do a search the supposed-to-be-deleted record is still > there. I need to restart my servlet engine to have that record been really > deleted. How can it happen? > > Thanks > > Dinh > |
|
|
Re: deleteDocuments() does not workHi Anshum,
> Is it that your engine keeps an IndexSearcher[Reader] open all through this while? The answer is yes. I have tried to keep a singleton instance of IndexSearcher open across web requests. Regarding to your advice, I have tried to re-open the IndexReader that is associated with that IndexSearcher public void deleteById(String id) throws IOException { IndexWriter writer = IndexWriterFactory.factory(); try { writer.deleteDocuments(new Term(Configuration.Field.ID, String.valueOf(id))); writer.commit(); IndexSearcherFactory.reload(); } catch (ArrayIndexOutOfBoundsException e) { // CHECK ignore this. Can happen if index has not been built yet } catch (IOException e) { System.out.println(e); } } Here is how IndexSearcherFactory#reload is defined public static void reload() throws CorruptIndexException, IOException { Set<Map.Entry<String, IndexReader>> set = readers.entrySet(); for (Map.Entry<String, IndexReader> entry : set) { readers.put(entry.getKey(), entry.getValue().reopen(true)); } } However, it does not work either. Is there any way to debug this situation? Thanks, Dinh On Wed, Oct 28, 2009 at 5:49 PM, Anshum <anshumg@...> wrote: > Hi Dinh, > Is it that your engine keeps an IndexSearcher[Reader] open all through this > while? For the deleted document to actually reflect in the search > (service), > you'd need to reload the index searcher with the latest version. > -- > Anshum Gupta > Naukri Labs! > http://ai-cafe.blogspot.com > > The facts expressed here belong to everybody, the opinions to me. The > distinction is yours to draw............ > > > On Wed, Oct 28, 2009 at 4:15 PM, Dinh <pcdinh@...> wrote: > > > Hi all, > > > > I have a very simple method to delete a document that is indexed before > > > > /** > > * @param id > > */ > > public void deleteById(String id) throws IOException { > > IndexWriter writer = IndexWriterFactory.factory(); > > > > try { > > writer.deleteDocuments(new Term(Configuration.Field.ID, > > String.valueOf(id))); > > writer.commit(); > > } catch (ArrayIndexOutOfBoundsException e) { > > // CHECK ignore this. Can happen if index has not been built > yet > > } catch (IOException e) { > > System.out.println(e); > > } > > } > > > > The problem is after executing this method without any exception, I come > > back and try to do a search the supposed-to-be-deleted record is still > > there. I need to restart my servlet engine to have that record been > really > > deleted. How can it happen? > > > > Thanks > > > > Dinh > > > -- Spica Framework: http://code.google.com/p/spica http://www.twitter.com/pcdinh http://groups.google.com/group/phpvietnam |
|
|
Re: deleteDocuments() does not workCan you not suppress the AIOOBE (just in case you're hitting that)?
Also, you are failing to close the old reader after opening a new one. This shouldn't cause the issue you're seeing, but, will lead eventually to OOME or file descriptor exhaustion. Can you verify you are in fact reopening the reader that's reading the same Directory the writer is writing to? Finally, are you sure the iteration over the Map entries, that overwrites each entry, is safe? Maybe, after writer.commit, try to simply [temporarily] open a new reader on that Dir and see if the doc is deleted. Are you sure String.valueOf(id) is giving you the expected result? Eg does id ever have leading zeros? Mike On Wed, Oct 28, 2009 at 7:17 AM, Dinh <pcdinh@...> wrote: > Hi Anshum, > >> Is it that your engine keeps an IndexSearcher[Reader] open all through > this > while? > > The answer is yes. I have tried to keep a singleton instance of > IndexSearcher open across web requests. > > Regarding to your advice, I have tried to re-open the IndexReader that is > associated with that IndexSearcher > > public void deleteById(String id) throws IOException { > IndexWriter writer = IndexWriterFactory.factory(); > > try { > writer.deleteDocuments(new Term(Configuration.Field.ID, > String.valueOf(id))); > writer.commit(); > IndexSearcherFactory.reload(); > } catch (ArrayIndexOutOfBoundsException e) { > // CHECK ignore this. Can happen if index has not been built yet > } catch (IOException e) { > System.out.println(e); > } > } > > Here is how IndexSearcherFactory#reload is defined > > public static void reload() throws CorruptIndexException, IOException { > > Set<Map.Entry<String, IndexReader>> set = readers.entrySet(); > for (Map.Entry<String, IndexReader> entry : set) { > readers.put(entry.getKey(), entry.getValue().reopen(true)); > } > } > > However, it does not work either. > > Is there any way to debug this situation? > > Thanks, > > Dinh > > On Wed, Oct 28, 2009 at 5:49 PM, Anshum <anshumg@...> wrote: > >> Hi Dinh, >> Is it that your engine keeps an IndexSearcher[Reader] open all through this >> while? For the deleted document to actually reflect in the search >> (service), >> you'd need to reload the index searcher with the latest version. >> -- >> Anshum Gupta >> Naukri Labs! >> http://ai-cafe.blogspot.com >> >> The facts expressed here belong to everybody, the opinions to me. The >> distinction is yours to draw............ >> >> >> On Wed, Oct 28, 2009 at 4:15 PM, Dinh <pcdinh@...> wrote: >> >> > Hi all, >> > >> > I have a very simple method to delete a document that is indexed before >> > >> > /** >> > * @param id >> > */ >> > public void deleteById(String id) throws IOException { >> > IndexWriter writer = IndexWriterFactory.factory(); >> > >> > try { >> > writer.deleteDocuments(new Term(Configuration.Field.ID, >> > String.valueOf(id))); >> > writer.commit(); >> > } catch (ArrayIndexOutOfBoundsException e) { >> > // CHECK ignore this. Can happen if index has not been built >> yet >> > } catch (IOException e) { >> > System.out.println(e); >> > } >> > } >> > >> > The problem is after executing this method without any exception, I come >> > back and try to do a search the supposed-to-be-deleted record is still >> > there. I need to restart my servlet engine to have that record been >> really >> > deleted. How can it happen? >> > >> > Thanks >> > >> > Dinh >> > >> > > > > -- > Spica Framework: http://code.google.com/p/spica > http://www.twitter.com/pcdinh > http://groups.google.com/group/phpvietnam > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
|
|
Re: deleteDocuments() does not workHi Michael,
Thank a lot for your advice > Can you verify you are in fact reopening the reader that's reading the > same Directory the writer is writing to? Yes. I have a single and configurable index path. So I can not make a mistake here > Also, you are failing to close the old reader after opening a new one. > This shouldn't cause the issue you're seeing, but, will lead > eventually to OOME or file descriptor exhaustion. I have rewritten the method as follows /** * Reloads searchers after index is changed (added, deleted or updated). */ public static synchronized void reload() { Set<Map.Entry<String, IndexSearcher>> set = searchers.entrySet(); for (Map.Entry<String, IndexSearcher> entry : set) { try { IndexSearcher searcher = entry.getValue(); IndexReader oldReader = searcher.getIndexReader(); IndexReader newReader = oldReader.reopen(true); if (newReader != oldReader) { oldReader.close(); searcher.close(); searchers.put(entry.getKey(), new IndexSearcher(newReader)); } } catch (Exception e) { log.warn(e.getMessage(), e); } } } And it works now. > Finally, are you sure the iteration over the Map entries, that > overwrites each entry, is safe? Do you think that my iteration is safe now? At least I have closed the previous searcher and oldReader before creating new ones. However, I don't know if it is a good practice to do so. Thanks Dinh On Wed, Oct 28, 2009 at 6:47 PM, Michael McCandless < lucene@...> wrote: > Can you not suppress the AIOOBE (just in case you're hitting that)? > > Also, you are failing to close the old reader after opening a new one. > This shouldn't cause the issue you're seeing, but, will lead > eventually to OOME or file descriptor exhaustion. > > Can you verify you are in fact reopening the reader that's reading the > same Directory the writer is writing to? > > Finally, are you sure the iteration over the Map entries, that > overwrites each entry, is safe? > > Maybe, after writer.commit, try to simply [temporarily] open a new > reader on that Dir and see if the doc is deleted. > > Are you sure String.valueOf(id) is giving you the expected result? Eg > does id ever have leading zeros? > > Mike > > On Wed, Oct 28, 2009 at 7:17 AM, Dinh <pcdinh@...> wrote: > > Hi Anshum, > > > >> Is it that your engine keeps an IndexSearcher[Reader] open all through > > this > > while? > > > > The answer is yes. I have tried to keep a singleton instance of > > IndexSearcher open across web requests. > > > > Regarding to your advice, I have tried to re-open the IndexReader that is > > associated with that IndexSearcher > > > > public void deleteById(String id) throws IOException { > > IndexWriter writer = IndexWriterFactory.factory(); > > > > try { > > writer.deleteDocuments(new Term(Configuration.Field.ID, > > String.valueOf(id))); > > writer.commit(); > > IndexSearcherFactory.reload(); > > } catch (ArrayIndexOutOfBoundsException e) { > > // CHECK ignore this. Can happen if index has not been built > yet > > } catch (IOException e) { > > System.out.println(e); > > } > > } > > > > Here is how IndexSearcherFactory#reload is defined > > > > public static void reload() throws CorruptIndexException, IOException > { > > > > Set<Map.Entry<String, IndexReader>> set = readers.entrySet(); > > for (Map.Entry<String, IndexReader> entry : set) { > > readers.put(entry.getKey(), entry.getValue().reopen(true)); > > } > > } > > > > However, it does not work either. > > > > Is there any way to debug this situation? > > > > Thanks, > > > > Dinh > > > > On Wed, Oct 28, 2009 at 5:49 PM, Anshum <anshumg@...> wrote: > > > >> Hi Dinh, > >> Is it that your engine keeps an IndexSearcher[Reader] open all through > this > >> while? For the deleted document to actually reflect in the search > >> (service), > >> you'd need to reload the index searcher with the latest version. > >> -- > >> Anshum Gupta > >> Naukri Labs! > >> http://ai-cafe.blogspot.com > >> > >> The facts expressed here belong to everybody, the opinions to me. The > >> distinction is yours to draw............ > >> > >> > >> On Wed, Oct 28, 2009 at 4:15 PM, Dinh <pcdinh@...> wrote: > >> > >> > Hi all, > >> > > >> > I have a very simple method to delete a document that is indexed > before > >> > > >> > /** > >> > * @param id > >> > */ > >> > public void deleteById(String id) throws IOException { > >> > IndexWriter writer = IndexWriterFactory.factory(); > >> > > >> > try { > >> > writer.deleteDocuments(new Term(Configuration.Field.ID, > >> > String.valueOf(id))); > >> > writer.commit(); > >> > } catch (ArrayIndexOutOfBoundsException e) { > >> > // CHECK ignore this. Can happen if index has not been > built > >> yet > >> > } catch (IOException e) { > >> > System.out.println(e); > >> > } > >> > } > >> > > >> > The problem is after executing this method without any exception, I > come > >> > back and try to do a search the supposed-to-be-deleted record is still > >> > there. I need to restart my servlet engine to have that record been > >> really > >> > deleted. How can it happen? > >> > > >> > Thanks > >> > > >> > Dinh > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@... > For additional commands, e-mail: java-user-help@... > > -- Spica Framework: http://code.google.com/p/spica http://www.twitter.com/pcdinh http://groups.google.com/group/phpvietnam |
|
|
Re: deleteDocuments() does not workOn Tue, Nov 3, 2009 at 4:24 AM, Dinh <pcdinh@...> wrote:
> Hi Michael, > > Thank a lot for your advice > >> Can you verify you are in fact reopening the reader that's reading the >> same Directory the writer is writing to? > > Yes. I have a single and configurable index path. So I can not make a > mistake here OK. >> Also, you are failing to close the old reader after opening a new one. >> This shouldn't cause the issue you're seeing, but, will lead >> eventually to OOME or file descriptor exhaustion. > > I have rewritten the method as follows > > /** > * Reloads searchers after index is changed (added, deleted or updated). > */ > public static synchronized void reload() { > > Set<Map.Entry<String, IndexSearcher>> set = searchers.entrySet(); > > for (Map.Entry<String, IndexSearcher> entry : set) { > try { > IndexSearcher searcher = entry.getValue(); > IndexReader oldReader = searcher.getIndexReader(); > IndexReader newReader = oldReader.reopen(true); > > if (newReader != oldReader) { > oldReader.close(); > searcher.close(); > searchers.put(entry.getKey(), new > IndexSearcher(newReader)); > } > } catch (Exception e) { > log.warn(e.getMessage(), e); > } > } > } Your reload method looks better now! (You are now closing the old reader). > And it works now. Does that mean you no longer see the original problem (changes not being reflected)? >> Finally, are you sure the iteration over the Map entries, that >> overwrites each entry, is safe? > > Do you think that my iteration is safe now? At least I have closed the > previous searcher and oldReader before creating new ones. However, I don't > know if it is a good practice to do so. You get the entrySet from the Map, you then iterate over its Map.Entry, then you replace in your original map some entries (the ones that are opened). So, you are modifying a Java collection while iterating over elements from its Set view... I just don't know if that's safe (anyone?). Would be good to instrument/debug and confirm that the precise IndexReader that's searching the Directory your IndexWriter just committed to, is in fact reopened. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
|
|
Re: deleteDocuments() does not workHi Michael,
> Does that mean you no longer see the original problem (changes not > being reflected)? Yes. The deleted documents do not appear in search results any more. I am not sure that if they are flushed to disk at that time yet but at least there is a sign that they are "deleted". I have stopped and started the servlet engine to ensure that deleted document is no longer there. I think that Lucene requires the previously opened IndexReader be closed before changes can be reflected. > You get the entrySet from the Map, you then iterate over its > Map.Entry, then you replace in your original map some entries (the > ones that are opened). So, you are modifying a Java collection while > iterating over elements from its Set view... I just don't know if > that's safe (anyone?) I am a bit skeptical about my approach. Because the IndexSearchers can be used by other threads (requests) at the same time. So when I close them, some users can be affected. I will find a better way to do it. Also, because reload() is synchronized so there is a single thread accessing it only. So I think that there will be no ConcurrentModificationException > Would be good to instrument/debug and confirm > that the precise IndexReader that's searching the Directory your > IndexWriter just committed to, is in fact reopened. Do you think that these code are enough IndexReader oldReader = searcher.getIndexReader(); IndexReader newReader = oldReader.reopen(true); if (newReader != oldReader) { oldReader.close(); searcher.close(); searchers.put(entry.getKey(), new IndexSearcher(newReader)); } Thanks, Dinh On Tue, Nov 3, 2009 at 4:50 PM, Michael McCandless < lucene@...> wrote: > On Tue, Nov 3, 2009 at 4:24 AM, Dinh <pcdinh@...> wrote: > > Hi Michael, > > > > Thank a lot for your advice > > > >> Can you verify you are in fact reopening the reader that's reading the > >> same Directory the writer is writing to? > > > > Yes. I have a single and configurable index path. So I can not make a > > mistake here > > OK. > > >> Also, you are failing to close the old reader after opening a new one. > >> This shouldn't cause the issue you're seeing, but, will lead > >> eventually to OOME or file descriptor exhaustion. > > > > I have rewritten the method as follows > > > > /** > > * Reloads searchers after index is changed (added, deleted or > updated). > > */ > > public static synchronized void reload() { > > > > Set<Map.Entry<String, IndexSearcher>> set = searchers.entrySet(); > > > > for (Map.Entry<String, IndexSearcher> entry : set) { > > try { > > IndexSearcher searcher = entry.getValue(); > > IndexReader oldReader = searcher.getIndexReader(); > > IndexReader newReader = oldReader.reopen(true); > > > > if (newReader != oldReader) { > > oldReader.close(); > > searcher.close(); > > searchers.put(entry.getKey(), new > > IndexSearcher(newReader)); > > } > > } catch (Exception e) { > > log.warn(e.getMessage(), e); > > } > > } > > } > > Your reload method looks better now! (You are now closing the old reader). > > > And it works now. > > Does that mean you no longer see the original problem (changes not > being reflected)? > > >> Finally, are you sure the iteration over the Map entries, that > >> overwrites each entry, is safe? > > > > Do you think that my iteration is safe now? At least I have closed the > > previous searcher and oldReader before creating new ones. However, I > don't > > know if it is a good practice to do so. > > You get the entrySet from the Map, you then iterate over its > Map.Entry, then you replace in your original map some entries (the > ones that are opened). So, you are modifying a Java collection while > iterating over elements from its Set view... I just don't know if > that's safe (anyone?). Would be good to instrument/debug and confirm > that the precise IndexReader that's searching the Directory your > IndexWriter just committed to, is in fact reopened. > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@... > For additional commands, e-mail: java-user-help@... > > |
|
|
Re: deleteDocuments() does not workOn Tue, Nov 3, 2009 at 5:21 AM, Dinh <pcdinh@...> wrote:
> Hi Michael, > >> Does that mean you no longer see the original problem (changes not >> being reflected)? > > Yes. The deleted documents do not appear in search results any more. I am > not sure that if they are flushed to disk > at that time yet but at least there is a sign that they are "deleted". I > have stopped and started the servlet engine to ensure > that deleted document is no longer there. I think that Lucene requires the > previously opened IndexReader be closed before changes > can be reflected. Hmmm: closing the old IndexReader shouldn't be necessary in order for a newly opened IndexReader to see changes. Something else must've been fixed at the same time (and I'm glad you got it fixed!). >> You get the entrySet from the Map, you then iterate over its >> Map.Entry, then you replace in your original map some entries (the >> ones that are opened). So, you are modifying a Java collection while >> iterating over elements from its Set view... I just don't know if >> that's safe (anyone?) > > I am a bit skeptical about my approach. Because the IndexSearchers can be > used by other threads (requests) at the same time. That's definitely a problem. [Shameless plug:] the next rev of Lucene in Action has a class (SearcherManager) which gracefully handles reopening in the presence of multiple threads still using the old IndexSearcher. It uses the reference counting already builtin to IndexReader to keep track of how many queries are still using the old reader. > Also, because reload() is synchronized so there is a single thread accessing > it only. So I think that there will be no ConcurrentModificationException Right, but your one thread is both iterating over and modifying the Map, at once. That's what concerns me (but it could very well be safe). >> Would be good to instrument/debug and confirm >> that the precise IndexReader that's searching the Directory your >> IndexWriter just committed to, is in fact reopened. > > Do you think that these code are enough > > IndexReader oldReader = searcher.getIndexReader(); > IndexReader newReader = oldReader.reopen(true); > > if (newReader != oldReader) { > oldReader.close(); > searcher.close(); > searchers.put(entry.getKey(), new IndexSearcher(newReader)); > } Yes, this code looks fine! Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
| Free embeddable forum powered by Nabble | Forum Help |