|
View:
New views
6 Messages
—
Rating Filter:
Alert me
|
|
|
Directory.list() deprecationHi all.
I am trying to clean up some deprecated calls which are showing up on upgrading to 2.9.0 (from 2.3.2...), and I have just come across Directory.list(), which says this: > Deprecated For some Directory implementations (FSDirectory, and its subclasses), this method silently filters its results to include only index files. Please use listAll instead, which does no filtering. * We have files in there which aren't Lucene's so obviously listAll() will not work. * We can't use FSDirectory directly because our tests rely on the Directory abstraction so that they can use a RAMDirectory. Given this, what is the suggested replacement for this method once it goes away? I'm not sure I understand the motivation for the change in list(), but I do think it was inconsistent for Directory implementations to perform different filtering (they should have at least all used the same filter.) Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
|
|
Re: Directory.list() deprecationWell... you can use oal.index.IndexFileNameFilter.getFilter() to
filter for only the Lucene index files, or, you could filter for the additional files you know you've placed in the index directory? The motivation for this change was that Directory is the wrong place to have "smarts" about what is & isn't an index file: it's too low-level (and, different Directory impls were inconsistent). Especially with flexible indexing coming, where a codec can write whatever files it wants, the Directory has no way know. Some details are in http://issues.apache.org/jira/browse/LUCENE-1468. Mike On Fri, Nov 6, 2009 at 12:39 AM, Daniel Noll <daniel@...> wrote: > Hi all. > > I am trying to clean up some deprecated calls which are showing up on > upgrading to 2.9.0 (from 2.3.2...), and I have just come across > Directory.list(), which says this: > >> Deprecated For some Directory implementations (FSDirectory, and its subclasses), this method silently filters its results to include only index files. Please use listAll instead, which does no filtering. > > * We have files in there which aren't Lucene's so obviously > listAll() will not work. > * We can't use FSDirectory directly because our tests rely on the > Directory abstraction so that they can use a RAMDirectory. > > Given this, what is the suggested replacement for this method once it goes away? > > I'm not sure I understand the motivation for the change in list(), but > I do think it was inconsistent for Directory implementations to > perform different filtering (they should have at least all used the > same filter.) > > Daniel > > -- > Daniel Noll Forensic and eDiscovery Software > Senior Developer The world's most advanced > Nuix email data analysis > http://nuix.com/ and eDiscovery software > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@... > For additional commands, e-mail: java-user-help@... > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
|
|
Re: Directory.list() deprecationOn Fri, Nov 6, 2009 at 20:26, Michael McCandless
<lucene@...> wrote: > Well... you can use oal.index.IndexFileNameFilter.getFilter() to > filter for only the Lucene index files, or, you could filter for the > additional files you know you've placed in the index directory? This is the workaround we're currently using, but it's pretty obvious why it's less than ideal: FileNameFilter filter = IndexFileNameFilter.getFilter(); List<String> results = new ArrayList<String>(); for (String candidate : dir.listAll()) { if (filter.accept(null, candidate)) { // <-- results.add(candidate); } } The biggest issue here is that the FileNameFilter forces us to provide a File for the first parameter even though the index may not even be stored on disk. We can pass null and hope that the filter won't have an issue with that, which works ... *for now*. > The motivation for this change was that Directory is the wrong place > to have "smarts" about what is & isn't an index file: it's too > low-level (and, different Directory impls were inconsistent). > Especially with flexible indexing coming, where a codec can write > whatever files it wants, the Directory has no way know. This seems reasonable, but it would have been nice to have a list method which accepted a filter so that there would at least be a replacement for the old behaviour. The way it is now, Lucene has deprecated a method people were using while providing no replacement except for "write it yourself", the same as what happened when Hits got canned. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
|
|
Re: Directory.list() deprecationOn Sun, Nov 8, 2009 at 4:58 PM, Daniel Noll <daniel@...> wrote:
>> Well... you can use oal.index.IndexFileNameFilter.getFilter() to >> filter for only the Lucene index files, or, you could filter for the >> additional files you know you've placed in the index directory? > > This is the workaround we're currently using, but it's pretty obvious > why it's less than ideal: > > FileNameFilter filter = IndexFileNameFilter.getFilter(); > List<String> results = new ArrayList<String>(); > for (String candidate : dir.listAll()) { > if (filter.accept(null, candidate)) { // <-- > results.add(candidate); > } > } > > The biggest issue here is that the FileNameFilter forces us to provide > a File for the first parameter even though the index may not even be > stored on disk. We can pass null and hope that the filter won't have > an issue with that, which works ... *for now*. I don't expect IndexFileNameFilter will ever look at that (File directory) argument. Lucene itself does the same thing (passes null), internally, whenever it uses IndexFileNameFilter. >> The motivation for this change was that Directory is the wrong place >> to have "smarts" about what is & isn't an index file: it's too >> low-level (and, different Directory impls were inconsistent). >> Especially with flexible indexing coming, where a codec can write >> whatever files it wants, the Directory has no way know. > > This seems reasonable, but it would have been nice to have a list > method which accepted a filter so that there would at least be a > replacement for the old behaviour. The way it is now, Lucene has > deprecated a method people were using while providing no replacement > except for "write it yourself", the same as what happened when Hits > got canned. Honestly I thought the effort to write the above code was trivial enough that preserving this inside Lucene was not necessary. But I guess would have been good to include such a code fragment in the javadocs for list(). Stepping back, since presumably your app knows what it's storing in the directory, can't you filter for files you know you've created? What's the larger use case here? Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
|
|
Re: Directory.list() deprecationOn Tue, Nov 10, 2009 at 00:44, Michael McCandless
<lucene@...> wrote: > Stepping back, since presumably your app knows what it's storing in > the directory, can't you filter for files you know you've created? > What's the larger use case here? The exact use case where we were using list() is to determine whether the index had data in it, without having to open it and do a docCount() (well, there were also calls to it in the unit tests, but those were entirely replaceable with listAll()). This was previously a one-liner: boolean containsData = directory.list().length > 1 Maybe there is another newer API which will return this to being a one-liner -- at the time it was written this seemed to be the best option. By the way, when I mean "there is no data in it", I mean the index exists but has 0 documents. Detecting that the index itself does not exist is somewhat simpler. Daniel -- Daniel Noll Forensic and eDiscovery Software Senior Developer The world's most advanced Nuix email data analysis http://nuix.com/ and eDiscovery software --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
|
|
Re: Directory.list() deprecationOn Mon, Nov 9, 2009 at 7:53 PM, Daniel Noll <daniel@...> wrote:
> On Tue, Nov 10, 2009 at 00:44, Michael McCandless > <lucene@...> wrote: >> Stepping back, since presumably your app knows what it's storing in >> the directory, can't you filter for files you know you've created? >> What's the larger use case here? > > The exact use case where we were using list() is to determine whether > the index had data in it, without having to open it and do a > docCount() (well, there were also calls to it in the unit tests, but > those were entirely replaceable with listAll()). > > This was previously a one-liner: > > boolean containsData = directory.list().length > 1 > > Maybe there is another newer API which will return this to being a > one-liner -- at the time it was written this seemed to be the best > option. > > By the way, when I mean "there is no data in it", I mean the index > exists but has 0 documents. Detecting that the index itself does not > exist is somewhat simpler. I see. There's IndexReader.indexExists(), but it sounds like that's not what you want because you want to check whether in fact it has > 0 docs in it. Otherwise, I think something like this (requires 2.9, since prior to that SegmentInfos isn't public) should work: SegmentInfos sis = new SegmentInfos(); try { sis.read(dir); } catch (IOException ioe) { // presumably no index exists } int totDocCount = 0; for(SegmentInfo info : sis) { totDocCount += info.docCount; } It's not a one-liner, but it's fast to run since it just reads the segments file. But remember that SegmentInfos has forward rights to break back-compat ("subject to change suddenly in the next release")! Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
| Free embeddable forum powered by Nabble | Forum Help |