<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
	<id>tag:old.nabble.com,2006:forum-46</id>
	<title>Nabble - Lucene - Java Developer</title>
	<updated>2009-12-24T04:00:30Z</updated>
	<link rel="self" type="application/atom+xml" href="http://old.nabble.com/Lucene---Java-Developer-f46.xml" />
	<link rel="alternate" type="text/html" href="http://old.nabble.com/Lucene---Java-Developer-f46.html" />
	<subtitle type="html"></subtitle>
	
<entry>
	<id>tag:old.nabble.com,2006:post-26913256</id>
	<title>[jira] Commented: (LUCENE-2034) Massive Code Duplication in Contrib Analyzers - unifly the analyzer ctors</title>
	<published>2009-12-24T04:00:30Z</published>
	<updated>2009-12-24T04:00:30Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794396#action_12794396&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794396#action_12794396&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Simon Willnauer commented on LUCENE-2034:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;With LUCENE-2169 this patch becomes even more valuable. The creation of Analyzers extending StopwordAnalyzerBase will be way faster than in previous versions, with this in mind we should add a pointer to StopwordAnalyzerBase that recommends using charArraySet for stopwords and in the next step we should get WordlistLoader refactored and deprecate all public HashSet * methods in favour of Set&amp;lt;?&amp;gt; with CharArraySet as an internal implementation. Unifiying the way to create / load stopwords outside of a StopwordAnalyzerBase subclass goes the same way I would guess. We need one way to do it though. I will create correspondent issues within the next days.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Massive Code Duplication in Contrib Analyzers - unifly the analyzer ctors
&lt;br&gt;&amp;gt; -------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2034
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2034&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2034&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: contrib/analyzers
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 2.9
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Robert Muir
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2034,patch, LUCENE-2034,patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.patch, LUCENE-2034.txt
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Due to the variouse tokenStream APIs we had in lucene analyzer subclasses need to implement at least one of the methodes returning a tokenStream. When you look at the code it appears to be almost identical if both are implemented in the same analyzer. &amp;nbsp;Each analyzer defnes the same inner class (SavedStreams) which is unnecessary.
&lt;br&gt;&amp;gt; In contrib almost every analyzer uses stopwords and each of them creates his own way of loading them or defines a large number of ctors to load stopwords from a file, set, arrays etc.. those ctors should be removed / deprecated and eventually removed.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26913256&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26913256&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2034%29-Massive-Code-Duplication-in-Contrib-Analyzers---unifly-the-analyzer-ctors-tp26207039p26913256.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26913218</id>
	<title>[jira] Commented: (LUCENE-2162) ExtendableQueryParser should allow extensions to access the toplevel parser settings/ properties</title>
	<published>2009-12-24T03:54:29Z</published>
	<updated>2009-12-24T03:54:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794395#action_12794395&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794395#action_12794395&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Simon Willnauer commented on LUCENE-2162:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;I plan to commit this until Dec. 28. If nobody objects.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; ExtendableQueryParser should allow extensions to access the toplevel parser settings/ properties
&lt;br&gt;&amp;gt; ------------------------------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2162
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2162&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2162&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: contrib/*
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Trivial
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2162.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Based on the latest discussions in LUCENE-2039 this issue will expose the toplevel parser via the ExtensionQuery so that ExtensionParsers can access properties like getAllowLeadingWildcards() from the top level parser.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26913218&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26913218&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2162%29-ExtendableQueryParser-should-allow-extensions-to-access-the-toplevel-parser-settings--properties-tp26792863p26913218.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26908782</id>
	<title>[jira] Resolved: (LUCENE-2179) CharArraySet.clear()</title>
	<published>2009-12-23T15:27:29Z</published>
	<updated>2009-12-23T15:27:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Uwe Schindler resolved LUCENE-2179.
&lt;br&gt;-----------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Resolution: Fixed
&lt;br&gt;&amp;nbsp; &amp;nbsp; Lucene Fields: [New, Patch Available] &amp;nbsp;(was: [New])
&lt;br&gt;&lt;br&gt;Committed revision: 893655
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; CharArraySet.clear()
&lt;br&gt;&amp;gt; --------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2179
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Robert Muir
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2179.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I needed CharArraySet.clear() for something I was working on in Solr in a tokenstream.
&lt;br&gt;&amp;gt; instead I ended up using CharArrayMap&amp;lt;Boolean&amp;gt; because it supported .clear()
&lt;br&gt;&amp;gt; it would be better to use a set though, currently it will throw UOE for .clear() because AbstractSet will call iterator.remove() which throws UOE.
&lt;br&gt;&amp;gt; In Solr, the very similar CharArrayMap.clear() looks like this:
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; &amp;nbsp; @Override
&lt;br&gt;&amp;gt; &amp;nbsp; public void clear() {
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; count = 0;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(keys,null);
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(values,null);
&lt;br&gt;&amp;gt; &amp;nbsp; }
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; I think we can do a similar thing as long as we throw UOE for the UnmodifiableCharArraySet
&lt;br&gt;&amp;gt; will submit a patch later tonight (unless someone is bored and has nothing better to do)
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26908782&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26908782&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2179%29-CharArraySet.clear%28%29-tp26905482p26908782.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26908783</id>
	<title>[jira] Resolved: (LUCENE-2169) Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.</title>
	<published>2009-12-23T15:27:29Z</published>
	<updated>2009-12-23T15:27:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Uwe Schindler resolved LUCENE-2169.
&lt;br&gt;-----------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Resolution: Fixed
&lt;br&gt;&amp;nbsp; &amp;nbsp; Lucene Fields: [New, Patch Available] &amp;nbsp;(was: [New])
&lt;br&gt;&lt;br&gt;Committed revision: 893655
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.
&lt;br&gt;&amp;gt; --------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2169
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; the copy method should use the entries array itself to copy the set internally instead of iterating over all values. this would speedup copying even small set 
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26908783&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26908783&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2169%29-Speedup-of-CharArraySet-copy-if-a-CharArraySet-instance-is-passed-to-copy.-tp26827428p26908783.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26907226</id>
	<title>[jira] Commented: (LUCENE-2179) CharArraySet.clear()</title>
	<published>2009-12-23T12:48:29Z</published>
	<updated>2009-12-23T12:48:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794218#action_12794218&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794218#action_12794218&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Uwe Schindler commented on LUCENE-2179:
&lt;br&gt;---------------------------------------
&lt;br&gt;&lt;br&gt;For this to work, in BW branch the clear() test will be disabled.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; CharArraySet.clear()
&lt;br&gt;&amp;gt; --------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2179
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Robert Muir
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2179.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I needed CharArraySet.clear() for something I was working on in Solr in a tokenstream.
&lt;br&gt;&amp;gt; instead I ended up using CharArrayMap&amp;lt;Boolean&amp;gt; because it supported .clear()
&lt;br&gt;&amp;gt; it would be better to use a set though, currently it will throw UOE for .clear() because AbstractSet will call iterator.remove() which throws UOE.
&lt;br&gt;&amp;gt; In Solr, the very similar CharArrayMap.clear() looks like this:
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; &amp;nbsp; @Override
&lt;br&gt;&amp;gt; &amp;nbsp; public void clear() {
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; count = 0;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(keys,null);
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(values,null);
&lt;br&gt;&amp;gt; &amp;nbsp; }
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; I think we can do a similar thing as long as we throw UOE for the UnmodifiableCharArraySet
&lt;br&gt;&amp;gt; will submit a patch later tonight (unless someone is bored and has nothing better to do)
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26907226&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26907226&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2179%29-CharArraySet.clear%28%29-tp26905482p26907226.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26906999</id>
	<title>[jira] Commented: (LUCENE-2179) CharArraySet.clear()</title>
	<published>2009-12-23T12:26:29Z</published>
	<updated>2009-12-23T12:26:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794208#action_12794208&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794208#action_12794208&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Uwe Schindler commented on LUCENE-2179:
&lt;br&gt;---------------------------------------
&lt;br&gt;&lt;br&gt;I will commit this together with LUCENE-2169.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; CharArraySet.clear()
&lt;br&gt;&amp;gt; --------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2179
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Robert Muir
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2179.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I needed CharArraySet.clear() for something I was working on in Solr in a tokenstream.
&lt;br&gt;&amp;gt; instead I ended up using CharArrayMap&amp;lt;Boolean&amp;gt; because it supported .clear()
&lt;br&gt;&amp;gt; it would be better to use a set though, currently it will throw UOE for .clear() because AbstractSet will call iterator.remove() which throws UOE.
&lt;br&gt;&amp;gt; In Solr, the very similar CharArrayMap.clear() looks like this:
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; &amp;nbsp; @Override
&lt;br&gt;&amp;gt; &amp;nbsp; public void clear() {
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; count = 0;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(keys,null);
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(values,null);
&lt;br&gt;&amp;gt; &amp;nbsp; }
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; I think we can do a similar thing as long as we throw UOE for the UnmodifiableCharArraySet
&lt;br&gt;&amp;gt; will submit a patch later tonight (unless someone is bored and has nothing better to do)
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906999&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906999&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2179%29-CharArraySet.clear%28%29-tp26905482p26906999.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26906973</id>
	<title>[jira] Commented: (LUCENE-2169) Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.</title>
	<published>2009-12-23T12:24:29Z</published>
	<updated>2009-12-23T12:24:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794207#action_12794207&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794207#action_12794207&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Uwe Schindler commented on LUCENE-2169:
&lt;br&gt;---------------------------------------
&lt;br&gt;&lt;br&gt;OK.
&lt;br&gt;&lt;br&gt;I will commit this together with LUCENE-2179.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.
&lt;br&gt;&amp;gt; --------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2169
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; the copy method should use the entries array itself to copy the set internally instead of iterating over all values. this would speedup copying even small set 
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906973&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906973&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2169%29-Speedup-of-CharArraySet-copy-if-a-CharArraySet-instance-is-passed-to-copy.-tp26827428p26906973.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26906154</id>
	<title>[jira] Commented: (LUCENE-2179) CharArraySet.clear()</title>
	<published>2009-12-23T11:12:29Z</published>
	<updated>2009-12-23T11:12:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794162#action_12794162&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794162#action_12794162&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Simon Willnauer commented on LUCENE-2179:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;patch looks good uwe! +1 &amp;nbsp;from my side
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; CharArraySet.clear()
&lt;br&gt;&amp;gt; --------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2179
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Robert Muir
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2179.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I needed CharArraySet.clear() for something I was working on in Solr in a tokenstream.
&lt;br&gt;&amp;gt; instead I ended up using CharArrayMap&amp;lt;Boolean&amp;gt; because it supported .clear()
&lt;br&gt;&amp;gt; it would be better to use a set though, currently it will throw UOE for .clear() because AbstractSet will call iterator.remove() which throws UOE.
&lt;br&gt;&amp;gt; In Solr, the very similar CharArrayMap.clear() looks like this:
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; &amp;nbsp; @Override
&lt;br&gt;&amp;gt; &amp;nbsp; public void clear() {
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; count = 0;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(keys,null);
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(values,null);
&lt;br&gt;&amp;gt; &amp;nbsp; }
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; I think we can do a similar thing as long as we throw UOE for the UnmodifiableCharArraySet
&lt;br&gt;&amp;gt; will submit a patch later tonight (unless someone is bored and has nothing better to do)
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906154&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906154&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2179%29-CharArraySet.clear%28%29-tp26905482p26906154.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26906108</id>
	<title>[jira] Commented: (LUCENE-2169) Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.</title>
	<published>2009-12-23T11:08:29Z</published>
	<updated>2009-12-23T11:08:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794160#action_12794160&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794160#action_12794160&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Simon Willnauer commented on LUCENE-2169:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;bq. But you now changed the behaviour of copy(). Before the patch it changed the CharArraySets matchVersion... That was what my patch was doing to preserve this behaviour. 
&lt;br&gt;&lt;br&gt;the CharArraySet did not have a Version before 3.1 so this code has never been released. Changing this behavior is fine and will not break anything though.
&lt;br&gt;&lt;br&gt;simon
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.
&lt;br&gt;&amp;gt; --------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2169
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; the copy method should use the entries array itself to copy the set internally instead of iterating over all values. this would speedup copying even small set 
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906108&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906108&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2169%29-Speedup-of-CharArraySet-copy-if-a-CharArraySet-instance-is-passed-to-copy.-tp26827428p26906108.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26906111</id>
	<title>[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter</title>
	<published>2009-12-23T11:08:29Z</published>
	<updated>2009-12-23T11:08:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794161#action_12794161&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794161#action_12794161&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Michael McCandless commented on LUCENE-2026:
&lt;br&gt;--------------------------------------------
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;For what it's worth, we haven't really solved that problem in Lucy either.
&lt;br&gt;The sliding window abstraction we wrapped around mmap/MapViewOfFile largely
&lt;br&gt;solved the problem of running out of address space on 32-bit operating
&lt;br&gt;systems. However, there's currently no way to invoke madvise through Lucy's
&lt;br&gt;IO abstraction layer - it's a little tricky with compound files.
&lt;br&gt;&lt;br&gt;Linux, at least, requires that the buffer supplied to madvise be page-aligned.
&lt;br&gt;So, say we're starting off on a posting list, and we want to communicate to
&lt;br&gt;the OS that it should treat the region we're about to read as MADV_SEQUENTIAL.
&lt;br&gt;If the start of the postings file is in the middle of a 4k page and the file
&lt;br&gt;right before it is a term dictionary, we don't want to indicate that that
&lt;br&gt;region should be treated as sequential.
&lt;br&gt;&lt;br&gt;I'm not sure how to solve that problem without violating the encapsulation of
&lt;br&gt;the compound file model. Hmm, maybe we could store metadata about the virtual
&lt;br&gt;files indicating usage patterns (sequential, random, etc.)? Since files are
&lt;br&gt;generally part of dedicated data structures whose usage patterns are known at 
&lt;br&gt;index time.
&lt;br&gt;&lt;br&gt;Or maybe we just punt on that use case and worry only about segment merging. 
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Storing metadata seems OK. &amp;nbsp;It'd be optional for codecs to declare that...
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;Hmm, wouldn't the act of deleting a file (and releasing all file descriptors) tell
&lt;br&gt;the OS that it's free to recycle any memory pages associated with it?
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;It better!
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. Actually why can't ord &amp; offset be one, for the string sort cache? Ie, if you write your string data in sort order, then the offsets are also in sort order? (I think we may have discussed this already?)
&lt;br&gt;&lt;br&gt;Right, we discussed this on lucy-dev last spring:
&lt;br&gt;&lt;br&gt;&lt;a href=&quot;http://markmail.org/message/epc56okapbgit5lw&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://markmail.org/message/epc56okapbgit5lw&lt;/a&gt;&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;OK I'll go try to catch up... but I'm about to drop [sort of]
&lt;br&gt;offline for a week and a half! &amp;nbsp;There's alot of reading there! &amp;nbsp;Should
&lt;br&gt;be a prereq that we first go back and re-read what we said &amp;quot;the last
&lt;br&gt;time&amp;quot;... ;)
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;Incidentally, some of this thread replays our exchange at the top of
&lt;br&gt;LUCENE-1458 from a year ago. It was fun to go back and reread that: in the
&lt;br&gt;interrim, we've implemented segment-centric search and memory mapped field
&lt;br&gt;caches and term dictionaries, both of which were first discussed back then.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Nice! &amp;nbsp;
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;Ords are great for low cardinality fields of all kinds, but become less
&lt;br&gt;efficient for high cardinality primitive numeric fields. For simplicity's
&lt;br&gt;sake, the prototype implementation of mmap'd field caches in KS always uses
&lt;br&gt;ords.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Right...
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. You don't want to have to create Lucy's equivalent of the JMM...
&lt;br&gt;&lt;br&gt;The more I think about making Lucy classes thread safe, the harder it seems.
&lt;br&gt;&amp;nbsp;I'd like to make it possible to share a Schema across threads, for
&lt;br&gt;instance, but that means all its Analyzers, etc have to be thread-safe as
&lt;br&gt;well, which isn't practical when you start getting into contributed
&lt;br&gt;subclasses.
&lt;br&gt;&lt;br&gt;Even if we succeed in getting Folders and FileHandles thread safe, it will be
&lt;br&gt;hard for the user to keep track of what they can and can't do across threads.
&lt;br&gt;&amp;quot;Don't share anything&amp;quot; is a lot easier to understand.
&lt;br&gt;&lt;br&gt;We reap a big benefit by making Lucy's metaclass infrastructure thread-safe.
&lt;br&gt;Beyond that, seems like there's a lot of pain for little gain.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Yeah. &amp;nbsp;Threads are not easy :(
&lt;br&gt;&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Refactoring of IndexWriter
&lt;br&gt;&amp;gt; --------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2026
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I've been thinking for a while about refactoring the IndexWriter into
&lt;br&gt;&amp;gt; two main components.
&lt;br&gt;&amp;gt; One could be called a SegmentWriter and as the
&lt;br&gt;&amp;gt; name says its job would be to write one particular index segment. The
&lt;br&gt;&amp;gt; default one just as today will provide methods to add documents and
&lt;br&gt;&amp;gt; flushes when its buffer is full.
&lt;br&gt;&amp;gt; Other SegmentWriter implementations would do things like e.g. appending or
&lt;br&gt;&amp;gt; copying external segments [what addIndexes*() currently does].
&lt;br&gt;&amp;gt; The second component's job would it be to manage writing the segments
&lt;br&gt;&amp;gt; file and merging/deleting segments. It would know about
&lt;br&gt;&amp;gt; DeletionPolicy, MergePolicy and MergeScheduler. Ideally it would
&lt;br&gt;&amp;gt; provide hooks that allow users to manage external data structures and
&lt;br&gt;&amp;gt; keep them in sync with Lucene's data during segment merges.
&lt;br&gt;&amp;gt; API wise there are things we have to figure out, such as where the
&lt;br&gt;&amp;gt; updateDocument() method would fit in, because its deletion part
&lt;br&gt;&amp;gt; affects all segments, whereas the new document is only being added to
&lt;br&gt;&amp;gt; the new segment.
&lt;br&gt;&amp;gt; Of course these should be lower level APIs for things like parallel
&lt;br&gt;&amp;gt; indexing and related use cases. That's why we should still provide
&lt;br&gt;&amp;gt; easy to use APIs like today for people who don't need to care about
&lt;br&gt;&amp;gt; per-segment ops during indexing. So the current IndexWriter could
&lt;br&gt;&amp;gt; probably keeps most of its APIs and delegate to the new classes.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906111&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906111&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2026%29-Refactoring-of-IndexWriter-tp26188404p26906111.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26906060</id>
	<title>[jira] Commented: (LUCENE-2169) Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.</title>
	<published>2009-12-23T11:04:29Z</published>
	<updated>2009-12-23T11:04:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794157#action_12794157&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794157#action_12794157&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Uwe Schindler commented on LUCENE-2169:
&lt;br&gt;---------------------------------------
&lt;br&gt;&lt;br&gt;But you now changed the behaviour of copy(). Before the patch it changed the CharArraySets matchVersion... That was what my patch was doing to preserve this behaviour.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.
&lt;br&gt;&amp;gt; --------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2169
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; the copy method should use the entries array itself to copy the set internally instead of iterating over all values. this would speedup copying even small set 
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906060&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26906060&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2169%29-Speedup-of-CharArraySet-copy-if-a-CharArraySet-instance-is-passed-to-copy.-tp26827428p26906060.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26905927</id>
	<title>[jira] Updated: (LUCENE-2169) Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.</title>
	<published>2009-12-23T10:52:30Z</published>
	<updated>2009-12-23T10:52:30Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Simon Willnauer updated LUCENE-2169:
&lt;br&gt;------------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; Attachment: LUCENE-2169.patch
&lt;br&gt;&lt;br&gt;patch that incorporates uwes additions to the testcases and enforces the Version of the source set in the copy if it is an instance of CharArraySet
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.
&lt;br&gt;&amp;gt; --------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2169
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; the copy method should use the entries array itself to copy the set internally instead of iterating over all values. this would speedup copying even small set 
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905927&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905927&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2169%29-Speedup-of-CharArraySet-copy-if-a-CharArraySet-instance-is-passed-to-copy.-tp26827428p26905927.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26905928</id>
	<title>[jira] Commented: (LUCENE-2169) Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.</title>
	<published>2009-12-23T10:52:30Z</published>
	<updated>2009-12-23T10:52:30Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794147#action_12794147&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794147#action_12794147&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Simon Willnauer commented on LUCENE-2169:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;Uwe, when I first looked at your patch I thought that is a good idea but once I looked at the usecases of CharArraySet differentiating between matchVersion if the given set is an instance of CharArraySet is not idea IMO. Imagine you create an analyzer with CharArraySet the analyzer will use its own given version together with the copy method internally if the analyzer is created with a different version than the provided stopset (which is already a CharArraySet) copy could change the behavior due to the given version with is actually the matchVersion for the analyzer not for the set. I would leave the decision to the user if a copy with a different version is what the user wants. If the version should not be preserved and the set to copy is a charArraySet users should use the constructor directly. I will attach a patch shortly
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Speedup of CharArraySet#copy if a CharArraySet instance is passed to copy.
&lt;br&gt;&amp;gt; --------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2169
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2169&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2169&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Simon Willnauer
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch, LUCENE-2169.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; the copy method should use the entries array itself to copy the set internally instead of iterating over all values. this would speedup copying even small set 
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905928&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905928&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2169%29-Speedup-of-CharArraySet-copy-if-a-CharArraySet-instance-is-passed-to-copy.-tp26827428p26905928.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26905886</id>
	<title>[jira] Updated: (LUCENE-2179) CharArraySet.clear()</title>
	<published>2009-12-23T10:48:30Z</published>
	<updated>2009-12-23T10:48:30Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Uwe Schindler updated LUCENE-2179:
&lt;br&gt;----------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; Attachment: LUCENE-2179.patch
&lt;br&gt;&lt;br&gt;Patch for clear() and modified test. The test already checks, that the unmodifiable set cannot be cleared.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; CharArraySet.clear()
&lt;br&gt;&amp;gt; --------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2179
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Robert Muir
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2179.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I needed CharArraySet.clear() for something I was working on in Solr in a tokenstream.
&lt;br&gt;&amp;gt; instead I ended up using CharArrayMap&amp;lt;Boolean&amp;gt; because it supported .clear()
&lt;br&gt;&amp;gt; it would be better to use a set though, currently it will throw UOE for .clear() because AbstractSet will call iterator.remove() which throws UOE.
&lt;br&gt;&amp;gt; In Solr, the very similar CharArrayMap.clear() looks like this:
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; &amp;nbsp; @Override
&lt;br&gt;&amp;gt; &amp;nbsp; public void clear() {
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; count = 0;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(keys,null);
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(values,null);
&lt;br&gt;&amp;gt; &amp;nbsp; }
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; I think we can do a similar thing as long as we throw UOE for the UnmodifiableCharArraySet
&lt;br&gt;&amp;gt; will submit a patch later tonight (unless someone is bored and has nothing better to do)
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905886&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905886&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2179%29-CharArraySet.clear%28%29-tp26905482p26905886.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26905887</id>
	<title>[jira] Assigned: (LUCENE-2179) CharArraySet.clear()</title>
	<published>2009-12-23T10:48:30Z</published>
	<updated>2009-12-23T10:48:30Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Uwe Schindler reassigned LUCENE-2179:
&lt;br&gt;-------------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; Assignee: Uwe Schindler
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; CharArraySet.clear()
&lt;br&gt;&amp;gt; --------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2179
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Analysis
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Robert Muir
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2179.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I needed CharArraySet.clear() for something I was working on in Solr in a tokenstream.
&lt;br&gt;&amp;gt; instead I ended up using CharArrayMap&amp;lt;Boolean&amp;gt; because it supported .clear()
&lt;br&gt;&amp;gt; it would be better to use a set though, currently it will throw UOE for .clear() because AbstractSet will call iterator.remove() which throws UOE.
&lt;br&gt;&amp;gt; In Solr, the very similar CharArrayMap.clear() looks like this:
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; &amp;nbsp; @Override
&lt;br&gt;&amp;gt; &amp;nbsp; public void clear() {
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; count = 0;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(keys,null);
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; Arrays.fill(values,null);
&lt;br&gt;&amp;gt; &amp;nbsp; }
&lt;br&gt;&amp;gt; {code}
&lt;br&gt;&amp;gt; I think we can do a similar thing as long as we throw UOE for the UnmodifiableCharArraySet
&lt;br&gt;&amp;gt; will submit a patch later tonight (unless someone is bored and has nothing better to do)
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905887&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905887&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2179%29-CharArraySet.clear%28%29-tp26905482p26905887.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26905655</id>
	<title>[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter</title>
	<published>2009-12-23T10:26:29Z</published>
	<updated>2009-12-23T10:26:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794137#action_12794137&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794137#action_12794137&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Marvin Humphrey commented on LUCENE-2026:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;&amp;gt; we can't give hints to the OS to tell it not to cache certain reads/writes
&lt;br&gt;&amp;gt; (ie segment merging), 
&lt;br&gt;&lt;br&gt;For what it's worth, we haven't really solved that problem in Lucy either.
&lt;br&gt;The sliding window abstraction we wrapped around mmap/MapViewOfFile largely
&lt;br&gt;solved the problem of running out of address space on 32-bit operating
&lt;br&gt;systems. &amp;nbsp;However, there's currently no way to invoke madvise through Lucy's
&lt;br&gt;IO abstraction layer -- it's a little tricky with compound files. &amp;nbsp;
&lt;br&gt;&lt;br&gt;Linux, at least, requires that the buffer supplied to madvise be page-aligned.
&lt;br&gt;So, say we're starting off on a posting list, and we want to communicate to
&lt;br&gt;the OS that it should treat the region we're about to read as MADV_SEQUENTIAL.
&lt;br&gt;If the start of the postings file is in the middle of a 4k page and the file
&lt;br&gt;right before it is a term dictionary, we don't want to indicate that that
&lt;br&gt;region should be treated as sequential.
&lt;br&gt;&lt;br&gt;I'm not sure how to solve that problem without violating the encapsulation of
&lt;br&gt;the compound file model. &amp;nbsp;Hmm, maybe we could store metadata about the virtual
&lt;br&gt;files indicating usage patterns (sequential, random, etc.)? &amp;nbsp;Since files are
&lt;br&gt;generally part of dedicated data structures whose usage patterns are known at 
&lt;br&gt;index time.
&lt;br&gt;&lt;br&gt;Or maybe we just punt on that use case and worry only about segment merging. &amp;nbsp;
&lt;br&gt;Hmm, wouldn't the act of deleting a file (and releasing all file descriptors) tell
&lt;br&gt;the OS that it's free to recycle any memory pages associated with it?
&lt;br&gt;&lt;br&gt;&amp;gt; Actually why can't ord &amp; offset be one, for the string sort cache?
&lt;br&gt;&amp;gt; Ie, if you write your string data in sort order, then the offsets are
&lt;br&gt;&amp;gt; also in sort order? (I think we may have discussed this already?)
&lt;br&gt;&lt;br&gt;Right, we discussed this on lucy-dev last spring:
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &lt;a href=&quot;http://markmail.org/message/epc56okapbgit5lw&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://markmail.org/message/epc56okapbgit5lw&lt;/a&gt;&lt;br&gt;&lt;br&gt;Incidentally, some of this thread replays our exchange at the top of
&lt;br&gt;LUCENE-1458 from a year ago. &amp;nbsp;It was fun to go back and reread that: in the
&lt;br&gt;interrim, we've implemented segment-centric search and memory mapped field
&lt;br&gt;caches and term dictionaries, both of which were first discussed back then.
&lt;br&gt;:)
&lt;br&gt;&lt;br&gt;Ords are great for low cardinality fields of all kinds, but become less
&lt;br&gt;efficient for high cardinality primitive numeric fields. &amp;nbsp;For simplicity's
&lt;br&gt;sake, the prototype implementation of mmap'd field caches in KS always uses
&lt;br&gt;ords.
&lt;br&gt;&lt;br&gt;&amp;gt; You don't want to have to create Lucy's equivalent of the JMM...
&lt;br&gt;&lt;br&gt;The more I think about making Lucy classes thread safe, the harder it seems.
&lt;br&gt;:( &amp;nbsp;I'd like to make it possible to share a Schema across threads, for
&lt;br&gt;instance, but that means all its Analyzers, etc have to be thread-safe as
&lt;br&gt;well, which isn't practical when you start getting into contributed
&lt;br&gt;subclasses. &amp;nbsp;
&lt;br&gt;&lt;br&gt;Even if we succeed in getting Folders and FileHandles thread safe, it will be
&lt;br&gt;hard for the user to keep track of what they can and can't do across threads.
&lt;br&gt;&amp;quot;Don't share anything&amp;quot; is a lot easier to understand.
&lt;br&gt;&lt;br&gt;We reap a big benefit by making Lucy's metaclass infrastructure thread-safe.
&lt;br&gt;Beyond that, seems like there's a lot of pain for little gain.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Refactoring of IndexWriter
&lt;br&gt;&amp;gt; --------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2026
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I've been thinking for a while about refactoring the IndexWriter into
&lt;br&gt;&amp;gt; two main components.
&lt;br&gt;&amp;gt; One could be called a SegmentWriter and as the
&lt;br&gt;&amp;gt; name says its job would be to write one particular index segment. The
&lt;br&gt;&amp;gt; default one just as today will provide methods to add documents and
&lt;br&gt;&amp;gt; flushes when its buffer is full.
&lt;br&gt;&amp;gt; Other SegmentWriter implementations would do things like e.g. appending or
&lt;br&gt;&amp;gt; copying external segments [what addIndexes*() currently does].
&lt;br&gt;&amp;gt; The second component's job would it be to manage writing the segments
&lt;br&gt;&amp;gt; file and merging/deleting segments. It would know about
&lt;br&gt;&amp;gt; DeletionPolicy, MergePolicy and MergeScheduler. Ideally it would
&lt;br&gt;&amp;gt; provide hooks that allow users to manage external data structures and
&lt;br&gt;&amp;gt; keep them in sync with Lucene's data during segment merges.
&lt;br&gt;&amp;gt; API wise there are things we have to figure out, such as where the
&lt;br&gt;&amp;gt; updateDocument() method would fit in, because its deletion part
&lt;br&gt;&amp;gt; affects all segments, whereas the new document is only being added to
&lt;br&gt;&amp;gt; the new segment.
&lt;br&gt;&amp;gt; Of course these should be lower level APIs for things like parallel
&lt;br&gt;&amp;gt; indexing and related use cases. That's why we should still provide
&lt;br&gt;&amp;gt; easy to use APIs like today for people who don't need to care about
&lt;br&gt;&amp;gt; per-segment ops during indexing. So the current IndexWriter could
&lt;br&gt;&amp;gt; probably keeps most of its APIs and delegate to the new classes.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905655&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905655&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2026%29-Refactoring-of-IndexWriter-tp26188404p26905655.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26905482</id>
	<title>[jira] Created: (LUCENE-2179) CharArraySet.clear()</title>
	<published>2009-12-23T10:10:29Z</published>
	<updated>2009-12-23T10:10:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">CharArraySet.clear()
&lt;br&gt;--------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Key: LUCENE-2179
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2179&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2179&lt;/a&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Project: Lucene - Java
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Issue Type: Improvement
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Components: Analysis
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Reporter: Robert Muir
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Priority: Minor
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Fix For: 3.1
&lt;br&gt;&lt;br&gt;&lt;br&gt;I needed CharArraySet.clear() for something I was working on in Solr in a tokenstream.
&lt;br&gt;&lt;br&gt;instead I ended up using CharArrayMap&amp;lt;Boolean&amp;gt; because it supported .clear()
&lt;br&gt;&lt;br&gt;it would be better to use a set though, currently it will throw UOE for .clear() because AbstractSet will call iterator.remove() which throws UOE.
&lt;br&gt;&lt;br&gt;In Solr, the very similar CharArrayMap.clear() looks like this:
&lt;br&gt;{code}
&lt;br&gt;&amp;nbsp; @Override
&lt;br&gt;&amp;nbsp; public void clear() {
&lt;br&gt;&amp;nbsp; &amp;nbsp; count = 0;
&lt;br&gt;&amp;nbsp; &amp;nbsp; Arrays.fill(keys,null);
&lt;br&gt;&amp;nbsp; &amp;nbsp; Arrays.fill(values,null);
&lt;br&gt;&amp;nbsp; }
&lt;br&gt;{code}
&lt;br&gt;&lt;br&gt;I think we can do a similar thing as long as we throw UOE for the UnmodifiableCharArraySet
&lt;br&gt;&lt;br&gt;will submit a patch later tonight (unless someone is bored and has nothing better to do)
&lt;br&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905482&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26905482&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2179%29-CharArraySet.clear%28%29-tp26905482p26905482.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26904307</id>
	<title>[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter</title>
	<published>2009-12-23T08:28:30Z</published>
	<updated>2009-12-23T08:28:30Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794095#action_12794095&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12794095#action_12794095&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Michael McCandless commented on LUCENE-2026:
&lt;br&gt;--------------------------------------------
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. Very interesting - thanks. So it also factors in how much the page was used in the past, not just how long it's been since the page was &amp;nbsp;last used.
&lt;br&gt;&lt;br&gt;In theory, I think that means the term dictionary will tend to be
&lt;br&gt;favored over the posting lists. In practice... hard to say, it would
&lt;br&gt;be difficult to test.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Right... though, I think the top &amp;quot;trunks&amp;quot; frequently used by the
&lt;br&gt;binary search, will stay hot. &amp;nbsp;But as you get deeper into the terms
&lt;br&gt;index, it's not as clear.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. And of course Java pretty much forces threads-as-concurrency (JVM startup time, hotspot compilation, are costly).
&lt;br&gt;&lt;br&gt;Yes. Java does a lot of stuff that most operating systems can also do, but of
&lt;br&gt;course provides a coherent platform-independent interface. In Lucy we're
&lt;br&gt;going to try to go back to the OS for some of the stuff that Java likes to
&lt;br&gt;take over - provided that we can develop a sane genericized interface using
&lt;br&gt;configuration probing and #ifdefs.
&lt;br&gt;&lt;br&gt;It's nice that as long as the box is up our OS-as-JVM is always running, so we
&lt;br&gt;don't have to worry about its (quite lengthy) startup time.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;OS as JVM is a nice analogy. &amp;nbsp;Java of course gets in the way, too,
&lt;br&gt;like we cannot properly set IO priorities, we can't give hints to the
&lt;br&gt;OS to tell it not to cache certain reads/writes (ie segment merging),
&lt;br&gt;can't pin pages ;), etc.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. Right, this is how Lucy would force warming.
&lt;br&gt;&lt;br&gt;I think slurp-instead-of-mmap is orthogonal to warming, because we can warm
&lt;br&gt;file-backed RAM structures by forcing them into the IO cache, using either the
&lt;br&gt;cat-to-dev-null trick or something more sophisticated. The
&lt;br&gt;slurp-instead-of-mmap setting would cause warming as a side effect, but the
&lt;br&gt;main point would be to attempt to persuade the virtual memory system that
&lt;br&gt;certain data structures should have a higher status and not be paged out as
&lt;br&gt;quickly.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Woops, sorry, I misread -- now I understand. &amp;nbsp;You can easily make
&lt;br&gt;certain files ram resident, and then be like Lucene (except the data
&lt;br&gt;structures are more compact). &amp;nbsp;Nice.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. But, even within that CFS file, these three sub-files will not be local? Ie you'll still have to hit three pages per &amp;quot;lookup&amp;quot; right?
&lt;br&gt;&lt;br&gt;They'll be next to each other in the compound file because CompoundFileWriter
&lt;br&gt;orders them alphabetically. For big segments, though, you're right that they
&lt;br&gt;won't be right next to each other, and you could possibly incur as many as
&lt;br&gt;three page faults when retrieving a sort cache value.
&lt;br&gt;&lt;br&gt;But what are the alternatives for variable width data like strings? You need
&lt;br&gt;the ords array anyway for efficient comparisons, so what's left are the
&lt;br&gt;offsets array and the character data.
&lt;br&gt;&lt;br&gt;An array of String objects isn't going to have better locality than one solid
&lt;br&gt;block of memory dedicated to offsets and another solid block of memory
&lt;br&gt;dedicated to file data, and it's no fewer derefs even if the string object
&lt;br&gt;stores its character data inline - more if it points to a separate allocation
&lt;br&gt;(like Lucy's CharBuf does, since it's mutable).
&lt;br&gt;&lt;br&gt;For each sort cache value lookup, you're going to need to access two blocks of
&lt;br&gt;memory.
&lt;br&gt;&lt;br&gt;With the array of String objects, the first is the memory block dedicated
&lt;br&gt;to the array, and the second is the memory block dedicated to the String
&lt;br&gt;object itself, which contains the character data.
&lt;br&gt;With the file-backed block sort cache, the first memory block is the
&lt;br&gt;offsets array, and the second is the character data array.
&lt;br&gt;I think the locality costs should be approximately the same... have I missed 
&lt;br&gt;anything?
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;You're right, Lucene risks 3 (ord array, String array, String object)
&lt;br&gt;page faults on each lookup as well.
&lt;br&gt;&lt;br&gt;Actually why can't ord &amp; offset be one, for the string sort cache?
&lt;br&gt;Ie, if you write your string data in sort order, then the offsets are
&lt;br&gt;also in sort order? &amp;nbsp;(I think we may have discussed this already?)
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. And it seems like Lucy would not need anything crazy-os-specific wrt threads?
&lt;br&gt;&lt;br&gt;It depends on how many classes we want to make thread-safe, and it's not just
&lt;br&gt;the OS, it's the host.
&lt;br&gt;&lt;br&gt;The bare minimum is simply to make Lucy thread-safe as a library. That's
&lt;br&gt;pretty close, because Lucy studiously avoided global variables whenever
&lt;br&gt;possible. The only problems that have to be addressed are the VTable_registry
&lt;br&gt;Hash, race conditions when creating new subclasses via dynamic VTable
&lt;br&gt;singletons, and refcounts on the VTable objects themselves.
&lt;br&gt;&lt;br&gt;Once those issues are taken care of, you'll be able to use Lucy objects in
&lt;br&gt;separate threads with no problem, e.g. one Searcher per thread.
&lt;br&gt;&lt;br&gt;However, if you want to share Lucy objects (other than VTables) across
&lt;br&gt;threads, all of a sudden we have to start thinking about &amp;quot;synchronized&amp;quot;,
&lt;br&gt;&amp;quot;volatile&amp;quot;, etc. Such constructs may not be efficient or even possible under
&lt;br&gt;some threading models.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;OK it is indeed hairy. &amp;nbsp;You don't want to have to create Lucy's
&lt;br&gt;equivalent of the JMM...
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. Hmm I'd guess that field cache is slowish; deleted docs &amp; norms are very fast; terms index is somewhere in between.
&lt;br&gt;&lt;br&gt;That jibes with my own experience. So maybe consider file-backed sort caches
&lt;br&gt;in Lucene, while keeping the status quo for everything else?
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Perhaps, but it'd still make me nervous ;) &amp;nbsp;When we get
&lt;br&gt;CSF (LUCENE-1231) online we should make it
&lt;br&gt;pluggable enough so that one could create an mmap impl.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. You're right, you'd get two readers for seg_12 in that case. By &amp;quot;pool&amp;quot; I meant you're tapping into all the sub-readers that the existing reader have opened - the reader is your pool of sub-readers.
&lt;br&gt;&lt;br&gt;Each unique SegReader will also have dedicated &amp;quot;sub-reader&amp;quot; objects: two
&lt;br&gt;&amp;quot;seg_12&amp;quot; SegReaders means two &amp;quot;seg_12&amp;quot; DocReaders, two &amp;quot;seg_12&amp;quot;
&lt;br&gt;PostingsReaders, etc. However, all those sub-readers will share the same
&lt;br&gt;file-backed RAM data, so in that sense they're pooled.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;OK
&lt;br&gt;&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Refactoring of IndexWriter
&lt;br&gt;&amp;gt; --------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2026
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I've been thinking for a while about refactoring the IndexWriter into
&lt;br&gt;&amp;gt; two main components.
&lt;br&gt;&amp;gt; One could be called a SegmentWriter and as the
&lt;br&gt;&amp;gt; name says its job would be to write one particular index segment. The
&lt;br&gt;&amp;gt; default one just as today will provide methods to add documents and
&lt;br&gt;&amp;gt; flushes when its buffer is full.
&lt;br&gt;&amp;gt; Other SegmentWriter implementations would do things like e.g. appending or
&lt;br&gt;&amp;gt; copying external segments [what addIndexes*() currently does].
&lt;br&gt;&amp;gt; The second component's job would it be to manage writing the segments
&lt;br&gt;&amp;gt; file and merging/deleting segments. It would know about
&lt;br&gt;&amp;gt; DeletionPolicy, MergePolicy and MergeScheduler. Ideally it would
&lt;br&gt;&amp;gt; provide hooks that allow users to manage external data structures and
&lt;br&gt;&amp;gt; keep them in sync with Lucene's data during segment merges.
&lt;br&gt;&amp;gt; API wise there are things we have to figure out, such as where the
&lt;br&gt;&amp;gt; updateDocument() method would fit in, because its deletion part
&lt;br&gt;&amp;gt; affects all segments, whereas the new document is only being added to
&lt;br&gt;&amp;gt; the new segment.
&lt;br&gt;&amp;gt; Of course these should be lower level APIs for things like parallel
&lt;br&gt;&amp;gt; indexing and related use cases. That's why we should still provide
&lt;br&gt;&amp;gt; easy to use APIs like today for people who don't need to care about
&lt;br&gt;&amp;gt; per-segment ops during indexing. So the current IndexWriter could
&lt;br&gt;&amp;gt; probably keeps most of its APIs and delegate to the new classes.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26904307&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26904307&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2026%29-Refactoring-of-IndexWriter-tp26188404p26904307.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26897840</id>
	<title>[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter</title>
	<published>2009-12-22T19:59:29Z</published>
	<updated>2009-12-22T19:59:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793918#action_12793918&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793918#action_12793918&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Marvin Humphrey commented on LUCENE-2026:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;&amp;gt; Very interesting - thanks. So it also factors in how much the page
&lt;br&gt;&amp;gt; was used in the past, not just how long it's been since the page was
&lt;br&gt;&amp;gt; last used.
&lt;br&gt;&lt;br&gt;In theory, I think that means the term dictionary will tend to be favored over
&lt;br&gt;the posting lists. &amp;nbsp;In practice... hard to say, it would be difficult to test.
&lt;br&gt;:(
&lt;br&gt;&lt;br&gt;&amp;gt; Even smallish indexes can see the pages swapped out? 
&lt;br&gt;&lt;br&gt;Yes, you're right -- the wait time to get at a small term dictionary isn't
&lt;br&gt;necessarily small. &amp;nbsp;I've amended my previous post, thanks.
&lt;br&gt;&lt;br&gt;&amp;gt; And of course Java pretty much forces threads-as-concurrency (JVM
&lt;br&gt;&amp;gt; startup time, hotspot compilation, are costly).
&lt;br&gt;&lt;br&gt;Yes. &amp;nbsp;Java does a lot of stuff that most operating systems can also do, but of
&lt;br&gt;course provides a coherent platform-independent interface. &amp;nbsp;In Lucy we're
&lt;br&gt;going to try to go back to the OS for some of the stuff that Java likes to
&lt;br&gt;take over -- provided that we can develop a sane genericized interface using
&lt;br&gt;configuration probing and #ifdefs. &amp;nbsp;
&lt;br&gt;&lt;br&gt;It's nice that as long as the box is up our OS-as-JVM is always running, so we
&lt;br&gt;don't have to worry about its (quite lengthy) startup time. 
&lt;br&gt;&lt;br&gt;&amp;gt; Right, this is how Lucy would force warming.
&lt;br&gt;&lt;br&gt;I think slurp-instead-of-mmap is orthogonal to warming, because we can warm
&lt;br&gt;file-backed RAM structures by forcing them into the IO cache, using either the
&lt;br&gt;cat-to-dev-null trick or something more sophisticated. &amp;nbsp;The
&lt;br&gt;slurp-instead-of-mmap setting would cause warming as a side effect, but the
&lt;br&gt;main point would be to attempt to persuade the virtual memory system that
&lt;br&gt;certain data structures should have a higher status and not be paged out as
&lt;br&gt;quickly.
&lt;br&gt;&lt;br&gt;&amp;gt; But, even within that CFS file, these three sub-files will not be
&lt;br&gt;&amp;gt; local? Ie you'll still have to hit three pages per &amp;quot;lookup&amp;quot; right?
&lt;br&gt;&lt;br&gt;They'll be next to each other in the compound file because CompoundFileWriter
&lt;br&gt;orders them alphabetically. &amp;nbsp;For big segments, though, you're right that they
&lt;br&gt;won't be right next to each other, and you could possibly incur as many as
&lt;br&gt;three page faults when retrieving a sort cache value.
&lt;br&gt;&lt;br&gt;But what are the alternatives for variable width data like strings? &amp;nbsp;You need
&lt;br&gt;the ords array anyway for efficient comparisons, so what's left are the
&lt;br&gt;offsets array and the character data.
&lt;br&gt;&lt;br&gt;An array of String objects isn't going to have better locality than one solid
&lt;br&gt;block of memory dedicated to offsets and another solid block of memory
&lt;br&gt;dedicated to file data, and it's no fewer derefs even if the string object
&lt;br&gt;stores its character data inline -- more if it points to a separate allocation
&lt;br&gt;(like Lucy's CharBuf does, since it's mutable). 
&lt;br&gt;&lt;br&gt;For each sort cache value lookup, you're going to need to access two blocks of
&lt;br&gt;memory. &amp;nbsp;
&lt;br&gt;&lt;br&gt;&amp;nbsp; * With the array of String objects, the first is the memory block dedicated
&lt;br&gt;&amp;nbsp; &amp;nbsp; to the array, and the second is the memory block dedicated to the String
&lt;br&gt;&amp;nbsp; &amp;nbsp; object itself, which contains the character data.
&lt;br&gt;&amp;nbsp; * With the file-backed block sort cache, the first memory block is the
&lt;br&gt;&amp;nbsp; &amp;nbsp; offsets array, and the second is the character data array.
&lt;br&gt;&lt;br&gt;I think the locality costs should be approximately the same... have I missed 
&lt;br&gt;anything?
&lt;br&gt;&lt;br&gt;&amp;gt; Write-once is good for Lucene too.
&lt;br&gt;&lt;br&gt;Hellyeah.
&lt;br&gt;&amp;nbsp;
&lt;br&gt;&amp;gt; And it seems like Lucy would not need anything crazy-os-specific wrt
&lt;br&gt;&amp;gt; threads?
&lt;br&gt;&lt;br&gt;It depends on how many classes we want to make thread-safe, and it's not just
&lt;br&gt;the OS, it's the host.
&lt;br&gt;&lt;br&gt;The bare minimum is simply to make Lucy thread-safe as a library. &amp;nbsp;That's
&lt;br&gt;pretty close, because Lucy studiously avoided global variables whenever
&lt;br&gt;possible. &amp;nbsp;The only problems that have to be addressed are the VTable_registry
&lt;br&gt;Hash, race conditions when creating new subclasses via dynamic VTable
&lt;br&gt;singletons, and refcounts on the VTable objects themselves.
&lt;br&gt;&lt;br&gt;Once those issues are taken care of, you'll be able to use Lucy objects in
&lt;br&gt;separate threads with no problem, e.g. one Searcher per thread.
&lt;br&gt;&lt;br&gt;However, if you want to *share* Lucy objects (other than VTables) across
&lt;br&gt;threads, all of a sudden we have to start thinking about &amp;quot;synchronized&amp;quot;,
&lt;br&gt;&amp;quot;volatile&amp;quot;, etc. &amp;nbsp;Such constructs may not be efficient or even possible under
&lt;br&gt;some threading models.
&lt;br&gt;&lt;br&gt;&amp;gt; Hmm I'd guess that field cache is slowish; deleted docs &amp; norms are
&lt;br&gt;&amp;gt; very fast; terms index is somewhere in between.
&lt;br&gt;&lt;br&gt;That jibes with my own experience. &amp;nbsp;So maybe consider file-backed sort caches
&lt;br&gt;in Lucene, while keeping the status quo for everything else?
&lt;br&gt;&lt;br&gt;&amp;gt; You're right, you'd get two readers for seg_12 in that case. By
&lt;br&gt;&amp;gt; &amp;quot;pool&amp;quot; I meant you're tapping into all the sub-readers that the
&lt;br&gt;&amp;gt; existing reader have opened - the reader is your pool of sub-readers.
&lt;br&gt;&lt;br&gt;Each unique SegReader will also have dedicated &amp;quot;sub-reader&amp;quot; objects: two
&lt;br&gt;&amp;quot;seg_12&amp;quot; SegReaders means two &amp;quot;seg_12&amp;quot; DocReaders, two &amp;quot;seg_12&amp;quot;
&lt;br&gt;PostingsReaders, etc. &amp;nbsp;However, all those sub-readers will share the same
&lt;br&gt;file-backed RAM data, so in that sense they're pooled.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Refactoring of IndexWriter
&lt;br&gt;&amp;gt; --------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2026
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I've been thinking for a while about refactoring the IndexWriter into
&lt;br&gt;&amp;gt; two main components.
&lt;br&gt;&amp;gt; One could be called a SegmentWriter and as the
&lt;br&gt;&amp;gt; name says its job would be to write one particular index segment. The
&lt;br&gt;&amp;gt; default one just as today will provide methods to add documents and
&lt;br&gt;&amp;gt; flushes when its buffer is full.
&lt;br&gt;&amp;gt; Other SegmentWriter implementations would do things like e.g. appending or
&lt;br&gt;&amp;gt; copying external segments [what addIndexes*() currently does].
&lt;br&gt;&amp;gt; The second component's job would it be to manage writing the segments
&lt;br&gt;&amp;gt; file and merging/deleting segments. It would know about
&lt;br&gt;&amp;gt; DeletionPolicy, MergePolicy and MergeScheduler. Ideally it would
&lt;br&gt;&amp;gt; provide hooks that allow users to manage external data structures and
&lt;br&gt;&amp;gt; keep them in sync with Lucene's data during segment merges.
&lt;br&gt;&amp;gt; API wise there are things we have to figure out, such as where the
&lt;br&gt;&amp;gt; updateDocument() method would fit in, because its deletion part
&lt;br&gt;&amp;gt; affects all segments, whereas the new document is only being added to
&lt;br&gt;&amp;gt; the new segment.
&lt;br&gt;&amp;gt; Of course these should be lower level APIs for things like parallel
&lt;br&gt;&amp;gt; indexing and related use cases. That's why we should still provide
&lt;br&gt;&amp;gt; easy to use APIs like today for people who don't need to care about
&lt;br&gt;&amp;gt; per-segment ops during indexing. So the current IndexWriter could
&lt;br&gt;&amp;gt; probably keeps most of its APIs and delegate to the new classes.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26897840&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26897840&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2026%29-Refactoring-of-IndexWriter-tp26188404p26897840.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26897826</id>
	<title>[jira] Issue Comment Edited: (LUCENE-2026) Refactoring of IndexWriter</title>
	<published>2009-12-22T19:55:29Z</published>
	<updated>2009-12-22T19:55:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793431#action_12793431&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793431#action_12793431&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Marvin Humphrey edited comment on LUCENE-2026 at 12/23/09 3:54 AM:
&lt;br&gt;-------------------------------------------------------------------
&lt;br&gt;&lt;br&gt;&amp;gt; I guess my confusion is what are all the other benefits of using
&lt;br&gt;&amp;gt; file-backed RAM? You can efficiently use process only concurrency
&lt;br&gt;&amp;gt; (though shared memory is technically an option for this too), and you
&lt;br&gt;&amp;gt; have wicked fast open times (but, you still must warm, just like
&lt;br&gt;&amp;gt; Lucene). 
&lt;br&gt;&lt;br&gt;Processes are Lucy's primary concurrency model. &amp;nbsp;(&amp;quot;The OS is our JVM.&amp;quot;)
&lt;br&gt;Making process-only concurrency efficient isn't optional -- it's a *core*
&lt;br&gt;*concern*.
&lt;br&gt;&lt;br&gt;&amp;gt; What else? Oh maybe the ability to inform OS not to cache
&lt;br&gt;&amp;gt; eg the reads done when merging segments. That's one I sure wish
&lt;br&gt;&amp;gt; Lucene could use...
&lt;br&gt;&lt;br&gt;Lightweight searchers mean architectural freedom. &amp;nbsp;
&lt;br&gt;&lt;br&gt;Create 2, 10, 100, 1000 Searchers without a second thought -- as many as you
&lt;br&gt;need for whatever app architecture you just dreamed up -- then destroy them
&lt;br&gt;just as effortlessly. &amp;nbsp;Add another worker thread to your search server without
&lt;br&gt;having to consider the RAM requirements of a heavy searcher object. &amp;nbsp;Create a
&lt;br&gt;command-line app to search a documentation index without worrying about
&lt;br&gt;daemonizing it. &amp;nbsp;Etc.
&lt;br&gt;&lt;br&gt;If your normal development pattern is a single monolithic Java process, then
&lt;br&gt;that freedom might not mean much to you. &amp;nbsp;But with their low per-object RAM
&lt;br&gt;requirements and fast opens, lightweight searchers are easy to use within a
&lt;br&gt;lot of other development patterns. For example: lightweight searchers work 
&lt;br&gt;well for maxing out multiple CPU cores under process-only concurrency.
&lt;br&gt;&lt;br&gt;&amp;gt; In exchange you risk the OS making poor choices about what gets
&lt;br&gt;&amp;gt; swapped out (LRU policy is too simplistic... not all pages are created
&lt;br&gt;&amp;gt; equal), 
&lt;br&gt;&lt;br&gt;The Linux virtual memory system, at least, is not a pure LRU. &amp;nbsp;It utilizes a
&lt;br&gt;page aging algo which prioritizes pages that have historically been accessed
&lt;br&gt;frequently even when they have not been accessed recently:
&lt;br&gt;&lt;br&gt;{panel}
&lt;br&gt;&amp;nbsp; &amp;nbsp; &lt;a href=&quot;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&lt;/a&gt;&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; The default action when a page is first allocated, is to give it an
&lt;br&gt;&amp;nbsp; &amp;nbsp; initial age of 3. Each time it is touched (by the memory management
&lt;br&gt;&amp;nbsp; &amp;nbsp; subsystem) it's age is increased by 3 to a maximum of 20. Each time the
&lt;br&gt;&amp;nbsp; &amp;nbsp; Kernel swap daemon runs it ages pages, decrementing their age by 1.
&lt;br&gt;{panel}
&lt;br&gt;&lt;br&gt;And while that system may not be ideal from our standpoint, it's still pretty
&lt;br&gt;good. &amp;nbsp;In general, the operating system's virtual memory scheme is going to
&lt;br&gt;work fine as designed, for us and everyone else, and minimize memory
&lt;br&gt;availability wait times.
&lt;br&gt;&lt;br&gt;When will swapping out the term dictionary be a problem? &amp;nbsp;
&lt;br&gt;&lt;br&gt;&amp;nbsp; * For indexes where queries are made frequently, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * Foir systems with plenty of RAM, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * For systems that aren't very busy, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * -For small indexes, no problem.- &amp;nbsp;
&lt;br&gt;&lt;br&gt;The only situation we're talking about is infrequent queries against -large-
&lt;br&gt;indexes on busy boxes where RAM isn't abundant. &amp;nbsp;Under those circumstances, it
&lt;br&gt;*might* be noticable that Lucy's term dictionary gets paged out somewhat
&lt;br&gt;sooner than Lucene's.
&lt;br&gt;&lt;br&gt;But in general, if the term dictionary gets paged out, so what? &amp;nbsp;Nobody was
&lt;br&gt;using it. &amp;nbsp;Maybe nobody will make another query against that index until next
&lt;br&gt;week. &amp;nbsp;Maybe the OS made the right decision.
&lt;br&gt;&lt;br&gt;OK, so there's a vulnerable bubble where the the query rate against 
&lt;br&gt;-a large index- an index is neither too fast nor too slow, on busy machines 
&lt;br&gt;where RAM isn't abundant. &amp;nbsp;I don't think that bubble ought to drive major 
&lt;br&gt;architectural decisions.
&lt;br&gt;&lt;br&gt;Let me turn your question on its head. &amp;nbsp;What does Lucene gain in return for
&lt;br&gt;the slow index opens and large process memory footprint of its heavy
&lt;br&gt;searchers?
&lt;br&gt;&lt;br&gt;&amp;gt; I do love how pure the file-backed RAM approach is, but I worry that
&lt;br&gt;&amp;gt; down the road it'll result in erratic search performance in certain
&lt;br&gt;&amp;gt; app profiles.
&lt;br&gt;&lt;br&gt;If necessary, there's a straightforward remedy: slurp the relevant files into
&lt;br&gt;RAM at object construction rather than mmap them. &amp;nbsp;The rest of the code won't 
&lt;br&gt;know the difference between malloc'd RAM and mmap'd RAM. &amp;nbsp;The slurped files 
&lt;br&gt;won't take up any more space than the analogous Lucene data structures; more 
&lt;br&gt;likely, they'll take up less.
&lt;br&gt;&lt;br&gt;That's the kind of setting we'd hide away in the IndexManager class rather
&lt;br&gt;than expose as prominent API, and it would be a hint to index components
&lt;br&gt;rather than an edict.
&lt;br&gt;&lt;br&gt;&amp;gt; Yeah, that you need 3 files for the string sort cache is a little
&lt;br&gt;&amp;gt; spooky... that's 3X the chance of a page fault.
&lt;br&gt;&lt;br&gt;Not when using the compound format.
&lt;br&gt;&lt;br&gt;&amp;gt; But the CFS construction must also go through the filesystem (like
&lt;br&gt;&amp;gt; Lucene) right? So you still incur IO load of creating the small
&lt;br&gt;&amp;gt; files, then 2nd pass to consolidate.
&lt;br&gt;&lt;br&gt;Yes.
&lt;br&gt;&lt;br&gt;&amp;gt; I think we may need to largely take &amp;quot;time&amp;quot; out of our programming
&lt;br&gt;&amp;gt; languages, eg switch to much more declarative code, or
&lt;br&gt;&amp;gt; something... wanna port Lucy to Erlang?
&lt;br&gt;&amp;gt; 
&lt;br&gt;&amp;gt; But I'm not sure process only concurrency, sharing only via
&lt;br&gt;&amp;gt; file-backed memory, is the answer either
&lt;br&gt;&lt;br&gt;I think relying heavily on file-backed memory is particularly appropriate for
&lt;br&gt;Lucy because the write-once file format works well with MAP_SHARED memory
&lt;br&gt;segments. &amp;nbsp;If files were being modified and had to be protected with
&lt;br&gt;semaphores, it wouldn't be as sweet a match.
&lt;br&gt;&lt;br&gt;Focusing on process-only concurrency also works well for Lucy because host
&lt;br&gt;threading models differ substantially and so will only be accessible via a
&lt;br&gt;generalized interface from the Lucy C core. &amp;nbsp;It will be difficult to tune
&lt;br&gt;threading performance through that layer of indirection -- I'm guessing beyond
&lt;br&gt;the ability of most developers since few will be experts in multiple host
&lt;br&gt;threading models. &amp;nbsp;In contrast, expertise in process level concurrency will be
&lt;br&gt;easier to come by and to nourish.
&lt;br&gt;&lt;br&gt;&amp;gt; Using Zoie you can make reopen time insanely fast (much faster than I
&lt;br&gt;&amp;gt; think necessary for most apps), but at the expense of some expected
&lt;br&gt;&amp;gt; hit to searching/indexing throughput. I don't think that's the right
&lt;br&gt;&amp;gt; tradeoff for Lucene.
&lt;br&gt;&lt;br&gt;But as Jake pointed out early in the thread, Zoie achieves those insanely fast
&lt;br&gt;reopens without tight coupling to IndexWriter and its components. &amp;nbsp;The
&lt;br&gt;auxiliary RAM index approach is well proven.
&lt;br&gt;&lt;br&gt;&amp;gt; Do you have any hard numbers on how much time it takes Lucene to load
&lt;br&gt;&amp;gt; from a hot IO cache, populating its RAM resident data structures?
&lt;br&gt;&lt;br&gt;Hmm, I don't spend a lot of time working with Lucene directly, so I might not
&lt;br&gt;be the person most likely to have data like that at my fingertips. &amp;nbsp;Maybe that
&lt;br&gt;McCandless dude can help you out, he runs a lot of benchmarks. &amp;nbsp;;) 
&lt;br&gt;&lt;br&gt;Or maybe ask the Solr folks? &amp;nbsp;I see them on solr-user all the time talking 
&lt;br&gt;about &amp;quot;MaxWarmingSearchers&amp;quot;. ;)
&lt;br&gt;&lt;br&gt;&amp;gt; OK. Then, you are basically pooling your readers &amp;nbsp;Ie, you do allow
&lt;br&gt;&amp;gt; in-process sharing, but only among readers.
&lt;br&gt;&lt;br&gt;Not sure about that. Lucy's IndexReader.reopen() would open new SegReaders for
&lt;br&gt;each new segment, but they would be private to each parent PolyReader. &amp;nbsp;So if
&lt;br&gt;you reopened two IndexReaders at the same time after e.g. &amp;nbsp;segment &amp;quot;seg_12&amp;quot;
&lt;br&gt;had been added, each would create a new, private SegReader for &amp;quot;seg_12&amp;quot;.
&lt;br&gt;&lt;br&gt;*Edit*: updated to correct assertions about virtual memory performance with
&lt;br&gt;small indexes.
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; was (Author: creamyg):
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;gt; I guess my confusion is what are all the other benefits of using
&lt;br&gt;&amp;gt; file-backed RAM? You can efficiently use process only concurrency
&lt;br&gt;&amp;gt; (though shared memory is technically an option for this too), and you
&lt;br&gt;&amp;gt; have wicked fast open times (but, you still must warm, just like
&lt;br&gt;&amp;gt; Lucene). 
&lt;br&gt;&lt;br&gt;Processes are Lucy's primary concurrency model. &amp;nbsp;(&amp;quot;The OS is our JVM.&amp;quot;)
&lt;br&gt;Making process-only concurrency efficient isn't optional -- it's a *core*
&lt;br&gt;*concern*.
&lt;br&gt;&lt;br&gt;&amp;gt; What else? Oh maybe the ability to inform OS not to cache
&lt;br&gt;&amp;gt; eg the reads done when merging segments. That's one I sure wish
&lt;br&gt;&amp;gt; Lucene could use...
&lt;br&gt;&lt;br&gt;Lightweight searchers mean architectural freedom. &amp;nbsp;
&lt;br&gt;&lt;br&gt;Create 2, 10, 100, 1000 Searchers without a second thought -- as many as you
&lt;br&gt;need for whatever app architecture you just dreamed up -- then destroy them
&lt;br&gt;just as effortlessly. &amp;nbsp;Add another worker thread to your search server without
&lt;br&gt;having to consider the RAM requirements of a heavy searcher object. &amp;nbsp;Create a
&lt;br&gt;command-line app to search a documentation index without worrying about
&lt;br&gt;daemonizing it. &amp;nbsp;Etc.
&lt;br&gt;&lt;br&gt;If your normal development pattern is a single monolithic Java process, then
&lt;br&gt;that freedom might not mean much to you. &amp;nbsp;But with their low per-object RAM
&lt;br&gt;requirements and fast opens, lightweight searchers are easy to use within a
&lt;br&gt;lot of other development patterns. For example: lightweight searchers work 
&lt;br&gt;well for maxing out multiple CPU cores under process-only concurrency.
&lt;br&gt;&lt;br&gt;&amp;gt; In exchange you risk the OS making poor choices about what gets
&lt;br&gt;&amp;gt; swapped out (LRU policy is too simplistic... not all pages are created
&lt;br&gt;&amp;gt; equal), 
&lt;br&gt;&lt;br&gt;The Linux virtual memory system, at least, is not a pure LRU. &amp;nbsp;It utilizes a
&lt;br&gt;page aging algo which prioritizes pages that have historically been accessed
&lt;br&gt;frequently even when they have not been accessed recently:
&lt;br&gt;&lt;br&gt;{panel}
&lt;br&gt;&amp;nbsp; &amp;nbsp; &lt;a href=&quot;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&lt;/a&gt;&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; The default action when a page is first allocated, is to give it an
&lt;br&gt;&amp;nbsp; &amp;nbsp; initial age of 3. Each time it is touched (by the memory management
&lt;br&gt;&amp;nbsp; &amp;nbsp; subsystem) it's age is increased by 3 to a maximum of 20. Each time the
&lt;br&gt;&amp;nbsp; &amp;nbsp; Kernel swap daemon runs it ages pages, decrementing their age by 1.
&lt;br&gt;{panel}
&lt;br&gt;&lt;br&gt;And while that system may not be ideal from our standpoint, it's still pretty
&lt;br&gt;good. &amp;nbsp;In general, the operating system's virtual memory scheme is going to
&lt;br&gt;work fine as designed, for us and everyone else, and minimize memory
&lt;br&gt;availability wait times.
&lt;br&gt;&lt;br&gt;When will swapping out the term dictionary be a problem? &amp;nbsp;
&lt;br&gt;&lt;br&gt;&amp;nbsp; * For indexes where queries are made frequently, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * Foir systems with plenty of RAM, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * For systems that aren't very busy, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * For small indexes, no problem. &amp;nbsp;
&lt;br&gt;&lt;br&gt;The only situation we're talking about is infrequent queries against large
&lt;br&gt;indexes on busy boxes where RAM isn't abundant. &amp;nbsp;Under those circumstances, it
&lt;br&gt;*might* be noticable that Lucy's term dictionary gets paged out somewhat
&lt;br&gt;sooner than Lucene's.
&lt;br&gt;&lt;br&gt;But in general, if the term dictionary gets paged out, so what? &amp;nbsp;Nobody was
&lt;br&gt;using it. &amp;nbsp;Maybe nobody will make another query against that index until next
&lt;br&gt;week. &amp;nbsp;Maybe the OS made the right decision.
&lt;br&gt;&lt;br&gt;OK, so there's a vulnerable bubble where the the query rate against a large
&lt;br&gt;index is neither too fast nor too slow, on busy machines where RAM isn't
&lt;br&gt;abundant. &amp;nbsp;I don't think that bubble ought to drive major architectural
&lt;br&gt;decisions.
&lt;br&gt;&lt;br&gt;Let me turn your question on its head. &amp;nbsp;What does Lucene gain in return for
&lt;br&gt;the slow index opens and large process memory footprint of its heavy
&lt;br&gt;searchers?
&lt;br&gt;&lt;br&gt;&amp;gt; I do love how pure the file-backed RAM approach is, but I worry that
&lt;br&gt;&amp;gt; down the road it'll result in erratic search performance in certain
&lt;br&gt;&amp;gt; app profiles.
&lt;br&gt;&lt;br&gt;If necessary, there's a straightforward remedy: slurp the relevant files into
&lt;br&gt;RAM at object construction rather than mmap them. &amp;nbsp;The rest of the code won't 
&lt;br&gt;know the difference between malloc'd RAM and mmap'd RAM. &amp;nbsp;The slurped files 
&lt;br&gt;won't take up any more space than the analogous Lucene data structures; more 
&lt;br&gt;likely, they'll take up less.
&lt;br&gt;&lt;br&gt;That's the kind of setting we'd hide away in the IndexManager class rather
&lt;br&gt;than expose as prominent API, and it would be a hint to index components
&lt;br&gt;rather than an edict.
&lt;br&gt;&lt;br&gt;&amp;gt; Yeah, that you need 3 files for the string sort cache is a little
&lt;br&gt;&amp;gt; spooky... that's 3X the chance of a page fault.
&lt;br&gt;&lt;br&gt;Not when using the compound format.
&lt;br&gt;&lt;br&gt;&amp;gt; But the CFS construction must also go through the filesystem (like
&lt;br&gt;&amp;gt; Lucene) right? So you still incur IO load of creating the small
&lt;br&gt;&amp;gt; files, then 2nd pass to consolidate.
&lt;br&gt;&lt;br&gt;Yes.
&lt;br&gt;&lt;br&gt;&amp;gt; I think we may need to largely take &amp;quot;time&amp;quot; out of our programming
&lt;br&gt;&amp;gt; languages, eg switch to much more declarative code, or
&lt;br&gt;&amp;gt; something... wanna port Lucy to Erlang?
&lt;br&gt;&amp;gt; 
&lt;br&gt;&amp;gt; But I'm not sure process only concurrency, sharing only via
&lt;br&gt;&amp;gt; file-backed memory, is the answer either
&lt;br&gt;&lt;br&gt;I think relying heavily on file-backed memory is particularly appropriate for
&lt;br&gt;Lucy because the write-once file format works well with MAP_SHARED memory
&lt;br&gt;segments. &amp;nbsp;If files were being modified and had to be protected with
&lt;br&gt;semaphores, it wouldn't be as sweet a match.
&lt;br&gt;&lt;br&gt;Focusing on process-only concurrency also works well for Lucy because host
&lt;br&gt;threading models differ substantially and so will only be accessible via a
&lt;br&gt;generalized interface from the Lucy C core. &amp;nbsp;It will be difficult to tune
&lt;br&gt;threading performance through that layer of indirection -- I'm guessing beyond
&lt;br&gt;the ability of most developers since few will be experts in multiple host
&lt;br&gt;threading models. &amp;nbsp;In contrast, expertise in process level concurrency will be
&lt;br&gt;easier to come by and to nourish.
&lt;br&gt;&lt;br&gt;&amp;gt; Using Zoie you can make reopen time insanely fast (much faster than I
&lt;br&gt;&amp;gt; think necessary for most apps), but at the expense of some expected
&lt;br&gt;&amp;gt; hit to searching/indexing throughput. I don't think that's the right
&lt;br&gt;&amp;gt; tradeoff for Lucene.
&lt;br&gt;&lt;br&gt;But as Jake pointed out early in the thread, Zoie achieves those insanely fast
&lt;br&gt;reopens without tight coupling to IndexWriter and its components. &amp;nbsp;The
&lt;br&gt;auxiliary RAM index approach is well proven.
&lt;br&gt;&lt;br&gt;&amp;gt; Do you have any hard numbers on how much time it takes Lucene to load
&lt;br&gt;&amp;gt; from a hot IO cache, populating its RAM resident data structures?
&lt;br&gt;&lt;br&gt;Hmm, I don't spend a lot of time working with Lucene directly, so I might not
&lt;br&gt;be the person most likely to have data like that at my fingertips. &amp;nbsp;Maybe that
&lt;br&gt;McCandless dude can help you out, he runs a lot of benchmarks. &amp;nbsp;;) 
&lt;br&gt;&lt;br&gt;Or maybe ask the Solr folks? &amp;nbsp;I see them on solr-user all the time talking 
&lt;br&gt;about &amp;quot;MaxWarmingSearchers&amp;quot;. ;)
&lt;br&gt;&lt;br&gt;&amp;gt; OK. Then, you are basically pooling your readers &amp;nbsp;Ie, you do allow
&lt;br&gt;&amp;gt; in-process sharing, but only among readers.
&lt;br&gt;&lt;br&gt;Not sure about that. Lucy's IndexReader.reopen() would open new SegReaders for
&lt;br&gt;each new segment, but they would be private to each parent PolyReader. &amp;nbsp;So if
&lt;br&gt;you reopened two IndexReaders at the same time after e.g. &amp;nbsp;segment &amp;quot;seg_12&amp;quot;
&lt;br&gt;had been added, each would create a new, private SegReader for &amp;quot;seg_12&amp;quot;.
&lt;br&gt;&amp;nbsp; 
&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Refactoring of IndexWriter
&lt;br&gt;&amp;gt; --------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2026
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I've been thinking for a while about refactoring the IndexWriter into
&lt;br&gt;&amp;gt; two main components.
&lt;br&gt;&amp;gt; One could be called a SegmentWriter and as the
&lt;br&gt;&amp;gt; name says its job would be to write one particular index segment. The
&lt;br&gt;&amp;gt; default one just as today will provide methods to add documents and
&lt;br&gt;&amp;gt; flushes when its buffer is full.
&lt;br&gt;&amp;gt; Other SegmentWriter implementations would do things like e.g. appending or
&lt;br&gt;&amp;gt; copying external segments [what addIndexes*() currently does].
&lt;br&gt;&amp;gt; The second component's job would it be to manage writing the segments
&lt;br&gt;&amp;gt; file and merging/deleting segments. It would know about
&lt;br&gt;&amp;gt; DeletionPolicy, MergePolicy and MergeScheduler. Ideally it would
&lt;br&gt;&amp;gt; provide hooks that allow users to manage external data structures and
&lt;br&gt;&amp;gt; keep them in sync with Lucene's data during segment merges.
&lt;br&gt;&amp;gt; API wise there are things we have to figure out, such as where the
&lt;br&gt;&amp;gt; updateDocument() method would fit in, because its deletion part
&lt;br&gt;&amp;gt; affects all segments, whereas the new document is only being added to
&lt;br&gt;&amp;gt; the new segment.
&lt;br&gt;&amp;gt; Of course these should be lower level APIs for things like parallel
&lt;br&gt;&amp;gt; indexing and related use cases. That's why we should still provide
&lt;br&gt;&amp;gt; easy to use APIs like today for people who don't need to care about
&lt;br&gt;&amp;gt; per-segment ops during indexing. So the current IndexWriter could
&lt;br&gt;&amp;gt; probably keeps most of its APIs and delegate to the new classes.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26897826&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26897826&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2026%29-Refactoring-of-IndexWriter-tp26188404p26897826.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26894903</id>
	<title>[jira] Commented: (LUCENE-2120) Possible file handle leak in near real-time reader</title>
	<published>2009-12-22T13:50:29Z</published>
	<updated>2009-12-22T13:50:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793803#action_12793803&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793803#action_12793803&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;John Wang commented on LUCENE-2120:
&lt;br&gt;-----------------------------------
&lt;br&gt;&lt;br&gt;Yes we have done perf tests.
&lt;br&gt;We see no indexing throughput improvement, query throughput improved by 40%.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Possible file handle leak in near real-time reader
&lt;br&gt;&amp;gt; --------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2120
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2120&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2120&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Bug
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Spinoff of LUCENE-1526: Jake/John hit file descriptor exhaustion when testing NRT.
&lt;br&gt;&amp;gt; I've tried to repro this, stress testing NRT, saturating reopens, indexing, searching, but haven't found any issue.
&lt;br&gt;&amp;gt; Let's try to get to the bottom of it, here...
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26894903&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26894903&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2120%29-Possible-file-handle-leak-in-near-real-time-reader-tp26663474p26894903.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26893987</id>
	<title>[jira] Commented: (LUCENE-2120) Possible file handle leak in near real-time reader</title>
	<published>2009-12-22T12:36:29Z</published>
	<updated>2009-12-22T12:36:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793775#action_12793775&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793775#action_12793775&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Michael McCandless commented on LUCENE-2120:
&lt;br&gt;--------------------------------------------
&lt;br&gt;&lt;br&gt;I briefly diff'd the BR_DELETE_OPT branch in Zoie -- it looks like you switched (via custom TermDocs/Positions) to enforcing deleted docs using an iterator instead of double-check random-access, right? &amp;nbsp;I had tried the same thing with Lucene (a while back now), under LUCENE-1476/LUCENE-1536, and found that the simple BitVector gave much better performance than an iterator (which then led to applying filters random-access as well). &amp;nbsp;Have you tested performance of this switch? &amp;nbsp;Maybe it works out to be faster, net/net, than doing the double-deletions check?
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Possible file handle leak in near real-time reader
&lt;br&gt;&amp;gt; --------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2120
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2120&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2120&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Bug
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.1
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Spinoff of LUCENE-1526: Jake/John hit file descriptor exhaustion when testing NRT.
&lt;br&gt;&amp;gt; I've tried to repro this, stress testing NRT, saturating reopens, indexing, searching, but haven't found any issue.
&lt;br&gt;&amp;gt; Let's try to get to the bottom of it, here...
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26893987&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26893987&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2120%29-Possible-file-handle-leak-in-near-real-time-reader-tp26663474p26893987.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26892819</id>
	<title>[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter</title>
	<published>2009-12-22T11:10:29Z</published>
	<updated>2009-12-22T11:10:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793737#action_12793737&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793737#action_12793737&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Michael McCandless commented on LUCENE-2026:
&lt;br&gt;--------------------------------------------
&lt;br&gt;&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;Processes are Lucy's primary concurrency model. (&amp;quot;The OS is our JVM.&amp;quot;)
&lt;br&gt;Making process-only concurrency efficient isn't optional - it's a core
&lt;br&gt;concern.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;OK
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;Lightweight searchers mean architectural freedom.
&lt;br&gt;&lt;br&gt;Create 2, 10, 100, 1000 Searchers without a second thought - as many as you
&lt;br&gt;need for whatever app architecture you just dreamed up - then destroy them
&lt;br&gt;just as effortlessly. Add another worker thread to your search server without
&lt;br&gt;having to consider the RAM requirements of a heavy searcher object. Create a
&lt;br&gt;command-line app to search a documentation index without worrying about
&lt;br&gt;daemonizing it. Etc.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;This is definitely neat.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;The Linux virtual memory system, at least, is not a pure LRU. It utilizes a
&lt;br&gt;page aging algo which prioritizes pages that have historically been accessed
&lt;br&gt;frequently even when they have not been accessed recently:
&lt;br&gt;&lt;br&gt;&lt;a href=&quot;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&lt;/a&gt;&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Very interesting -- thanks. &amp;nbsp;So it also factors in how much the page
&lt;br&gt;was used in the past, not just how long it's been since the page was
&lt;br&gt;last used.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;When will swapping out the term dictionary be a problem?
&lt;br&gt;&lt;br&gt;For indexes where queries are made frequently, no problem.
&lt;br&gt;Foir systems with plenty of RAM, no problem.
&lt;br&gt;For systems that aren't very busy, no problem.
&lt;br&gt;For small indexes, no problem.
&lt;br&gt;The only situation we're talking about is infrequent queries against large
&lt;br&gt;indexes on busy boxes where RAM isn't abundant. Under those circumstances, it
&lt;br&gt;might be noticable that Lucy's term dictionary gets paged out somewhat
&lt;br&gt;sooner than Lucene's.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Even smallish indexes can see the pages swapped out? &amp;nbsp;I'd think at
&lt;br&gt;low-to-moderate search traffic, any index could be at risk, depdending
&lt;br&gt;on whether other stuff in the machine wanting RAM or IO cache is
&lt;br&gt;running.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;But in general, if the term dictionary gets paged out, so what? Nobody was
&lt;br&gt;using it. Maybe nobody will make another query against that index until next
&lt;br&gt;week. Maybe the OS made the right decision.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;You can't afford many page faults until the latency becomes very
&lt;br&gt;apparent (until we're all on SSDs... at which point this may all be
&lt;br&gt;moot).
&lt;br&gt;&lt;br&gt;Right -- the metric that the swapper optimizes is overall efficient
&lt;br&gt;use of the machine's resources.
&lt;br&gt;&lt;br&gt;But I think that's often a poor metric for search apps... I think
&lt;br&gt;consistency on the search latency is more important, though I agree it
&lt;br&gt;depends very much on the app.
&lt;br&gt;&lt;br&gt;I don't like the same behavior in my desktop -- when I switch to my
&lt;br&gt;mail client, I don't want to wait 10 seconds for it to swap the pages
&lt;br&gt;back in.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;Let me turn your question on its head. What does Lucene gain in return for
&lt;br&gt;the slow index opens and large process memory footprint of its heavy
&lt;br&gt;searchers?
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Consistency in the search time. &amp;nbsp;Assuming the OS doesn't swap our
&lt;br&gt;pages out...
&lt;br&gt;&lt;br&gt;And of course Java pretty much forces threads-as-concurrency (JVM
&lt;br&gt;startup time, hotspot compilation, are costly).
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;If necessary, there's a straightforward remedy: slurp the relevant files into
&lt;br&gt;RAM at object construction rather than mmap them. The rest of the code won't 
&lt;br&gt;know the difference between malloc'd RAM and mmap'd RAM. The slurped files 
&lt;br&gt;won't take up any more space than the analogous Lucene data structures; more 
&lt;br&gt;likely, they'll take up less.
&lt;br&gt;&lt;br&gt;That's the kind of setting we'd hide away in the IndexManager class rather
&lt;br&gt;than expose as prominent API, and it would be a hint to index components
&lt;br&gt;rather than an edict.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Right, this is how Lucy would force warming.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. Yeah, that you need 3 files for the string sort cache is a little spooky... that's 3X the chance of a page fault.
&lt;br&gt;&lt;br&gt;Not when using the compound format.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;But, even within that CFS file, these three sub-files will not be
&lt;br&gt;local? &amp;nbsp;Ie you'll still have to hit three pages per &amp;quot;lookup&amp;quot; right?
&lt;br&gt;&lt;br&gt;{quote} &amp;nbsp;
&lt;br&gt;I think relying heavily on file-backed memory is particularly appropriate for
&lt;br&gt;Lucy because the write-once file format works well with MAP_SHARED memory
&lt;br&gt;segments. If files were being modified and had to be protected with
&lt;br&gt;semaphores, it wouldn't be as sweet a match.
&lt;br&gt;{quote} &amp;nbsp;
&lt;br&gt;&lt;br&gt;Write-once is good for Lucene too.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;Focusing on process-only concurrency also works well for Lucy because host
&lt;br&gt;threading models differ substantially and so will only be accessible via a
&lt;br&gt;generalized interface from the Lucy C core. It will be difficult to tune
&lt;br&gt;threading performance through that layer of indirection - I'm guessing beyond
&lt;br&gt;the ability of most developers since few will be experts in multiple host
&lt;br&gt;threading models. In contrast, expertise in process level concurrency will be
&lt;br&gt;easier to come by and to nourish.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;I'm confused by this -- eg Python does a great job presenting a simple
&lt;br&gt;threads interface and implementing it on major OSs. &amp;nbsp;And it seems like
&lt;br&gt;Lucy would not need anything crazy-os-specific wrt threads?
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. Do you have any hard numbers on how much time it takes Lucene to load from a hot IO cache, populating its RAM resident data structures?
&lt;br&gt;&lt;br&gt;Hmm, I don't spend a lot of time working with Lucene directly, so I might not
&lt;br&gt;be the person most likely to have data like that at my fingertips. Maybe that
&lt;br&gt;McCandless dude can help you out, he runs a lot of benchmarks. &amp;nbsp;
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;Hmm ;) I'd guess that field cache is slowish; deleted docs &amp; norms are
&lt;br&gt;very fast; terms index is somewhere in between.
&lt;br&gt;&lt;br&gt;bq. Or maybe ask the Solr folks? I see them on solr-user all the time talking about &amp;quot;MaxWarmingSearchers&amp;quot;. 
&lt;br&gt;&lt;br&gt;Hmm -- not sure what's up with that. &amp;nbsp;Looks like maybe it's the
&lt;br&gt;auto-warming that might happen after a commit.
&lt;br&gt;&lt;br&gt;{quote}
&lt;br&gt;bq. OK. Then, you are basically pooling your readers Ie, you do allow in-process sharing, but only among readers.
&lt;br&gt;&lt;br&gt;Not sure about that. Lucy's IndexReader.reopen() would open new SegReaders for
&lt;br&gt;each new segment, but they would be private to each parent PolyReader. So if
&lt;br&gt;you reopened two IndexReaders at the same time after e.g. segment &amp;quot;seg_12&amp;quot;
&lt;br&gt;had been added, each would create a new, private SegReader for &amp;quot;seg_12&amp;quot;.
&lt;br&gt;{quote}
&lt;br&gt;&lt;br&gt;You're right, you'd get two readers for seg_12 in that case. &amp;nbsp;By
&lt;br&gt;&amp;quot;pool&amp;quot; I meant you're tapping into all the sub-readers that the
&lt;br&gt;existing reader have opened -- the reader is your pool of sub-readers.
&lt;br&gt;&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Refactoring of IndexWriter
&lt;br&gt;&amp;gt; --------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2026
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I've been thinking for a while about refactoring the IndexWriter into
&lt;br&gt;&amp;gt; two main components.
&lt;br&gt;&amp;gt; One could be called a SegmentWriter and as the
&lt;br&gt;&amp;gt; name says its job would be to write one particular index segment. The
&lt;br&gt;&amp;gt; default one just as today will provide methods to add documents and
&lt;br&gt;&amp;gt; flushes when its buffer is full.
&lt;br&gt;&amp;gt; Other SegmentWriter implementations would do things like e.g. appending or
&lt;br&gt;&amp;gt; copying external segments [what addIndexes*() currently does].
&lt;br&gt;&amp;gt; The second component's job would it be to manage writing the segments
&lt;br&gt;&amp;gt; file and merging/deleting segments. It would know about
&lt;br&gt;&amp;gt; DeletionPolicy, MergePolicy and MergeScheduler. Ideally it would
&lt;br&gt;&amp;gt; provide hooks that allow users to manage external data structures and
&lt;br&gt;&amp;gt; keep them in sync with Lucene's data during segment merges.
&lt;br&gt;&amp;gt; API wise there are things we have to figure out, such as where the
&lt;br&gt;&amp;gt; updateDocument() method would fit in, because its deletion part
&lt;br&gt;&amp;gt; affects all segments, whereas the new document is only being added to
&lt;br&gt;&amp;gt; the new segment.
&lt;br&gt;&amp;gt; Of course these should be lower level APIs for things like parallel
&lt;br&gt;&amp;gt; indexing and related use cases. That's why we should still provide
&lt;br&gt;&amp;gt; easy to use APIs like today for people who don't need to care about
&lt;br&gt;&amp;gt; per-segment ops during indexing. So the current IndexWriter could
&lt;br&gt;&amp;gt; probably keeps most of its APIs and delegate to the new classes.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892819&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892819&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2026%29-Refactoring-of-IndexWriter-tp26188404p26892819.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26892020</id>
	<title>Re: 3.0 api change</title>
	<published>2009-12-22T10:09:53Z</published>
	<updated>2009-12-22T10:09:53Z</updated>
	<author>
		<name>John Wang-9</name>
	</author>
	<content type="html">Thanks Uwe for clearing this up!&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;-John&lt;br&gt;&lt;br&gt;&lt;div class=&quot;gmail_quote&quot;&gt;On Mon, Dec 21, 2009 at 11:22 PM, Uwe Schindler &lt;span dir=&quot;ltr&quot;&gt;&amp;lt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;uwe@...&lt;/a&gt;&amp;gt;&lt;/span&gt; wrote:&lt;br&gt;
&lt;blockquote class=&quot;gmail_quote&quot; style=&quot;margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;&quot;&gt;










&lt;div lang=&quot;DE&quot; link=&quot;blue&quot; vlink=&quot;blue&quot;&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;2&quot; color=&quot;navy&quot; face=&quot;Arial&quot;&gt;&lt;span lang=&quot;EN-GB&quot; style=&quot;font-size:10.0pt;font-family:Arial;color:navy&quot;&gt;This method was
accidentially removed because it had no javadocs and was “somehow invisible”.
This is the second one found, that was accidentially removed (the other was IR.getTermIndexDivisor).&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;2&quot; color=&quot;navy&quot; face=&quot;Arial&quot;&gt;&lt;span lang=&quot;EN-GB&quot; style=&quot;font-size:10.0pt;font-family:Arial;color:navy&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;2&quot; color=&quot;navy&quot; face=&quot;Arial&quot;&gt;&lt;span lang=&quot;EN-GB&quot; style=&quot;font-size:10.0pt;font-family:Arial;color:navy&quot;&gt;As JIRA is not working a
the moment, I will add a comment to the Sort deprecations removal issue (LUCENE-1972)
later, but it is fixed now in 3.0 branch revision 893095 (and also trunk)&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;2&quot; color=&quot;navy&quot; face=&quot;Arial&quot;&gt;&lt;span lang=&quot;EN-GB&quot; style=&quot;font-size:10.0pt;font-family:Arial;color:navy&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;2&quot; color=&quot;navy&quot; face=&quot;Arial&quot;&gt;&lt;span lang=&quot;EN-GB&quot; style=&quot;font-size:10.0pt;font-family:Arial;color:navy&quot;&gt;Thanks!&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;2&quot; color=&quot;navy&quot; face=&quot;Arial&quot;&gt;&lt;span lang=&quot;EN-GB&quot; style=&quot;font-size:10.0pt;font-family:Arial;color:navy&quot;&gt;Uwe&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p style=&quot;margin-bottom:12.0pt&quot;&gt;&lt;font size=&quot;2&quot; color=&quot;navy&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:10.0pt;color:navy&quot;&gt;-----&lt;br&gt;
Uwe Schindler&lt;br&gt;
H.-H.-Meier-Allee 63, D-28213 Bremen&lt;br&gt;
&lt;a href=&quot;http://www.thetaphi.de&quot; target=&quot;_blank&quot; rel=&quot;nofollow&quot;&gt;http://www.thetaphi.de&lt;/a&gt;&lt;br&gt;
eMail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;uwe@...&lt;/a&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div style=&quot;border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt&quot;&gt;

&lt;div&gt;

&lt;div class=&quot;MsoNormal&quot; align=&quot;center&quot; style=&quot;text-align:center&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;

&lt;hr size=&quot;2&quot; width=&quot;100%&quot; align=&quot;center&quot;&gt;

&lt;/span&gt;&lt;/font&gt;&lt;/div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;b&gt;&lt;font size=&quot;2&quot; face=&quot;Tahoma&quot;&gt;&lt;span style=&quot;font-size:10.0pt;font-family:Tahoma;font-weight:bold&quot;&gt;From:&lt;/span&gt;&lt;/font&gt;&lt;/b&gt;&lt;font size=&quot;2&quot; face=&quot;Tahoma&quot;&gt;&lt;span style=&quot;font-size:10.0pt;font-family:Tahoma&quot;&gt; John Wang
[mailto:&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=2&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;john.wang@...&lt;/a&gt;] &lt;br&gt;
&lt;b&gt;&lt;span style=&quot;font-weight:bold&quot;&gt;Sent:&lt;/span&gt;&lt;/b&gt; Tuesday, December 22, 2009
3:16 AM&lt;br&gt;
&lt;b&gt;&lt;span style=&quot;font-weight:bold&quot;&gt;To:&lt;/span&gt;&lt;/b&gt; &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=3&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-user@...&lt;/a&gt;;
&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=4&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev@...&lt;/a&gt;&lt;br&gt;
&lt;b&gt;&lt;span style=&quot;font-weight:bold&quot;&gt;Subject:&lt;/span&gt;&lt;/b&gt; Fwd: 3.0 api change&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;div class=&quot;h5&quot;&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;Any comments?&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;Did we just unintentionally remove getFieldComparatorSource in 3.0.0?&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot; style=&quot;margin-bottom:12.0pt&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;-John&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;---------- Forwarded message ----------&lt;br&gt;
From: &lt;b&gt;&lt;span style=&quot;font-weight:bold&quot;&gt;John Wang&lt;/span&gt;&lt;/b&gt; &amp;lt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=5&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;john.wang@...&lt;/a&gt;&amp;gt;&lt;br&gt;
Date: Mon, Dec 21, 2009 at 11:21 AM&lt;br&gt;
Subject: 3.0 api change&lt;br&gt;
To: Lucene Users List &amp;lt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=6&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;lucene-user@...&lt;/a&gt;&amp;gt;,
&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26892020&amp;i=7&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;lucene-dev@...&lt;/a&gt;&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
Hi guys:&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;    I noticed SortField.&lt;/span&gt;&lt;/font&gt;&lt;u&gt;&lt;font size=&quot;1&quot; face=&quot;Monaco&quot;&gt;&lt;span style=&quot;font-size:8.5pt;font-family:Monaco&quot;&gt;getComparatorSource&lt;/span&gt;&lt;/font&gt;&lt;/u&gt; was
removed (replaced by getComparator) and it is not documented in CHANGES.TXT.
This api was introduced in 2.9.0. &lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;    Is this intentional?&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt;Thanks&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; color=&quot;#888888&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt;color:#888888&quot;&gt;-John&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;/div&gt;

&lt;p class=&quot;MsoNormal&quot;&gt;&lt;font size=&quot;3&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style=&quot;font-size:12.0pt&quot;&gt; &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;

&lt;/div&gt;

&lt;/div&gt;


&lt;/blockquote&gt;&lt;/div&gt;&lt;br&gt;&lt;/div&gt;
</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/Fwd%3A-3.0-api-change-tp26883041p26892020.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26891405</id>
	<title>[jira] Resolved: (LUCENE-2178) Benchmark contrib should allow multiple locations in ext.classpath</title>
	<published>2009-12-22T09:08:30Z</published>
	<updated>2009-12-22T09:08:30Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Michael McCandless resolved LUCENE-2178.
&lt;br&gt;----------------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Resolution: Fixed
&lt;br&gt;&amp;nbsp; &amp;nbsp; Fix Version/s: 3.1
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Benchmark contrib should allow multiple locations in ext.classpath
&lt;br&gt;&amp;gt; ------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2178
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: contrib/benchmark
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.0
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Steven Rowe
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...}} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26891405&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26891405&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2178%29-Benchmark-contrib-should-allow-multiple-locations-in-ext.classpath-tp26889161p26891405.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26891399</id>
	<title>[jira] Assigned: (LUCENE-2178) Benchmark contrib should allow multiple locations in ext.classpath</title>
	<published>2009-12-22T09:04:29Z</published>
	<updated>2009-12-22T09:04:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Michael McCandless reassigned LUCENE-2178:
&lt;br&gt;------------------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; Assignee: Michael McCandless
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Benchmark contrib should allow multiple locations in ext.classpath
&lt;br&gt;&amp;gt; ------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2178
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: contrib/benchmark
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.0
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Steven Rowe
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...}} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26891399&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26891399&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2178%29-Benchmark-contrib-should-allow-multiple-locations-in-ext.classpath-tp26889161p26891399.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26891417</id>
	<title>[jira] Commented: (LUCENE-2178) Benchmark contrib should allow multiple locations in ext.classpath</title>
	<published>2009-12-22T09:04:29Z</published>
	<updated>2009-12-22T09:04:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793673#action_12793673&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793673#action_12793673&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Michael McCandless commented on LUCENE-2178:
&lt;br&gt;--------------------------------------------
&lt;br&gt;&lt;br&gt;Looks good -- I'll commit. &amp;nbsp;Thanks Steven!
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Benchmark contrib should allow multiple locations in ext.classpath
&lt;br&gt;&amp;gt; ------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2178
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: contrib/benchmark
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.0
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Steven Rowe
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...}} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26891417&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26891417&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2178%29-Benchmark-contrib-should-allow-multiple-locations-in-ext.classpath-tp26889161p26891417.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26889213</id>
	<title>[jira] Commented: (LUCENE-2178) Benchmark contrib should allow multiple locations in ext.classpath</title>
	<published>2009-12-22T06:30:29Z</published>
	<updated>2009-12-22T06:30:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793627#action_12793627&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793627#action_12793627&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Steven Rowe commented on LUCENE-2178:
&lt;br&gt;-------------------------------------
&lt;br&gt;&lt;br&gt;Trivial patch to fix (works with single or multiple locations):
&lt;br&gt;&lt;br&gt;{code}
&lt;br&gt;Index: contrib/benchmark/build.xml
&lt;br&gt;===================================================================
&lt;br&gt;--- contrib/benchmark/build.xml (revision 892657)
&lt;br&gt;+++ contrib/benchmark/build.xml (working copy)
&lt;br&gt;@@ -114,7 +114,7 @@
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;lt;path id=&amp;quot;run.classpath&amp;quot;&amp;gt;
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;lt;path refid=&amp;quot;classpath&amp;quot;/&amp;gt;
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;lt;pathelement location=&amp;quot;${build.dir}/classes/java&amp;quot;/&amp;gt;
&lt;br&gt;- &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;lt;pathelement location=&amp;quot;${benchmark.ext.classpath}&amp;quot;/&amp;gt;
&lt;br&gt;+ &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;lt;pathelement path=&amp;quot;${benchmark.ext.classpath}&amp;quot;/&amp;gt;
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;lt;/path&amp;gt;
&lt;br&gt;&amp;nbsp;
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;lt;property name=&amp;quot;task.alg&amp;quot; location=&amp;quot;conf/micro-standard.alg&amp;quot;/&amp;gt;
&lt;br&gt;{code}
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Benchmark contrib should allow multiple locations in ext.classpath
&lt;br&gt;&amp;gt; ------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2178
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: contrib/benchmark
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.0
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Steven Rowe
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...}} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889213&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889213&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2178%29-Benchmark-contrib-should-allow-multiple-locations-in-ext.classpath-tp26889161p26889213.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26889187</id>
	<title>[jira] Updated: (LUCENE-2178) Benchmark contrib should allow multiple locations in ext.classpath</title>
	<published>2009-12-22T06:28:29Z</published>
	<updated>2009-12-22T06:28:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Steven Rowe updated LUCENE-2178:
&lt;br&gt;--------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; Description: When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...}} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM. &amp;nbsp;(was: When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM.)
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Benchmark contrib should allow multiple locations in ext.classpath
&lt;br&gt;&amp;gt; ------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2178
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: contrib/benchmark
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp;Affects Versions: 3.0
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Steven Rowe
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...}} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889187&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889187&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2178%29-Benchmark-contrib-should-allow-multiple-locations-in-ext.classpath-tp26889161p26889187.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26889164</id>
	<title>[jira] Resolved: (LUCENE-2177) The Field ctors that take byte[] shouldn't take Store, since it must be YES</title>
	<published>2009-12-22T06:26:29Z</published>
	<updated>2009-12-22T06:26:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Michael McCandless resolved LUCENE-2177.
&lt;br&gt;----------------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; Resolution: Fixed
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; The Field ctors that take byte[] shouldn't take Store, since it must be YES
&lt;br&gt;&amp;gt; ---------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2177
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2177&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2177&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Other
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael McCandless
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Trivial
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-2177.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; API silliness. &amp;nbsp;Makes you think you can set Store.NO for binary fields. &amp;nbsp;This used to be meaningful when we also accepted COMPRESS, but now it's an orphan.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889164&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889164&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2177%29-The-Field-ctors-that-take-byte---shouldn%27t-take-Store%2C-since-it-must-be-YES-tp26872071p26889164.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26889161</id>
	<title>[jira] Created: (LUCENE-2178) Benchmark contrib should allow multiple locations in ext.classpath</title>
	<published>2009-12-22T06:26:29Z</published>
	<updated>2009-12-22T06:26:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">Benchmark contrib should allow multiple locations in ext.classpath
&lt;br&gt;------------------------------------------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Key: LUCENE-2178
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2178&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2178&lt;/a&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Project: Lucene - Java
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Issue Type: Improvement
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Components: contrib/benchmark
&lt;br&gt;&amp;nbsp; &amp;nbsp; Affects Versions: 3.0
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Reporter: Steven Rowe
&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Priority: Minor
&lt;br&gt;&lt;br&gt;&lt;br&gt;When {{ant run-task}} is invoked with the &amp;nbsp;{{-Dbenchmark.ext.classpath=...} option, only a single location may be specified. &amp;nbsp;If a classpath with more than one location is specified, none of the locations is put on the classpath for the invoked JVM.
&lt;br&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889161&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26889161&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2178%29-Benchmark-contrib-should-allow-multiple-locations-in-ext.classpath-tp26889161p26889161.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26885173</id>
	<title>[jira] Updated: (LUCENE-1972) Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic</title>
	<published>2009-12-22T00:24:29Z</published>
	<updated>2009-12-22T00:24:29Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;[ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel&lt;/a&gt;&amp;nbsp;]
&lt;br&gt;&lt;br&gt;Uwe Schindler updated LUCENE-1972:
&lt;br&gt;----------------------------------
&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; Attachment: LUCENE-1972-fix.patch
&lt;br&gt;&lt;br&gt;Attached is the patch, committed in 3.0 branch and trunk (rev 893104) that fixes the accidental remove of SortField.getComparatorSource().
&lt;br&gt;&lt;br&gt;Thanks John Wang!
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and lot's of deprecated sort logic
&lt;br&gt;&amp;gt; ------------------------------------------------------------------------------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-1972
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-1972&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-1972&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Task
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Search
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Uwe Schindler
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.0
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Attachments: LUCENE-1972-2.patch, LUCENE-1972-bw.patch, LUCENE-1972-fix.patch, LUCENE-1972.patch
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; Remove (deprecated) ExtendedFieldCache and Auto/Custom caches and sort
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26885173&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26885173&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-1972%29-Remove-%28deprecated%29-ExtendedFieldCache-and-Auto-Custom-caches-and-sort-tp25847231p26885173.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26884767</id>
	<title>RE: 3.0 api change</title>
	<published>2009-12-21T23:22:43Z</published>
	<updated>2009-12-21T23:22:43Z</updated>
	<author>
		<name>Uwe Schindler</name>
	</author>
	<content type="html">&lt;html xmlns:v=&quot;urn:schemas-microsoft-com:vml&quot; xmlns:o=&quot;urn:schemas-microsoft-com:office:office&quot; xmlns:w=&quot;urn:schemas-microsoft-com:office:word&quot; xmlns:st1=&quot;urn:schemas-microsoft-com:office:smarttags&quot; xmlns=&quot;http://www.w3.org/TR/REC-html40&quot;&gt;

&lt;head&gt;
&lt;meta http-equiv=Content-Type content=&quot;text/html; charset=us-ascii&quot;&gt;
&lt;meta name=Generator content=&quot;Microsoft Word 11 (filtered medium)&quot;&gt;
&lt;!--[if !mso]&gt;
&lt;style&gt;
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
&lt;/style&gt;
&lt;![endif]--&gt;&lt;o:SmartTagType namespaceuri=&quot;urn:schemas-microsoft-com:office:smarttags&quot; name=&quot;PersonName&quot; /&gt;
&lt;!--[if !mso]&gt;
&lt;style&gt;
st1\:*{behavior:url(#default#ieooui) }
&lt;/style&gt;
&lt;![endif]--&gt;

&lt;!--[if gte mso 9]&gt;&lt;xml&gt;
 &lt;o:shapedefaults v:ext=&quot;edit&quot; spidmax=&quot;1026&quot; /&gt;
&lt;/xml&gt;&lt;![endif]--&gt;&lt;!--[if gte mso 9]&gt;&lt;xml&gt;
 &lt;o:shapelayout v:ext=&quot;edit&quot;&gt;
  &lt;o:idmap v:ext=&quot;edit&quot; data=&quot;1&quot; /&gt;
 &lt;/o:shapelayout&gt;&lt;/xml&gt;&lt;![endif]--&gt;
&lt;/head&gt;

&lt;body lang=DE link=blue vlink=blue&gt;

&lt;div class=Section1&gt;

&lt;p class=MsoNormal&gt;&lt;font size=2 color=navy face=Arial&gt;&lt;span lang=EN-GB style='font-size:10.0pt;font-family:Arial;color:navy'&gt;This method was
accidentially removed because it had no javadocs and was &amp;#8220;somehow invisible&amp;#8221;.
This is the second one found, that was accidentially removed (the other was IR.getTermIndexDivisor).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=MsoNormal&gt;&lt;font size=2 color=navy face=Arial&gt;&lt;span lang=EN-GB style='font-size:10.0pt;font-family:Arial;color:navy'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=MsoNormal&gt;&lt;font size=2 color=navy face=Arial&gt;&lt;span lang=EN-GB style='font-size:10.0pt;font-family:Arial;color:navy'&gt;As JIRA is not working a
the moment, I will add a comment to the Sort deprecations removal issue (LUCENE-1972)
later, but it is fixed now in 3.0 branch revision 893095 (and also trunk)&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=MsoNormal&gt;&lt;font size=2 color=navy face=Arial&gt;&lt;span lang=EN-GB style='font-size:10.0pt;font-family:Arial;color:navy'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=MsoNormal&gt;&lt;font size=2 color=navy face=Arial&gt;&lt;span lang=EN-GB style='font-size:10.0pt;font-family:Arial;color:navy'&gt;Thanks!&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=MsoNormal&gt;&lt;font size=2 color=navy face=Arial&gt;&lt;span lang=EN-GB style='font-size:10.0pt;font-family:Arial;color:navy'&gt;Uwe&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p style='margin-bottom:12.0pt'&gt;&lt;font size=2 color=navy face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:10.0pt;color:navy'&gt;-----&lt;br&gt;
Uwe Schindler&lt;br&gt;
H.-H.-Meier-Allee 63, D-28213 Bremen&lt;br&gt;
&lt;a href=&quot;http://www.thetaphi.de&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://www.thetaphi.de&lt;/a&gt;&lt;br&gt;
eMail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26884767&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;uwe@...&lt;/a&gt;&lt;/span&gt;&lt;/font&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div style='border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt'&gt;

&lt;div&gt;

&lt;div class=MsoNormal align=center style='text-align:center'&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:12.0pt'&gt;

&lt;hr size=2 width=&quot;100%&quot; align=center tabindex=-1&gt;

&lt;/span&gt;&lt;/font&gt;&lt;/div&gt;

&lt;p class=MsoNormal&gt;&lt;b&gt;&lt;font size=2 face=Tahoma&gt;&lt;span style='font-size:10.0pt;
font-family:Tahoma;font-weight:bold'&gt;From:&lt;/span&gt;&lt;/font&gt;&lt;/b&gt;&lt;font size=2 face=Tahoma&gt;&lt;span style='font-size:10.0pt;font-family:Tahoma'&gt; John Wang
[mailto:&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26884767&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;john.wang@...&lt;/a&gt;] &lt;br&gt;
&lt;b&gt;&lt;span style='font-weight:bold'&gt;Sent:&lt;/span&gt;&lt;/b&gt; Tuesday, December 22, 2009
3:16 AM&lt;br&gt;
&lt;b&gt;&lt;span style='font-weight:bold'&gt;To:&lt;/span&gt;&lt;/b&gt; &lt;st1:PersonName w:st=&quot;on&quot;&gt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26884767&amp;i=2&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-user@...&lt;/a&gt;&lt;/st1:PersonName&gt;;
&lt;st1:PersonName w:st=&quot;on&quot;&gt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26884767&amp;i=3&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev@...&lt;/a&gt;&lt;/st1:PersonName&gt;&lt;br&gt;
&lt;b&gt;&lt;span style='font-weight:bold'&gt;Subject:&lt;/span&gt;&lt;/b&gt; Fwd: 3.0 api change&lt;/span&gt;&lt;/font&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;Any comments?&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;Did we just unintentionally remove getFieldComparatorSource in 3.0.0?&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal style='margin-bottom:12.0pt'&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:12.0pt'&gt;-John&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;---------- Forwarded message ----------&lt;br&gt;
From: &lt;b&gt;&lt;span style='font-weight:bold'&gt;John Wang&lt;/span&gt;&lt;/b&gt; &amp;lt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26884767&amp;i=4&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;john.wang@...&lt;/a&gt;&amp;gt;&lt;br&gt;
Date: Mon, Dec 21, 2009 at 11:21 AM&lt;br&gt;
Subject: 3.0 api change&lt;br&gt;
To: Lucene Users List &amp;lt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26884767&amp;i=5&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;lucene-user@...&lt;/a&gt;&amp;gt;,
&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26884767&amp;i=6&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;lucene-dev@...&lt;/a&gt;&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
Hi guys:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;I noticed SortField.&lt;/span&gt;&lt;/font&gt;&lt;u&gt;&lt;font size=1 face=Monaco&gt;&lt;span style='font-size:8.5pt;font-family:Monaco'&gt;getComparatorSource&lt;/span&gt;&lt;/font&gt;&lt;/u&gt;&amp;nbsp;was
removed (replaced by getComparator) and it is not documented in CHANGES.TXT.
This api was introduced in 2.9.0.&amp;nbsp;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Is this intentional?&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;Thanks&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 color=&quot;#888888&quot; face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:12.0pt;color:#888888'&gt;-John&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;/div&gt;

&lt;p class=MsoNormal&gt;&lt;font size=3 face=&quot;Times New Roman&quot;&gt;&lt;span style='font-size:
12.0pt'&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;

&lt;/div&gt;

&lt;/div&gt;

&lt;/div&gt;

&lt;/body&gt;

&lt;/html&gt;
</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/Fwd%3A-3.0-api-change-tp26883041p26884767.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26883041</id>
	<title>Fwd: 3.0 api change</title>
	<published>2009-12-21T18:15:54Z</published>
	<updated>2009-12-21T18:15:54Z</updated>
	<author>
		<name>John Wang-9</name>
	</author>
	<content type="html">Any comments?&lt;div&gt;Did we just unintentionally remove getFieldComparatorSource in 3.0.0?&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;-John&lt;br&gt;&lt;br&gt;&lt;div class=&quot;gmail_quote&quot;&gt;---------- Forwarded message ----------&lt;br&gt;From: &lt;b class=&quot;gmail_sendername&quot;&gt;John Wang&lt;/b&gt; &lt;span dir=&quot;ltr&quot;&gt;&amp;lt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26883041&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;john.wang@...&lt;/a&gt;&amp;gt;&lt;/span&gt;&lt;br&gt;
Date: Mon, Dec 21, 2009 at 11:21 AM&lt;br&gt;Subject: 3.0 api change&lt;br&gt;To: Lucene Users List &amp;lt;&lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26883041&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;lucene-user@...&lt;/a&gt;&amp;gt;, &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26883041&amp;i=2&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;lucene-dev@...&lt;/a&gt;&lt;br&gt;
&lt;br&gt;&lt;br&gt;Hi guys:&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;    I noticed SortField.&lt;span style=&quot;font-family:Monaco;font-size:11px&quot;&gt;&lt;span style=&quot;text-decoration:underline&quot;&gt;getComparatorSource&lt;/span&gt;&lt;/span&gt; was removed (replaced by getComparator) and it is not documented in CHANGES.TXT. This api was introduced in 2.9.0. &lt;/div&gt;

&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;    Is this intentional?&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;div&gt;Thanks&lt;/div&gt;&lt;div&gt;&lt;br&gt;&lt;/div&gt;&lt;font color=&quot;#888888&quot;&gt;&lt;div&gt;-John&lt;/div&gt;
&lt;/font&gt;&lt;/div&gt;&lt;br&gt;&lt;/div&gt;
</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/Fwd%3A-3.0-api-change-tp26883041p26883041.html" />
</entry>

<entry>
	<id>tag:old.nabble.com,2006:post-26882314</id>
	<title>[jira] Commented: (LUCENE-2026) Refactoring of IndexWriter</title>
	<published>2009-12-21T16:09:18Z</published>
	<updated>2009-12-21T16:09:18Z</updated>
	<author>
		<name>JIRA jira@apache.org</name>
	</author>
	<content type="html">&lt;br&gt;&amp;nbsp; &amp;nbsp; [ &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793431#action_12793431&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&amp;focusedCommentId=12793431#action_12793431&lt;/a&gt;&amp;nbsp;] 
&lt;br&gt;&lt;br&gt;Marvin Humphrey commented on LUCENE-2026:
&lt;br&gt;-----------------------------------------
&lt;br&gt;&lt;br&gt;&amp;gt; I guess my confusion is what are all the other benefits of using
&lt;br&gt;&amp;gt; file-backed RAM? You can efficiently use process only concurrency
&lt;br&gt;&amp;gt; (though shared memory is technically an option for this too), and you
&lt;br&gt;&amp;gt; have wicked fast open times (but, you still must warm, just like
&lt;br&gt;&amp;gt; Lucene). 
&lt;br&gt;&lt;br&gt;Processes are Lucy's primary concurrency model. &amp;nbsp;(&amp;quot;The OS is our JVM.&amp;quot;)
&lt;br&gt;Making process-only concurrency efficient isn't optional -- it's a *core*
&lt;br&gt;*concern*.
&lt;br&gt;&lt;br&gt;&amp;gt; What else? Oh maybe the ability to inform OS not to cache
&lt;br&gt;&amp;gt; eg the reads done when merging segments. That's one I sure wish
&lt;br&gt;&amp;gt; Lucene could use...
&lt;br&gt;&lt;br&gt;Lightweight searchers mean architectural freedom. &amp;nbsp;
&lt;br&gt;&lt;br&gt;Create 2, 10, 100, 1000 Searchers without a second thought -- as many as you
&lt;br&gt;need for whatever app architecture you just dreamed up -- then destroy them
&lt;br&gt;just as effortlessly. &amp;nbsp;Add another worker thread to your search server without
&lt;br&gt;having to consider the RAM requirements of a heavy searcher object. &amp;nbsp;Create a
&lt;br&gt;command-line app to search a documentation index without worrying about
&lt;br&gt;daemonizing it. &amp;nbsp;Etc.
&lt;br&gt;&lt;br&gt;If your normal development pattern is a single monolithic Java process, then
&lt;br&gt;that freedom might not mean much to you. &amp;nbsp;But with their low per-object RAM
&lt;br&gt;requirements and fast opens, lightweight searchers are easy to use within a
&lt;br&gt;lot of other development patterns. For example: lightweight searchers work 
&lt;br&gt;well for maxing out multiple CPU cores under process-only concurrency.
&lt;br&gt;&lt;br&gt;&amp;gt; In exchange you risk the OS making poor choices about what gets
&lt;br&gt;&amp;gt; swapped out (LRU policy is too simplistic... not all pages are created
&lt;br&gt;&amp;gt; equal), 
&lt;br&gt;&lt;br&gt;The Linux virtual memory system, at least, is not a pure LRU. &amp;nbsp;It utilizes a
&lt;br&gt;page aging algo which prioritizes pages that have historically been accessed
&lt;br&gt;frequently even when they have not been accessed recently:
&lt;br&gt;&lt;br&gt;{panel}
&lt;br&gt;&amp;nbsp; &amp;nbsp; &lt;a href=&quot;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;http://sunsite.nus.edu.sg/LDP/LDP/tlk/node40.html&lt;/a&gt;&lt;br&gt;&lt;br&gt;&amp;nbsp; &amp;nbsp; The default action when a page is first allocated, is to give it an
&lt;br&gt;&amp;nbsp; &amp;nbsp; initial age of 3. Each time it is touched (by the memory management
&lt;br&gt;&amp;nbsp; &amp;nbsp; subsystem) it's age is increased by 3 to a maximum of 20. Each time the
&lt;br&gt;&amp;nbsp; &amp;nbsp; Kernel swap daemon runs it ages pages, decrementing their age by 1.
&lt;br&gt;{panel}
&lt;br&gt;&lt;br&gt;And while that system may not be ideal from our standpoint, it's still pretty
&lt;br&gt;good. &amp;nbsp;In general, the operating system's virtual memory scheme is going to
&lt;br&gt;work fine as designed, for us and everyone else, and minimize memory
&lt;br&gt;availability wait times.
&lt;br&gt;&lt;br&gt;When will swapping out the term dictionary be a problem? &amp;nbsp;
&lt;br&gt;&lt;br&gt;&amp;nbsp; * For indexes where queries are made frequently, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * Foir systems with plenty of RAM, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * For systems that aren't very busy, no problem. &amp;nbsp;
&lt;br&gt;&amp;nbsp; * For small indexes, no problem. &amp;nbsp;
&lt;br&gt;&lt;br&gt;The only situation we're talking about is infrequent queries against large
&lt;br&gt;indexes on busy boxes where RAM isn't abundant. &amp;nbsp;Under those circumstances, it
&lt;br&gt;*might* be noticable that Lucy's term dictionary gets paged out somewhat
&lt;br&gt;sooner than Lucene's.
&lt;br&gt;&lt;br&gt;But in general, if the term dictionary gets paged out, so what? &amp;nbsp;Nobody was
&lt;br&gt;using it. &amp;nbsp;Maybe nobody will make another query against that index until next
&lt;br&gt;week. &amp;nbsp;Maybe the OS made the right decision.
&lt;br&gt;&lt;br&gt;OK, so there's a vulnerable bubble where the the query rate against a large
&lt;br&gt;index is neither too fast nor too slow, on busy machines where RAM isn't
&lt;br&gt;abundant. &amp;nbsp;I don't think that bubble ought to drive major architectural
&lt;br&gt;decisions.
&lt;br&gt;&lt;br&gt;Let me turn your question on its head. &amp;nbsp;What does Lucene gain in return for
&lt;br&gt;the slow index opens and large process memory footprint of its heavy
&lt;br&gt;searchers?
&lt;br&gt;&lt;br&gt;&amp;gt; I do love how pure the file-backed RAM approach is, but I worry that
&lt;br&gt;&amp;gt; down the road it'll result in erratic search performance in certain
&lt;br&gt;&amp;gt; app profiles.
&lt;br&gt;&lt;br&gt;If necessary, there's a straightforward remedy: slurp the relevant files into
&lt;br&gt;RAM at object construction rather than mmap them. &amp;nbsp;The rest of the code won't 
&lt;br&gt;know the difference between malloc'd RAM and mmap'd RAM. &amp;nbsp;The slurped files 
&lt;br&gt;won't take up any more space than the analogous Lucene data structures; more 
&lt;br&gt;likely, they'll take up less.
&lt;br&gt;&lt;br&gt;That's the kind of setting we'd hide away in the IndexManager class rather
&lt;br&gt;than expose as prominent API, and it would be a hint to index components
&lt;br&gt;rather than an edict.
&lt;br&gt;&lt;br&gt;&amp;gt; Yeah, that you need 3 files for the string sort cache is a little
&lt;br&gt;&amp;gt; spooky... that's 3X the chance of a page fault.
&lt;br&gt;&lt;br&gt;Not when using the compound format.
&lt;br&gt;&lt;br&gt;&amp;gt; But the CFS construction must also go through the filesystem (like
&lt;br&gt;&amp;gt; Lucene) right? So you still incur IO load of creating the small
&lt;br&gt;&amp;gt; files, then 2nd pass to consolidate.
&lt;br&gt;&lt;br&gt;Yes.
&lt;br&gt;&lt;br&gt;&amp;gt; I think we may need to largely take &amp;quot;time&amp;quot; out of our programming
&lt;br&gt;&amp;gt; languages, eg switch to much more declarative code, or
&lt;br&gt;&amp;gt; something... wanna port Lucy to Erlang?
&lt;br&gt;&amp;gt; 
&lt;br&gt;&amp;gt; But I'm not sure process only concurrency, sharing only via
&lt;br&gt;&amp;gt; file-backed memory, is the answer either
&lt;br&gt;&lt;br&gt;I think relying heavily on file-backed memory is particularly appropriate for
&lt;br&gt;Lucy because the write-once file format works well with MAP_SHARED memory
&lt;br&gt;segments. &amp;nbsp;If files were being modified and had to be protected with
&lt;br&gt;semaphores, it wouldn't be as sweet a match.
&lt;br&gt;&lt;br&gt;Focusing on process-only concurrency also works well for Lucy because host
&lt;br&gt;threading models differ substantially and so will only be accessible via a
&lt;br&gt;generalized interface from the Lucy C core. &amp;nbsp;It will be difficult to tune
&lt;br&gt;threading performance through that layer of indirection -- I'm guessing beyond
&lt;br&gt;the ability of most developers since few will be experts in multiple host
&lt;br&gt;threading models. &amp;nbsp;In contrast, expertise in process level concurrency will be
&lt;br&gt;easier to come by and to nourish.
&lt;br&gt;&lt;br&gt;&amp;gt; Using Zoie you can make reopen time insanely fast (much faster than I
&lt;br&gt;&amp;gt; think necessary for most apps), but at the expense of some expected
&lt;br&gt;&amp;gt; hit to searching/indexing throughput. I don't think that's the right
&lt;br&gt;&amp;gt; tradeoff for Lucene.
&lt;br&gt;&lt;br&gt;But as Jake pointed out early in the thread, Zoie achieves those insanely fast
&lt;br&gt;reopens without tight coupling to IndexWriter and its components. &amp;nbsp;The
&lt;br&gt;auxiliary RAM index approach is well proven.
&lt;br&gt;&lt;br&gt;&amp;gt; Do you have any hard numbers on how much time it takes Lucene to load
&lt;br&gt;&amp;gt; from a hot IO cache, populating its RAM resident data structures?
&lt;br&gt;&lt;br&gt;Hmm, I don't spend a lot of time working with Lucene directly, so I might not
&lt;br&gt;be the person most likely to have data like that at my fingertips. &amp;nbsp;Maybe that
&lt;br&gt;McCandless dude can help you out, he runs a lot of benchmarks. &amp;nbsp;;) 
&lt;br&gt;&lt;br&gt;Or maybe ask the Solr folks? &amp;nbsp;I see them on solr-user all the time talking 
&lt;br&gt;about &amp;quot;MaxWarmingSearchers&amp;quot;. ;)
&lt;br&gt;&lt;br&gt;&amp;gt; OK. Then, you are basically pooling your readers &amp;nbsp;Ie, you do allow
&lt;br&gt;&amp;gt; in-process sharing, but only among readers.
&lt;br&gt;&lt;br&gt;Not sure about that. Lucy's IndexReader.reopen() would open new SegReaders for
&lt;br&gt;each new segment, but they would be private to each parent PolyReader. &amp;nbsp;So if
&lt;br&gt;you reopened two IndexReaders at the same time after e.g. &amp;nbsp;segment &amp;quot;seg_12&amp;quot;
&lt;br&gt;had been added, each would create a new, private SegReader for &amp;quot;seg_12&amp;quot;.
&lt;br&gt;&lt;div class='shrinkable-quote'&gt;&lt;br&gt;&amp;gt; Refactoring of IndexWriter
&lt;br&gt;&amp;gt; --------------------------
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Key: LUCENE-2026
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; URL: &lt;a href=&quot;https://issues.apache.org/jira/browse/LUCENE-2026&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;https://issues.apache.org/jira/browse/LUCENE-2026&lt;/a&gt;&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Project: Lucene - Java
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Issue Type: Improvement
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Components: Index
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Reporter: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Assignee: Michael Busch
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Priority: Minor
&lt;br&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Fix For: 3.1
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt;
&lt;br&gt;&amp;gt; I've been thinking for a while about refactoring the IndexWriter into
&lt;br&gt;&amp;gt; two main components.
&lt;br&gt;&amp;gt; One could be called a SegmentWriter and as the
&lt;br&gt;&amp;gt; name says its job would be to write one particular index segment. The
&lt;br&gt;&amp;gt; default one just as today will provide methods to add documents and
&lt;br&gt;&amp;gt; flushes when its buffer is full.
&lt;br&gt;&amp;gt; Other SegmentWriter implementations would do things like e.g. appending or
&lt;br&gt;&amp;gt; copying external segments [what addIndexes*() currently does].
&lt;br&gt;&amp;gt; The second component's job would it be to manage writing the segments
&lt;br&gt;&amp;gt; file and merging/deleting segments. It would know about
&lt;br&gt;&amp;gt; DeletionPolicy, MergePolicy and MergeScheduler. Ideally it would
&lt;br&gt;&amp;gt; provide hooks that allow users to manage external data structures and
&lt;br&gt;&amp;gt; keep them in sync with Lucene's data during segment merges.
&lt;br&gt;&amp;gt; API wise there are things we have to figure out, such as where the
&lt;br&gt;&amp;gt; updateDocument() method would fit in, because its deletion part
&lt;br&gt;&amp;gt; affects all segments, whereas the new document is only being added to
&lt;br&gt;&amp;gt; the new segment.
&lt;br&gt;&amp;gt; Of course these should be lower level APIs for things like parallel
&lt;br&gt;&amp;gt; indexing and related use cases. That's why we should still provide
&lt;br&gt;&amp;gt; easy to use APIs like today for people who don't need to care about
&lt;br&gt;&amp;gt; per-segment ops during indexing. So the current IndexWriter could
&lt;br&gt;&amp;gt; probably keeps most of its APIs and delegate to the new classes.
&lt;/div&gt;&lt;br&gt;-- 
&lt;br&gt;This message is automatically generated by JIRA.
&lt;br&gt;-
&lt;br&gt;You can reply to this email to add a comment to the issue online.
&lt;br&gt;&lt;br&gt;&lt;br&gt;---------------------------------------------------------------------
&lt;br&gt;To unsubscribe, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26882314&amp;i=0&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-unsubscribe@...&lt;/a&gt;
&lt;br&gt;For additional commands, e-mail: &lt;a href=&quot;http://old.nabble.com/user/SendEmail.jtp?type=post&amp;post=26882314&amp;i=1&quot; target=&quot;_top&quot; rel=&quot;nofollow&quot;&gt;java-dev-help@...&lt;/a&gt;
&lt;br&gt;&lt;br&gt;</content>
	<link rel="alternate" type="text/html" href="http://old.nabble.com/-jira--Created%3A-%28LUCENE-2026%29-Refactoring-of-IndexWriter-tp26188404p26882314.html" />
</entry>

</feed>
