|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
synonym payload boostingHi,
I have a field and a wighted synonym map. I have indexed the synonyms with the weight as payload. my code snippet from my filter *public Token next(final Token reusableToken) throws IOException * * . * * . * * .* * Payload boostPayload;* * * * for (Synonym synonym : syns) {* * * * Token newTok = new Token(nToken.startOffset(), nToken.endOffset(), "SYNONYM");* * newTok.setTermBuffer(synonym.getToken().toCharArray(), 0, synonym.getToken().length());* * // set the position increment to zero* * // this tells lucene the synonym is* * // in the exact same location as the originating word* * newTok.setPositionIncrement(0);* * boostPayload = new Payload(PayloadHelper.encodeFloat(synonym.getWieght()));* * newTok.setPayload(boostPayload);* * * I have put it in the index time analyzer : this is my field definition: * <fieldType name="PersonName" class="solr.TextField" positionIncrementGap="100" > <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="com.digitaltrowel.solr.DTSynonymFactory" FreskoFunction="names_with_scoresPipe23Columns.txt" ignoreCase="true" expand="false"/> <!--<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>--> <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>--> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <!--<filter class="com.digitaltrowel.solr.DTSynonymFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>--> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <!--<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>--> <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/ >--> </analyzer> </fieldType> my similarity class is public class BoostingSymilarity extends DefaultSimilarity { public BoostingSymilarity(){ super(); } @Override public float scorePayload(String field, byte [] payload, int offset, int length) { double weight = PayloadHelper.decodeFloat(payload, 0); return (float)weight; } @Override public float coord(int overlap, int maxoverlap) { return 1.0f; } @Override public float idf(int docFreq, int numDocs) { return 1.0f; } @Override public float lengthNorm(String fieldName, int numTerms) { return 1.0f; } @Override public float tf(float freq) { return 1.0f; } } My problem is that scorePayload method does not get called at search time like the other methods in my similarity class. I tested and verified it with break points. What am I doing wrong? I used solr 1.3 and thinking of the payload boos support in solr 1.4. * |
|
|
Re: synonym payload boostingYou might get an answer on the solr list. This is the lucene users list.
Simon On Nov 8, 2009 2:24 PM, "David Ginzburg" <davidginzburg@...> wrote: Hi, I have a field and a wighted synonym map. I have indexed the synonyms with the weight as payload. my code snippet from my filter *public Token next(final Token reusableToken) throws IOException * * . * * . * * .* * Payload boostPayload;* * * * for (Synonym synonym : syns) {* * * * Token newTok = new Token(nToken.startOffset(), nToken.endOffset(), "SYNONYM");* * newTok.setTermBuffer(synonym.getToken().toCharArray(), 0, synonym.getToken().length());* * // set the position increment to zero* * // this tells lucene the synonym is* * // in the exact same location as the originating word* * newTok.setPositionIncrement(0);* * boostPayload = new Payload(PayloadHelper.encodeFloat(synonym.getWieght()));* * newTok.setPayload(boostPayload);* * * I have put it in the index time analyzer : this is my field definition: * <fieldType name="PersonName" class="solr.TextField" positionIncrementGap="100" > <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="com.digitaltrowel.solr.DTSynonymFactory" FreskoFunction="names_with_scoresPipe23Columns.txt" ignoreCase="true" expand="false"/> <!--<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>--> <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>--> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <!--<filter class="com.digitaltrowel.solr.DTSynonymFactory" synonyms="synonyms.txt" ignoreCase="true" expand="false"/>--> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <!--<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>--> <!--<filter class="solr.RemoveDuplicatesTokenFilterFactory"/ >--> </analyzer> </fieldType> my similarity class is public class BoostingSymilarity extends DefaultSimilarity { public BoostingSymilarity(){ super(); } @Override public float scorePayload(String field, byte [] payload, int offset, int length) { double weight = PayloadHelper.decodeFloat(payload, 0); return (float)weight; } @Override public float coord(int overlap, int maxoverlap) { return 1.0f; } @Override public float idf(int docFreq, int numDocs) { return 1.0f; } @Override public float lengthNorm(String fieldName, int numTerms) { return 1.0f; } @Override public float tf(float freq) { return 1.0f; } } My problem is that scorePayload method does not get called at search time like the other methods in my similarity class. I tested and verified it with break points. What am I doing wrong? I used solr 1.3 and thinking of the payload boos support in solr 1.4. * |
|
|
Re: synonym payload boostingAdditionaly you need to modify your queryparser to return BoostingTermQuery, PayloadTermQuery, PayloadNearQuery etc.
With these types of Queries scorePayload method invoked. Hope this helps. --- On Sun, 11/8/09, David Ginzburg <davidginzburg@...> wrote: > From: David Ginzburg <davidginzburg@...> > Subject: synonym payload boosting > To: java-user@... > Date: Sunday, November 8, 2009, 3:23 PM > Hi, > I have a field and a wighted synonym map. > I have indexed the synonyms with the weight as payload. > my code snippet from my filter > > *public Token next(final Token reusableToken) throws > IOException * > * . * > * . * > * .* > * Payload boostPayload;* > * > * > * for (Synonym synonym : syns) > {* > * * > * Token newTok = > new Token(nToken.startOffset(), > nToken.endOffset(), "SYNONYM");* > * > newTok.setTermBuffer(synonym.getToken().toCharArray(), 0, > synonym.getToken().length());* > * // set the > position increment to zero* > * // this tells > lucene the synonym is* > * // in the exact > same location as the originating word* > * > newTok.setPositionIncrement(0);* > * boostPayload = > new > Payload(PayloadHelper.encodeFloat(synonym.getWieght()));* > * > newTok.setPayload(boostPayload);* > * > * > I have put it in the index time analyzer : this is my field > definition: > > * > <fieldType name="PersonName" class="solr.TextField" > positionIncrementGap="100" > > <analyzer type="index"> > <tokenizer > class="solr.WhitespaceTokenizerFactory"/> > <filter > class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <filter > class="solr.LowerCaseFilterFactory"/> > <filter > class="com.digitaltrowel.solr.DTSynonymFactory" > FreskoFunction="names_with_scoresPipe23Columns.txt" > ignoreCase="true" > expand="false"/> > > <!--<filter > class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/>--> > <!--<filter > class="solr.RemoveDuplicatesTokenFilterFactory"/>--> > </analyzer> > <analyzer type="query"> > <tokenizer > class="solr.WhitespaceTokenizerFactory"/> > <filter > class="solr.LowerCaseFilterFactory"/> > <!--<filter > class="com.digitaltrowel.solr.DTSynonymFactory" > synonyms="synonyms.txt" ignoreCase="true" > expand="false"/>--> > <filter > class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > <!--<filter > class="solr.EnglishPorterFilterFactory" > protected="protwords.txt"/>--> > <!--<filter > class="solr.RemoveDuplicatesTokenFilterFactory"/ > >--> > </analyzer> > </fieldType> > > > my similarity class is > public class BoostingSymilarity extends DefaultSimilarity > { > > > public BoostingSymilarity(){ > super(); > > } > @Override > public float scorePayload(String field, > byte [] payload, int offset, > int length) > { > double weight = PayloadHelper.decodeFloat(payload, 0); > return (float)weight; > } > > @Override public float coord(int overlap, int maxoverlap) > { > return 1.0f; > } > > @Override public float idf(int docFreq, int numDocs) > { > return 1.0f; > } > > @Override public float lengthNorm(String fieldName, int > numTerms) > { > return 1.0f; > } > > @Override public float tf(float freq) > { > return 1.0f; > } > } > > My problem is that scorePayload method does not get called > at search time > like the other methods in my similarity class. > I tested and verified it with break points. > What am I doing wrong? > I used solr 1.3 and thinking of the payload boos support in > solr 1.4. > > > * > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@... For additional commands, e-mail: java-user-help@... |
| Free embeddable forum powered by Nabble | Forum Help |