|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12 - 13 - 14 - 15 | Next > |
|
|
[jira] Created: (SOLR-236) Field collapsingField collapsing
---------------- Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.2 Reporter: Emmanuel Keller This patch include a new feature called "Field collapsing". "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." http://www.fastsearch.com/glossary.aspx?m=48&amid=299 The implementation add 3 new query parameters (SolrParams): "collapse" set to true to enable collapsing. "collapse.field" to choose the field used to group results "collapse.max" to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: collapse_field.patch Field Collapsing > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: collapse_field.patch Remplacing HashDocSet by BitDocSet for hasMoreResult for better performances > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495334 ] Ryan McKinley commented on SOLR-236: ------------------------------------ This looks good. Someone with better lucene chops should look at the IndexSearcher getDocListAndSet part... A few comments/questions about the interface: If you apply all the example docs and hit: http://localhost:8983/solr/select/?q=*:*&collapse=true you get 500. We should use: params.required().get( "collapse.field" ) to have a nicer error: With: http://localhost:8983/solr/select/?q=*:*&collapse=true&collapse.field=manu&collapse.max=1 the collapse info at the bottom says: <lst name="collapse_counts"> <int name="has_more_results">3</int> <int name="has_more_results">5</int> <int name="has_more_results">9</int> </lst> what does that mean? How would you use it? How does it relate to the <result docs? > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495356 ] Emmanuel Keller commented on SOLR-236: -------------------------------------- My turn to miss something ;) You are right, we have to use params.required().get("collapse.field"). About collapse info: <int name="has_more_results">3</int> means that the third doc of the result has been collapsed and that some consecutive results having same field has been removed. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495367 ] Yonik Seeley commented on SOLR-236: ----------------------------------- Thanks for looking into this Emmanuel. It appears as if this only collapses adjacent documents, correct? We should really try to get everyone on the same page... hash out the exact semantics of "collapsing", and the most useful interface. An efficient implementation can follow. A good starting point might be here: > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495368 ] Yonik Seeley commented on SOLR-236: ----------------------------------- A good starting point might be here: http://www.nabble.com/result-grouping--tf2910425.html#a8131895 > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12495376 ] Emmanuel Keller commented on SOLR-236: -------------------------------------- Yonik, You are right, only adjacent documents are collapsed. I work on a large index ( 2.000.000 documents) growing every day. The first goal was to group results, preserving score ranking and achieving good performances. This "light" implementation meets our needs. I am currently working on a second implementation taking care of the semantics. P.S.: Congratulations for this great application. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: field_collapsing.patch This release is more conform with the semantics of "field collapsing". Parameters are: collapse=true // enable collapsing collapse.field=[field] // indexed field used for collapsing collapse.max=[integer] // Start collapsing after n document collapse.type=[normal|adjacent] // Default value is "normal" - "adjacent" collapse only consecutive documents. - "normal" collapse all documents having equal collapsing field. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: field_collapsing.patch Corrects a bug on the previous version when using a value greater than 1 as collapse.max parameter. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Description: This patch include a new feature called "Field collapsing". "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." http://www.fastsearch.com/glossary.aspx?m=48&amid=299 The implementation add 4 new query parameters (SolrParams): "collapse" set to true to enable collapsing. "collapse.field" to choose the field used to group results "collapse.type" normal (default value) or adjacent "collapse.max" to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases was: This patch include a new feature called "Field collapsing". "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." http://www.fastsearch.com/glossary.aspx?m=48&amid=299 The implementation add 3 new query parameters (SolrParams): "collapse" set to true to enable collapsing. "collapse.field" to choose the field used to group results "collapse.max" to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 4 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496617 ] Otis Gospodnetic commented on SOLR-236: --------------------------------------- Question: Do you need collapse=true when you can detect whether collapse.field has been specified or not? > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 4 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496805 ] Emmanuel Keller commented on SOLR-236: -------------------------------------- You're right. As collapse.field is a required field, we don't need more information. My first idea was to copy the behavior of facet. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 4 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: field_collapsing.patch The last version of the patch. - Results are now cached using "CollapseCache" (a new instance of SolrCache added on solrconfig.xml) - The parameter "collapse" has been removed. This version has been fully tested. Feedbacks are welcome. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 4 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: field_collapsing_1.1.0.patch I still maintain a version for the release 1.1.0 (The version we used on our production environment). > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 4 new query parameters (SolrParams): > "collapse" set to true to enable collapsing. > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Description: This patch include a new feature called "Field collapsing". "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." http://www.fastsearch.com/glossary.aspx?m=48&amid=299 The implementation add 3 new query parameters (SolrParams): "collapse.field" to choose the field used to group results "collapse.type" normal (default value) or adjacent "collapse.max" to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - "field_collapsing.patch" for current development version (1.2) - "field_collapsing_1.1.0.patch" for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) was: This patch include a new feature called "Field collapsing". "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." http://www.fastsearch.com/glossary.aspx?m=48&amid=299 The implementation add 4 new query parameters (SolrParams): "collapse" set to true to enable collapsing. "collapse.field" to choose the field used to group results "collapse.type" normal (default value) or adjacent "collapse.max" to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing_1.1.0.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version (1.2) > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan McKinley updated SOLR-236: ------------------------------- Attachment: SOLR-236-FieldCollapsing.patch I updated the patch so that is applies cleanly with trunk, while I was at it, I: * fixed a few spelling errors * made the "collapse.type" parameter parsing to throw an error if the passed field is unknown (rather then quietly using 'normal') * changed the patch name to include the number. -- as we update the patch, use this same name again so it is easy to tell what is the most current. I also made a wiki page so there are direct links to interesting queries: http://wiki.apache.org/solr/FieldCollapsing - - - - - - - Again, I will leave any discussion about the lucene implementation to other more qualified and will just focus on the response interface. Currently if you send the query: http://localhost:8983/solr/select/?q=*:*&collapse.field=cat&collapse.max=1&collapse.type=normal you get a response that looks like: <lst name="collapse_counts"> <int name="hard">1</int> <int name="electronics">2</int> <int name="memory">2</int> <int name="monitor">1</int> <int name="software">1</int> </lst> It looks like that says: for the field 'cat', there is one more result with cat=hard, 2 more results with cat=electronics, ... How is a client supposed to know how to deal with that? "hard" is tokenized version of "hard drive" -- unless it were a 'string' field, the client would need to know how to do that -- or the response needs to change. From a client, it would be more useful to have output that looked something like: <lst name="collapse_counts"> <str name="field">cat</str> <lst name="doc"> <int name="SP2514N">1</int> <int name="6H500F0">1</int> <int name="VS1GB400C3">2</int> <int name="VS1GB400C3">1</int> </lst> <lst name="count"> <int name="hard">1</int> <int name="electronics">1</int> <int name="memory">2</int> <int name="monitor">1</int> </lst> </lst> "field" says what field was collapsed on, "doc" is a map of doc id -> how many more collapsed on that field "count" is a map of 'token'-> how many more collapsed on that field This way, the client would know what collapse counts apply to which documents without knowing about the schema. thoughts? > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing_1.1.0.patch, SOLR-236-FieldCollapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version (1.2) > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: SOLR-236-FieldCollapsing.patch Right, It's more useful. This new version includes the result as you expect it. You should add the following constraint on the wiki: The collapsing field must be un-tokenized. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing_1.1.0.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version (1.2) > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501405 ] Ryan McKinley commented on SOLR-236: ------------------------------------ I just took a look at this using the example data: http://localhost:8983/solr/select/?q=*:*&collapse.field=cat&collapse.max=1&collapse.type=normal&rows=10 <lst name="collapse_counts"> <str name="field">cat</str> <lst name="doc"> <int>1</int> <int name="1">2</int> <int name="2">2</int> <int name="4">1</int> <int name="7">1</int> </lst> <lst name="count"> <int>1</int> <int name="card">2</int> <int name="drive">2</int> <int name="hard">1</int> <int name="music">1</int> </lst> </lst> - - - what is the "<int>1</int>" at the front of each response? Perhaps the 'doc' results should be renamed 'offset' or 'index', and then have another one named 'doc' that uses the uniqueKey as the index... this would be useful to build a Map. - - - Also, check: http://localhost:8983/solr/select/?q=*:*&collapse.field=cat&collapse.max=1&collapse.type=adjacent&rows=50 ArrayIndexOutOfBoundsException: - - - > You should add the following constraint on the wiki: The collapsing field must be un-tokenized. Anyone can edit the wiki (you just have to make an account) -- it would be great if you could help keep the page accurate / useful. JIRA discussion comment trails don't work so well at that... Re: tokenized... what about it does not work? Are the limitations an different if it is mult-valued? Is it just that if any token matches within the field it will collapse and that may or may not be what you expect? - - - Did you get a chance to look at the questions from the previous discussion? I just noticed Yonik posted something new there: http://www.nabble.com/result-grouping--tf2910425.html#a10959848 > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing_1.1.0.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version (1.2) > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-236) Field collapsing[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated SOLR-236: --------------------------------- Attachment: SOLR-236-FieldCollapsing.patch Sorry, my last post was buggy. Here is the correct one. There is no more exception now. About tokens, if any token matches within the field it will collapse. When I start implementing collapsing, my need was to to group documents having exact identical field. I believe that faceting has identical behavior. Lookt at "Graphic card" as example: http://localhost:8983/solr/select/?q=cat:graphic%20card&version=2.2&start=0&rows=10&indent=on&facet=true&facet.field=cat I will try to maintain the wiki page. > Field collapsing > ---------------- > > Key: SOLR-236 > URL: https://issues.apache.org/jira/browse/SOLR-236 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 1.2 > Reporter: Emmanuel Keller > Attachments: collapse_field.patch, collapse_field.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing.patch, field_collapsing_1.1.0.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch > > > This patch include a new feature called "Field collapsing". > "Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated "more documents from this site" link. See also Duplicate detection." > http://www.fastsearch.com/glossary.aspx?m=48&amid=299 > The implementation add 3 new query parameters (SolrParams): > "collapse.field" to choose the field used to group results > "collapse.type" normal (default value) or adjacent > "collapse.max" to select how many continuous results are allowed before collapsing > TODO (in progress): > - More documentation (on source code) > - Test cases > Two patches: > - "field_collapsing.patch" for current development version (1.2) > - "field_collapsing_1.1.0.patch" for Solr-1.1.0 > P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
| < Prev | 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 - 11 - 12 - 13 - 14 - 15 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |