|
View:
New views
11 Messages
—
Rating Filter:
Alert me
|
|
|
[jira] Created: (SOLR-874) Dismax parser exceptions on trailing OPERATORDismax parser exceptions on trailing OPERATOR
--------------------------------------------- Key: SOLR-874 URL: https://issues.apache.org/jira/browse/SOLR-874 Project: Solr Issue Type: Bug Components: search Affects Versions: 1.3 Reporter: Erik Hatcher Dismax is supposed to be immune to parse exceptions, but alas it's not: http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND kaboom! Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. Was expecting one of: <NOT> ... "+" ... "-" ... "(" ... "*" ... <QUOTED> ... <TERM> ... <PREFIXTERM> ... <WILDTERM> ... "[" ... "{" ... <NUMBER> ... <TERM> ... "*" ... at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12649505#action_12649505 ] Mark Miller commented on SOLR-874: ---------------------------------- Support for AND and OR escaping needed - only I hate to see a scan for AND and OR on every term for every query just to support this...but to quote Erik: "dismax is not to generate a parse error", so I guess it can't be helped? My real dream would be to get those darn unprecedent working AND and OR oddities out of Lucene syntax... > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730492#action_12730492 ] Peter Wolanin commented on SOLR-874: ------------------------------------ I get the same sort of exception with a *leading* operator and the dismax handler. Jul 13, 2009 1:47:06 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse 'OR vti OR bin OR vti OR aut OR author OR dll': Encountered " <OR> "OR "" at line 1, column 0. Was expecting one of: <NOT> ... "+" ... "-" ... "(" ... "*" ... <QUOTED> ... <TERM> ... <PREFIXTERM> ... <WILDTERM> ... "[" ... "{" ... <NUMBER> ... <TERM> ... "*" ... at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:110) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730513#action_12730513 ] Peter Wolanin commented on SOLR-874: ------------------------------------ possibly a fix could be rolled into this existing method in SolrPluginUtils.java ? {code} /** * Strips operators that are used illegally, otherwise reuturns it's * input. Some examples of illegal user queries are: "chocolate +- * chip", "chocolate - - chip", and "chocolate chip -". */ public static CharSequence stripIllegalOperators(CharSequence s) { String temp = CONSECUTIVE_OP_PATTERN.matcher( s ).replaceAll( " " ); return DANGLING_OP_PATTERN.matcher( temp ).replaceAll( "" ); } {code} This seems only to be called from: org/apache/solr/search/DisMaxQParser.java:156: userQuery = SolrPluginUtils.stripIllegalOperators(userQuery).toString(); > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Wolanin updated SOLR-874: ------------------------------- Attachment: SOLR-874.patch Here's a simple patch that escapes with a \. It prevents the exception, however, this fails to match and/or/not (after removing those from the stopwords file) so it's clearly not quite right. > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > Attachments: SOLR-874.patch > > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731292#action_12731292 ] Michael Haag commented on SOLR-874: ----------------------------------- Peter, thanks for keeping our support group in the loop on this issue. Just to make sure I understand: your patch below would work ok for Acquia hosted search since our dismax handler config doesn't make use of boolean expressions anyway. Correct? -m On Jul 14, 2009, at 5:27 PM, Peter Wolanin (JIRA) wrote: [ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Wolanin updated SOLR-874: ------------------------------- Attachment: SOLR-874.patch Here's a simple patch that escapes with a \. It prevents the exception, however, this fails to match and/or/not (after removing those from the stopwords file) so it's clearly not quite right. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > Attachments: SOLR-874.patch > > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736640#action_12736640 ] Mark Miller commented on SOLR-874: ---------------------------------- There is also a problem with && and || > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > Attachments: SOLR-874.patch > > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12771932#action_12771932 ] Peter Wolanin commented on SOLR-874: ------------------------------------ Anyone have an approach for this bug so we can get it fixed before 1.4 is done? > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > Attachments: SOLR-874.patch > > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Commented: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774484#action_12774484 ] Jake Brownell commented on SOLR-874: ------------------------------------ I've also observed dismax blow up if the query starts with more than a single dash, e.g. --john grisham. It doesn't appear to mind multiple leading dashes elsewhere in the query string. > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > Attachments: SOLR-874.patch > > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-874: --------------------------------------- Fix Version/s: 1.5 > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > Fix For: 1.5 > > Attachments: SOLR-874.patch > > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
|
|
[jira] Updated: (SOLR-874) Dismax parser exceptions on trailing OPERATOR[ https://issues.apache.org/jira/browse/SOLR-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Darroch updated SOLR-874: ------------------------------- Attachment: SOLR-874-1.3.patch Hi, I'm one of the httpd devs but I thought I'd throw in this patch for Solr 1.3 (I'll try to make one for trunk later) which handles a number of the issues raised in this report for us. First, & and | are escaped, and the dismax logic is changed a little so that if the various query-munging methods return a blank string, we fall back to using the configured default query. Next, consecutive + or - chars are flattened to a single char; this handles cases where a user might accidentally type --foo when they just mean -foo. Strings of mixed + and - chars are removed, since we have no way of knowing the user's intent without something like +-foo or similar. Together these two steps handle one of the reported cases where the query starts with multiple + or - operators. Any remaining + or - chars which trail the last term, or which have whitespace on their right side, are removed. Our users found it puzzling in the extreme that a search on "questions 1 - 10" explicitly excluded results with "10" in them, because "- 10" is treated as -10. So we just remove any + or - operators which aren't right up against the following term. Finally, we escape AND, OR, and NOT when they appear outside of quotes, and remove any trailing unmatched quote. This changes the previous behaviour which removes all quotes if they aren't perfectly balanced; we felt this was more in line with what users expect if they mistype and enter an extra quote char. So far I haven't been able to generate any Lucene query parser exceptions with this code, but it doesn't mean it's perfect, obviously -- there may still be some way to slip an invalid Lucene query past it. But I'm cautiously optimistic that it covers all or most of the issues raised so far in the thread. > Dismax parser exceptions on trailing OPERATOR > --------------------------------------------- > > Key: SOLR-874 > URL: https://issues.apache.org/jira/browse/SOLR-874 > Project: Solr > Issue Type: Bug > Components: search > Affects Versions: 1.3 > Reporter: Erik Hatcher > Fix For: 1.5 > > Attachments: SOLR-874-1.3.patch, SOLR-874.patch > > > Dismax is supposed to be immune to parse exceptions, but alas it's not: > http://localhost:8983/solr/select?defType=dismax&qf=name&q=ipod+AND > kaboom! > Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'ipod AND': Encountered "<EOF>" at line 1, column 8. > Was expecting one of: > <NOT> ... > "+" ... > "-" ... > "(" ... > "*" ... > <QUOTED> ... > <TERM> ... > <PREFIXTERM> ... > <WILDTERM> ... > "[" ... > "{" ... > <NUMBER> ... > <TERM> ... > "*" ... > > at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:175) > at org.apache.solr.search.DismaxQParser.parse(DisMaxQParserPlugin.java:138) > at org.apache.solr.search.QParser.getQuery(QParser.java:88) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
| Free embeddable forum powered by Nabble | Forum Help |