|
View:
New views
3 Messages
—
Rating Filter:
Alert me
|
|
|
quote char appearing in string literal, how to parse?I need a way to parse this string:
"m>\AF\A4}"\B7p\28^\?" where all of the double quote characters are present, that is, the string is delimited by the quotes at the beginning and end, and a double quote appears as part of the string. I had hoped for a simple TOKEN but cannot make that work. My current TOKEN looks like this: | <#CHARACTER: ["a"-"z","A"-"Z","0"-"9","/","'","`","="," ",",","(",")","*","-",";","|","&","\ \",".",":","$","!","@","#","%","^","_","+","?","<",">","~","{","}"] > | <STRING_LITERAL: ( "\"" <CHARACTER> (<CHARACTER>)* "\"" ) | "\"\"" > Is there a way to have the string above matched by <STRING_LITERAL>, or am I off-track and need to do something else? If so, what? --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@... For additional commands, e-mail: users-help@... |
|
|
Re: quote char appearing in string literal, how to parse?Given that strings are delimited by "s, the only way to define them as containing a literal " is to escape it somehow.
E.g. in C etc, internal quotes are escaped by a \ Then your definition can become: <STRING_LITERAL: "\"" ("\\" "\"" | <NOT_QUOTE_CHAR>)* "\""> or if you escape it by doubling it like say Pascal or Delphi, then it is even simpler as you can leave out the escape branch above (so long as you allow in your grammar for a string to appear as a sequence of one or more <STRING_LITERAL>s as "xxx""yyy" is the representation for the literal value xxx"yyy but will parse as 2 consecutive <STRING_LITERAL>s). If you can't double it you need to define some escape semantics (e.g. it's not a delimiter unless it has whitespace against it or something like that). Generally however, follow the why-reinvent-the-wheel concept and realise that this problem has been around for ages and is generally solved using 3 standard ways: - no escapes, but provide multiple delimiters e.g. " and ' so you can make a string containing a " by delimiting it with 's instead etc. - some escape semantic, usually \ that makes non-special the following character - doubling the delimiter to include it once in the value (like how CSV files do), implemented by allowing in the grammar a sequence of string literal tokens with the concatenation rule that a delimiter character is inserted into the value between each. On Fri, May 8, 2009 at 9:32 AM, Terry Gardner <Terry.Gardner@...> wrote: I need a way to parse this string: -- - J.Chris Findlay (c: |
|
|
Re: quote char appearing in string literal, how to parse?Thanks for your comments, they caused me to look at the problem in a new way: I was trying to use the same <STRING_LITERAL> everywhere, but as it turns out, my problem string can only occur in a couple of places in the input, so creating a one-off set enabled me to use the same <STRING_LITERAL> everywhere except where my problem string occurs, thereby isolating the problem. Since isolation, I was able to create a one-off that solves the problem. Thanks. On May 7, 2009, at 5:45 PM, J.Chris Findlay wrote: Given that strings are delimited by "s, the only way to define them as containing a literal " is to escape it somehow. |
| Free embeddable forum powered by Nabble | Forum Help |