|
View:
New views
9 Messages
—
Rating Filter:
Alert me
|
|
|
Grammar for parsing regular expressionHello,
I'm working on grammar for ActionScript language. I learned a lot from java grammar include with javacc examples. Now I want to parse regular expression. My first idea was to create token similar to string token, but it looks that it isn't possible to decide was is regex and what is math expression on the token level. Since there are regular expression in java I was looking for examples in java 1.5 grammar. No luck. Since I don't want to reinvent wheel :-), anyone knows where I can find grammar for javacc with regular expression parsing? Thank you in advance Krzysztof |
|
|
RE: Grammar for parsing regular expressionI'm going to guess that the answer is to use lexical states. That's how the same characters can have different meanings in different situations. Date: Wed, 15 Apr 2009 14:04:05 +0200 From: klangner@... To: users@... Subject: [JavaCC] Grammar for parsing regular expression Hello, I'm working on grammar for ActionScript language. I learned a lot from java grammar include with javacc examples. Now I want to parse regular expression. My first idea was to create token similar to string token, but it looks that it isn't possible to decide was is regex and what is math expression on the token level. Since there are regular expression in java I was looking for examples in java 1.5 grammar. No luck. Since I don't want to reinvent wheel :-), anyone knows where I can find grammar for javacc with regular expression parsing? Thank you in advance Krzysztof Get news, entertainment and everything you care about at Live.com. Check it out! |
|
|
Re: Grammar for parsing regular expressionOn Wednesday April 15 2009, Krzysztof Langner wrote:
> Hello, > > I'm working on grammar for ActionScript language. ... > > Since there are regular expression in java I was looking > for examples in java 1.5 grammar. No luck. Java doesn't have a special RE syntax. They're encoded in and compiled from plain old strings. Also, since you mention ActionScript (of which I know nothing), I'll point out that there are a great many different surface syntaxes for notating regular expressions. > Since I don't want to reinvent wheel :-), anyone knows where I can > find grammar for javacc with regular expression parsing? Specifically the REs of ActionScript? > Thank you in advance > Krzysztof Randall Schulz --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@... For additional commands, e-mail: users-help@... |
|
|
Re: Grammar for parsing regular expressionYes you are right. I forgot that in java you put regex in strings. In action script RE syntax is:
myVariable = /abc/i So even this syntax is valid: myVariable = /"/ The problem is that it is similar to math expressions like 2/3/4. Because of it I can't define it as TOKEN (and have problems with strings too). I think the same syntax is in javascript and in ruby maybe in other languages too. Thats why I'm looking for same grammar similar to AS to learn how this problem can be solved. Thank you Krzysztof On Wed, Apr 15, 2009 at 16:00, Randall R Schulz <rschulz@...> wrote:
|
|
|
Re: Grammar for parsing regular expressionOn Apr 15, 2009, at 11:11 AM, Krzysztof Langner wrote: > Yes you are right. I forgot that in java you put regex in strings. > In action script RE syntax is: > > myVariable = /abc/i > > So even this syntax is valid: > myVariable = /"/ > > The problem is that it is similar to math expressions like 2/3/4. > Because of it I can't define it as TOKEN (and have problems with > strings too). > > I think the same syntax is in javascript and in ruby maybe in other > languages too. Thats why I'm looking for same grammar similar to AS > to learn how this problem can be solved. There's an ecmascript grammar that's got a lexical-state based tokenizer for regular expressions; I'll send it to you, Yours, Tom http://generatingparserswithjavacc.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@... For additional commands, e-mail: users-help@... |
|
|
Re: Grammar for parsing regular expressionOn Wednesday April 15 2009, Krzysztof Langner wrote:
> Yes you are right. I forgot that in java you put regex in strings. In > action script RE syntax is: > > myVariable = /abc/i > > So even this syntax is valid: > myVariable = /"/ As in Groovy. > The problem is that it is similar to math expressions like 2/3/4. > Because of it I can't define it as TOKEN (and have problems with > strings too). You will never, ever be able to write a lexeme / token definition (a regular expression) to recognize the syntax of regular expressions. Parsing regular expressions requires a push-down automata, owing to the balanced, nestable constructs such as parentheses and (in certain dialects) character classes and its use of infix operators (i.e., alternation). So you can save yourself time and energy by accepting the fundamental fact that REs cannot be parsed by REs. > ... > > Thank you > Krzysztof Randall Schulz --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@... For additional commands, e-mail: users-help@... |
|
|
Re: Grammar for parsing regular expressionOn Thu, Apr 16, 2009 at 3:48 AM, Randall R Schulz <rschulz@...> wrote:
> So you can save yourself time and energy by accepting the fundamental > fact that REs cannot be parsed by REs. At least, not all at once (c: (Though PCRE-style RE's can get close, and given a bounded RE text (i.e. extracted from everything else by some lexical state change or similar), are a great help at splitting it up.) When I built a PCRE parser, it was its own animal (hand-coded state machine) and only dealt with RE's. To parse RE's out of something else its much easier (read only possible) to do it as either a lexical state subset or a separate parser, depending on things like how complex this particular dialect of RE's is. -- - J.Chris Findlay (c: --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@... For additional commands, e-mail: users-help@... |
|
|
Re: Grammar for parsing regular expression
I know. I don't want to parse RE with lexer (tokens). It would be enough for me to just represent RE as token. It could almost be done with this simple definition: < REGEX_LITERAL: "/" ( (~["/","\n","\r"]) | ("\\" "/") )* "/" > And the problem is not with RE syntax. But with math expression. Because now int this expression: my_variable = 2/3/4; I'll get REGEX_LITERAL token. But Tom Copeland helped me with this problem by sending me ECMAScript grammar. It looks that the problem is solved there on the parser level. Thank you all for help :-) Krzysztof |
|
|
RE: Grammar for parsing regular expressionThis does mean that a lot of invalid assignments can be tokenized, but you can catch those later, some other way than with the tokenizer. Date: Thu, 16 Apr 2009 17:08:48 +0200 From: klangner@... To: users@... Subject: Re: [JavaCC] Grammar for parsing regular expression
I know. I don't want to parse RE with lexer (tokens). It would be enough for me to just represent RE as token. It could almost be done with this simple definition: < REGEX_LITERAL: "/" ( (~["/","\n","\r"]) | ("\\" "/") )* "/" > And the problem is not with RE syntax. But with math expression. Because now int this expression: my_variable = 2/3/4; I'll get REGEX_LITERAL token. But Tom Copeland helped me with this problem by sending me ECMAScript grammar. It looks that the problem is solved there on the parser level. Thank you all for help :-) Krzysztof Invite your mail contacts to join your friends list with Windows Live Spaces. It's easy! Try it! |
| Free embeddable forum powered by Nabble | Forum Help |