Changing to an ANTLR based lexer for named param parsing

View: New views
1 Messages — Rating Filter:   Alert me  

Changing to an ANTLR based lexer for named param parsing

by brianm :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I just checked in a change (to be in 2.0pre4) which switches the  
named parameter parsing to use ANTLR. It uses JarJar Links to  
renamespace ANTLR and bundle it in the jdbi jar, which, sadly, has  
made the jar 4x its previous size. I am going to work on further  
trimming out the stuff from antlr we don't need.

I am really happy to make this change, though as the old parser was a  
regex based nightmare which was basically unchangeable. The new one  
is four rules:


header {
     package org.skife.jdbi.rewriter.colon;
}

class ColonStatementLexer extends Lexer;

options {
     charVocabulary='\u0000'..'\uFFFE';
     k=2;
}

LITERAL: ('a'..'z' | 'A'..'Z' | ' ' | '\t' | '0'..'9' | ',' | '*'
           | '=' | ';' | '(' | ')' | '[' | ']' | '+' | '-' | '/' |  
'>' | '<' )+;
NAMED_PARAM: ':' ('a'..'z' | 'A'..'Z' | '_')+;
POSITIONAAL_PARAM: '?';
QUOTED_TEXT: ('\'' (~'\'')+ '\'');


I am not sure of the LITERAL matching yet, but it passes all the  
current unit tests. What are the rules for UTF-8 names and whatnot in  
most databases, and anyone have pointers for improving the grammar?

Also, I'd really appreciate anyone who can taking a poke at 2.0. Thanks!

-Brian

---------------------------------------------------------------------
To unsubscribe from this list please visit:

    http://xircles.codehaus.org/manage_email