Bytecode Generation

View: New views
5 Messages — Rating Filter:   Alert me  

Bytecode Generation

by Matt Fowles :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

All~

Thank you all for your earlier suggestions.  Now that my project is
complete (well complete enough to have committed back to trunk), I
thought that I would post an update for others.  I am cross posting
this to the Janino user email list too.

Based on your suggestions and my reviewing of various websites, I
decided to use Janino as my target AST.  Just as a preface to the rest
of this, I have found Arno, the creator and maintainer of Janino, to
be quite helpful and responsive.

The initial effort of converting our internal AST to the Janino AST
went very smoothly.  I was a little surprised to discover that Janino
requires its AST to actually be a tree.  Our internal stuff allows for
sub-expression reuse with in the tree.  But emitting a tree was a
fairly trivial affair.


Next I discovered that the provided UnparseVisitor did not insert
parentheses where needed.  The fix for this is not hard, so much as
mildly annoying and has been fixed on Janino trunk.

http://jira.codehaus.org/browse/JANINO-111


At the point the smallest level of tests in our system was passing,
and it was time to move on to our next level of tests.  I quickly ran
into what was probably the largest hurdle in this project.  It turns
out that nested classes in Java are a bit of a hack.

http://jira.codehaus.org/browse/JANINO-112
http://jira.codehaus.org/browse/JANINO-113


It took a little bit of digging with javac and javap before I realized
that access to enclosing class's parent's protected member variables
is mediated through synthetic accessor functions in javac.  Even from
sun's perspective this is a bit of a hack, but I guess we are stuck
with it.

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4116802

At the moment, Janino is still debating whether to fix this or simply
mark it as a known limitation, as the fix adds a reasonably sized
chunk of complexity to the compiler.  Once this problem was
identified, the workaround of making the protected variables public or
adding the synthetic methods manually is trivial and is working
reasonably for us.

Some of our stress tests turned up limitations with offsets and sizing
restrictions in Janino, but these were all fairly easy to fix.

http://jira.codehaus.org/browse/JANINO-115
http://jira.codehaus.org/browse/JANINO-118
http://jira.codehaus.org/browse/JANINO-119


The only other short coming that I needed to fix was that Janino does
not support a statement list that does not introduce a new scope.  We
internally use these to lazily generate code based on later
requirements.

http://jira.codehaus.org/browse/JANINO-116


All told, Janino has been very reliable and easy to target, has a
license amenable to use in a commercial project, and the internal code
is straight forward enough to extend and fix bugs in without
difficulty.  While we turned up 10 or so bugs in the process of doing
this integration, only about 2 of them were major (taking more than a
few hours to fix) and the major ones only took a day or two.


At the moment, using Janino for our bytecode generation all of our
over 9000 nightly tests pass.  The speed up on small test cases (where
writing to disk, starting javac, and parsing the text dominated) was
between 2-6x.  There is currently a small slow down about 10% on
extremely large test cases; however, I have not profiled janino
heavily and am extremely confident that this can be optimized down to
a 20-50% speed up.

All told, I highly recommend Janino as an AST to target.

Matt

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Bytecode Generation

by Arno Unkrig :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Matt Fowles schrieb:
> All~
>
...
> went very smoothly.  I was a little surprised to discover that Janino
> requires its AST to actually be a tree.  Our internal stuff allows for
> sub-expression reuse with in the tree.  But emitting a tree was a

That's because compile-time status information is stored in the AST.
Surely a design flaw, but really difficult to resolve without splitting
up all the AST classes and ending up with twice the number of classes,
and a lot of glue code.

...
> All told, Janino has been very reliable and easy to target, has a
> license amenable to use in a commercial project, and the internal code
> is straight forward enough to extend and fix bugs in without
> difficulty.

Thank you!


CU

Arno


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Bytecode Generation

by Matt Fowles :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Arno~

On Sat, Jun 14, 2008 at 5:54 PM, Arno Unkrig <arno@...> wrote:

> Matt Fowles schrieb:
>>
>> All~
>>
> ...
>>
>> went very smoothly.  I was a little surprised to discover that Janino
>> requires its AST to actually be a tree.  Our internal stuff allows for
>> sub-expression reuse with in the tree.  But emitting a tree was a
>
> That's because compile-time status information is stored in the AST. Surely
> a design flaw, but really difficult to resolve without splitting up all the
> AST classes and ending up with twice the number of classes, and a lot of
> glue code.

Yes, I figured that out.  I agree with you that it might be a flaw,
but the flip side is the extra classes you mentioned.  I am not sure
either what the "correct" design is.  Although I have noticed that
compilers tend to be complicated beasts...

Matt

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Bytecode Generation

by Arno Unkrig :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Matt Fowles schrieb:

> Arno~
>
> On Sat, Jun 14, 2008 at 5:54 PM, Arno Unkrig <arno@...> wrote:
>> Matt Fowles schrieb:
>>> All~
>>>
>> ...
>>> went very smoothly.  I was a little surprised to discover that Janino
>>> requires its AST to actually be a tree.  Our internal stuff allows for
>>> sub-expression reuse with in the tree.  But emitting a tree was a
>> That's because compile-time status information is stored in the AST. Surely
>> a design flaw, but really difficult to resolve without splitting up all the
>> AST classes and ending up with twice the number of classes, and a lot of
>> glue code.
>
> Yes, I figured that out.  I agree with you that it might be a flaw,
> but the flip side is the extra classes you mentioned.  I am not sure
> either what the "correct" design is.  Although I have noticed that
> compilers tend to be complicated beasts...
>
> Matt

Yep, exactly. And one big design goal of JANINO is simplicity. I hate
structures being duplicated...

BTW, this issue is a bit of a shame: In ancient versions, the AST nodes
all had "compile()" and "getType()" methods. Then, one day, I decided to
switch to the VISITOR pattern to get the compilation logic out of the
AST classes. When I was finished, I couldn't find a reasonable way to
get the compile time state information out of the AST nodes. In
retrospect, maybe I'd better left ALL the compilation logic inside the AST.

But the VISITOR pattern functions prettily for UnparseVisitor and stuff,
so the refactoring was not a complete mess.


CU

Arno

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: Bytecode Generation

by Matt Fowles :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Arno~

I definitely find the visitor pattern to be a mixed blessing.  It has
a lot of advantages, but it does introduce a great deal of complexity
into code.  I suppose all things have trade offs...

Matt

On Wed, Jun 18, 2008 at 5:00 PM, Arno Unkrig <arno@...> wrote:

> Matt Fowles schrieb:
>>
>> Arno~
>>
>> On Sat, Jun 14, 2008 at 5:54 PM, Arno Unkrig <arno@...> wrote:
>>>
>>> Matt Fowles schrieb:
>>>>
>>>> All~
>>>>
>>> ...
>>>>
>>>> went very smoothly.  I was a little surprised to discover that Janino
>>>> requires its AST to actually be a tree.  Our internal stuff allows for
>>>> sub-expression reuse with in the tree.  But emitting a tree was a
>>>
>>> That's because compile-time status information is stored in the AST.
>>> Surely
>>> a design flaw, but really difficult to resolve without splitting up all
>>> the
>>> AST classes and ending up with twice the number of classes, and a lot of
>>> glue code.
>>
>> Yes, I figured that out.  I agree with you that it might be a flaw,
>> but the flip side is the extra classes you mentioned.  I am not sure
>> either what the "correct" design is.  Although I have noticed that
>> compilers tend to be complicated beasts...
>>
>> Matt
>
> Yep, exactly. And one big design goal of JANINO is simplicity. I hate
> structures being duplicated...
>
> BTW, this issue is a bit of a shame: In ancient versions, the AST nodes all
> had "compile()" and "getType()" methods. Then, one day, I decided to switch
> to the VISITOR pattern to get the compilation logic out of the AST classes.
> When I was finished, I couldn't find a reasonable way to get the compile
> time state information out of the AST nodes. In retrospect, maybe I'd better
> left ALL the compilation logic inside the AST.
>
> But the VISITOR pattern functions prettily for UnparseVisitor and stuff, so
> the refactoring was not a complete mess.
>
>
> CU
>
> Arno
>
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
>
>   http://xircles.codehaus.org/manage_email
>
>
>

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email