adding files to the compiler through the compiler (architectural question)

View: New views
7 Messages — Rating Filter:   Alert me  

adding files to the compiler through the compiler (architectural question)

by Jochen Theodorou :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

hi all,


I need some advise/insurance in this case.... There is an issue that is
bothering me and that goes like this:

A.groovy
println X
println Y

X.groovy
class X{}
class Y{}

Now, if I compile this using A.groovy and X.groovy, then all is fine.
If I compile this using only A.groovy, then the compiler will pickup
X.groovy, but fails in seeing that Y is a class and will then think Y is
a property. That is because X.groovy is parsed after ResolveVisitor is
done with A.groovy, while adding X.groovy happens while ResolveVisitor
is working.

now I see multiple ways of solving this issue.

(1)
Do an early resolving phase for parts we know that has to be a class.
That would be an early resolving process and splits the work current
ResolveVisitor has to do. This would align with my thoughts to enable
local transforms in the conversion phase already, but it would not solve
the issue at hand. To get a solution for example an "import X" is
required. But I wouldn't see that as the worst solution.

(2)
I could end the ResolveVisitor and redo the phase. This brings problems
with other phase operations applied twice and it will probably slow down
the compilation process a lot, since looking for an class is one of the
most expensive things. On the other hand... if a global cache is used,
then this can be worked around... I get the feeling our current
ResolveVisitor does not use a global cache.

(3)
I could "pause" the ResolveVisitor and continue with the added file till
it has the same phase as the paused part and then continue normally.
Using that as general mechanism I would have to use threads to implement
that and it would spawn up to the maximum of available number of phases
threads. That means it wouldn't be many of them (below 10). Also all
these Threads would not run concurrently. Besides that I use ugly
Threads, the overall changes would be quite small here and not really
seen to the outside.

What do you guys think which of these three solution I should do?

--
Jochen "blackdrag" Theodorou
The Groovy Project Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



RE: adding files to the compiler through the compiler (architectural question)

by Alexander Veit-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

> I need some advise/insurance in this case.... There is an
> issue that is
> bothering me and that goes like this:
>
> A.groovy
> println X
> println Y
>
> X.groovy
> class X{}
> class Y{}
>
> Now, if I compile this using A.groovy and X.groovy, then all is fine.
> If I compile this using only A.groovy, then the compiler will pickup
> X.groovy, but fails in seeing that Y is a class and will then
> think Y is a property. That is because X.groovy is parsed after
> ResolveVisitor is done with A.groovy, while adding X.groovy happens
> while ResolveVisitor is working.

Shouldn't class Y be considered private for the compilation unit X.groovy as
long as it doesn't have a compilation unit for its own?

--
Cheers,
Alex


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: adding files to the compiler through the compiler (architectural question)

by Guillaume Laforge-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

On Wed, Oct 7, 2009 at 21:53, Jochen Theodorou <blackdrag@...> wrote:

> hi all,
>
>
> I need some advise/insurance in this case.... There is an issue that is
> bothering me and that goes like this:
>
> A.groovy
> println X
> println Y
>
> X.groovy
> class X{}
> class Y{}
>
> Now, if I compile this using A.groovy and X.groovy, then all is fine. If I
> compile this using only A.groovy, then the compiler will pickup X.groovy,
> but fails in seeing that Y is a class and will then think Y is a property.
> That is because X.groovy is parsed after ResolveVisitor is done with
> A.groovy, while adding X.groovy happens while ResolveVisitor is working.

First, I'd like to ask a question.
In which cases / why you wouldn't compiled both A and X together?
Why would X be compiled afterwards?
Perhaps we even need an "undecided" flag on some nodes when the
compiler doesn't yet know, so we can come back to it later on?

> now I see multiple ways of solving this issue.
>
> (1)
> Do an early resolving phase for parts we know that has to be a class. That
> would be an early resolving process and splits the work current
> ResolveVisitor has to do. This would align with my thoughts to enable local
> transforms in the conversion phase already, but it would not solve the issue
> at hand. To get a solution for example an "import X" is required. But I
> wouldn't see that as the worst solution.

I've got the impression that a first resolving pass would be a good
thing to resolve those "ambiguities" (is it a property, a class?)

> (2)
> I could end the ResolveVisitor and redo the phase. This brings problems with
> other phase operations applied twice and it will probably slow down the
> compilation process a lot, since looking for an class is one of the most
> expensive things. On the other hand... if a global cache is used, then this
> can be worked around... I get the feeling our current ResolveVisitor does
> not use a global cache.

With a cache, would it really slow down compilation that much?
Is it really problematic of some phases are applied twice? Especially
for AST transformations perhaps?

> (3)
> I could "pause" the ResolveVisitor and continue with the added file till it
> has the same phase as the paused part and then continue normally. Using that
> as general mechanism I would have to use threads to implement that and it
> would spawn up to the maximum of available number of phases threads. That
> means it wouldn't be many of them (below 10). Also all these Threads would
> not run concurrently. Besides that I use ugly Threads, the overall changes
> would be quite small here and not really seen to the outside.

I'm a bit worried with 3) in that for instance using threads on Google
App Engine is not possible.
Which would make Groovy unable to run there anymore :-(

> What do you guys think which of these three solution I should do?

Perhaps a first resolve may be the best approach in the end?

--
Guillaume Laforge
Groovy Project Manager
Head of Groovy Development at SpringSource
http://www.springsource.com/g2one

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: adding files to the compiler through the compiler (architectural question)

by Jochen Theodorou :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Guillaume Laforge schrieb:

> Hi,
>
> On Wed, Oct 7, 2009 at 21:53, Jochen Theodorou <blackdrag@...> wrote:
>> hi all,
>>
>>
>> I need some advise/insurance in this case.... There is an issue that is
>> bothering me and that goes like this:
>>
>> A.groovy
>> println X
>> println Y
>>
>> X.groovy
>> class X{}
>> class Y{}
>>
>> Now, if I compile this using A.groovy and X.groovy, then all is fine. If I
>> compile this using only A.groovy, then the compiler will pickup X.groovy,
>> but fails in seeing that Y is a class and will then think Y is a property.
>> That is because X.groovy is parsed after ResolveVisitor is done with
>> A.groovy, while adding X.groovy happens while ResolveVisitor is working.
>
> First, I'd like to ask a question.
> In which cases / why you wouldn't compiled both A and X together?
> Why would X be compiled afterwards?
> Perhaps we even need an "undecided" flag on some nodes when the
> compiler doesn't yet know, so we can come back to it later on?

well, think of having a A.groovy and X.groovy and you want to start
using "groovy A.groovy"

>> now I see multiple ways of solving this issue.
>>
>> (1)
>> Do an early resolving phase for parts we know that has to be a class. That
>> would be an early resolving process and splits the work current
>> ResolveVisitor has to do. This would align with my thoughts to enable local
>> transforms in the conversion phase already, but it would not solve the issue
>> at hand. To get a solution for example an "import X" is required. But I
>> wouldn't see that as the worst solution.
>
> I've got the impression that a first resolving pass would be a good
> thing to resolve those "ambiguities" (is it a property, a class?)

properties and variables wouldn't be resolved, only ClassExpressions.
The real bad part is resolving what is not a ClassExpression. We cannot
resolve variables for example, because scoping is not yet done.

>> (2)
>> I could end the ResolveVisitor and redo the phase. This brings problems with
>> other phase operations applied twice and it will probably slow down the
>> compilation process a lot, since looking for an class is one of the most
>> expensive things. On the other hand... if a global cache is used, then this
>> can be worked around... I get the feeling our current ResolveVisitor does
>> not use a global cache.
>
> With a cache, would it really slow down compilation that much?
> Is it really problematic of some phases are applied twice? Especially
> for AST transformations perhaps?

some phase operations can be applied multiple times without problems.
ResolveVisitor is an example for that. Others do not work, as they would
duplicate methods and such. Duplicated methods would result in either a
compilation error or at runtime in a ClassFormatError.

>> (3)
>> I could "pause" the ResolveVisitor and continue with the added file till it
>> has the same phase as the paused part and then continue normally. Using that
>> as general mechanism I would have to use threads to implement that and it
>> would spawn up to the maximum of available number of phases threads. That
>> means it wouldn't be many of them (below 10). Also all these Threads would
>> not run concurrently. Besides that I use ugly Threads, the overall changes
>> would be quite small here and not really seen to the outside.
>
> I'm a bit worried with 3) in that for instance using threads on Google
> App Engine is not possible.
> Which would make Groovy unable to run there anymore :-(

true, I forgot about that part... well, then let us forget about (3),
too bad the VM does not really support continuations.

>> What do you guys think which of these three solution I should do?
>
> Perhaps a first resolve may be the best approach in the end?

looks like it yes.

bye blackdrag

--
Jochen "blackdrag" Theodorou
The Groovy Project Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: adding files to the compiler through the compiler (architectural question)

by Jochen Theodorou :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexander Veit schrieb:

>> I need some advise/insurance in this case.... There is an
>> issue that is
>> bothering me and that goes like this:
>>
>> A.groovy
>> println X
>> println Y
>>
>> X.groovy
>> class X{}
>> class Y{}
>>
>> Now, if I compile this using A.groovy and X.groovy, then all is fine.
>> If I compile this using only A.groovy, then the compiler will pickup
>> X.groovy, but fails in seeing that Y is a class and will then
>> think Y is a property. That is because X.groovy is parsed after
>> ResolveVisitor is done with A.groovy, while adding X.groovy happens
>> while ResolveVisitor is working.
>
> Shouldn't class Y be considered private for the compilation unit X.groovy as
> long as it doesn't have a compilation unit for its own?

without making the class explicitly private? I think that leads to
unexpected results.

bye blackdrag

--
Jochen "blackdrag" Theodorou
The Groovy Project Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: adding files to the compiler through the compiler (architectural question)

by Alexander Veit-5 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Jochen Theodorou wrote:
> Alexander Veit schrieb:
> >
> > Shouldn't class Y be considered private for the compilation
> > unit X.groovy as long as it doesn't have a compilation unit
> > for its own?
>
> without making the class explicitly private? I think that leads to
> unexpected results.

It depends on the concept of privacy used :)

As far as I understand Groovy there are three basic scenarios for the Groovy
compiler:

1. Compiling a script.
2. Precompiling classes from groovy source files.
3. Compiling classes from groovy source files on demand (e.g. through GCL).

In the first case it is arguable if classes that are defined within the
script should be created in a public namespace. I would expect them to be
inner classes of the implicitly defined script class.

In the second case speed probably doesn't matter too much. The compiler if
free to apply the compilation phases to the compilation units in any
reasonable order (even parallelized). So it's probably not a big deal to
find all class definitions in all compilation units.

The third case is different. When compilation units might be modified at
runtime, it is extremely difficult (or let's say expensive) to keep track of
the classpath, when the content of the compilation units is not reflected in
the file system. Even the uniqueness of class names must be enforced by the
compiler in this case. Class Foo may be defined in several files within the
same directory at different times during the VM life cycle.

I wouldn't see it as a hard restriction if each Groovy class had to be
defined in its own file in future releases.

As a compromise additional class definitions could be treated as roughly
analogous to anonymous inner classes in Java. So, in your example, Y could
be freely used by X, X could create instances of Y, and X could pass these
instances to callers. But other classes would not be able to create
instances of Y. In this sense Y would be private to X.

--
Just my two cents.
Alex




---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email



Re: adding files to the compiler through the compiler (architectural question)

by Jochen Theodorou :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Alexander Veit schrieb:

> Jochen Theodorou wrote:
>> Alexander Veit schrieb:
>>> Shouldn't class Y be considered private for the compilation
>>> unit X.groovy as long as it doesn't have a compilation unit
>>> for its own?
>> without making the class explicitly private? I think that leads to
>> unexpected results.
>
> It depends on the concept of privacy used :)
>
> As far as I understand Groovy there are three basic scenarios for the Groovy
> compiler:
>
> 1. Compiling a script.
> 2. Precompiling classes from groovy source files.
> 3. Compiling classes from groovy source files on demand (e.g. through GCL).

a script is just an implicit class definition, so 1 is not really
different from 2 or 3

> In the first case it is arguable if classes that are defined within the
> script should be created in a public namespace. I would expect them to be
> inner classes of the implicitly defined script class.

they are in the name space defined by the package statement in the
script. If it does not exist, then the default package is used. They are
not inner classes

> In the second case speed probably doesn't matter too much. The compiler if
> free to apply the compilation phases to the compilation units in any
> reasonable order (even parallelized). So it's probably not a big deal to
> find all class definitions in all compilation units.

if all compilation Units are known at this point. It is very possible,
that during compilation the compiler has to pickup further source files.

> The third case is different. When compilation units might be modified at
> runtime, it is extremely difficult (or let's say expensive) to keep track of
> the classpath, when the content of the compilation units is not reflected in
> the file system. Even the uniqueness of class names must be enforced by the
> compiler in this case. Class Foo may be defined in several files within the
> same directory at different times during the VM life cycle.

we use the class loader to find those, so the order there defines the
order we use. Case 3 is not that different from case 2, because GCL does
use the same compiler, which means case 2 will use the class loader
infrastructure too.

> I wouldn't see it as a hard restriction if each Groovy class had to be
> defined in its own file in future releases.
>
> As a compromise additional class definitions could be treated as roughly
> analogous to anonymous inner classes in Java. So, in your example, Y could
> be freely used by X, X could create instances of Y, and X could pass these
> instances to callers. But other classes would not be able to create
> instances of Y. In this sense Y would be private to X.

that is more or less how we handle it now

bye blackdrag

--
Jochen "blackdrag" Theodorou
The Groovy Project Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email