Saturday, January 06, 2007

AST macros and mixins

Hi all,

I know my last article was written some time ago, but I was busy with Groovy 1.0 bug times. Anyway. This is another part of my "beyond Groovy 1.0" series, only that we are now officially in that era and I think I no longer need to prefix the title with that.

So my next wild idea is about a macro mechanism for Groovy. But not the kind of macros you know from C or such, no I think of macros rewriting parts of the AST used by our compiler. So the idea is not really something new, it is something the compiler does already provide. The new thing is to let the compiler do the integration steps for you automatically. No need to add a PhaseOperation to the compiler or such.

No, I have not really thought of a syntax yet, let us think about this as a case study and not as a final draft or something. Ok, back to title. let us say, there is a statement like the import statement and let us call it macro. So when we want to use a macro we simply tell the compiler this by:

import macro Foo

or by
import macro Foo as bar

and let us assume the compiler can use them like a method:
bar {
// a block containing code
}

looks much like the dynamic features we already provide, but in fact the compiler recognizes "bar" as macro and does not produces a method call with a closure instance for this, no, instead the compiler takes the logic provided by our Foo class and applies it to this part of the code.

Why should we do that? One thing I always disliked in Groovy is the huge amount of keywords. Do they really have to be keywords? One example for this is synchronized. Usually you do something like:
synchronized (object) {
// code using object
}

But isn't that almost our macro "bar" from before? It is, not only almost. In fact it is really alike, the compiler would transform that into
synchronized (object, {  ...t })

just like the bar example
bar ({...})

it is no different. So that would mean we could remove the keywords "for", "while", "assert" and "synchronized". Ok, not much... but still it is a step forward. That would still mean that the do-while loop is not supported, but maybe there is a solution for this too.

How would Foo possibly look? Well I guess something like:
class Foo {
static phase = Phases.CONVERSION
def visit(ASTNode[] contextPath, SourceUnit source) {
// ... transformation code here
}
}

The compiler would take a look at the Foo class ensure that there is a default constructor and a static field named phase. Then the compiler would create a new instance of the class and add it to the phase operations it already provides, plus a filter to ensure only the correct classes are affected by this. So if the compiler compiles the file it goes along the AST until it finds the macro and then does execute it.

That's pretty easy to implement, really no big deal, but the impact might be big. I think such a construct would be the first step to unify the Groovy grammar parts and to remove special constructs. Well, the thing really needed for this is a way to exchange/remove statements in the AST. That is currently not possible. It is only possible to transform expressions, but not statements. But well, that is no problem we can't solve, is it?

The mentioned keywords earlier could be seen as macros themself. That doesn't mean that we can remove these statements from the AST, but it would mean a programmer could define additional keywords as he likes. For example a macro transforming a closure into a SQL statement. We already have this, but wouldn't it be exciting to have that at compile time? I mean with different checks and without runtime impact. the macro could use any expression, for example
  sql {
select row1,row2
from tx
where id==26
}

I am also thinking about SODA queries or queries for GORM. The above can't be interpreted using a normal closure, because I used 3 statements that are interpreted as method calls. There is for one select(row1,row2), then from(tx) and where(id==26). So to correctly interpret this we have to use the getClassNode method in MetaClass, but that relies on runtime analyzes of the source file atm. Yes, we are going to change that, but it means to store the source inside the closure in compressed form or else we wouldn't be able to ensure the correct interpretation. With these AST makros we are able to compile that directly into the goal construction. I mean even different SQL dialects could be covered by this.

But SQL is just a example. I used it only to illustrate the abilities a little. I know that this article does only give a little idea about how it might work and writing this doesn't mean it will go into any version of Groovy. But I think the idea is quite simple and yet powerful.

But this is not all... AST handling is complicated and not really something that would make your day. so I think about going a step further, in fact I am planining to adding this pretty soon. I am talking about AST level mixins.

"Ok, what's that again?" you may ask. I think of defining a chunk of code as normal groovy script and a defined way ofr a phase operation to use that code to modify the AST. This chunk of code does itself not contain any AST handling code, it will just be used as archetype for the compiler to produce the real code. This might be a simple thing as
class X {
long id
}

causing the compiler to add a field named id to the currently processed classNode, or something more complex. Well we have a problem when it comes down to bytecode instruction. For example if we want to replace a for loop we would have to define jumps. So, as long as there is no mechanism to represent this, there is no way this transformation by example will work for all kinds of macros too. But well, this is like doing research. At the beginning you have only a vague idea of how things might work. you do experiments to see if your expectations are correct and you develop new ideas if they do not or if the experiments are uncovering no facts.

But still, the idea is exciting somehow. I am sure another language does already have a construct like that... well I am thinking about LISP here. I wouldn't dare to compare this to the macros in LISP (I am talking about Common LISP), but possibly they will become alike in the future after doing same practical work. LISP does have a more easy part here, because LISP does have a much simpler syntax than Groovy. Doing structural work in such a language is naturally more enjoyable than in a complex language like Groovy. But I still think it can be done.

Btw, this is no replacement for DSLs looking like natural language, but that is more a syntax issue, than an issue of AST macros.

4 comments:

Guillaume said...

I'd also be glad to have optional parentheses:

dsl {
    monster.move x:10.meters, y.5.meters
}

Jochen "blackdrag" Theodorou said...

you mean because of x:10? I think that is not related to the macros yet. If the compilers sees this, then it throws an error even before a AST is generated. the grammar must allow this, in order to be used by a macro. But once this is done, it can be used without parentheses

Guillaume said...

Another idea of macros could be a parallel macro:

parallel {
    foo()
    bar()
}

where the methods would be called in parallel threads.

Daniel.Sun said...

Wonderful idea!
With your unremitting effort, I believe Groovy will be more and more powerful ;)

BTW Groovy1.0 is very excellent!
Thank you, Jochen Theodorou :)