Friday, September 01, 2006

Non Local Transfers in Groovy: A 90% Solution

The discussion about supporting break/continue and return in Closures is a old one.

I was always looking for a 100% solution that just works. Too sad I still haven't found one. But ok, let us look at my new proposal, which is inspired by a blog entry from John Rose about the ongoing discussion of closure support in Java. I think my proposal should cover 99% of all cases.

First let me introduce the term "appended closure". With that I mean closures appended to a method call. An example would be

list.each {println it}
or
if (atHome) {doHomework()}
Ok, the last could have been a closure, bit is none ;)

My solution now covers only these appended-closures. Closures passed around as variable are not covered. These appended closures are most looking like the normal control structures we have, as for example the while loop.

Now what happens when declaring a while loop? The compiler will make us a label marking the entry point of the loop, a label marking the exit point of the loop, and some gotos for break/continue. in Groovy Closures are inner classes, so they can't jump to a label using the bytecode. So we must transfer that exit state form the closure to the calling point and then do our goto there. There was much discussion about how to do that. We discussed return values, which won't work as we must exit the loop method which may be void. We discussed xceptions, which would work, but we where unsure if we really catch them all at the right places and having to rewrite all closure handling code out there is not nice too. We discussed additional fields, with the problem of using the same closure multiple times.

If you only look at appended closures, then some of the problems go away. We get a defined point where to catch the Exception, look at the fields or whatever. But we need the Exception, because we need to exit the method calling the closure. We need it in the case we want to do a break.

And if we limit break/continue to just this type of closures, we have no real problems. We can even do a labled break/continue and not jump only outside the loop method, but outside the surrounding loop too. So what would it look like?
def list = [1,10,100,10,1]
list.each {
it (it>10) break
println it
}
The intention is to print the values 1 and 10 and then stop processing. My suggestion now is to tranform this code into:
def list = [1,10,100,10,1]
int id = createLabelID()
Closure c = {
it (it>10) throw new ClosureBreakException(id)
println it
}
try {
list.each(c)
} catch (ClosureBreakException ce) {
if (ce.id != id) rethrow ce
}
Well, yes, looks like much code. But the compiler does the work for us. What about continue?
def list = [1,10,100,10,1]
list.each {
it (it>10) continue
println it
}
Printing 1,10,10,1. It would be transformed into something like
def list = [1,10,100,10,1]
list.each {
it (it>10) return
println it
}
ehm.. yes.. easy ;) All code in the method calling the closure is always executed then. Compared with a for-loop this is the part where the increment happens an the comparision is done. If we want to jump to a different point, like a surroung for loop, we need exceptions too, because we break the inner loop then
outer: while (foo) {
list.each {
it (it>10) continue outer
println it
}
}

is basically the same as
while (foo) {
list.each {
it (it>10) {doContinue = true; break}
println it
}
if (doContinue) continue
}
But as we use exceptions here, we don't need that "doContinue"
int id = createLabelID()
Closure c = {
it (it>10) throw new ClosureBreakException(id)
println it
}
while (foo) {
try {
list.each(c)
} catch (ClosureBreakException ce) {
if (ce.id != id) rethrow ce
continue
}
}
What about combinations of continue and break?
To enable this I suggest the ussage of more than one id
while (foo) {
list.each {
if (it<0)>10) break outer
}
}
is transformed to
int idBreak = createLabelID()
int idContinue = createLabelID()
Closure c = {
it (it<0)>10) throw new ClosureBreakException(idContinue)
}
outer: while (foo) {
try {
list.each(c)
} catch (ClosureBreakException ce) {
if (ce.id == idBreak ) break outer
if (ce.id == idContinue) continue outer
rethrow ce
}
}
The implementation is really straight forward.

Speed Issues?
We can tell the people that using break/continue might slow down.

Danger of doing break/continue on the wrong loop?
I think we eliminated that problem with our createID function, which produces a new unique id.

Why is it a 90% only?
What I can't do is:
def c = {if(it>10) break; println it}
list.each(c)
and expect the break to work. That is because we have no defined point where we can catch the exceptions. A normal continue can work.

What is the return value in case of continue?
The same as by break, nothing. People will have to know this. We can't have both, a return value and an exception, if the exception does not the job of transporting the return value. In case of a "collect" I expect to get at last a pratial list. To enable this, we must catch the exception and then store the return value od the loop method in there. The changes to the code I showed above is trivial. The changes to the loop method are trivial too, but must be done. The case of a continue is a bit more difficult. And to say the truth I don't know of a solution here. That is because

... I want to avoid people having to rewrite their closures handling code.
Respecting continue would mean to catch an exception in the closure handling method. I don't think that is nice. If I don't do it in case of a "break" my code still works, but might not return correct data. If I don't do it in case of a "continue" my code doesn't behave right and doesn't return correct data. And while supporting break is not needed in every method, supporting continue would be.

What about "return"?
We could support it the same way we support "break". No real problem. In fact that is the most easy version since we don't have to handle incomplete data.

My Suggestion:
Don't allow "continue" in closures. It might look strange, but it has semantic problems. People can always simulate it by returning from the closure with a special value or such and respecting that value in a loop. But avoid new users getting in trouble here and confronting them with more black magic as we already do, I suggest to replace "continue" with a "closure return". The advantage is, that it returns a value and thus is no problem in aspects of incomplete date. for Java ^ is suggested as additional return, I haven't thought about a name or pattern yet. break/return can be implemented as described above for appended closures. Other closures won't have break/return, the compiler would forbid them there. A bit of a problem is the new return. Becasue using the current return in a closure in current Groovy means to return from the closure if it is in a closure and return from the method if it is not in a closure. When seeing a closure as method this is ok, but we don't want to see them as methods. So "return" would change its semantics compared to older groovy versions.

All in all this suggestion here would allow people to use break and return in closures. It would allow to sue them in appended closures, the most used form of the closure when doing loops. A "closure return", or I should better say "block return" would replace the old continue statement. If people really wish to omit a single step, they have to write their own loop.

The advantages is that I don't have to write additional closure handling code for break/return. Only if I want to avoid the incomplete data in the case I return something. Next advantage is that I don't have to identify the closure somehow in the loop method, I don't care if the current closure caused the problem or something deeper inside. I have a defined point where I can catch the exception and don't have to be afraid that it might interfere with other loop methods.

The other possibility I see is to forbid them all ;)

1 comment:

Anonymous said...

Hallo Jochen, ich habe deinen Blogg im Internet gefunden. Deine Spuren im Netz sind ja wirklich beeindruckend. Würde mich freuen mal wieder von dir zu hören. Meine e-mail Adresse ist chrischdoph at web punkt de. Viele Gruesse Christoph K. (Realschule und TG-Zeit)