Monday, September 11, 2006

Groovy on Speed - a "fast mode" for Groovy

fast mode? What's that?

The Problem:
Did you ever try to meassure the performance of groovy? It is a quite difficult task. Because of the high integration of Groovy and Java, you are contantly calling Java and Groovy. For example think of some time consuming operations in a Java library. The Groovy part would probably call the library and then return the result. The overhead through Groovy is minimal.

Calling a Groovy method doing some calculations on the other hand might result in bad performance. The solution is quite simple. Identify these parts and write them in Java, by delegating to a Java library or subclass the Groovy class in Java and overwrite the methods in Java.

Of course this is a possibility. But it means you have to use Java for these parts. It means you can't use Groovy syntax for the operations. That sure is no problem for people seeing Groovy as frontend... somewhere... occasionally used, but no essential part. It does indeed become a problem for groovy evangelists.

What exactly is the Problem?
What Goovy makes slower than Java is it's ability to add/intercept methods at runtime and it's slightly different type conversion model. A method invocation in Groovy consists usually of several normal method calls, where the arguments are stored in a array, the classes of the arguments must be retrieved, a key is generated out of them, a hashmap is used to lookup the method and if that fails, then we have to test the available methods for compatible methods, select one of the methods based on the runtime type, create a key for the hasmap and then in the end, do a reflection like call on the method.

Benchmarks usually use the same typs of arguments, so the hasmap kicks in and allows a "fast" method selection. Only counting the method calls, you get around 4 calls before you are able to look in the Hashmap. Then you create the key, do the lookup in the map... given all that I think it is quite a wonder Groovy has such a good performance!

How to make it even faster?
The way to make something faster is either to find an algorithm with a better runtime complexity, or to reduce the constants. If we would not have to create the Object[], we might be able to reduce the costs very much.. I made some meassurments and it seems that invoking a method taking 3 arguments is 3 times slower than a method taking no arguments. That's because we not only have to create the Object[] for the 3 arguments, but we also have three objects which types we need to get, the key becomes longer, and as the key is a String there is a coherence of the length of the string and the number of arguments.

Java does avoid that. Java does select the method at compile time, so there is no Object[], no type extraction and such. Of course the VM checks the types of the parameters, but in case of Groovy the VM does have to do that more than once. The end of the call in Groovy is the same as in Java, but there has much action taken place before. So if we were able to select a method at compile time in groovy, we could avoid that problem completly.

What does it mean?
It means Groovy would be as fast as Java! But that is no general solution. Dynamic typing doesn't allow you to choose the methods at runtime. And the MetaClass facilities won't help here too. so you are changing the semantics of a method call when chanign the typing from dynamic to static.

So what is the Fast Mode then?
The idea is not do dynamic typing every where. In fact there was already a proposal going into that direction. It was even in the codebase, but it was a global switch. I think that is the wrong solution. My solution is to let the programer decide where he wants to have that. For simplicity I will talk about an annotation here, but there was no decision on what to use. Anyway, I would use this annotation for a class, for a method, for a block of code or a single statement. Each so marked part would loose dynamic typing and letting the compiler use static typing to choose the method. This means that the MetaClass is ignored. It does not mean, that we loose any of the addons to the jdk we made. It will still be possible to use "each" or native syntax for BigDecimals, lists and maps. It doesn't mean that you get back checked exceptions or such. I am not sure yet if the compiler should forbid the mixed usage of static typed and dynamic typed arguments, or if it simply should fall back to the normal invocation mode.. Anyway, so instead of asking the MetaClass to tranform the type and make the call the compiler, we will add code to make the type transformation and the method call based on the static types. We loose multimethods then, and we loose the ability to catch unknown methods and properties! So we loose one of the most appealing feature of Groovy. And given the fact, that a builder depends on them we loose them then too. But as you can limit the usage of the fast mode very much it is no problem at all.

Of course this adds much to the complexity of the language.

What about ivokedynamic?
invokedynamic is the great hope of all dynamic languages on the JVM. If it helps us to not to create the Object[] and if it helps us to avoid to create the hasmap key and if it helps us to avoid the type tranformations, then it might be a much better solution than my fast mode. But even if it does all this, it won't be in Java6. It is Java7 at best. And it is not yet defined. The work for this is still in the early stages and while I hope the result will be very good, it is no solution for now.

The Goal
The idea of the "fast mode" is to provide a way to give the maximum performance at certain places. Then you can do calculations in Groovy. I am no fan of using one language for all uasages, but on the other hand I think that such a mode would be very good for Groovy. And it would tell all those benchmark writters out there not to meassure a language with calculations. That is a very stupid thing to do I would say.

The implementation is surely not done in a few minutes and this "fast mode" will not go into Groovy 1.0. But 2.0 or even 1.1 may have that.

Other ideas?
There were other ideas, yes. For exmaple not to simply make the call through the Metaclass, but to let the compiler guess a method and check with the MetaClass if the guess is right. Sadly that doesn't help, because the method selection has still to be done and is the cost intensive factor here. Writing a hotspot engine for Groovy might be a solution. Java is making HotSpot open, so it is theoretically an option. And possible a much better option than anything else. But there are many unclear details.... none has the knowlefge to write something like that, noone knows the API, how is the HotSpot engine added to the normal Java SDK?

We always tried to avoid the need of a special Groovy JVM, even it is jsut the JVM with additional bytecodes. if the HotSpot engine can't be added to any normal JDK then I don't see why this should be any different.

Reality!
I think the space for doing optimizations is big. For example in the "fast mode" we probably don't need to always create a closure. Instead we can inline the closure handling method and the closure itself. But I think only a concrete inplementation of all the possibiilites would be a fair way to tell if something is better. But even if the normal method invocation in Groovy gets a boost fomr a better HotSpot engine, or new bytecode, it only means that the "fast mode" is no longer needed. I would always recommend to use the normal Groovy method invocation and identify the cost intensive parts. Neither will the "fast mode" solve all problems nor will it free you from the usage of profilers.

2 comments:

Anonymous said...

You may want to look at the "sealing" declarations provided by Dylan (see here).
Unfortunately the chapter from the DRM is not particularly easy to understand, but the basic idea is that all information about sealed methods (or more generally sealed domains) is available at compile time, so that the compiler can perform many optimizations which are not normally possible in dynamic languages. Sealing does not change the semantics of the language(*), e.g., multiple dispatch is still possible for sealed methods, but in most cases the compiler can either determine that there is only a single applicable method for a call site or at least generate efficient static dispatch functions.

(*) Of course it changes the semantics in the sense that you cannot add new methods to a sealed domain, or new subclasses to a sealed class. In Groovy it would probably also disable most uses of the MOP.

Jochen "blackdrag" Theodorou said...

In Groovy it somes to the question of doing things static or dynamic. The case the compiler can produce fast code is when doing things static. As for "multiple dispatch", there are many definitions on that. In Groovy it means, that you can overload a method from a parent class and when doing method dispatching the method might be used even if the "static type" is the one of the parent class. Of course, if you forbid subclassing, something like that can not happen. So on final classes we could speed the dispatching up, if there were no MetaClass possibly adding or even intercepting method calls. But only if the method is not overlaoded for the same number of parameters. But when I compare that to the "fast-mode" I described, then this is exactly the case, where they both behave the same way.