Names and Labels
I compensate for the infrequency of my blog posts with their length. Or the other way around. Anyway, I have some good news to report, so let's do that first. The JIT branch of MoarVM (moar-jit), which has been my work for the last few months, has been merged into the main branch just this morning, after we've found it not to crash and burn on building nqp, rakudo, and running spectests. This means that, with some fiddling, you too can now run JIT for MoarVM, NQP, and Perl 6. World domination for camelia!
Well, unless we find new bugs, of course. We've had plenty of these the last few weeks, and most of them all had the same cause, which can be summarized simply: semantic use of the interpreter process counter is not really compatible with JIT compilation. But maybe that requires some elaboration.
Simply put, during interpreting MoarVM sometimes needs to know exactly where in the program the interpreter is. This happens, for example, when using handlers, which is the mechanism MoarVM uses for try-catch constructs. In the following frame, for example, lines 1 and 4 would be the start and end of the handlers, and the block within CATCH would be invoked if the do-something-dangerous() would actually throw something. On the other hand, if another-dangerous-thing() were to throw something, the CATCH block should clearly not catch it.
To determine what should be done in the event that either of these dangerous functions raises an error, MoarVM inspects the current process counter of the frame, and determines whether or not the try block applies. And this works very well in practice, so long as MoarVM is actually interpreting code. When the same code is compiled - as in, JIT compiled - the interpreter is merely used for entering into the JIT code, and never changes as we move through the frame. So the interpreter instruction pointer can no longer be used to tell where we are. As a result, exception handling didn't quite go smoothly.
A similar problem existed with dynamic variable caches. These are used to make lexical lookup of dynamic variables cheaper by caching them locally in the frame. It is not normally necessary to know where the interpreter is in the frame, except when we're dealing with inlined frames. Put shortly, before inlining the inlined frames may have had different ideas of the contents of the dynamic lexical cache. Because of this, MoarVM needs to know in which of the inlined frames we're actually working to figure out which cache to use. (NB: I'm not totally sure this explanation is 100% correct. Please correct me if not). So again when running the JIT MoarVM couldn't figure this out, and would use the wrong cache. By the way, I should note that jnthn has been extremely helpful in finding the cause of this and several other bugs.
Because the third time is a charm (as I think the saying goes), another, very similar, version of the same bug appeared just a bit earlier with deoptimization. As I had never implemented any operation that caused a 'global deoptimization', I naively thought I wouldn't have to deal with it yet. After all,global deoptimization means that all frames in the call stack would have to be deoptimized. And you may have guessed it, but to do that correctly, you'll have to know precisely where you are in the deoptimising frame. This one was not only found, but also very helpfully fixed by jnthn.
All this implied that it became necessary for me to just solve this problem - where are we in the JIT code - once and for all. And in fact, there already existed parts of a solution to this problem. After all, the JIT already used a special label to store the place we should return too after we'd invoke another frame. So to determine where we are in the program, all we need to do is map those pointers back to the original structures that refer to them - that is to say, the inlined frames, the handlers, and the deoptimization structures. So it was done, just this week. I'd be lying if I said that this went without a hitch, because especially exception handling presented some challenges, but I think this morning I've ironed out the last issue. And because today is - according to the GSoC 2014 timeline - the 'soft pencils-down date' - in other words, the deadline - we felt it was time to merge moar-jit into master and let you enjoy my work.
And people have! This gist shows the relative speedup caused by spesh and JIT compilation in an admittedly overly simple example. As a counterexample, the compilation of CORE.setting - the most time-intensive part of building rakudo - seems to take slightly longer while using JIT than while. Still, tight and simple loops such as these do seem to occur, so I hope that in real-world programs the MoarVM JIT will give better performance. Quite possibly not as good as the JVM or other systems, certainly not as good as it could be, but better than it used to be.
There is still quite a lot to be done, of course. Many instructions are not readily compiled by the JIT. Fortunately, many of these can be compiled into function calls, because this is exactly what they are for the interpreter, too. Many people, including timotimo and jnthn, have already added instructions this way. Some instructions may have to be refactored a bit and I'm sure we'll encounter new bugs, but I do hope that my work can be a starting point.
Well, unless we find new bugs, of course. We've had plenty of these the last few weeks, and most of them all had the same cause, which can be summarized simply: semantic use of the interpreter process counter is not really compatible with JIT compilation. But maybe that requires some elaboration.
Simply put, during interpreting MoarVM sometimes needs to know exactly where in the program the interpreter is. This happens, for example, when using handlers, which is the mechanism MoarVM uses for try-catch constructs. In the following frame, for example, lines 1 and 4 would be the start and end of the handlers, and the block within CATCH would be invoked if the do-something-dangerous() would actually throw something. On the other hand, if another-dangerous-thing() were to throw something, the CATCH block should clearly not catch it.
0: sub a-handler-frame() {
1: try {
2: do-something-dangerous();
3: CATCH { say($_); }
4: }
5: another-dangerous-thing();
6: }
To determine what should be done in the event that either of these dangerous functions raises an error, MoarVM inspects the current process counter of the frame, and determines whether or not the try block applies. And this works very well in practice, so long as MoarVM is actually interpreting code. When the same code is compiled - as in, JIT compiled - the interpreter is merely used for entering into the JIT code, and never changes as we move through the frame. So the interpreter instruction pointer can no longer be used to tell where we are. As a result, exception handling didn't quite go smoothly.
A similar problem existed with dynamic variable caches. These are used to make lexical lookup of dynamic variables cheaper by caching them locally in the frame. It is not normally necessary to know where the interpreter is in the frame, except when we're dealing with inlined frames. Put shortly, before inlining the inlined frames may have had different ideas of the contents of the dynamic lexical cache. Because of this, MoarVM needs to know in which of the inlined frames we're actually working to figure out which cache to use. (NB: I'm not totally sure this explanation is 100% correct. Please correct me if not). So again when running the JIT MoarVM couldn't figure this out, and would use the wrong cache. By the way, I should note that jnthn has been extremely helpful in finding the cause of this and several other bugs.
Because the third time is a charm (as I think the saying goes), another, very similar, version of the same bug appeared just a bit earlier with deoptimization. As I had never implemented any operation that caused a 'global deoptimization', I naively thought I wouldn't have to deal with it yet. After all,global deoptimization means that all frames in the call stack would have to be deoptimized. And you may have guessed it, but to do that correctly, you'll have to know precisely where you are in the deoptimising frame. This one was not only found, but also very helpfully fixed by jnthn.
All this implied that it became necessary for me to just solve this problem - where are we in the JIT code - once and for all. And in fact, there already existed parts of a solution to this problem. After all, the JIT already used a special label to store the place we should return too after we'd invoke another frame. So to determine where we are in the program, all we need to do is map those pointers back to the original structures that refer to them - that is to say, the inlined frames, the handlers, and the deoptimization structures. So it was done, just this week. I'd be lying if I said that this went without a hitch, because especially exception handling presented some challenges, but I think this morning I've ironed out the last issue. And because today is - according to the GSoC 2014 timeline - the 'soft pencils-down date' - in other words, the deadline - we felt it was time to merge moar-jit into master and let you enjoy my work.
And people have! This gist shows the relative speedup caused by spesh and JIT compilation in an admittedly overly simple example. As a counterexample, the compilation of CORE.setting - the most time-intensive part of building rakudo - seems to take slightly longer while using JIT than while. Still, tight and simple loops such as these do seem to occur, so I hope that in real-world programs the MoarVM JIT will give better performance. Quite possibly not as good as the JVM or other systems, certainly not as good as it could be, but better than it used to be.
There is still quite a lot to be done, of course. Many instructions are not readily compiled by the JIT. Fortunately, many of these can be compiled into function calls, because this is exactly what they are for the interpreter, too. Many people, including timotimo and jnthn, have already added instructions this way. Some instructions may have to be refactored a bit and I'm sure we'll encounter new bugs, but I do hope that my work can be a starting point.
Reacties
Een reactie posten