It's wrap-up time! This will be my final post in the GSoC 2014 program, because the GSoC 2014 progam ends 19:00 UTC today, so that's early enough that I won't be writing another post today. It's hard enough to write a post every week as it is. So what I'd like to do today is wrap-up what we've reached and what I've learned in the past three months.
Let's start with a summary of what has been reached. MoarVM has an optional JIT compiler based on spesh, which is the optimization framework I've mentioned often by now. The JIT compiled code cannot handle all instruction yet, but it can handle loops, invocations, perl6 container ops, some regular expressions, expressions and deoptimisation. Reported benefits are anywhere between nothing and 50% faster code, which is awesome. I assume very few people have a lot of truly computation-intensive code running on rakudo perl6 yet, but I suspect that is where it would help most.
The code-generator that we currently have is not awesome. It is in fact pretty naive. Values that are computed in a CPU register must be stored in the MoarVM register space before we can use it in another instruction. We make no attempt to detect loops and allocate registers, we don't do common sub-expression elimination, nor do we make any attempt to dynamically select different instructions.
Well, one aspect of it is awesome, and that is that by using DynASM, there is a direct correspondence between the code we write and the code that is executed. I cannot over-emphasize how user-friendly this really is. Writing a compiler as if I'm writing regular x64 assembly is a great benefit and allows for much faster development times.
Anyway, this discussion of the code generator brings me to another topic I'd like to discuss, and that's why we wrote our own code generator rather than borrowing the one from LuaJIT (or LLVM, or v8 for that matter). I think this is a very fair question and so it deserves some discussion. The first point I'd like to call out is the idea that LLVM (or any of the alternatives) are magic black boxes in which you enter intermediate-level code, wait, and out rolls totally optimal machine code that will run faster than C. I'm obviously totally biased but based on my limited experience I'd say that's not how this works.
First of all, code generation is hardly the most difficult part of doing a JIT compiler. By far most of the complexity in this project has come from integration with the existing MoarVM infrastructure, and that's where the most complex bugs have come from, too. (I've written at length about such bugs already). For example, MoarVM call frame mechanics involve a lot of bookkeeping that simply doesn't exist for C call frames. Likewise, the JIT compiler must make objects available and understandable for the garbage collector just as the interpreter does (or change the GC to understand how to read a JITted frame, not an easy task by itself). Exception throwing and handling form a similar challenge, one which I've talked about at length in another post.
I would also note that impressive as the results of the aforementioned projects may be, what they do isn't magic. Algorithms to do code generation are
really quite well-known and researched, and there is no reason why we
couldn't - in time - implement them. More important than code generation however seems to be optimizing memory access, as it is my suspicion that this is where most of the time is actually spent (Timotimo is actually looking into this as I'm writing this). Because none of these backends know anything about perl6, none of them can use any of the special properties of perl6 to optimize code, which we can (and do).
In conclusion, the choice of a code generation backend is a trade-off like any other. There is no magical silver bullet that will make your code run fast. You could argue against writing our own (even using the help of DynASM) and such an argument could be fair. I would argue as I have above that the trade-off was such that using DynASM was right, especially given the ease of use that it provides.
So what are the next steps for the MoarVM JIT? The first thing is to keep adding instructions while trying to remain bug-free. This isn't as easy as it looks because it seems like every third instruction we add enables the execution of poorly-tested branches that contain new bugs :-). After that, I'd still like to add full register selection for x64 by modifying DynASM. This won't be a very simple task for me since I know very little of x64 instruction encoding or of lua. With that, we can add proper instruction selection and low-level optimization, which should result in a speed-up. However, the most important work is probably to be done in spesh by transforming polymorphic code in simpler, monomorphic, efficient code. For instance, the instruction to test the truth-value of an iterator could be transformed into an iterator-specific instruction that would be much simpler, possibly eliminating a function call. Such transformations should also benefit users who do not use the JIT compiler yet. It seems to me spesh is the place where MoarVM can gain most.