Retrospective of the MoarVM JIT

Hi hackers! Today the MoarVM JIT project is nearly 9 years old. I was inspired by Jonathan's presentation reflecting on the development of MoarVM, to do the same for the MoarVM JIT, for which I have been responsible. For those who are unfamiliar, what is commonly understood as 'JIT compilation' for virtual machines is performed by two components in MoarVM. A framework for runtime type specialization (' spesh ') A native code generation backend for the specialized code (the 'JIT'). This post refers only to the native code generation backend component. It, too, is split into two mostly-independent systems: A backend that emits code directly from MoarVM instructions from machine code templates (the 'lego' JIT compiler). Another backend that transforms MoarVM instructions into an expression-based intermediate representation and compiles machine code based on that (the 'expression' compiler). Things that worked well Using DynASM for code generat

Why bother with Scripting?

Many years back, Larry Wall shared his thesis on the nature of scripting. Since recently even Java gained 'script' support I thought it would be fitting to revisit the topic, and hopefully relevant to the perl and raku language community. The weakness of Larry's treatment (which, to be fair to the author, I think is more intended to be enlightening than to be complete) is the contrast of scripting with programming . This contrast does not permit a clear separation because scripts are programs . That is to say, no matter how long or short, scripts are written commands for a machine to execute, and I think that's a pretty decent definition of a program in general. A more useful contrast - and, I think, the intended one - is between scripts and other sorts of programs , because that allows us to compare scripting (writing scripts) with 'programming' (writing non-script programs). And to do that we need to know what other sorts of programs there are. The short

Reverse Linear Scan Allocation is probably a good idea

Hi hackers! Today First of all, I want to thank everybody who gave such useful feedback on my last post.  For instance, I found out that the similarity between the expression JIT IR and the Testarossa Trees IR  is quite remarkable, and that they have a fix for the problem that is quite different from what I had in mind. Today I want to write something about register allocation, however. Register allocation is probably not my favorite problem, on account of being both messy and thankless. It is a messy problem because - aside from being NP-hard to solve optimally - hardware instruction sets and software ABI's introduce all sorts of annoying constraints. And it is a thankless problem because the case in which a good register allocator is useful - for instance, when there's lots of intermediate values used over a long stretch of code - are fairly rare. Much more common are the cases in which either there are trivially sufficient registers, or ABI constraints force a spill to me

Something about IR optimization

Hi hackers! Today I want to write about optimizing IR in the MoarVM JIT, and also a little bit about IR design itself. One of the (major) design goals for the expression JIT was to have the ability to optimize code over the boundaries of individual MoarVM instructions. To enable this, the expression JIT first expands each VM instruction into a graph of lower-level operators. Optimization then means pattern-matching those graphs and replacing them with more efficient expressions. As a running example, consider the idx operator. This operator takes two inputs ( base and element ) and a constant parameter scale and computes base+element*scale . This represents one of the operands of an  'indexed load' instruction on x86, typically used to process arrays. Such instructions allow one instruction to be used for what would otherwise be two operations (computing an address and loading a value). However, if the element of the idx operator is a constant, we can replace it instead

A short post about types and polymorphism

Hi all. I usually write somewhat long-winded posts, but today I'm going to try and make an exception. Today I want to talk about the expression template language used to map the high-level MoarVM instructions to low-level constructs that the JIT compiler can easily work with: This 'language' was designed back in 2015 subject to three constraints: It should make it easy to develop 'templates' for MoarVM instructions, so we can map the ~800 or so different instructions supported by the interpreter to something the JIT compiler can work with. It should be simple to process and analyze; specifically, it should be suitable as input to the instruction selection process (the tiler). It should be simple to implement, both from the frontend (meaning the perl program that compiles a template file to a C header) and the backend (meaning the C code that combines templates into the IR that is compiled). Recently I've been working on adding support for floating point

New years post

Hi everybody! I recently read jnthn s Perl 6 new years resolutions post, and I realized that this was an excellent example to emulate. So here I will attempt to share what I've been doing in 2018 and what I'll be doing in 2019. In 2018, aside from the usual refactoring, bugfixing and the like: I added support for the fork() system call in the MoarVM backend. I removed the ' invokish ' control flow mechanism and replaced it with controlled return address modification. I requested a grant from the Perl Foundation aiming to complete the expression JIT compiler with floating point support, irregular registers and the like. So 2019 starts with me trying to complete the goals specified in that grant request. I've already partially completed one goal (as explained in the interim report ) - ensuring that register encoding works correctly for SSE registers in DynASM. Next up is actually ensuring support for SSE (and floating point) registers in the JIT, which is su

A future for fork(2)

Hi hackers. Today I want to write about a new functionality that I've been developing for MoarVM that has very little to do with the JIT compiler. But it is still about VM internals so I guess it will fit. Many months ago, jnthn wrote a blog post on the relation between perl 5 and perl 6. And as a longtime and enthusiastic perl 5 user - most of the JIT's compile time support software is written in perl 5 for a reason - I wholeheartedly agree with the 'sister language' narrative. There is plenty of room for all sorts of perls yet, I hope. Yet one thing kept itching me: Moreover, it’s very much the case that Perl 5 and Perl 6 make different trade-offs. To pick one concrete example, Perl 6 makes it easy to run code across multiple threads, and even uses multiple threads internally (for example, performing optimization and JIT compilation on a background thread). Which is great…except the only winning move in a game involving both threads and  fork()  is not to pl