This is something I was talking about with a coworker today, and I couldn’t quickly find a friendly writeup of it online, so I decided to write my own. (Yes, we do have a lot of people who enjoy conversations like this at foursquare, which is one of the many reasons that it’s such a great place to work. And we are hiring engineers! Feel free to get in touch with me if you’re interested, or just check our website!)
Anyway – what we were discussing was something called partial evaluation (PE). The idea of PE is almost like an eager form of currying. Basically, you have a two parameter function, f(x, y). You know that you’re going to invoke it a bunch of times with the same value of x, but different values of y. So what you want to do is create a new function, fx(y), which evaluates as much as possible of f using the known parameter.
It’s often clearer to see it in programmatic form. Since these days, I pretty much live Scala, I’ll use Scala syntax. We can abstractly represent a program as an object which can be run with a list of arguments, and which returns some value as its result.
trait Program { def run(inputs: List[String]): String }
If that’s a program, then a partial evaluator would be:
object PartialEvaluator { def specialize(prog: Program, knownInputs: List[String]): Program = { ... } }
What it does is take and program and a partial input, and returns a new program, which is the original program, specialized for the partial inputs supplied to the partial evaluator. So, for example, imagine that you have a program like:
object Silly extends Program { def run(inputs: List[String]): String = { val x: Int = augmentString(inputs[0]).toInt val y: Int = augmentString(inputs[0]).toInt if (x % 2 == 0) { (y * y).toString } else { ((y+1)*(y+1)).toString } } }
This is obviously a stupid program, but it’s good for an example, because it’s so simple. If we’re going to run Silly
a bunch of times with 1
as the first parameter, but with different second parameters, then we could generate a specialized version of Silly
:
val sillySpecialOne = PartialEvaluator.specialize(Silly, List("1"))
Here’s where the interesting part comes, where it’s really different from
currying. A partial evaluator evaluates everything that it
possibly can, given the inputs that it’s supplied. So the value produced by the specializer would be:
object Silly_1 extends Program { def run(inputs: List[String]): String = { val y: Int = augmentString(inputs[0]).toInt ((y+1)*(y+1)).toString } }
This can be a really useful thing to do. It can turn in to a huge
optimization. (Which is, of course, why people are interested in it.) In compiler terms, it’s sort of like an uber-advanced form of constant propagation.
But the cool thing that we were talking about is going a bit meta. Suppose that you run a partial evaluator on a programming language interpreter?
object MyInterpreter extends Program { def run(inputs: List[String]): String = { val code = inputs[0] val inputsToCode = inputs.tail interpretProgram(code, inputsToCode) } def interpretProgram(code: String, inputs: List[String]): String = { ... } }
We can run our partial evaluator to generate a version of the interpreter that’s specialized for the code that the interpreter is supposed to execute:
- We have an interpreter for language M, written in language L.
- We have a partial evaluator for language L.
- We run the partial evaluator on the interpreter, with a program
written in M that we want to interpret:
PartialEvaluator.specialize(M_Interpreter, M_Code)
. - The partial evaluator produces a program written in L.
So the partial evaluator took a program written in M, and transformed it into a program written in L!
We can go a step further – and generate an M to L compiler. How? By running the partial evaluator on itself. That is, run the partial
evaluator like this: PartialEvaluator.specialize(PartialEvaluator, M_Interpreter)
. The result is a program which takes an M program as
input, and generates an L program: that is, it’s an M to L compiler!
We can go yet another step, and turn the partial evaluator into a
compiler generator: PartialEvaluator(PartialEvaluator,
. What we get is a program which takes an interpreter written in L, and generates a compiler from it.
List(source(PartialEvalutor)))
It’s possible to actually generate useful tools this way. People have actually implemented Lisp compilers this way! For example, you can see someone do a simple version here.
Neat trick, indeed! Here‘s what I think would qualify as a friendly writeup, complete with amusing drawings.
Oh, good old Futamura projections! Are they not part of normal university CS courses?
I am interested in how this concept applies to the PyPy project.
For those not familiar with it, PyPy started as a project to write a Python interpreter in a high-level language (instead of a low-level language like C). Specifically, they chose to write their new Python compiler in the language Python. (Hence the name “PyPy”. The logo is, of course, an Ouroboros.) The project has evolved somewhat: they managed to leverage their interpreter into being a kind of JIT compiler — it can take frequently-executed code (inner loops) complete with actual execution data about data types and branches taken, and replace them with optimized assembly code. The PyPy project is now FASTER (for most situations) than the traditional C-based implementation of Python.
My understanding of PyPy is that the optimization it runs is extremely close to what you are describing with partial evaluators. What the PyPy team actually built (if I understand it correctly) is actually an optimizer that takes programs written in a limited subset of Python (“RPython”) and generates optimized code for them. It works particularly well on certain kinds of programs — language interpreters. They then run this thing on an implementation of Python which is written (mostly) in RPython and out pops a Python interpreter that runs faster than the one written in C. Of course, you can ALSO use their tool to produce optimized JIT-enabled interpreters for other languages also, if you write an interpreter for languages in Python and run the tool on it.
Can anyone else who understands it better than me comment on whether I’ve got the right idea and if so, what the implications are?
You might be intrested in ‘Synthesis’, a little operating system made in 1992 which does similar things.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.4871
You might also be interested in this paper:
K.M. Kahn, M. Carlsson, “The compilation of Prolog programs without the use of a Prolog compiler”, in International Conference on Fifth Generation Computer Systems 1984, ICOT, pp. 348-355. 1984,
From my understanding, Michael Chermside is correct about PyPy.
If anyone has corrections to what I write below, I would greatly appreciate a reply with the correction.
PyPy is full of simplifications for the special cases that often arise in real-world code running on real-world input: e.g. special case of x == 0, x >= 1, empty data structures, etc.
PyPy simplifies your code _running through the interpreter_. PyPy doesn’t even bother simplifying code until the interpreter loop deals with a loop in your code, and doesn’t bother separating interpreter code from user code.
After applying the special case simplifications to the interpreter code + user code + interpreter loop + user loop + real world input, you (often enough) get code that can be implemented with fairly tight machine code (the machine code is generated by a C complier – currently PyPy is using a subset of C language as a portable machine code). Heuristics keep PyPy from trying to optimize anything hairy. There is a minor complication with allowing the tight machine code to fall back to a interpreter when the code recognizes that one of the assumptions backing the fast code is violated.
Counter-intuitively, this all works very well with dynamic languages running real-world tasks (dynamic languages are languages that provide ease-of-coding, but have very little type information is available prior to execution)
See http://speed.pypy.org/ and make your own judgement. The project has matured to the point where any production code that does not speed up is considered a bug! This shows that the project has progressed a great deal past simply being a highly-compliant Python implementation with a few special-case speedups!
The flexible approach described above lets you write PyPy in a high level language (PyPy is written in Python mixed with critical pieces of RPython, a restricted subset of Python that can be effectively replaced with fairly tight machine code). Also, you can make radical changes to large chunks of PyPy, and still get the benefit of the optimizations worked out beforehand. The typical approach to writing a high-quality optimizing JIT for a dynamic language usually means a lot of drudgery for any change in the system or the target machine – with PyPy, much is automated. This is being used to good effect by adding mutable data-structure shared-across-cores transactions to the Python language – implementing such a thing with a high-quality optimizing JIT would have been madness in a less flexible approach.
Pingback: Why people making compilers are superheroes | A geek with a hat