In this post we'll write a Python to C compiler in Python. This is especially easy to do since Python has a builtin parser library and because a number of CPython internals are exposed for extension writers.
By the end of this post, in a few hundred lines of Python, we'll be able to compile and run the following program:
$ cat tests/recursive_fib.py
def fib(n):
if n == 0 or n == 1:
return n
return fib(n - 1) + fib(n - 2)
def main():
print(fib(40))
$ python3 pyc tests/recursive_fib.py
$ ./bin/a.out
102334155
This post implements an extremely small subset of Python and completely gives up on even trying to manage memory because I cannot fathom manual reference counting. Maybe some day I'll find a way to swap in an easy GC like Boehm.
Source code for this project is available on Github.
Dependencies
We'll need Python3, GCC, libpython3, and clang-format.
On Fedora-based systems:
$ sudo dnf install gcc python3-devel clang-format python3
And on Debian-based systems:
$ sudo apt install gcc python3-dev clang-format python3
This program will likely work as well on Windows, Mac, FreeBSD, etc. but I haven't gone through the trouble of testing this (or providing custom compiler directives). Pull requests welcome!
A hand-written first-pass
Before we get into the compiler, let's write the fibonacci program by hand in C using libpython.
As described in the Python embedding
guide
we'll need to include libpython and initialize it in
our main.c
:
#define PY_SSIZE_T_CLEAN
#include <Python.h>
int main(int argc, char *argv[]) {
Py_Initialize();
return 0;
}
To compile against libpython, we'll use
python3-config installed
as part of python3-devel
to tell us what should be linked
at each step during compilation.
$ gcc -c -o main.o $(python3-config --cflags) main.c
$ gcc $(python3-config --ldflags) main.o
$ ./a.out; echo $?
0
Cool! Now as we think about translating the fibonacci implementation,
we want to keep everything as Python objects for as long as
possible. This means passing and receiving
PyObject* to and from
all functions, and converting all C integers to
PyLong*, a "subtype" of
PyObject*
. You can imagine that everything in Python is
an object
until you operate on it.
For more information on objects in Python, check out the Data model page in Python docs.
To map a C integer to a PyLong*
we use
PyLong_FromLong. To
map in reverse, we use
PyLong_AsLong.
To compare two PyObject*
s we can use
PyObject_RichCompareBool
which will handle the comparison regardless of the type of the two
parameters. Without this helper we'd have to write complex checks to
make sure that the two sides are the same and if they are, unwrap them
into their underlying C value and compare the C value.
We can use PyNumber_Add and PyNumber_Subtract for basic arithmetic, and there are many similar helpers available to us for operations down the line.
Now we can write a translation:
#define PY_SSIZE_T_CLEAN
#include <Python.h>
PyObject* fib(PyObject* n) {
PyObject* zero = PyLong_FromLong(0);
PyObject* one = PyLong_FromLong(1);
if (PyObject_RichCompareBool(n, zero, Py_EQ) || PyObject_RichCompareBool(n, one, Py_EQ)) {
return n;
}
PyObject* left = fib(PyNumber_Subtract(n, one));
PyObject* two = PyLong_FromLong(2);
PyObject* right = fib(PyNumber_Subtract(n, two));
return PyNumber_Add(left, right);
}
int main(int argc, char *argv[]) {
Py_Initialize();
PyObject* res = fib(PyLong_FromLong(7)); // Should be 13
return PyLong_AsLong(res);
}
Compile and run it:
$ gcc -c -o main.o $(python3-config --cflags) main.c
$ gcc $(python3-config --ldflags) main.o
$ ./a.out; echo $?
13
That's great! But we cheated in one place. We assumed that the input
to the fib
function was an integer, and we propagated
that assumption everywhere we wrote PyNumber_*
operations. When we write the compiler, we'll need to check that both
arguments are an integer before we call a numeric helper, otherwise we
may need to call a string concatenation helper or something else
entirely.
Compiler Architecture
We'll break the code into four major parts:
libpyc.c
: helper functions for generated codepyc/context.py
: utilities for scope and writing code in memorypyc/codegen.py
: for generating C code from a Python ASTpyc/__main__.py
: the entrypoint
When I'm writing a new compiler using an existing parser I almost always start with the entrypoint and code generator so I can explore the AST. However, it's easiest to explain the code if we start with the utilities first.
We'll also want an empty pyc/__init__.py
.
libpyc.c
This C file will contain three helper functions for safely adding, subtracting, and printing. It will be concatenated to the top of the generated C file. We'll only support integers for now but this structure sets us up for supporting more types later on.
We'll use PyLong_Check before calling number-specific methods.
#define PY_SSIZE_T_CLEAN
#include <Python.h>
inline PyObject* PYC_Add(PyObject* l, PyObject* r) {
// TODO: allow __add__ override
// Includes ints and bools
if (PyLong_Check(l) && PyLong_Check(r)) {
return PyNumber_Add(l, r);
}
// TODO: handle str, etc.
// TODO: throw exception
return NULL;
}
inline PyObject* PYC_Sub(PyObject* l, PyObject* r) {
// TODO: allow __add__ override
// Includes ints and bools
if (PyLong_Check(l) && PyLong_Check(r)) {
return PyNumber_Subtract(l, r);
}
// TODO: handle str, etc.
// TODO: throw exception
return NULL;
}
inline PyObject* PYC_Print(PyObject* o) {
PyObject_Print(o, stdout, Py_PRINT_RAW);
printf("\n");
return Py_None;
}
That's it! We could generate these as strings in Python but it gets
hairy to do so. By using a dedicated C file, we can take advantage of
syntax highlighting since this file is only C code. And since we've
marked all functions as inline
, there's no runtime cost
to using not embedding these as strings in Python.
pyc/context.py
This file will contain a Context
class for managing
identifiers in scope and for proxying to a Writer
class
that contains helpers for writing lines of C code.
We'll have two instances of the Writer
class in
Context
so that we can write to a body (or
current/primary) region and an initialization region.
The initialization region is necessary in case there are any variables
declared at the top-level. We can't initialize these variables in C
outside of a function since every PyObject*
must be
created after calling Py_Initialize
. This section will be
written into our C main
function before we enter a
compiled Python main
function.
import copy
class Writer():
content = ""
def write(self, exp: str, indent: int = 0):
self.content += (" " * indent) + exp
def writeln(self, stmt: str, indent: int = 0):
self.write(stmt + "\n", indent)
def write_statement(self, stmt: str, indent: int = 0):
self.writeln(stmt + ";", indent)
class Context():
initializations = Writer()
body = Writer()
indentation = 0
scope = 0
ret = None
namings = {}
counter = -1
def __getattr__(self, name: str) -> object:
# Helpers to avoid passing in self.indentation every time
outputs = [initializations", "body"]
for output in outputs:
if name.startswith(output):
return lambda s, i=None: getattr(getattr(self, output), name[len(output)+1:])(s, i if i is not None else self.indentation)
return object.__getattr__(self, name)
def get_local(self, source_name: str) -> dict:
return self.namings[source_name]
def register_global(self, name: str, loc: str):
self.namings[name] = {
"name": loc,
"scope": 0,
}
def register_local(self, local: str = "tmp") -> str:
self.counter += 1
self.namings[local] = {
"name": f"{local}_{self.counter}",
# naming dictionary is copied, so we need to capture scope
# at declaration
"scope": self.scope,
}
return self.namings[local]["name"]
def copy(self):
new = copy.copy(self)
# For some reason copy.deepcopy doesn't do this
new.namings = dict(new.namings)
return new
def at_toplevel(self):
return self.scope == 0
This is all pretty boring boilerplate. Let's move on.
pyc/main.py
The entrypoint is responsible for reading source code, parsing it, calling the code generator, writing the source code to a C file, and compiling it.
First, we read and parse the source code:
import ast
import os
import subprocess
import shutil
import sys
from context import Context
from codegen import generate
BUILTINS = {
"print": "PYC_Print",
}
def main():
target = sys.argv[1]
with open(target) as f:
source = f.read()
tree = ast.parse(source, target)
Then we write libpyc.c
into the body, register builtins,
and run code generation:
...
def main()
...
ctx = Context()
with open("libpyc.c") as f:
ctx.body_write(f.read() + "\n")
for builtin, fn in BUILTINS.items():
ctx.register_global(builtin, fn)
generate(ctx, tree)
Next, we create a clean output directory and write
main.c
with the generated code and a main
function to initialization Python and any global variables:
...
def main():
...
# Create and move to working directory
outdir = "bin"
shutil.rmtree(outdir, ignore_errors=True)
os.mkdir(outdir)
os.chdir(outdir)
with open("main.c", "w") as f:
f.write(ctx.body.content)
main = ctx.namings.get("main")["name"]
f.write(f"""int main(int argc, char *argv[]) {{
Py_Initialize();
// Initialize globals, if any.
{ctx.initializations.content}
PyObject* r = {main}();
return PyLong_AsLong(r);
}}""")
Finally, we run clang-format
and gcc
against
the generated C code:
...
def main():
...
subprocess.run(["clang-format", "-i", "main.c"])
cflags_raw = subprocess.check_output(["python3-config", "--cflags"])
cflags = [f.strip() for f in cflags_raw.decode().split(" ") if f.strip()]
cmd = ["gcc", "-c", "-o", "main.o"] + cflags + ["main.c"]
subprocess.run(cmd)
ldflags_raw = subprocess.check_output(["python3-config", "--ldflags"])
ldflags = [f.strip() for f in ldflags_raw.decode().split(" ") if f.strip()]
cmd = ["gcc"] + ldflags + ["main.o"]
subprocess.run(cmd)
All together:
import ast
import os
import subprocess
import shutil
import sys
from context import Context
from codegen import generate
BUILTINS = {
"print": "PYC_Print",
}
def main():
target = sys.argv[1]
with open(target) as f:
source = f.read()
tree = ast.parse(source, target)
ctx = Context()
with open("libpyc.c") as f:
ctx.body_write(f.read() + "\n")
for builtin, fn in BUILTINS.items():
ctx.register_global(builtin, fn)
generate(ctx, tree)
# Create and move to working directory
outdir = "bin"
shutil.rmtree(outdir, ignore_errors=True)
os.mkdir(outdir)
os.chdir(outdir)
with open("main.c", "w") as f:
f.write(ctx.body.content)
main = ctx.namings.get("main")["name"]
f.write(f"""int main(int argc, char *argv[]) {{
Py_Initialize();
// Initialize globals, if any.
{ctx.initializations.content}
PyObject* r = {main}();
return PyLong_AsLong(r);
}}""")
subprocess.run(["clang-format", "-i", "main.c"])
cflags_raw = subprocess.check_output(["python3-config", "--cflags"])
cflags = [f.strip() for f in cflags_raw.decode().split(" ") if f.strip()]
cmd = ["gcc", "-c", "-o", "main.o"] + cflags + ["main.c"]
subprocess.run(cmd)
ldflags_raw = subprocess.check_output(["python3-config", "--ldflags"])
ldflags = [f.strip() for f in ldflags_raw.decode().split(" ") if f.strip()]
cmd = ["gcc"] + ldflags + ["main.o"]
subprocess.run(cmd)
main()
Done!
pyc/codegen.py
Lastly we write the translation layer from Python AST to C. We'll break this out into 10 helper functions. It is helpful to have the AST spec for reference.
1/10: generate
The entrypoint of the code generator is generate(ctx: Context,
exp)
. It generates code for any object with a body
attribute storing a list of statements. This function will generate
code for objects like modules, function bodies, if bodies, etc.
The statements we'll support to begin are:
ast.Assign
ast.FunctionDef
ast.Return
ast.If
- and
ast.Expr
For each statement, we'll simply pass on generation to an associated helper function. In the case of expression generation though, we'll also add a noop operation on the result of the expression otherwise the compiler will complain about an unused variable.
def generate(ctx: Context, module):
for stmt in module.body:
if isinstance(stmt, ast.Assign):
generate_assign(ctx, stmt)
elif isinstance(stmt, ast.FunctionDef):
generate_function_def(ctx, stmt)
elif isinstance(stmt, ast.Return):
generate_return(ctx, stmt)
elif isinstance(stmt, ast.If):
generate_if(ctx, stmt)
elif isinstance(stmt, ast.Expr):
r = generate_expression(ctx, stmt.value)
ctx.body_writeln("// noop to hide unused warning")
ctx.body_write_statement(f"{r} += 0")
else:
raise Exception(f"Unsupported statement type: {type(stmt)}")
Remember to throw exceptions aggressively otherwise you'll have a bad time debugging programs using new syntax.
Let's dig into these helpers.
2/10: generate_assign
To generate assignment code, we need to check if we're at the
top-level or not. If we're at the top-level we can declare the
variable but we can't initialize it yet. So we add the initialization
code to the initialization
section of the program.
If we're not at the top-level, we can declare and assign in one statement.
Before doing either though, we register the variable name so we can get a safe local name to use in generated code. Then we compile the right-hand side so we can assign it to the left-hand side.
import ast
from context import Context
def initialize_variable(ctx: Context, name: str, val: str):
if ctx.at_toplevel():
decl = f"PyObject* {name}"
ctx.body_write_statement(decl, 0)
init = f"{name} = {val}"
ctx.initializations_write_statement(init)
else:
ctx.body_write_statement(f"PyObject* {name} = {val}")
def generate_assign(ctx: Context, stmt: ast.Assign):
# TODO: support assigning to a tuple
local = ctx.register_local(stmt.targets[0].id)
val = generate_expression(ctx, stmt.value)
initialize_variable(ctx, local, val)
We're going to need to implement generate_expression
to
make this work.
3/10: generate_expression
Just like for statements in generate
, there are a few
kinds of expressions we need to implement:
ast.Num
ast.BinOp
ast.BoolOp
ast.Name
ast.Compare
- and
ast.Call
For ast.Num
, we just need to wrap the literal number as a
PyLong*
. And for ast.Name
we just need to
look up the local name in context. Otherwise we delegate to more
helper functions.
def generate_expression(ctx: Context, exp) -> str:
if isinstance(exp, ast.Num):
# TODO: deal with non-integers
tmp = ctx.register_local("num")
initialize_variable(ctx, tmp, f"PyLong_FromLong({exp.n})")
return tmp
elif isinstance(exp, ast.BinOp):
return generate_bin_op(ctx, exp)
elif isinstance(exp, ast.BoolOp):
return generate_bool_op(ctx, exp)
elif isinstance(exp, ast.Name):
return ctx.get_local(exp.id)["name"]
elif isinstance(exp, ast.Compare):
return generate_compare(ctx, exp)
elif isinstance(exp, ast.Call):
return generate_call(ctx, exp)
raise Exception(f"Unsupported expression: {type(exp)}")
For every code generation helper that is an expression, we store the expression in a local variable and return the variable's name so that parent nodes in the AST can refer to the child. This can result in inefficient code generation (useless assignment) but that's not really a big deal for a project like this and will likely be optimized away by GCC anyway. The more annoying aspect is that useless assignment just makes the generated code harder to read.
4/10: generate_bin_op
For binary operators we need to support addition and
subtraction. Other binary operators like equality or and/or are
represented in ast.Compare
and ast.BoolOp
.
This is easy to write because we already prepared helpers in
libpyc.c
: PYC_Sub
and PYC_Add
.
def generate_bin_op(ctx: Context, binop: ast.BinOp) -> str:
result = ctx.register_local("binop")
l = generate_expression(ctx, binop.left)
r = generate_expression(ctx, binop.right)
if isinstance(binop.op, ast.Add):
ctx.body_write_statement(f"PyObject* {result} = PYC_Add({l}, {r})")
elif isinstance(binop.op, ast.Sub):
ctx.body_write_statement(f"PyObject* {result} = PYC_Sub({l}, {r})")
else:
raise Exception(f"Unsupported binary operator: {type(binop.op)}")
return result
Easy enough.
5/10: generate_bool_op
We only need to support or
for the fibonacci program, but
or
in Python is more complicated than in C. In Python,
the first value to be truthy short-circuits the expression and the
result is its value, not True
.
We'll use goto
to short-circuit and we'll use
PyObject_IsTrue
to do the truthy check:
def generate_bool_op(ctx: Context, boolop: ast.BoolOp) -> str:
result = ctx.register_local("boolop")
ctx.body_write_statement(f"PyObject* {result}")
if isinstance(boolop.op, ast.Or):
done_or = ctx.register_local("done_or")
for exp in boolop.values:
v = generate_expression(ctx, exp)
ctx.body_write_statement(f"{result} = {v}")
ctx.body_writeln(f"if (PyObject_IsTrue({v})) {{")
ctx.body_write_statement(f"goto {done_or}", ctx.indentation+1)
ctx.body_writeln("}")
ctx.body_writeln(f"{done_or}:\n", 0)
return result
Now that I write this down I see we could probably move this
function into libpyc.c
if we used a loop. Maybe in
the next iteration.
We move on.
6/10: generate_compare
This function handles equality and inequality checks. We'll adapt the
PyObject_RichCompareBool
helper we used in the
hand-written translation.
The only additional thing to keep in mind is that the right-hand side is passed as an array. So we have to iterate through it and apply the equality/inequality check on all objects in the list.
def generate_compare(ctx: Context, exp: ast.Compare) -> str:
result = ctx.register_local("compare")
left = generate_expression(ctx, exp.left)
ctx.body_write_statement(f"PyObject* {result} = {left}")
for i, op in enumerate(exp.ops):
v = generate_expression(ctx, exp.comparators[i])
if isinstance(op, ast.Eq):
ctx.body_write_statement(f"{result} = PyObject_RichCompare({result}, {v}, Py_EQ)")
elif isinstance(op, ast.NotEq):
ctx.body_write_statement(f"{result} = PyObject_RichCompare({result}, {v}, Py_NE)")
else:
raise Exception(f"Unsupported comparison: {type(op)}")
return result
7/10: generate_call
The last expression is simple enough. We compile the call's arguments first, then the function itself, then we call the function with the arguments like any C function. Calling the C function directly will have ramifications for interacting with Python libraries (basically, we won't be able to interact with any) but it's the easiest way to get started.
def generate_call(ctx: Context, exp: ast.Call) -> str:
args = ', '.join([generate_expression(ctx, a) for a in exp.args])
fun = generate_expression(ctx, exp.func)
res = ctx.register_local("call_result")
# TODO: lambdas and closures need additional work
ctx.body_write_statement(
f"PyObject* {res} = {fun}({args})")
return res
And that's it for expressions! Just a few more statement helpers to support.
8/10: generate_function_def
This is a fun one. First we register the function name in scope. Then
we copy the context so variables within the function body are
contained within the function body. We increment scope
so
we know we've left the top-level. Finally, we compile the body.
def generate_function_def(ctx: Context, fd: ast.FunctionDef):
name = ctx.register_local(fd.name)
childCtx = ctx.copy()
args = ", ".join([f"PyObject* {childCtx.register_local(a.arg)}" for a in fd.args.args])
ctx.body_writeln(f"PyObject* {name}({args}) {{", 0)
childCtx.scope += 1
childCtx.indentation += 1
generate(childCtx, fd)
if not childCtx.ret:
childCtx.body_write_statement("return Py_None")
ctx.body_writeln("}\n", 0)
The check for childCtx.ret
isn't strictly necessary
because we could just emit a return even if there already was
one. Asking generate_return
to set this attribute and
having generate_function_def
check it just makes the
generate code a little prettier.
9/10: generate_return
Very straightforward, we just compile the value to be returned and
then we emit a return
statement.
We store the return value so that the function definition can know
whether to add a return PyNone
statement.
def generate_return(ctx: Context, r: ast.Return):
ctx.ret = generate_expression(ctx, r.value)
ctx.body_writeln("")
ctx.body_write_statement(f"return {ctx.ret}")
And we've got one last statement to support!
10/10: generate_if
You know the deal: compile the test and if the test is truthy, enter the compiled body. We'll deal with the else body another time.
def generate_if(ctx: Context, exp: ast.If):
test = generate_expression(ctx, exp.test)
ctx.body_writeln(f"if (PyObject_IsTrue({test})) {{")
ctx.indentation += 1
generate(ctx, exp)
# TODO: handle exp.orelse
ctx.indentation -= 1
ctx.body_writeln("}\n")
And we're done the compiler!
Trying it out
As promised:
$ cat tests/recursive_fib.py
def fib(n):
if n == 0 or n == 1:
return n
return fib(n - 1) + fib(n - 2)
def main():
print(fib(40))
$ python3 pyc tests/recursive_fib.py
$ ./bin/a.out
102334155
Microbenchmarking, or making compiler Twitter unhappy
Keep in mind this implementation does a small fraction of what CPython is doing.
If you time the generated code:
$ python3 pyc tests/recursive_fib.py
$ time ./bin/a.out
102334155
./bin/a.out 18.69s user 0.03s system 99% cpu 18.854 total
And CPython (with main()
append to the source):
time python3 tests/recursive_fib.py
102334155
python3 tests/recursive_fib.py 76.24s user 0.11s system 99% cpu 1:16.81 total
The only reason I mention this is because when I did a similar compiler project for JavaScript targeting C++/libV8, the generated code was about the same or a little slower in speed.
I haven't gotten that much better at writing these compilers.
Latest post up, on writing a simple Python to C compiler (in Python).https://t.co/4kkji0XXbp
— Phil Eaton (@phil_eaton) August 16, 2020