Macros and Simple Programming Languages
Today I started reading Martin Davis’ excellent introduction to computability and complexity. This stuff is the theoretical basis of all the programming we do day to day so the material is important, but it can get very heady, very fast! To make the concepts of the book stick, not just in the abstract, but intuitively it helps to breath some life into them and relate them back to reality.
So today we’re going to see how to apply these concepts to create a real programming language and understand its limits. In particular we’re going to write a virtual machine and a compiler for the simple ℓ programming language introduced in the book and insodoing we’ll internalize some of the book’s core themes:
- Software constructs can add functionality to a system, even when the underlying hardware doesn’t change
- Computability is the study of how and when we can translate mathematical functions into computer programs
Part 1: Building the Virtual Machine
The L programming language
First let’s understand the basic units of our programming language. Like most computing systems, our ℓ will maintain an internal state of variable/register values, and then modify that state as it reads and executes instructions in sequence. We have three (unbounded) classes of variables: 𝑋1,𝑋2,... used for inputs, 𝑌1,𝑌2,... used for outputs, and 𝑍1,𝑍2,... for internal program use. Each variable is restricted to a non-zero integer value (the set ℕ of natural numbers). Note we can omit the ordinal when it equals 1 so that (e.g.) 𝑋=𝑋1.
We further support four types of instructions
X ← X + 1 (increment a variable)
X ← X - 1 (decrement a variable)
X ← X (no-op)
IF Z ≠ 0 GOTO LABEL (conditional branch, specifically bnez)
The language is… really simple, but as we’ll see it can support some complex behavior with the addition of macros. Interestingly the existence of a conditional branch makes the system Turing complete (covered in a future post).
To get a feel for this language let’s write a program to realize the identity function 𝑓(𝑥)=𝑥
[A] IF X ≠ 0 GOTO B
Z ← Z + 1
IF Z ≠ 0 GOTO E
[B] X ← X - 1
Y ← Y + 1
Z ← Z + 1
IF Z ≠ 0 GOTO A
Here we basically increment the output variable 𝑌 and the internal variable 𝑍 as we decrement the input 𝑋. Once 𝑋=0 we’re done and can exit. Note that the special label 𝐸 is the “exit” label which the program will use to terminate. The use of the pair of lines
Z ← Z + 1
IF Z ≠ 0 GOTO A
here is a bit confusing, but really it’s just an unconditional jump (jmp) instruction which we improvise since our hardware does not support it out-of-the-box. Later we’ll see how we can use macros to add a new pseudo-instruction for this.
Encoding a Program
A program for ℓ is thus a sequence of instructions, each of which has one of the four forms above. To make things easy we’ll store programs as files. Each line in the file will represent a line of our program and have three to four space-delimited tokens
Instruction |
Token 1 |
Token 2 |
Token 3 |
Token 4 |
INC |
I |
𝑋,𝑌,𝑍 |
𝑛 |
|
DEC |
D |
𝑋,𝑌,𝑍 |
𝑛 |
|
NOOP |
N |
𝑋,𝑌,𝑍 |
𝑛 |
|
BNEZ |
B |
𝑋,𝑌,𝑍 |
𝑛 |
𝑖 |
where for ease of readability token 1 is a single ASCII character denoting the type of instruction, token 2 is a single ASCII character for the class of variable, token 3 is the ordinal of the variable, and token 4 is the ordinal of the instruction to which we’re jumping. Integers are unbounded in ℓ, but they’re also unbounded (in software) in Python a feature which we’ll take advantage of. So we’ll represent integers also as an ASCII sequence of decimals, and we’ll one-index all the ordinals.
Thus our 𝑓(𝑥)=𝑥 program above becomes
B X 1 4
I Z 1
B Z 1 0
D X 1
I Y 1
I Z 1
B Z 1 1
where we use the special 0 ordinal instruction to mean 𝐸 or “exit”. This format is quite flexible and easy for humans to read, but as a tradeoff we lose the guarantee that any given file represents a valid program. Worth it for our purposes.
Writing the Virtual Machine
Since our ℓ machine is a particular class of object that can perform well defined actions we can model it pretty nicely with the object-oriented paradigm. We’ll start by writing a class:
from collections import defaultdict
class LMachineState(object):
def __init__(self):
self.pc = 1
self.vals = dict(
X=defaultdict(int),
Y=defaultdict(int),
Z=defaultdict(int),
)
class LMachine(object):
def __init__(self):
self.state = LMachineState()
def inc(self, var_class, ordinal):
self.state.vals[var_class][ordinal] += 1
self.state.pc += 1
def dec(self, var_class, ordinal):
self.state.vals[var_class][ordinal] -= 1
self.state.pc += 1
def noop(self, var_class, ordinal):
self.state.pc += 1
def bnez(self, var_class, ordinal, target):
if self.state.vals[var_class][ordinal] != 0:
self.state.pc = target
else:
self.state.pc += 1
def run(self, program, *x_vals):
for i_minus_one, val in enumerate(x_vals):
i = i_minus_one + 1
self.state.vals['X'][i] = val
ix2method = {
'I': self.inc,
'D': self.dec,
'N': self.noop,
'B': self.bnez,
}
while self.state.pc != 0 and self.state.pc != len(program):
instruction, *args = program[self.state.pc - 1]
ix2method[instruction](*args)
where we use dicts (hash maps) to resolve both object attributes and ordinals, though lists would work just fine for both if we wanted to stick to simpler constructs. Next we’ll write a main procedure to parse inputs from and write outputs to the command-line
import sys
def load_program(f):
program = []
for line in f:
parts = line.strip().split(" ")
if len(parts) == 3:
program.append((parts[0], parts[1], int(parts[2])))
elif len(parts) == 4:
program.append((parts[0], parts[1], int(parts[2]), int(parts[3])))
else:
raise ValueError("Unparsable line: {}".format(" ".join(parts)))
return program
def main():
program_file, *x_vals = sys.argv[1:]
with open(program_file) as f:
program = load_program(f)
m = LMachine()
m.run(program, *map(int, x_vals))
output = m.state.vals['Y']
for key in sorted(output):
print("Y{}: {}".format(key, output[key]))
if __name__ == "__main__":
main()
Finally we’ll write our program to a file and run it to confirm that we get 𝑌1=20 when we have an input 𝑋1=20
$ cat identity.ell
B X 1 4
I Z 1
B Z 1 0
D X 1
I Y 1
I Z 1
B Z 1 1
$ python l_machine.py identity.ell 20
Y1: 20
Nifty. That should be a good enough sanity check to move on.
Part 2: Writing the Compiler
Resolving Labels
As we discussed in Hello World, Byte for Byte, a compiler will typically keep references to other code locations in symbolic form until we’re ready to assemble all our bits together and resolve the references. In our case we’re only working with one program, so no assembly required, and every line of soruce code will by fiat map to a single line of our program. So let’s start by tracking and extracting our labels which will make our later work easier.
import re
class Grammar(object):
lbl_group = "tgt_lbl"
cls_group = "var_cls"
ord_group = "var_ord"
lbl_expr = f'(?P<{lbl_group}>[a-zA-Z0-9]+)'
tgt_expr = f'(?P<{cls_group}>[XYZ])(?P<suffix>(_(?P<{ord_group}>[0-9]+))?)'
src_expr = f'(?P={cls_group})(?P=suffix)'
def parse_lines(source_lines, with_defaults=True):
labels = {}
if with_defaults:
labels.update(E=0)
pattern = f"\[{Grammar.lbl_expr}\] "
parsed = []
for i, source_line in enumerate(filter(lambda line: line, source_lines)):
line = re.sub("\s+", " ", source_line.strip())
if match := re.match(pattern, line):
line = line[match.span()[1]:]
labels[match.group(Grammar.lbl_group)] = i + 1
parsed.append(line)
return labels, parsed
We’ve first defined the grammar of the language as regular expressions. This will be convenient later when we’re parsing whole statements. Next we use one of these expressions to find any labeled statements in our program, remove the label, and fill a map from label to line number.
Define the Instruction Types
Next we’ll use our regular expressions to transform lines of our source code into lines for our virtual machine.
class Token(object):
var_cls = "{var_cls}"
var_ord = "{var_ord}"
ixn_loc = "{ixn_loc}"
class Line(object):
pattern = ""
tgt_ixn = ""
@classmethod
def match(cls, line):
return re.fullmatch(cls.pattern, line)
@classmethod
def to_ixn(cls, match, labels):
raw = lambda token: token[1:-1]
var_cls = match.group(Grammar.cls_group)
var_ord = int(match.group(Grammar.ord_group) or "1")
context = {
raw(Token.var_cls): var_cls,
raw(Token.var_ord): var_ord,
}
if Grammar.lbl_expr in cls.pattern:
location = labels[match.group(Grammar.lbl_group)]
context[raw(Token.ixn_loc)] = location
return cls.tgt_ixn.format(**context)
class IncrementLine(Line):
pattern = f'{Grammar.tgt_expr} ← {Grammar.src_expr} \+ 1'
tgt_ixn = f'I {Token.var_cls} {Token.var_ord}'
class DecrementLine(Line):
pattern = f'{Grammar.tgt_expr} ← {Grammar.src_expr} - 1'
tgt_ixn = f'D {Token.var_cls} {Token.var_ord}'
class NoopLine(Line):
pattern = f'{Grammar.tgt_expr} ← {Grammar.src_expr}'
tgt_ixn = f'N {Token.var_cls} {Token.var_ord}'
class BNEZLine(Line):
pattern = f'IF {Grammar.tgt_expr} ≠ 0 GOTO {Grammar.lbl_expr}'
tgt_ixn = f'B {Token.var_cls} {Token.var_ord} {Token.ixn_loc}'
Since all of our lines are quite similar in form to each other we can define our procedures once and use different attributes for each line type to differentiate the transformation. The basic idea here is that the pattern
attribute is a regular expression that matches a statement and captures its subexpressions as groups. The to_ixn
method will extract those groups and convert them into tokens which are then put in the right place by the tgt_ixn
formatted string.
Putting it All Together
Finally we’ll write a main method that reads the source code file, extracts the labels, and applies the transformations above to write a program file.
import sys
LINE_CLASSES = (
IncrementLine,
DecrementLine,
NoopLine,
BNEZLine,
)
def compile_lines(labels, parsed_lines, line_classes=LINE_CLASSES):
for line in parsed_lines:
for cls in line_classes:
match = cls.match(line)
if match:
yield cls.to_ixn(match, labels)
break
else:
raise ValueError(f"No matching class for line: {line}")
def main():
program_file, object_file = sys.argv[1:]
with open(program_file) as f:
source_lines = f.read().split("\n")
labels, parsed_lines = parse_lines(source_lines)
with open(object_file, 'w') as f:
f.write("\n".join(compile_lines(labels, parsed_lines)))
and if we run it we see
$ cat identity.l
[A] IF X ≠ 0 GOTO B
Z ← Z + 1
IF Z ≠ 0 GOTO E
[B] X ← X - 1
Y ← Y + 1
Z ← Z + 1
IF Z ≠ 0 GOTO A
$ python l_compile.py identity.l identity.ell
$ python l_machine.py identity.ell 40
Y1: 40
Superb!
Part 3: Add Macros and Watch Them Go
Now we get to the really fun stuff. No, seriously! Macros are what will let us build more and more complexity into our program without changing the machine that will run it. We’ll start by addressing some obvious gaps in our language and add basic features like branch-on-equal and GOTO instructions. Then we’ll use those to add a couple more sophisticated features. Finally we’ll build up to macros for adding variables (through repeated incrementing) and multiplying variables (through repeated addition).
Adding Language Features
Our macros will work very similarly to our complied lines. We’ll start by defining an interface that each macro has to support.
class Macro(object):
@classmethod
def match(cls, line):
return re.fullmatch(cls.pattern, line)
@classmethod
def expand(cls, match, z_start):
raise NotImplementedError()
Here we have a basic interface where each macro will return a regexp
iff it matches the line in question and then use that regexp to expand the line into more fundamental units. Importantly, there is a hierarchy to our macros, each being defined in terms of expressions below it. If this were not the case we could end up with infinite loops while compiling 😱
You might be wondering what’s with the z_start
parameter in expand
. Our textbook on the language says,
At this point we will not discuss how to ensure that the variables local to the macro definition are distinct from the variables used in the main program. Instead, we will manually replace any such duplicate variable uses with unused variables.
This turns out to be non-trivial, but this gist of it is that if we have two macros that use the same internal variables (the 𝑍0,𝑍1,... variables), we want to keep incrementing the indices in one of them so that the two macros no longer overlap. z_start
here basically tells us what is the first z index we’re allowed to use for this expansion.
Additionally we want to ensure that different macros don’t have labels overlap or else branching statements can become ambiguous. As a sort of hack we’re using teh z_start
as well as a label prefix M
so that labels like M1,M2,..
will be reserved for macro expansion.
Now let’s see this system in action to define two new instruction types for our ℓ that will make programming much easier.
class Goto(Macro):
pattern = f'GOTO {Grammar.lbl_expr}'
@classmethod
def expand(cls, match, z_start):
tmp_store = z_start
z_end = tmp_store + 1
tgt_label = match.group(Grammar.lbl_group)
label = match.group(f'{Grammar.lbl_group}')
lines = [
f'Z_{tmp_store} ← Z_{tmp_store} + 1',
f'IF Z_{tmp_store} ≠ 0 GOTO {label}'
]
return lines, z_end
class BEZ(Macro):
pattern = f'IF {Grammar.tgt_expr} = 0 GOTO {Grammar.lbl_expr}'
@classmethod
def expand(cls, match, z_start):
end_lbl = z_start
z_end = z_start + 1
lbl = match.group(Grammar.lbl_group)
var_cls = match.group(Grammar.cls_group)
var_ord = int(match.group(Grammar.ord_group) or "1")
lines = [
f'IF {var_cls}_{var_ord} ≠ 0 GOTO M{end_lbl}',
f'GOTO {lbl}',
f'[M{end_lbl}] Z ← Z',
]
return lines, z_end
Remember this kinda mystifying line combination before?
Z ← Z + 1
IF Z ≠ 0 GOTO A
It gave us an unconditional branch operation, but now you can get the same effect by just writing a GOTO _label_
line. I also found that in writing the macros for addition and multiplication (below) having a branch-on-equal instruction makes the code cleaner.
Next we’ll add some more advanced language features that will be useful for doing arithmetic.
class SetZero(Macro):
pattern = f'{Grammar.tgt_expr} ← 0'
@classmethod
def expand(cls, match, z_start):
lbl_start_ix = z_start
lbl_done_ix = lbl_start_ix + 1
z_end = lbl_done_ix + 1
lbl_start = f"M{lbl_start_ix}"
lbl_done = f"M{lbl_done_ix}"
var_cls = match.group(Grammar.cls_group)
var_ord = int(match.group(Grammar.ord_group) or "1")
var_exp = f"{var_cls}_{var_ord}"
lines = [
f'[{lbl_start}] IF {var_exp} = 0 GOTO {lbl_done}',
f'{var_exp} ← {var_exp} - 1',
f'GOTO {lbl_start}',
f'[{lbl_done}] Z ← Z',
]
return lines, z_end
class Copy(Macro):
_var_expr = f'(?P<{Grammar.cls_group}{{ix}}>[XYZ])(_(?P<{Grammar.ord_group}{{ix}}>[0-9]+))?'
var_expr_0 = _var_expr.format(ix=0)
var_expr_1 = _var_expr.format(ix=1)
pattern = f"{var_expr_0} ← {var_expr_1}'"
@classmethod
def expand(cls, match, z_start):
tmp_store_ix = z_start
lbl_cp2tmp_ix = tmp_store_ix + 1
lbl_cp_tmp_ix = lbl_cp2tmp_ix + 1
lbl_finish_ix = lbl_cp_tmp_ix + 1
z_end = lbl_finish_ix + 1
tmp_store = f"Z_{tmp_store_ix}"
lbl_cp2tmp = f"M{lbl_cp2tmp_ix}"
lbl_cp_tmp = f"M{lbl_cp_tmp_ix}"
lbl_finish = f"M{lbl_finish_ix}"
tgt_var = "{}_{}".format(
match.group(f'{Grammar.cls_group}0'),
match.group(f'{Grammar.ord_group}0') or "1",
)
src_var = "{}_{}".format(
match.group(f'{Grammar.cls_group}1'),
match.group(f'{Grammar.ord_group}1') or "1",
)
if tgt_var == src_var:
return [f"{tgt_var} ← {src_var}"], z_start
lines = [
f'{tgt_var} ← 0',
f'[{lbl_cp2tmp}] IF {src_var} = 0 GOTO {lbl_cp_tmp}',
f'{tmp_store} ← {tmp_store} + 1',
f'{src_var} ← {src_var} - 1',
f'GOTO {lbl_cp2tmp}',
f'[{lbl_cp_tmp}] IF {tmp_store} = 0 GOTO {lbl_finish}',
f'{tmp_store} ← {tmp_store} - 1',
f'{src_var} ← {src_var} + 1',
f'{tgt_var} ← {tgt_var} + 1',
f'GOTO {lbl_cp_tmp}',
f'[{lbl_finish}] Z ← Z',
]
return lines, z_end
Our SetZero
macro (taken from the book) will “reset” a variable for us. Imagine we want to do additions and such but all we can do is increment a variable. That’s no bueno as we have no idea the value with which our variable is starting. So here we’ve defined a fairly complex operation, decrement a variable until it is zero and given it the pattern X ← 0
.
Our Copy
macro is fairly complex as well so it’s worth diving in here. The motivation is that we will want to be able to move values from one variable to another without destroying the original value, the behavior we’re used to in a typical C-derived language.
Our logic is to:
- First use our nifty
SetZero
macro to reset the value of our target variable so we know where it’s starting
- Use a temporary 𝑍𝑖 variable, unique to this macro expansion, to store the value of our source variable
- This is useful because when we copy the value back we’ll increment both our original source var and the target var
- So in effect we do destroy the value of our source var when we move it, but then set that value back again
Finally we’re going to use all the primitives we have built so far to create macros for addition and multiplication.
class Add(Macro):
_var_expr = f'(?P<{Grammar.cls_group}{{ix}}>[XYZ])(_(?P<{Grammar.ord_group}{{ix}}>[0-9]+))?'
var_expr_0 = _var_expr.format(ix=0)
var_expr_1 = _var_expr.format(ix=1)
var_expr_2 = _var_expr.format(ix=2)
pattern = f'{var_expr_0} ← {var_expr_1} \+ {var_expr_2}'
@classmethod
def expand(cls, match, z_start):
tmp_store_ix = z_start
lbl_cp_tmp_ix = tmp_store_ix + 1
lbl_finish_ix = lbl_cp_tmp_ix + 1
z_end = lbl_finish_ix + 1
lbl_cp_tmp = f"M{lbl_cp_tmp_ix}"
lbl_finish = f"M{lbl_finish_ix}"
tmp_store = f"Z_{tmp_store_ix}"
tgt_var = "{}_{}".format(
match.group(f'{Grammar.cls_group}0'),
match.group(f'{Grammar.ord_group}0') or "1",
)
src_var_0 = "{}_{}".format(
match.group(f'{Grammar.cls_group}1'),
match.group(f'{Grammar.ord_group}1') or "1",
)
src_var_1 = "{}_{}".format(
match.group(f'{Grammar.cls_group}2'),
match.group(f'{Grammar.ord_group}2') or "1",
)
lines = [
f"{tgt_var} ← {src_var_0}'",
f"{tmp_store} ← {src_var_1}'",
f"[{lbl_cp_tmp}] IF {tmp_store} = 0 GOTO {lbl_finish}",
f"{tmp_store} ← {tmp_store} - 1",
f"{tgt_var} ← {tgt_var} + 1",
f"GOTO {lbl_cp_tmp}",
f"[{lbl_finish}] Z ← Z",
]
return lines, z_end
class Mult(Macro):
_var_expr = f'(?P<{Grammar.cls_group}{{ix}}>[XYZ])(_(?P<{Grammar.ord_group}{{ix}}>[0-9]+))?'
var_expr_0 = _var_expr.format(ix=0)
var_expr_1 = _var_expr.format(ix=1)
var_expr_2 = _var_expr.format(ix=2)
pattern = f'{var_expr_0} ← {var_expr_1} \* {var_expr_2}'
@classmethod
def expand(cls, match, z_start):
tmp_store_ix = z_start
lbl_loop_ix = tmp_store_ix + 1
lbl_finish_ix = lbl_loop_ix + 1
z_end = lbl_finish_ix + 1
lbl_loop = f"M{lbl_loop_ix}"
lbl_finish = f"M{lbl_finish_ix}"
tmp_store = f"Z_{tmp_store_ix}"
tgt_var = "{}_{}".format(
match.group(f'{Grammar.cls_group}0'),
match.group(f'{Grammar.ord_group}0') or "1",
)
src_var_0 = "{}_{}".format(
match.group(f'{Grammar.cls_group}1'),
match.group(f'{Grammar.ord_group}1') or "1",
)
src_var_1 = "{}_{}".format(
match.group(f'{Grammar.cls_group}2'),
match.group(f'{Grammar.ord_group}2') or "1",
)
lines = [
f"{tgt_var} ← 0",
f"{tmp_store} ← {src_var_0}'",
f"[{lbl_loop}] IF {tmp_store} = 0 GOTO {lbl_finish}",
f"{tgt_var} ← {tgt_var} + {src_var_1}",
f"{tmp_store} ← {tmp_store} - 1",
f"GOTO {lbl_loop}",
f"[{lbl_finish}] Z ← Z",
]
return lines, z_end
Again you’ll see these look fairly complex, but the basic logic is easy to understand. For addition we want to copy one of our inputs to the target value and then repeatedly increment the target for every unit in our second source. Multiplication works similarly except that instead of incrementing we add the whole value of our first argument for every unit in our second argument (multiplication = repeated addition).
Exercise for the reader here: see if you can write a macro to handle multiple additions in a single line like 𝑋0←𝑋1+𝑋2+𝑋3+𝑋4. (hint: try to reduce this expression to a series of two-argument additions). If you’re feeling very ambitious, try to do it with multiplication.
Compiling with Macros
Now we’ll have to glue these pieces together to actually make a compiler you can use at the command line. The pattern to do so should look familiar from our main routine above.
MACROS = (
Mult,
Add,
Copy,
SetZero,
BEZ,
Goto,
)
def max_z_ordinal_referenced(parsed_lines, line_classes=LINE_CLASSES):
max_z_ordinal = -1
for line in parsed_lines:
for cls in line_classes:
if match := cls.match(line):
ordinal = int(match.group(Grammar.ord_group) or "1")
if match.group(
Grammar.cls_group) == 'Z' and ordinal > max_z_ordinal:
max_z_ordinal = ordinal
return max_z_ordinal
def expand_macros(source_lines, macros=MACROS):
labels, lines = parse_lines(source_lines)
current_z = max_z_ordinal_referenced(lines) + 1
for macro in macros:
expanded_lines = []
for i, line in enumerate(lines):
if match := macro.match(line):
new_lines, current_z = macro.expand(match, current_z)
new_labels, new_lines = parse_lines(new_lines, False)
diff = len(new_lines) - 1
labels.update({
lbl: loc + diff
for lbl, loc in labels.items()
if loc > len(expanded_lines) + 1
})
labels.update({
lbl: loc + len(expanded_lines)
for lbl, loc in new_labels.items()
})
expanded_lines.extend(new_lines)
else:
expanded_lines.append(line)
lines = expanded_lines
return labels, lines
def main():
program_file, object_file = sys.argv[1:]
with open(program_file) as f:
source_lines = f.read().split("\n")
labels, expanded_lines = expand_macros(source_lines)
with open(object_file, 'w') as f:
f.write("\n".join(compile_lines(labels, expanded_lines)))
if __name__ == "__main__":
main()
We first need to compute the max index of Z that is referenced in our initial program so that our individual macro expansions know where to start without overlapping with existing variables (a typically programming language will solve this problem using variable scopes). Next we iterate through each macro in reverse order and apply it to our program. The tricky bit here is that by replacing one line with 𝑛 lines we effectively invalidate our labels. So to account for that we have to iterate through each label that references a location after our changed line and add the delta to it.
With all our pieces in place we can finally write a program to realize the function 𝑓(𝑥1,𝑥2)=𝑥1⋅𝑥2 that will multiply its two inputs with ease
$ cat mult.l
Y_1 ← X_1 * X_2
$ python l_compile.py mult.l mult.ell
$ python l_machine.py mult.ell 42 24
Y1: 1008
So, yay! We just wrote 400+ lines of Python to do what you can do in a single line of Python 😄 It’s worth meditating on what we just accomplished though. We reduced multiplication to the simplest of atomic operations a computer can perform and we did so using primitive recursion, a special subset of general recursion that is guaranteed to terminate. We like to use more simple operations when expanding macros so that we can be sure our compilation will finish in finite steps, but in this case even our program itself will always have finite steps which is what proves to us that the mathematical operation of multiplication is always computable.
When Turing first formulated his now famous model of a universal computer he was trying to get to exactly this sort of question, which mathematical functions can be realized as computer programs that will always terminate, and specifically whether or not the truth value of a mathematical proposition is always computable. Let’s let these topics simmer in our heads for a while until we can start to feel them in our bones and revisit this question in more depth in a future post. Until next time!
Macros and Simple Programming Languages
Today I started reading Martin Davis’ excellent introduction to computability and complexity. This stuff is the theoretical basis of all the programming we do day to day so the material is important, but it can get very heady, very fast! To make the concepts of the book stick, not just in the abstract, but intuitively it helps to breath some life into them and relate them back to reality.
So today we’re going to see how to apply these concepts to create a real programming language and understand its limits. In particular we’re going to write a virtual machine and a compiler for the simpleℓ programming language introduced in the book and insodoing we’ll internalize some of the book’s core themes:
Part 1: Building the Virtual Machine
The L programming language
First let’s understand the basic units of our programming language. Like most computing systems, ourℓ will maintain an internal state of variable/register values, and then modify that state as it reads and executes instructions in sequence. We have three (unbounded) classes of variables: 𝑋1,𝑋2,... used for inputs, 𝑌1,𝑌2,... used for outputs, and 𝑍1,𝑍2,... for internal program use. Each variable is restricted to a non-zero integer value (the set ℕ of natural numbers). Note we can omit the ordinal when it equals 1 so that (e.g.) 𝑋=𝑋1 .
We further support four types of instructions
The language is… really simple, but as we’ll see it can support some complex behavior with the addition of macros. Interestingly the existence of a conditional branch makes the system Turing complete (covered in a future post).
To get a feel for this language let’s write a program to realize the identity function𝑓(𝑥)=𝑥
[A] IF X ≠ 0 GOTO B Z ← Z + 1 IF Z ≠ 0 GOTO E [B] X ← X - 1 Y ← Y + 1 Z ← Z + 1 IF Z ≠ 0 GOTO A
Here we basically increment the output variable𝑌 and the internal variable 𝑍 as we decrement the input 𝑋 . Once 𝑋=0 we’re done and can exit. Note that the special label 𝐸 is the “exit” label which the program will use to terminate. The use of the pair of lines
here is a bit confusing, but really it’s just an unconditional jump (jmp) instruction which we improvise since our hardware does not support it out-of-the-box. Later we’ll see how we can use macros to add a new pseudo-instruction for this.
Encoding a Program
A program forℓ is thus a sequence of instructions, each of which has one of the four forms above. To make things easy we’ll store programs as files. Each line in the file will represent a line of our program and have three to four space-delimited tokens
I
D
N
B
where for ease of readability token 1 is a single ASCII character denoting the type of instruction, token 2 is a single ASCII character for the class of variable, token 3 is the ordinal of the variable, and token 4 is the ordinal of the instruction to which we’re jumping. Integers are unbounded inℓ , but they’re also unbounded (in software) in Python a feature which we’ll take advantage of. So we’ll represent integers also as an ASCII sequence of decimals, and we’ll one-index all the ordinals.
Thus our𝑓(𝑥)=𝑥 program above becomes
B X 1 4 I Z 1 B Z 1 0 D X 1 I Y 1 I Z 1 B Z 1 1
where we use the special 0 ordinal instruction to mean𝐸 or “exit”. This format is quite flexible and easy for humans to read, but as a tradeoff we lose the guarantee that any given file represents a valid program. Worth it for our purposes.
Writing the Virtual Machine
Since ourℓ machine is a particular class of object that can perform well defined actions we can model it pretty nicely with the object-oriented paradigm. We’ll start by writing a class:
from collections import defaultdict class LMachineState(object): def __init__(self): self.pc = 1 self.vals = dict( X=defaultdict(int), Y=defaultdict(int), Z=defaultdict(int), ) class LMachine(object): def __init__(self): self.state = LMachineState() def inc(self, var_class, ordinal): self.state.vals[var_class][ordinal] += 1 self.state.pc += 1 def dec(self, var_class, ordinal): self.state.vals[var_class][ordinal] -= 1 self.state.pc += 1 def noop(self, var_class, ordinal): self.state.pc += 1 def bnez(self, var_class, ordinal, target): if self.state.vals[var_class][ordinal] != 0: self.state.pc = target else: self.state.pc += 1 def run(self, program, *x_vals): # init X Values for i_minus_one, val in enumerate(x_vals): i = i_minus_one + 1 self.state.vals['X'][i] = val # for convenience, map instructions to their methods ix2method = { 'I': self.inc, 'D': self.dec, 'N': self.noop, 'B': self.bnez, } # execute until we exit while self.state.pc != 0 and self.state.pc != len(program): instruction, *args = program[self.state.pc - 1] ix2method[instruction](*args)
where we use dicts (hash maps) to resolve both object attributes and ordinals, though lists would work just fine for both if we wanted to stick to simpler constructs. Next we’ll write a main procedure to parse inputs from and write outputs to the command-line
import sys def load_program(f): # read in the program from the file program = [] for line in f: parts = line.strip().split(" ") if len(parts) == 3: program.append((parts[0], parts[1], int(parts[2]))) elif len(parts) == 4: program.append((parts[0], parts[1], int(parts[2]), int(parts[3]))) else: raise ValueError("Unparsable line: {}".format(" ".join(parts))) return program def main(): program_file, *x_vals = sys.argv[1:] with open(program_file) as f: program = load_program(f) # run the program on the virtual machine m = LMachine() m.run(program, *map(int, x_vals)) # print the outputs output = m.state.vals['Y'] for key in sorted(output): print("Y{}: {}".format(key, output[key])) if __name__ == "__main__": main()
Finally we’ll write our program to a file and run it to confirm that we get𝑌1=20 when we have an input 𝑋1=20
Nifty. That should be a good enough sanity check to move on.
Part 2: Writing the Compiler
Resolving Labels
As we discussed in Hello World, Byte for Byte, a compiler will typically keep references to other code locations in symbolic form until we’re ready to assemble all our bits together and resolve the references. In our case we’re only working with one program, so no assembly required, and every line of soruce code will by fiat map to a single line of our program. So let’s start by tracking and extracting our labels which will make our later work easier.
import re class Grammar(object): lbl_group = "tgt_lbl" cls_group = "var_cls" ord_group = "var_ord" lbl_expr = f'(?P<{lbl_group}>[a-zA-Z0-9]+)' tgt_expr = f'(?P<{cls_group}>[XYZ])(?P<suffix>(_(?P<{ord_group}>[0-9]+))?)' src_expr = f'(?P={cls_group})(?P=suffix)' def parse_lines(source_lines, with_defaults=True): labels = {} if with_defaults: labels.update(E=0) pattern = f"\[{Grammar.lbl_expr}\] " parsed = [] for i, source_line in enumerate(filter(lambda line: line, source_lines)): line = re.sub("\s+", " ", source_line.strip()) if match := re.match(pattern, line): line = line[match.span()[1]:] labels[match.group(Grammar.lbl_group)] = i + 1 parsed.append(line) return labels, parsed
We’ve first defined the grammar of the language as regular expressions. This will be convenient later when we’re parsing whole statements. Next we use one of these expressions to find any labeled statements in our program, remove the label, and fill a map from label to line number.
Define the Instruction Types
Next we’ll use our regular expressions to transform lines of our source code into lines for our virtual machine.
class Token(object): var_cls = "{var_cls}" var_ord = "{var_ord}" ixn_loc = "{ixn_loc}" class Line(object): pattern = "" tgt_ixn = "" @classmethod def match(cls, line): return re.fullmatch(cls.pattern, line) @classmethod def to_ixn(cls, match, labels): raw = lambda token: token[1:-1] var_cls = match.group(Grammar.cls_group) var_ord = int(match.group(Grammar.ord_group) or "1") context = { raw(Token.var_cls): var_cls, raw(Token.var_ord): var_ord, } if Grammar.lbl_expr in cls.pattern: location = labels[match.group(Grammar.lbl_group)] context[raw(Token.ixn_loc)] = location return cls.tgt_ixn.format(**context) class IncrementLine(Line): pattern = f'{Grammar.tgt_expr} ← {Grammar.src_expr} \+ 1' tgt_ixn = f'I {Token.var_cls} {Token.var_ord}' class DecrementLine(Line): pattern = f'{Grammar.tgt_expr} ← {Grammar.src_expr} - 1' tgt_ixn = f'D {Token.var_cls} {Token.var_ord}' class NoopLine(Line): pattern = f'{Grammar.tgt_expr} ← {Grammar.src_expr}' tgt_ixn = f'N {Token.var_cls} {Token.var_ord}' class BNEZLine(Line): pattern = f'IF {Grammar.tgt_expr} ≠ 0 GOTO {Grammar.lbl_expr}' tgt_ixn = f'B {Token.var_cls} {Token.var_ord} {Token.ixn_loc}'
Since all of our lines are quite similar in form to each other we can define our procedures once and use different attributes for each line type to differentiate the transformation. The basic idea here is that the
pattern
attribute is a regular expression that matches a statement and captures its subexpressions as groups. Theto_ixn
method will extract those groups and convert them into tokens which are then put in the right place by thetgt_ixn
formatted string.Putting it All Together
Finally we’ll write a main method that reads the source code file, extracts the labels, and applies the transformations above to write a program file.
import sys LINE_CLASSES = ( IncrementLine, DecrementLine, NoopLine, BNEZLine, ) def compile_lines(labels, parsed_lines, line_classes=LINE_CLASSES): for line in parsed_lines: for cls in line_classes: match = cls.match(line) if match: yield cls.to_ixn(match, labels) break else: raise ValueError(f"No matching class for line: {line}") def main(): program_file, object_file = sys.argv[1:] with open(program_file) as f: source_lines = f.read().split("\n") labels, parsed_lines = parse_lines(source_lines) with open(object_file, 'w') as f: f.write("\n".join(compile_lines(labels, parsed_lines)))
and if we run it we see
Superb!
Part 3: Add Macros and Watch Them Go
Now we get to the really fun stuff. No, seriously! Macros are what will let us build more and more complexity into our program without changing the machine that will run it. We’ll start by addressing some obvious gaps in our language and add basic features like branch-on-equal and GOTO instructions. Then we’ll use those to add a couple more sophisticated features. Finally we’ll build up to macros for adding variables (through repeated incrementing) and multiplying variables (through repeated addition).
Adding Language Features
Our macros will work very similarly to our complied lines. We’ll start by defining an interface that each macro has to support.
class Macro(object): @classmethod def match(cls, line): return re.fullmatch(cls.pattern, line) @classmethod def expand(cls, match, z_start): raise NotImplementedError()
Here we have a basic interface where each macro will return a
regexp
iff it matches the line in question and then use that regexp to expand the line into more fundamental units. Importantly, there is a hierarchy to our macros, each being defined in terms of expressions below it. If this were not the case we could end up with infinite loops while compiling 😱You might be wondering what’s with the
z_start
parameter inexpand
. Our textbook on the language says,This turns out to be non-trivial, but this gist of it is that if we have two macros that use the same internal variables (the𝑍0,𝑍1,... variables), we want to keep incrementing the indices in one of them so that the two macros no longer overlap.
z_start
here basically tells us what is the first z index we’re allowed to use for this expansion.Additionally we want to ensure that different macros don’t have labels overlap or else branching statements can become ambiguous. As a sort of hack we’re using teh
z_start
as well as a label prefixM
so that labels likeM1,M2,..
will be reserved for macro expansion.Now let’s see this system in action to define two new instruction types for ourℓ that will make programming much easier.
class Goto(Macro): pattern = f'GOTO {Grammar.lbl_expr}' @classmethod def expand(cls, match, z_start): tmp_store = z_start z_end = tmp_store + 1 tgt_label = match.group(Grammar.lbl_group) label = match.group(f'{Grammar.lbl_group}') lines = [ f'Z_{tmp_store} ← Z_{tmp_store} + 1', f'IF Z_{tmp_store} ≠ 0 GOTO {label}' ] return lines, z_end class BEZ(Macro): pattern = f'IF {Grammar.tgt_expr} = 0 GOTO {Grammar.lbl_expr}' @classmethod def expand(cls, match, z_start): end_lbl = z_start z_end = z_start + 1 lbl = match.group(Grammar.lbl_group) var_cls = match.group(Grammar.cls_group) var_ord = int(match.group(Grammar.ord_group) or "1") lines = [ f'IF {var_cls}_{var_ord} ≠ 0 GOTO M{end_lbl}', f'GOTO {lbl}', f'[M{end_lbl}] Z ← Z', ] return lines, z_end
Remember this kinda mystifying line combination before?
It gave us an unconditional branch operation, but now you can get the same effect by just writing a
GOTO _label_
line. I also found that in writing the macros for addition and multiplication (below) having a branch-on-equal instruction makes the code cleaner.Next we’ll add some more advanced language features that will be useful for doing arithmetic.
class SetZero(Macro): pattern = f'{Grammar.tgt_expr} ← 0' @classmethod def expand(cls, match, z_start): lbl_start_ix = z_start lbl_done_ix = lbl_start_ix + 1 z_end = lbl_done_ix + 1 lbl_start = f"M{lbl_start_ix}" lbl_done = f"M{lbl_done_ix}" var_cls = match.group(Grammar.cls_group) var_ord = int(match.group(Grammar.ord_group) or "1") var_exp = f"{var_cls}_{var_ord}" lines = [ f'[{lbl_start}] IF {var_exp} = 0 GOTO {lbl_done}', f'{var_exp} ← {var_exp} - 1', f'GOTO {lbl_start}', f'[{lbl_done}] Z ← Z', ] return lines, z_end class Copy(Macro): _var_expr = f'(?P<{Grammar.cls_group}{{ix}}>[XYZ])(_(?P<{Grammar.ord_group}{{ix}}>[0-9]+))?' var_expr_0 = _var_expr.format(ix=0) var_expr_1 = _var_expr.format(ix=1) pattern = f"{var_expr_0} ← {var_expr_1}'" @classmethod def expand(cls, match, z_start): tmp_store_ix = z_start lbl_cp2tmp_ix = tmp_store_ix + 1 lbl_cp_tmp_ix = lbl_cp2tmp_ix + 1 lbl_finish_ix = lbl_cp_tmp_ix + 1 z_end = lbl_finish_ix + 1 tmp_store = f"Z_{tmp_store_ix}" lbl_cp2tmp = f"M{lbl_cp2tmp_ix}" lbl_cp_tmp = f"M{lbl_cp_tmp_ix}" lbl_finish = f"M{lbl_finish_ix}" tgt_var = "{}_{}".format( match.group(f'{Grammar.cls_group}0'), match.group(f'{Grammar.ord_group}0') or "1", ) src_var = "{}_{}".format( match.group(f'{Grammar.cls_group}1'), match.group(f'{Grammar.ord_group}1') or "1", ) # short circuit to a no-op or else we will overwrite to 0 if tgt_var == src_var: return [f"{tgt_var} ← {src_var}"], z_start lines = [ # first set the target var to 0 f'{tgt_var} ← 0', # move the value out of the source var into tmp f'[{lbl_cp2tmp}] IF {src_var} = 0 GOTO {lbl_cp_tmp}', f'{tmp_store} ← {tmp_store} + 1', f'{src_var} ← {src_var} - 1', f'GOTO {lbl_cp2tmp}', # move the value out of tmp into the source var f'[{lbl_cp_tmp}] IF {tmp_store} = 0 GOTO {lbl_finish}', f'{tmp_store} ← {tmp_store} - 1', f'{src_var} ← {src_var} + 1', f'{tgt_var} ← {tgt_var} + 1', f'GOTO {lbl_cp_tmp}', # finish with a noop f'[{lbl_finish}] Z ← Z', ] return lines, z_end
Our
SetZero
macro (taken from the book) will “reset” a variable for us. Imagine we want to do additions and such but all we can do is increment a variable. That’s no bueno as we have no idea the value with which our variable is starting. So here we’ve defined a fairly complex operation, decrement a variable until it is zero and given it the patternX ← 0
.Our
Copy
macro is fairly complex as well so it’s worth diving in here. The motivation is that we will want to be able to move values from one variable to another without destroying the original value, the behavior we’re used to in a typical C-derived language.Our logic is to:
SetZero
macro to reset the value of our target variable so we know where it’s startingFinally we’re going to use all the primitives we have built so far to create macros for addition and multiplication.
class Add(Macro): _var_expr = f'(?P<{Grammar.cls_group}{{ix}}>[XYZ])(_(?P<{Grammar.ord_group}{{ix}}>[0-9]+))?' var_expr_0 = _var_expr.format(ix=0) var_expr_1 = _var_expr.format(ix=1) var_expr_2 = _var_expr.format(ix=2) pattern = f'{var_expr_0} ← {var_expr_1} \+ {var_expr_2}' @classmethod def expand(cls, match, z_start): tmp_store_ix = z_start lbl_cp_tmp_ix = tmp_store_ix + 1 lbl_finish_ix = lbl_cp_tmp_ix + 1 z_end = lbl_finish_ix + 1 lbl_cp_tmp = f"M{lbl_cp_tmp_ix}" lbl_finish = f"M{lbl_finish_ix}" tmp_store = f"Z_{tmp_store_ix}" tgt_var = "{}_{}".format( match.group(f'{Grammar.cls_group}0'), match.group(f'{Grammar.ord_group}0') or "1", ) src_var_0 = "{}_{}".format( match.group(f'{Grammar.cls_group}1'), match.group(f'{Grammar.ord_group}1') or "1", ) src_var_1 = "{}_{}".format( match.group(f'{Grammar.cls_group}2'), match.group(f'{Grammar.ord_group}2') or "1", ) lines = [ # copy the inputs to not overwrite them f"{tgt_var} ← {src_var_0}'", f"{tmp_store} ← {src_var_1}'", # increment the target using the second input f"[{lbl_cp_tmp}] IF {tmp_store} = 0 GOTO {lbl_finish}", f"{tmp_store} ← {tmp_store} - 1", f"{tgt_var} ← {tgt_var} + 1", f"GOTO {lbl_cp_tmp}", # finish with a no-op f"[{lbl_finish}] Z ← Z", ] return lines, z_end class Mult(Macro): _var_expr = f'(?P<{Grammar.cls_group}{{ix}}>[XYZ])(_(?P<{Grammar.ord_group}{{ix}}>[0-9]+))?' var_expr_0 = _var_expr.format(ix=0) var_expr_1 = _var_expr.format(ix=1) var_expr_2 = _var_expr.format(ix=2) pattern = f'{var_expr_0} ← {var_expr_1} \* {var_expr_2}' @classmethod def expand(cls, match, z_start): tmp_store_ix = z_start lbl_loop_ix = tmp_store_ix + 1 lbl_finish_ix = lbl_loop_ix + 1 z_end = lbl_finish_ix + 1 lbl_loop = f"M{lbl_loop_ix}" lbl_finish = f"M{lbl_finish_ix}" tmp_store = f"Z_{tmp_store_ix}" tgt_var = "{}_{}".format( match.group(f'{Grammar.cls_group}0'), match.group(f'{Grammar.ord_group}0') or "1", ) src_var_0 = "{}_{}".format( match.group(f'{Grammar.cls_group}1'), match.group(f'{Grammar.ord_group}1') or "1", ) src_var_1 = "{}_{}".format( match.group(f'{Grammar.cls_group}2'), match.group(f'{Grammar.ord_group}2') or "1", ) lines = [ # Set our target var to zero and copy one of the inputs f"{tgt_var} ← 0", f"{tmp_store} ← {src_var_0}'", # Repeatedly add the second input while we decrement the first f"[{lbl_loop}] IF {tmp_store} = 0 GOTO {lbl_finish}", f"{tgt_var} ← {tgt_var} + {src_var_1}", f"{tmp_store} ← {tmp_store} - 1", f"GOTO {lbl_loop}", # finish with a no-op f"[{lbl_finish}] Z ← Z", ] return lines, z_end
Again you’ll see these look fairly complex, but the basic logic is easy to understand. For addition we want to copy one of our inputs to the target value and then repeatedly increment the target for every unit in our second source. Multiplication works similarly except that instead of incrementing we add the whole value of our first argument for every unit in our second argument (multiplication = repeated addition).
Exercise for the reader here: see if you can write a macro to handle multiple additions in a single line like𝑋0←𝑋1+𝑋2+𝑋3+𝑋4 . (hint: try to reduce this expression to a series of two-argument additions). If you’re feeling very ambitious, try to do it with multiplication.
Compiling with Macros
Now we’ll have to glue these pieces together to actually make a compiler you can use at the command line. The pattern to do so should look familiar from our main routine above.
# Macros in the reverse order of their dependencies so that each # line reduces to elements that follow it MACROS = ( Mult, Add, Copy, SetZero, BEZ, Goto, ) def max_z_ordinal_referenced(parsed_lines, line_classes=LINE_CLASSES): max_z_ordinal = -1 for line in parsed_lines: for cls in line_classes: if match := cls.match(line): ordinal = int(match.group(Grammar.ord_group) or "1") if match.group( Grammar.cls_group) == 'Z' and ordinal > max_z_ordinal: max_z_ordinal = ordinal return max_z_ordinal def expand_macros(source_lines, macros=MACROS): labels, lines = parse_lines(source_lines) current_z = max_z_ordinal_referenced(lines) + 1 for macro in macros: expanded_lines = [] for i, line in enumerate(lines): if match := macro.match(line): new_lines, current_z = macro.expand(match, current_z) new_labels, new_lines = parse_lines(new_lines, False) # any existing labels that were after this line need to be # increased by the number of lines we're adding diff = len(new_lines) - 1 labels.update({ lbl: loc + diff for lbl, loc in labels.items() if loc > len(expanded_lines) + 1 }) # also merge in the labels from the expansion labels.update({ lbl: loc + len(expanded_lines) for lbl, loc in new_labels.items() }) expanded_lines.extend(new_lines) else: expanded_lines.append(line) lines = expanded_lines return labels, lines def main(): program_file, object_file = sys.argv[1:] with open(program_file) as f: source_lines = f.read().split("\n") labels, expanded_lines = expand_macros(source_lines) with open(object_file, 'w') as f: f.write("\n".join(compile_lines(labels, expanded_lines))) if __name__ == "__main__": main()
We first need to compute the max index of Z that is referenced in our initial program so that our individual macro expansions know where to start without overlapping with existing variables (a typically programming language will solve this problem using variable scopes). Next we iterate through each macro in reverse order and apply it to our program. The tricky bit here is that by replacing one line with𝑛 lines we effectively invalidate our labels. So to account for that we have to iterate through each label that references a location after our changed line and add the delta to it.
With all our pieces in place we can finally write a program to realize the function𝑓(𝑥1,𝑥2)=𝑥1⋅𝑥2 that will multiply its two inputs with ease
So, yay! We just wrote 400+ lines of Python to do what you can do in a single line of Python 😄 It’s worth meditating on what we just accomplished though. We reduced multiplication to the simplest of atomic operations a computer can perform and we did so using primitive recursion, a special subset of general recursion that is guaranteed to terminate. We like to use more simple operations when expanding macros so that we can be sure our compilation will finish in finite steps, but in this case even our program itself will always have finite steps which is what proves to us that the mathematical operation of multiplication is always computable.
When Turing first formulated his now famous model of a universal computer he was trying to get to exactly this sort of question, which mathematical functions can be realized as computer programs that will always terminate, and specifically whether or not the truth value of a mathematical proposition is always computable. Let’s let these topics simmer in our heads for a while until we can start to feel them in our bones and revisit this question in more depth in a future post. Until next time!