From Register Machines to Brainfuck, part 2
7 September 2010 (programming haskell brainfuck language)This is a continuation of my previous post on register machines vs. Brainfuck programs. We left off at Brainfuck's supposed Turing-completeness.
Now, the most straightforward way to prove Turing-completeness of a given language is to write a compiler that takes a program written in a language that is already known to be Turing-complete, and creates a program written in the language to be proved Turing-complete, that simulates the original program. So an obvious way to prove that Brainfuck is a Turing-complete language is to compile register machine programs into Brainfuck. This has the added advantage that a programmer having some experience in real-world assembly programming can easily write register machine programs, which can then be compiled into (horribly inefficient and over-complicated, as we'll see) Brainfuck programs.
Important note: Of course, to really prove, in a mathematical sense, that Brainfuck is Turing-complete, we would first have to define formal operational semantics for register machines and Brainfuck programs to be even able to argue about simulating one with the another. In this post, I will appeal to intuition instead.
So how does one simulate a register machine (RM for short) using Brainfuck? The first idea is that since a given RM program can only reference a finite number of variables, we can lay them out in the linear array of memory cells provided by the Brainfuck model. So we can assign e.g. cell number #0 to a, #1 to b and #2 to z, and any operation working on z first increments the pointer twice (i.e. >> in Brainfuck notation), then does something, then decrements the pointer twice (<<) to get it back to the initial state. So for the line
we can write
>>[-]<<
Similarly, we can compile
to
Enter Loop
In fact, to make further work simpler, we can devise an intermediate language that has constructs similar to Brainfuck, but that uses named registers instead of a linear array. The language called Loop has the following statements:
- inc r, dec r, clr r
- while r stmts: Repeatedly execute the statements in the body as long as the value of r is non-zero.
- out r, in r
Once all the registers are laid out in the linear memory, compiling this to Brainfuck is trivial.
From RM to Loop
As we've previously noted, the other major difference between RM and Brainfuck is that Brainfuck programs can't directly control their execution sequence. If your Brainfuck program contains "<++" at the next position, you can be 100% sure that the pointer will move left and then increment the cell twice, and there is nothing you can do about it. Contrast this with RM's jmp and jz instructions that can change the statement that gets executed next.
To reconcile this difference, the key idea is to start thinking about RM programs in a different way. Instead of a sequential list of instructions with possible jumps between, let's look at it as an n-ary branch switching on some special register called a Program Counter. So for the following program that adds a to b:
- clr tmp
- jz a 7
- dec a
- inc tmp
- inc b
- jmp 2
- jz z 11
- dec tmp
- inc a
- jmp 7
- Done.
we can also imagine it as the following program, written in some unspecified pseudo-code:
pc ← 1 while pc ≠ 11 loop switch pc case 1: clr tmp pc ← 2 case 2: if a = 0 then pc ← 7 else pc ← 3 case 3: dec a pc ← 4 case 4: inc z pc ← 5 case 5: inc b pc ← 6 case 6: pc ← 2 case 7: if z = 0 then pc ← 11 else pc ← 8 case 8: dec z pc ← 9 case 9: inc a pc ← 10 case 10: pc ← 7 end loop
At first glance, we don't seem to be any closer to our goal, since now we have to implement if and switch in Loop. First, let's observe that it makes no difference if several values of the pc register are handled in a single iteration of the outermost loop. Using this observation, and getting rid of some superfluous partitioning of statements, the above can be rewritten as the following:
pc ← 1 while pc ≠ 11 loop if pc = 1 then clr tmp pc ← 2 if pc = 2 then if a = 0 then pc ← 7 else pc ← 3 if pc = 3 then dec a inc tmp inc b pc ← 2 if pc = 7 then if tmp = 0 then pc ← 11 else pc ← 8 if pc = 8 then dec tmp inc a pc ← 7 end loop
We've eliminated the need for switch, and all our if branches fall in one of the following two categories:
- Testing if the special register pc equals a given predeterminate value
- Testing if a given register is non-zero
The first kind of test we can simulate by using not just one pc register, but one for each possible value of pc, taking values of 0 or 1. So we enforce the invariant that pci is 1 iff the virtual pc register equals i. Then we can use while loops for branching by testing for pci and then immediately after, decrementing it, thus ensuring that the loop runs at most once. The above program thus becomes:
dec pc
Note the special handling of pc11 which gets decremented first, to -1, so that incrementing it later exits the main loop.
We are inching closer and closer to our destination – we just need a way to increment one of two pc registers based on the value of some other, non-pc register. Solving this requires some trickery because we can only use loops for testing if a given register is zero, but then we have to zero it out unless we want to get into an infinite loop. The solution is similar to what we do in our original adding example, by using a separate register as temporary storage. Suppose we want to translate the following piece of code:
if a = 0 then inc pc
Using a temporary buffer, it is possible to run two loops that by the end preserve the register a's initial value, but allow us to change other registers in the process. We will use two special-purpose registers Z and NZ to signal if the value of a is zero or non-zero. First, we set up Z:
inc Z inc NZ while a loop dec a inc Buffer clr Z end loop while Buffer loop dec Buffer inc a end loop
By that point, a retained its original value, but Z is 1 iff a is zero at the start. So now we can discriminate between the two cases using yet more loops:
while Z loop dec Z inc pc
Note how the loop for Z decrements NZ, thereby preventing the other branch from running.
Wrapping it up
We've now arrived at a valid Loop program, which can readily be translated into a Brainfuck program. I've implemented an RM → Loop → Brainfuck compiler using the above scheme in my Brainfuck toolbox.
One surprising aspect of the above is that the resulting Brainfuck program, while hideously complicated and large, doesn't perform that bad. Maya was kind enough to write a register machine program solving the 9-digit problem (source here), and I compiled it into x86 assembly via the Brainfuck route, to compare it with his native Brainfuck solution. Let's look at program size first: the native one is 4,591 instructions long, and the one compiled from RM comes in at a whopping 480,466 instructions. However, both implementations showed runtime performance in the same order of magnitude.
Unfortunately, I don't have a corpus of algorithms implemented in both RM and Brainfuck lying around, so I can't do any real benchmarks. But compared to my initial expectations, the result of the 9-digit program is promising: I figured this whole RM → Brainfuck compiler scheme would turn out to be a strictly theoretical result, creating Brainfuck programs that are so slow to be completely impractical.
Epilogue: I wanted to write some Agda-checked proofs that the compiler actually generated equivalent programs. As it turned out, this is not so easy. I hope I'll have time to get back to this problem soon.