About
The Simple Programming language is a toy language that is easy to parse and easy to interpret, yet powerful enough to solve problems from number theory. Its purpose is to demonstrate the formal definition of languages using the Backus-Naur notation.
The language
SP has four statements:
READ var | Read an int from the user, and store it in var |
WRITE expr | Print the value of expr |
LET var = expr | Set value of var to expr |
GOTO label [IF expr] | Jump to the specified line. If expr is present, the jump is conditional on expr ≧ 0 |
Here's the complete BNF for SP:
<program> ::= <progline> | <program> <progline> <progline> ::= <number> : <stmt> ; | <stmt> ; <stmt> ::= READ <var_id> | WRITE <expr> | LET <var_id> = <expr> | GOTO <number> | GOTO <number> IF <expr> <expr> ::= <term> | <term> + <term> | <term> - <term> <term> ::= <factor> | <factor> * <factor> | <factor> / <factor> <factor> ::= ( <expr> ) | <number> | <var_id> <var_id> ::= LETTER | LETTER DIGIT <number> ::= DIGIT | <number> DIGIT
There are 33 variables available for programs, named X, Y, Z, X0, …, X9, Y0, …, Y9, Z0, …, Z9. The type of variables and expressions is uniformly signed integer.
Sample SP program
The following program calculates the greatest common denominator of two nonnegative numbers X and Y, using a somewhat naïve approach. You can download it to test your own parser.
READ X; READ Y; 100: GOTO 200 IF Y - X; LET X = X - Y; GOTO 100; 200: GOTO 300 IF X - Y; LET Y = Y - X; GOTO 100; 300: WRITE X;
Implementing SP
Python interpreter
This Python program parses and interprets SP programs. Parsing is done via the Yappy parser generator, and the program is internally represented as a list of statement objects. Expressions are represented as binary trees, taking advantage of the fact that the grammar definition of SP allows for no non-binary operations.
Yappy provides a very nice interface for writing parsers. Note how cleanly semantic actions are described in the grammar table. The single fault I can find with Yappy is that both its lexer and its parser generator require the usage of strings for id's. Allowing arbitrary identities would make for even nicer code.
Python compiler
This compiler, also written in Python and obviously based on the above code, compiles SP programs into IA-32 assembly. Expressions are compiled into assembly instructions putting results on the stack. IO is implemented via libc's printf and scanf functions.
Obviously, because of the very nature of the SP language, no fancy type calculus or memory layouting is done. It does, however, demonstrate how there's nothing magic about writing a compiler.
Yes, the implementation (unified with the interpreter's code) needs to be refactored into multidispatch methods.