Interpreters
Part of Programming Fundamentals
An interpreter executes a program by reading and performing each instruction directly, without first translating it to machine code — enabling immediate interactivity and portability at the cost of speed.
Why This Matters
The interpreter is the fastest path from “no software exists” to “programmers can write useful programs.” Writing a BASIC interpreter takes one skilled programmer a few weeks. Writing a compiler takes months. The interpreter gets your computing environment productive first; the compiler comes later when you need speed.
For a rebuilding civilization, the interpreter offers another critical advantage: immediacy. Type a line of BASIC, press Enter, and see it execute. Make a mistake, correct it, run again. This interactive feedback loop accelerates learning and debugging compared to the edit-compile-link-run cycle of compiled languages. For people learning to program for the first time, this immediacy makes the difference between an approachable tool and an intimidating one.
Understanding how interpreters work also demystifies the entire relationship between programs and machines. An interpreter is just a program that reads instructions and does what they say — and at some level, that is all a CPU is, too.
What an Interpreter Does
An interpreter reads source code (or a tokenized representation of it) and executes it directly. Unlike a compiler, which translates the entire program to machine code before any of it runs, an interpreter processes and executes each instruction as it encounters it.
The execution cycle:
- Read the next statement from the source
- Parse it (determine what operation to perform and what operands to use)
- Execute it (perform the operation, update variables, produce output)
- Advance to the next statement
- Repeat
This cycle is called the fetch-decode-execute loop, mirroring the hardware cycle of a CPU itself. The interpreter is, in a sense, a software CPU that executes a program language rather than machine code.
Architecture of a BASIC Interpreter
A typical BASIC interpreter has these components:
Line store: Program lines are stored sorted by line number. Common implementation: a linked list or sorted array of (line-number, text) pairs. When the user types a line, it is inserted or replaced. LIST traverses the store in order. RUN begins executing from the lowest line number.
Tokenizer: Converting text to tokens on each execution is slow. Most production BASIC interpreters tokenize on input: keywords (PRINT, IF, GOTO) are replaced by single-byte codes, strings and numbers are stored in compact form. The tokenized representation is what is stored and executed, not the original text. LIST converts back to readable form on demand.
Variable table: Stores the current value of every variable. Typically an associative array of (name, value) pairs. Simple implementations use a linear search; production interpreters use hash tables for faster lookup.
Expression evaluator: Evaluates arithmetic and logical expressions. Implements operator precedence (multiplication before addition), handles function calls (SIN, SQR, LEN), and resolves variable references.
Statement executor: A dispatch table or chain of if-else statements mapping each statement type to its implementation. Executing PRINT calls the PRINT handler; executing IF calls the IF handler, which evaluates the condition and either executes the THEN clause or jumps to the ELSE clause or NEXT line.
Line pointer: Tracks which line is currently executing. GOTO N sets this to line N. GOSUB N pushes the current line onto the return stack and jumps to N. RETURN pops the return stack and jumps back.
Expression Evaluation in Detail
Evaluating 3 + 4 * 2 must produce 11 (not 14), respecting multiplication’s higher precedence. The recursive descent approach makes this natural.
Define the grammar:
expression = term (('+' | '-') term)*
term = factor (('*' | '/') factor)*
factor = number | variable | '(' expression ')'
The expression function calls term, checks for + or -, calls term again if found, and adds or subtracts. The term function calls factor, checks for * or /, calls factor again if found, and multiplies or divides. Since term calls factor and expression calls term, multiplication naturally has higher precedence — it is evaluated deeper in the recursion, returning a complete result before the addition operator can act.
This recursive descent evaluator can be written in 50-100 lines of well-structured code and handles arbitrary complexity through recursion.
Handling Control Flow
Control flow in an interpreter is trickier than expression evaluation because statements affect the line pointer in non-local ways.
GOTO N: Search the line store for line N. Set the current line pointer to that entry. Continue execution there.
GOSUB N / RETURN: GOSUB pushes (current line number, position in that line) onto a call stack, then jumps to line N. RETURN pops the call stack and resumes there.
FOR I = start TO end: Push a FOR frame onto a loop stack: (variable name, current value, end value, step, line number of the FOR statement). Each subsequent NEXT I increments the variable, checks if it has exceeded the end value, and either continues with the line after NEXT (loop done) or jumps back to the FOR line.
IF condition THEN statement / GOTO: Evaluate the condition. If true, execute the THEN clause or perform the GOTO. If false, advance to the next line.
These operations make the interpreter substantially more complex than a simple expression evaluator but do not add conceptual difficulty — each is a well-defined state machine operation.
Performance Characteristics
Interpreters are slow compared to compiled code — typically 10 to 100 times slower, sometimes more. Every statement that executes must be parsed and dispatched, and variable accesses go through a lookup table.
This matters when: running compute-intensive numerical simulations, processing large datasets in tight loops, responding to real-time events with strict timing requirements.
It does not matter when: writing short programs that run occasionally, doing interactive data entry and display, controlling hardware at human timescales (tenths of a second or slower), learning and experimenting.
For a rebuilding civilization, the majority of initial software needs fall in the “does not matter” category. An interpreted BASIC that runs at 1% of compiled speed is fast enough for record-keeping, calculations, and interactive tools. Speed becomes a concern as applications mature — and by that point, a compiler is feasible to build.
Tokenizing Interpreters
Storing tokenized (compressed) program lines rather than raw text gives interpreters a significant speed advantage because parsing happens once at input time rather than on every execution. Microsoft’s BASIC interpreters for 8-bit computers all used this approach.
The token codes also serve as an efficient representation of the program. Common BASIC keywords fit in a single byte each. Numbers are stored as binary floating-point rather than ASCII digit strings. A typical BASIC program tokenizes to 30-50% of its source text size — important when total memory is 16 KB.
Threaded Interpreters
Forth and similar languages use a threaded interpretation model for near-native-code speed. Rather than storing the program as text or tokens, the program is stored as a list of addresses — pointers to the machine code implementations of each word (operation). The interpreter’s inner loop simply follows these pointers and calls each routine in sequence.
This eliminates parsing and dispatch lookup from the execution loop, leaving only the overhead of the indirect calls. Threaded code runs at 5-20x the speed of text interpretation, approaching compiled code for many tasks. Forth’s speed, combined with its tiny interpreter, makes it suitable for real-time embedded applications where BASIC would be too slow.
Practical Notes for Rebuilders
Implement the expression evaluator first — it is the heart of the interpreter and can be tested independently before any other component. Write test cases: simple literals, arithmetic expressions with various precedences, nested parentheses. Get this working completely before adding statement execution.
Make the variable table implementation replaceable. A simple linear search works for programs with 10 variables; a hash table is needed for programs with 100. Design the interface so you can swap implementations when performance demands it.
An interpreter that runs on 4 KB of ROM and 1 KB of RAM is a genuine achievement that enables a computing culture on severely constrained hardware. Optimize for compactness, not speed, in the first implementation.