The Stack
Part of Programming Fundamentals
The stack is a last-in-first-out memory structure maintained by the CPU for storing return addresses, saving registers, and managing local data across subroutine calls.
Why This Matters
The stack is what makes subroutines work. Without it, there is no way to call a routine from multiple places and have it return to the right place each time. Without nested subroutine calls, no program of significant complexity is possible. The stack is the hardware mechanism that enables function calls, recursion, and structured programming.
For rebuilders, understanding the stack is essential at three levels. First, you must know how it works to use subroutines correctly — pushing and popping values in the right order, not overflowing it, not corrupting it. Second, you must understand it to debug crashes, because most serious bugs ultimately involve the stack in some way — corruption, overflow, or mismatch between pushed and popped values. Third, for those designing operating systems or implementing compilers, the stack is the key data structure for process management and calling conventions.
The stack is also the simplest example of a last-in-first-out (LIFO) data structure, making it foundational for understanding more complex data structures built on the same principle.
Physical Structure
The stack is a region of RAM. One end is fixed (the stack’s base address), and the other end is dynamic (the current top, tracked by the stack pointer register SP). On most 8-bit CPUs, the stack grows downward: each push decrements SP and stores the value at the new SP address. Each pop reads the value at SP and increments SP.
On the Z80:
PUSH BC: SP decreases by 2, then the value of BC (16 bits) is written to the new SP address (B to SP+1, C to SP)POP BC: The 16-bit value at SP is loaded into BC (C from SP, B from SP+1), then SP increases by 2
On the 6502:
- Stack fixed at page 1 (01FF), 8-bit stack pointer (S)
PHA: Write A to address $0100+S, then decrement SPLA: Increment S, then read A from address $0100+S
The 6502’s stack is limited to 256 bytes — very tight for deeply nested calls with many pushed registers. The Z80’s stack can extend throughout RAM (configured by setting SP to an appropriate starting address).
Initialization
Before any subroutine call, the stack pointer must be initialized to a valid RAM address. Failing to do this, or initializing it to the wrong address, causes the first PUSH to write data to an arbitrary memory location, corrupting whatever was there.
At program startup:
; Z80: set SP to top of RAM (stack grows downward from here)
LD SP, 0xFFFF ; start at top of 64K RAM space
Or more specifically, at the boundary between stack space and data space:
STACK_TOP EQU 0x2000 ; stack occupies 0x1000 to 0x1FFF
LD SP, STACK_TOP
Choose the stack size based on your deepest expected call chain plus the registers each level saves. For a simple program with 5 levels of nesting and each level saving 4 registers (8 bytes), 40+ bytes of stack suffices. For complex programs, allocate 256 bytes to 1 KB.
Saving and Restoring Registers
Subroutines use registers for computation. This is a problem if the caller needs those registers to retain their values across the call. The solution: at the start of the subroutine, push the registers you will use (saving them); at the end, pop them in reverse order (restoring them).
MY_SUBROUTINE:
PUSH BC ; save registers this routine will use
PUSH DE
PUSH HL
; ... use BC, DE, HL freely ...
POP HL ; restore in reverse order
POP DE
POP BC
RET
The reverse-order restoration is mandatory. The stack is LIFO — you must pop in the opposite order from pushing. Popping in the wrong order loads the wrong values into the wrong registers.
Document which registers each subroutine modifies (does not restore). Callers that use those registers must save them before the call and restore them after. A subroutine that unconditionally preserves all registers it uses is “register-safe” — the caller never needs to worry about register corruption.
The Call Chain
The stack’s power comes from its ability to handle nested calls automatically:
MAIN:
CALL LEVEL_1 ; pushes MAIN's return address
LEVEL_1:
CALL LEVEL_2 ; pushes LEVEL_1's return address
LEVEL_2:
RET ; pops LEVEL_1's address → returns to LEVEL_1
; back in LEVEL_1:
RET ; pops MAIN's return address → returns to MAIN
Each CALL pushes one return address. Each RET pops one. As long as every CALL is matched by exactly one RET, the stack correctly unwinds the call chain regardless of depth.
Mismatched PUSH/POP within a subroutine corrupts this chain. If a subroutine pushes a register but returns early (via a conditional return) without popping, the next RET pops the register value instead of the return address and jumps to garbage. This is one of the most common stack corruption bugs.
Stack Frames and Local Variables
In languages beyond assembly, subroutines have local variables — values that exist only during a call and are destroyed on return. The standard mechanism is a stack frame: before the main body of the subroutine runs, reserve space on the stack for local variables by decrementing SP by the needed amount.
; Subroutine with 8 bytes of local variable space
MYSUB:
PUSH HL ; save frame pointer (optional)
LD HL, 0
ADD HL, SP ; HL = SP (to access locals via offset)
LD SP, SP - 8 ; reserve 8 bytes (conceptual; use DEC SP or LD SP, SP-8)
; access locals: (HL+0) through (HL+7)
; ...
ADD SP, 8 ; free local space
POP HL
RET
The compiler handles this automatically for higher-level languages. The key insight is that local variables live on the stack — they occupy stack space for the duration of the call, then vanish when the function returns and SP is restored.
Stack Overflow
If the program pushes more values than the stack has space for, the stack pointer advances into code or data memory and overwrites it. This is stack overflow — the stack has “overflowed” into adjacent memory.
Typical symptoms: data corruption that appears unrelated to what the code is doing, crashes in routines that worked before, program counter jumping to invalid addresses.
Prevent overflow by:
- Allocating sufficient stack space (know your maximum call depth and register usage)
- Avoiding unbounded recursion (recursive calls with no guaranteed termination or with very deep recursion)
- Monitoring SP periodically in debug builds: if SP falls below a low-water mark, log a warning
Fill the stack region with a known sentinel value (like 0xAA) before running. Periodically check how far into the sentinel region the stack has grown. The lowest address overwritten by actual stack data is the maximum stack depth — compare to your allocation to see how much headroom remains.
Practical Notes for Rebuilders
Make it a rule: every PUSH in a subroutine has exactly one corresponding POP. Verify this by hand before trusting any subroutine. A single mismatched PUSH/POP can silently corrupt execution long after the subroutine returns — the wrong return address is on the stack but not used until the calling function returns, making the bug appear far from its cause.
Add stack overflow checking to your monitor or diagnostic code. A simple check: is SP still within the allocated stack region? Run this check periodically (say, at the top of the main loop) and halt with a diagnostic if violated. Finding overflow early saves hours of confusion.
For the first programs on a new machine, keep call nesting shallow and register saving minimal. As experience with the hardware grows and stack space is confirmed adequate, deepen the call chain.