Assembly Language
Part of Basic Computing
Assembly language is the thin human-readable layer directly above machine code, giving programmers direct control over hardware with minimal abstraction.
Why This Matters
Assembly language is the lowest level at which humans can write programs without manually encoding binary. Every instruction maps directly to one CPU operation — no compiler, no runtime, no hidden layers. Understanding assembly means understanding what a computer actually does, instruction by instruction.
In a civilization-rebuilding context, assembly language may be the only programming tool available when reconstructing computing from scratch. High-level languages require compilers, which require working computers to run. A programmer who understands assembly can write software for any machine with nothing more than the processor’s instruction reference and a paper notebook. The first programs for every historical computer were written in assembly or machine code.
Assembly also builds the mental model that makes all higher-level programming comprehensible. When a C programmer understands that a + b compiles to a LOAD, ADD, STORE sequence, they write better code. Assembly is the bridge between hardware and thought.
Structure of an Assembly Program
Assembly source code consists of lines, each specifying one CPU operation. A typical line has four fields:
LABEL: MNEMONIC OPERANDS ; comment
The label (optional) marks the current address for use as a jump target or data reference. The mnemonic is the human-readable name for the instruction (MOV, ADD, JMP). Operands specify registers, memory addresses, or immediate values. Comments document intent.
A minimal program on a hypothetical 8-bit CPU:
START: LDA #10 ; load immediate value 10 into accumulator
LDB #7 ; load immediate value 7 into register B
ADD B ; add B to accumulator (A = A + B = 17)
STA RESULT ; store accumulator to memory address RESULT
HLT ; halt execution
RESULT: DB 0 ; reserve 1 byte, initialized to 0The assembler converts mnemonics to binary opcodes, resolves label addresses, and outputs machine code ready to load into memory.
Registers and Addressing Modes
Registers are the CPU’s internal working storage — faster than memory, but few in number. A minimal CPU might have:
- Accumulator (A): primary arithmetic register
- Index register (X or Y): used for pointer arithmetic and array indexing
- Stack pointer (SP): points to the top of the call stack
- Program counter (PC): address of the next instruction to execute
- Status register (SR): holds condition flags (zero, carry, negative, overflow)
Addressing modes determine how operands are interpreted:
- Immediate:
LDA #42— the value 42 is encoded directly in the instruction - Direct/Absolute:
LDA $1000— load from memory address 0x1000 - Register:
ADD B— operand is register B - Indirect:
LDA (PTR)— PTR contains the address to load from (pointer dereference) - Indexed:
LDA $1000,X— load from address 0x1000 + X (array element access) - Relative:
BNE LOOP— branch to LOOP, encoded as signed offset from current PC
Understanding addressing modes is essential for writing efficient assembly. Immediate mode is fastest (no memory access). Indirect and indexed modes enable dynamic data structures and arrays.
Common Instruction Patterns
Counting loop:
LDA #0 ; counter = 0
LOOP: ADD #1 ; counter += 1
CMP #10 ; compare with 10
BNE LOOP ; if not equal, repeatMemory copy (N bytes from SRC to DST):
LDX #0 ; index = 0
COPY: LDA SRC,X ; load byte from source[X]
STA DST,X ; store to dest[X]
INX ; X++
CPX #N ; compare with count
BNE COPY ; repeat until doneSubroutine call and return:
JSR MYSUB ; push return address, jump to MYSUB
... ; execution continues here after RTS
MYSUB: LDA #42 ; subroutine body
RTS ; pop return address, jump backThe stack is crucial for subroutines. JSR pushes the return address; RTS pops it. Local variables can be pushed onto the stack and popped on return, enabling recursive procedures.
Writing an Assembler
An assembler is a simple program (or even a manual process) that translates assembly text to machine code. A two-pass assembler works as follows:
Pass 1: Read all lines, assign addresses to labels, build a symbol table:
START = 0x0000
RESULT = 0x000A
LOOP = 0x0004
Pass 2: Re-read each line, look up label addresses, encode each instruction:
- Look up the mnemonic’s opcode in a table
- Encode the addressing mode
- Resolve label references using the symbol table
- Output the bytes
A hand assembler (done on paper) follows the same process. The programmer maintains the symbol table manually and looks up opcodes in the instruction reference manual. Historical programmers routinely hand-assembled programs of hundreds of instructions before automated tools existed.
Practical Tips for Assembly Programming
Always comment liberally. Assembly has no self-documenting names — a comment every 3–5 instructions is not excessive. Describe intent, not mechanics: ; multiply by 10 is more useful than ; add A to A.
Draw the memory map before writing code. Know where code lives, where the stack is, where variables go. Stack and program data colliding causes spectacular crashes with no error message.
Use symbolic constants instead of magic numbers. Define MAX_COUNT EQU 64 at the top of the file rather than scattering literal 64s throughout the code.
Test incrementally. Assemble and test each subroutine before building the next. With assembly, bugs from multiple untested components interacting are nearly impossible to debug.
On real hardware, use a single-step mode (if available) or toggle switches to execute one instruction at a time and inspect register and memory contents after each step. This is slow but certain.