Computer Architecture

Computer architecture defines how the major subsystems of a processor — control unit, ALU, memory, and I/O — are organized and how they interact to execute programs.

Why This Matters

Computer architecture is the blueprint that turns individual digital circuits into a functioning programmable machine. Without a coherent architectural design, you might have all the components — adders, registers, memory — but no way for them to work together to execute instructions. Architecture answers the fundamental questions: how does the machine know what to do next? How does it fetch data? How does it decide?

Understanding architecture enables building computers rather than just using them. When civilization-rebuilding efforts reach the computing tier, the engineers involved must make architectural decisions: how wide should the data bus be? How many registers? What instruction format? These choices determine what the machine can do and how efficiently it does it.

Every architecture ever designed is a variation on a small number of fundamental patterns. Grasping those patterns means being able to design or understand any computer.

The Von Neumann Model

Nearly all practical computers implement the Von Neumann architecture, proposed by John von Neumann in 1945. Its central insight: programs and data are both stored in the same memory, encoded as numbers. The processor reads program instructions the same way it reads data — this is the “stored program” concept.

Five components:

  1. Memory: stores both program instructions and data as binary numbers
  2. ALU (Arithmetic Logic Unit): performs arithmetic and logic operations
  3. Control Unit (CU): decodes instructions and generates control signals
  4. Input: receives data from the external world
  5. Output: sends data to the external world

The processor (CPU) consists of the ALU and Control Unit combined. Memory is separate. I/O devices connect through the same bus or through dedicated I/O circuits.

The fetch-decode-execute cycle is the heart of Von Neumann operation:

  1. Fetch the instruction at the address in the Program Counter (PC)
  2. Decode the instruction to determine what operation to perform
  3. Execute the operation (which may involve fetching operands from memory)
  4. Update the PC (typically PC = PC + 1, or set to branch target)
  5. Repeat

This cycle repeats millions or billions of times per second in modern processors. In a hand-built 4-bit CPU running at 1 MHz, it repeats one million times per second.

Register File and Internal Buses

Registers are high-speed storage inside the processor, directly accessible without memory latency. A minimal CPU might have:

  • Program Counter (PC): address of next instruction
  • Instruction Register (IR): holds the instruction currently being decoded
  • Stack Pointer (SP): address of the top of the call stack
  • Accumulator (A): primary arithmetic register
  • General-purpose registers (B, C, D, etc.): additional working storage
  • Status Register (SR): condition flags (zero, carry, negative, overflow)

Registers connect to the ALU and to memory through internal buses. A simple single-bus architecture routes all register outputs onto one shared bus; at any moment, only one register may place its value on the bus. Tri-state buffers (three-state logic: drive high, drive low, high-impedance/disconnected) gate each register’s connection.

A two-bus architecture (separate address and data buses for internal paths) allows simultaneous register reads and the start of memory access, improving throughput.

Instruction Format Design

An instruction is a binary number that encodes what operation to perform, which registers to use, and (sometimes) an immediate value or memory address. The instruction format determines the range of expressible programs.

For an 8-bit CPU with 8-bit instructions:

  • 3 bits for opcode (8 possible operations)
  • 2 bits for source register (4 registers)
  • 2 bits for destination register (4 registers)
  • 1 bit for immediate flag

This format is too compact for many useful programs. Real 8-bit CPUs (like the 6502) use variable-length instructions: a 1-byte opcode, optionally followed by 1 or 2 bytes of operand.

For a 16-bit instruction word with 32 opcodes:

  • 5 bits opcode
  • 3 bits destination register (8 registers)
  • 3 bits source register A
  • 3 bits source register B
  • 2 bits addressing mode

The addressing mode field determines how to interpret the source register field: as a register number, as a memory address, as a pointer, etc.

Fixed-length vs. variable-length instructions: fixed-length simplifies decoding (always fetch N bytes per instruction); variable-length uses fewer bytes per program but requires sequential decoding (can’t pre-fetch in parallel).

Control Unit Design

The control unit reads the instruction register and generates all the control signals that make the instruction happen. Two implementation approaches:

Hardwired control: Boolean logic circuits directly decode the instruction register and drive control signals. Fast and efficient but complex to design and impossible to change after construction. Appropriate when the instruction set is fixed and the design is mature.

Microprogrammed control: a small ROM (read-only memory) stores microcode — a sequence of micro-operations for each instruction. The instruction register indexes into the ROM; each ROM row drives the control signals for one micro-step. Simpler to design and modify (change the ROM), slightly slower due to ROM access time.

For a hand-built CPU, microprogrammed control with a small EEPROM is recommended. It separates the instruction set design from the hardware, allowing iteration. A bug in an instruction’s behavior means rewriting the microcode, not reworking the circuit.

Memory and Bus Architecture

The memory bus carries addresses (from CPU to memory, selecting which location to access) and data (bidirectional: CPU writing to memory, or memory reading to CPU). A control signal selects read vs. write direction.

Bus width determines granularity: an 8-bit bus transfers 1 byte per cycle; a 16-bit bus transfers 2 bytes. A wider bus costs more (more wires, more drivers) but moves data faster.

Address bus width determines memory capacity: 16-bit address = 2^16 = 65,536 addressable locations (64 KB). 20-bit = 1 MB. For a first-generation hand-built system, 16-bit addresses and 8-bit data are entirely sufficient.

Memory-mapped I/O: I/O devices are assigned addresses in the memory address space. Reading from address 0xFFFF might read a keyboard register; writing to 0xFFFE might write to an output display. This simplifies the CPU — only one bus access mechanism needed. Alternative: separate I/O address space with IN/OUT instructions (used by Intel x86).

Start simple: one address space, 8-bit data, 16-bit addresses, 8 registers, 32 instructions. This architecture is fully buildable from discrete TTL logic on a large breadboard and produces a real working computer capable of running meaningful programs.