Machine Code

Machine code is the binary representation of a program that a processor directly executes — the lowest level of software, where instructions are numbers and programs are sequences of bytes.

Why This Matters

Machine code is the native language of every processor. High-level languages (C, Python, Java), assembly language, and even the operating system are all ultimately translated to machine code before execution. Understanding machine code reveals exactly what a processor does: it fetches bytes from memory, decodes them as instructions, and performs specified operations. Nothing more.

This understanding becomes essential when building computers from scratch. The first programs running on a hand-built CPU must be entered as machine code — either toggled in through front-panel switches, burned into ROM, or loaded from a bootstrap device. Before assemblers exist, assembly language code must be hand-translated to machine code using the instruction reference.

Machine code is also the answer to “but what does the computer actually do?” — a question that every serious engineer should be able to answer.

How the Processor Fetches and Decodes Instructions

The processor’s fetch-decode-execute cycle (described in Computer Architecture) begins by reading bytes from the address in the Program Counter. The first byte (or word, for 16-bit instructions) is the opcode field.

For a typical 8-bit CISC processor (like the Zilog Z80 or MOS 6502):

  • Fetch byte at PC address: this is the opcode
  • Decode: look up the opcode in the instruction decode logic (hardwired ROM or logic)
  • Determine instruction length (1, 2, or 3 bytes for different operations)
  • Fetch additional bytes if needed (operand bytes)
  • Execute

For the 6502, hex opcode 0xA9 means “LDA immediate” — load the accumulator with the next byte. If memory contains:

Address: 0x0200  0x0201
Content: 0xA9   0x42

The CPU fetches 0xA9 at PC=0x200, decodes as LDA immediate, fetches 0x42 at PC=0x201, loads 0x42 into the accumulator. PC advances to 0x202 for the next instruction.

Opcode 0x8D means “STA absolute” — store accumulator to a 16-bit address given in the next two bytes (little-endian). If the sequence 0x8D 0x00 0x02 follows, the accumulator is stored at address 0x0200.

Hand Assembly: Converting Instructions to Bytes

Before an assembler exists, programs must be hand-assembled: for each instruction, look up its opcode in the instruction reference, then encode operands according to the addressing mode.

Example: hand-assembling a program to add two numbers on a 6502-compatible CPU.

Source intent:

Load first number (at address 0x0300) into A
Add second number (at address 0x0301)
Store result at address 0x0302

Looking up opcodes:

  • LDA absolute: 0xAD, followed by 16-bit address (low byte first)
  • ADC absolute: 0x6D, followed by 16-bit address
  • STA absolute: 0x8D, followed by 16-bit address
  • CLC (clear carry before add): 0x18 (1 byte, no operand)
  • HLT or BRK: 0x00

Hand-assembled bytes:

Addr  Bytes      Instruction
0200: 18         CLC
0201: AD 00 03   LDA $0300
0204: 6D 01 03   ADC $0301
0207: 8D 02 03   STA $0302
020A: 00         BRK

This sequence of 11 bytes, placed at address 0x0200, executes the intended program when the CPU starts at PC=0x0200.

ROM and Bootstrap Loading

The first program a computer runs after power-on or reset comes from a ROM (Read-Only Memory) at a fixed address (typically the highest addresses in the memory map). This bootstrap program:

  1. Initializes hardware (stack pointer, I/O devices)
  2. Establishes communication (serial port, front panel)
  3. Accepts machine code input from the operator
  4. Jumps to the loaded program

A minimal bootstrap (monitor program) can be just 256 bytes of machine code. It provides commands like:

  • E (Examine) — read memory at address: E 0200 → prints bytes
  • D (Deposit) — write byte to address: D 0200 42 → writes 0x42
  • G (Go) — start execution: G 0200 → jumps to address

With this monitor burned into ROM, the operator can type machine code bytes over the serial link, then run them. This is exactly how the first microcomputer software was entered.

Machine Code Debugging

Without a high-level debugger, debugging machine code requires:

Memory examination: read the bytes around the failing instruction. Are they the expected opcodes? Wrong bytes mean the program was loaded incorrectly or self-modified.

Register inspection: the monitor or front panel shows register contents. After execution halts (breakpoint or error), examine PC (which instruction was executing?), accumulator, stack pointer.

Breakpoints: write a BRK or halt instruction at the suspected problem location, overwriting the normal instruction. When the CPU reaches this address, it halts. Examine state. Restore the original byte and continue.

Trace execution: single-step mode (if the hardware supports it) or simulate execution mentally by reading instructions in sequence and tracking register values on paper.

Machine code debugging is slow and tedious. This is exactly why higher-level tools (assemblers, monitors, debuggers) were developed in the earliest days of computing. Each tool is itself written in machine code, so the discipline requires working from scratch up the tool stack, with each tool enabling the next.