Addressing Modes

How instructions specify where their operands are — the fundamental vocabulary of machine language.

Why This Matters

Every machine language instruction must specify where its operands are: where is the data to be read, and where should the result be written? Addressing modes are the different ways an instruction can specify an operand’s location. A processor might support a dozen or more addressing modes, each suited to different situations.

Understanding addressing modes is essential for writing assembly language, understanding disassembled machine code, reasoning about how compilers translate high-level constructs to machine instructions, and debugging programs at the binary level. When you look at a disassembled program or read a processor datasheet, addressing modes appear in every instruction — you cannot understand the code without understanding how each instruction specifies its data.

Addressing modes also directly affect performance. Different addressing modes have different latencies. A programmer or compiler that chooses addressing modes poorly produces code that is correct but unnecessarily slow. Understanding addressing modes enables understanding why some code patterns are faster than others.

Immediate Addressing

In immediate addressing, the operand is embedded directly in the instruction. There is no memory access needed to get the operand — it is part of the instruction itself.

Example in x86 assembly: MOV AX, 42 — move the value 42 into the AX register. The number 42 is an immediate value, encoded directly in the instruction encoding.

Immediate addressing is used for constants: loop counts, mask values, increment amounts. It is the fastest addressing mode because no additional memory access is needed — the value is fetched along with the instruction during instruction decode.

Limitation: immediate values are limited in size by the instruction encoding. An x86 instruction can include an 8-bit, 16-bit, or 32-bit immediate; larger constants must be loaded from memory.

When you write in a high-level language: int x = 5; or if (count < 100), the constants 5 and 100 are typically compiled to immediate operands.

Register Addressing

The operand is in a CPU register. Like immediate addressing, this requires no memory access — registers are part of the processor.

Example: ADD AX, BX — add the value in register BX to the value in register AX, storing the result in AX. Both operands are registers.

Register addressing is the fastest form of non-immediate addressing. Registers are the fastest storage in a computer — faster than cache, much faster than main memory. Code that keeps frequently used values in registers rather than spilling them to memory runs significantly faster.

Modern processors have a limited number of named registers (x86-64 has 16 general-purpose registers; RISC processors typically have 32). Register allocation — deciding which values to keep in registers and which to spill to memory — is one of the most important optimizations a compiler performs.

Direct (Absolute) Addressing

The instruction specifies the actual memory address of the operand. The processor reads the operand directly from that address.

Example: MOV AX, [0x1000] — load the value at memory address 0x1000 into AX.

Direct addressing is simple and useful for accessing fixed memory locations: memory-mapped hardware registers, fixed data structures at known addresses. In early programming, global variables were accessed via direct addressing.

In modern programs, direct addressing for data is less common because programs are loaded at different addresses each run (address space layout randomization, position-independent code). Fixed addresses only work when you control exactly where things are loaded.

Indirect Addressing (Register Indirect)

A register holds the address of the operand. The instruction specifies the register; the processor reads the register value and uses it as a memory address.

Example: MOV AX, [BX] — load the value from the memory address currently in register BX into AX.

Indirect addressing is how pointers work at the machine level. When a high-level language dereferences a pointer (*ptr, ptr->field), the compiler generates register indirect addressing — the pointer value is in a register, and the instruction uses that register’s value as the memory address.

Indirect addressing is fundamental to dynamic data structures: linked lists, trees, and any data structure where the actual memory address is not known until runtime.

Indexed Addressing

The address is computed by adding a register value (the index) to a base address (either from another register or an immediate offset).

Example: MOV AX, [BX + SI] or MOV AX, [BX + 4] — load from the address formed by BX plus SI (another register) or BX plus the constant 4.

Indexed addressing is how arrays are accessed at the machine level. If BX holds the address of an array and SI holds the element index multiplied by the element size, [BX + SI] accesses the correct array element. This common pattern is so important that most processors provide hardware support for scaled indexed addressing.

Scaled indexed addressing (common in x86): MOV AX, [BX + SI*4] — the index register SI is multiplied by 4 (the element size) automatically by the hardware. This enables direct array access with a single instruction rather than requiring explicit multiplication of the index.

When you write array[i] in C, the compiler generates indexed addressing. The array base address is one register, the index i is another register (possibly scaled by the element size), and the instruction accesses the element in one operation.

Base + Displacement Addressing

A register holds a base address, and the instruction includes a signed constant offset (displacement). The effective address is base + displacement.

Example: MOV AX, [BP + 8] — load from the address of the BP register plus 8. In x86, this is the standard way to access function parameters and local variables on the stack. BP points to the current stack frame; the displacement locates specific parameters or variables within the frame.

Compilers use this pattern extensively for accessing struct fields, object members, and function-local variables. The struct base address is in a register; the field offset (known at compile time) is the displacement. person.age compiles to something like MOV AX, [BX + 12] where BX holds the struct base address and 12 is the offset of the age field within the struct.

PC-Relative Addressing

The address is computed relative to the current program counter (PC). This enables position-independent code that works regardless of where in memory it is loaded.

MOV AX, [PC + 0x80] — load from the address 0x80 bytes after the current instruction’s address.

PC-relative addressing is used by modern compilers to access global variables and constants in position-independent code. Because the offset from the instruction to the data is constant regardless of where the code is loaded, the instruction works correctly even when the program is loaded at different base addresses.

Jump and branch instructions use PC-relative offsets to specify targets: JMP +16 means jump to the instruction 16 bytes after the current one. This also works in position-independent code — the relative distance between instruction and target does not change when the code is relocated.

Choosing Addressing Modes

In assembly programming, choose addressing modes that minimize memory accesses: keep frequently used values in registers (register addressing), use immediate addressing for constants, and use indexed or base+displacement addressing for array and struct access.

In high-level programming, the compiler makes these choices. But understanding addressing modes explains why certain patterns are faster than others:

  • Sequential array access (i++) is faster than random access (i = hash(key)) because sequential access patterns map well to indexed addressing with incrementing indices.
  • Accessing fields of a struct is fast because it compiles to base+displacement addressing with the field offset known at compile time.
  • Dereferencing a chain of pointers (abcd) is slow because each dereference is an indirect memory access, and each must complete before the next can begin.

Understanding addressing modes bridges the gap between high-level code and machine execution, giving you the mental model needed to reason about performance and to read and understand machine-level code.