Binary Encoding

Binary encoding is how computers represent all information — numbers, text, images, instructions — as sequences of ones and zeros.

Why This Matters

Everything stored in a computer — every number, character, instruction, and image — is ultimately a pattern of bits. Understanding binary encoding is not optional for anyone who programs at the hardware level: it determines how you read and write data, how you interpret sensor inputs, how you pack multiple values into a single byte, and how you communicate between devices.

For rebuilders working with early microprocessors and constrained memory, binary encoding is a daily tool. When you read a byte from a hardware register and need to know which specific bits control which hardware functions, you are doing binary encoding. When you store two 4-bit values in a single 8-bit byte to conserve memory, you are doing binary encoding. When you transmit data over a serial line and need to verify it arrived correctly, you are doing binary encoding.

The rules are simple. The applications are endless.

Number Systems

Binary uses base 2: only digits 0 and 1. Each digit is called a bit. A group of 8 bits is a byte. A group of 16 bits is a word (on most early systems). A group of 4 bits is a nibble.

To convert a binary number to decimal, multiply each bit by its place value and sum:

Binary: 1 0 1 1 0 1 0 0
Place:  128 64 32 16 8 4 2 1

= 128 + 0 + 32 + 16 + 0 + 4 + 0 + 0 = 180

To convert decimal to binary, repeatedly divide by 2 and record remainders:

180 / 2 = 90 remainder 0
 90 / 2 = 45 remainder 0
 45 / 2 = 22 remainder 1
 22 / 2 = 11 remainder 0
 11 / 2 =  5 remainder 1
  5 / 2 =  2 remainder 1
  2 / 2 =  1 remainder 0
  1 / 2 =  0 remainder 1

Reading remainders bottom to top: 10110100 = 180

Hexadecimal (base 16) is a shorthand for binary. Each hex digit represents exactly 4 bits. Digits 0-9 represent values 0-9; letters A-F represent 10-15.

Binary:  1011 0100
Hex:     B    4      = 0xB4 = 180 decimal

Hexadecimal is the standard notation for machine code, memory addresses, and hardware registers. You will write and read it constantly. Memorize the 4-bit to hex digit mapping: 0000=0, 0001=1, …, 1001=9, 1010=A, 1011=B, 1100=C, 1101=D, 1110=E, 1111=F.

Encoding Integers

An 8-bit unsigned integer can hold values 0 to 255 (2⁸ - 1). A 16-bit unsigned integer holds 0 to 65535. A 32-bit unsigned integer holds 0 to about 4 billion.

Signed integers need to encode negative numbers. The standard method is two’s complement. In two’s complement, the highest bit is the sign bit: 0 means positive, 1 means negative.

For an 8-bit two’s complement integer:

  • Values 0x00 to 0x7F represent 0 to 127
  • Values 0x80 to 0xFF represent -128 to -1

To negate a value: flip all bits, then add 1.

To negate 5 (0x05 = 00000101):
Flip bits: 11111010
Add 1:     11111011 = 0xFB = -5

Verify: 5 + (-5) should equal 0.

  00000101
+ 11111011
----------
 100000000  (9 bits; discard the carry out, result = 00000000 = 0) ✓

Two’s complement makes addition and subtraction use the same hardware circuit regardless of sign, which is why all modern CPUs use it.

Character Encoding

Text must be encoded as numbers. The dominant encoding for early microcomputer work is ASCII (American Standard Code for Information Interchange). ASCII assigns a unique 7-bit code to 128 characters: uppercase letters, lowercase letters, digits 0-9, punctuation, and 32 control codes.

Essential ASCII values to memorize:

  • ‘0’ through ‘9’: 0x30 through 0x39 (digit value = ASCII code - 0x30)
  • ‘A’ through ‘Z’: 0x41 through 0x5A
  • ‘a’ through ‘z’: 0x61 through 0x7A (lowercase = uppercase + 0x20)
  • Space: 0x20, Newline: 0x0A, Carriage return: 0x0D, Null: 0x00

The relationship between digit characters and their values (subtract 0x30) lets you convert numeric strings to integers character by character. The relationship between uppercase and lowercase (add 0x20) lets you write case-insensitive comparisons cheaply.

An 8-bit byte can hold all ASCII values with one bit to spare. Extended ASCII uses the 128th bit to encode an additional 128 characters — national characters, box-drawing symbols, mathematical notation — in various incompatible code pages. For rebuilding purposes, plain 7-bit ASCII is sufficient for all technical work.

Bit Fields and Packing

When memory is scarce, you can pack multiple small values into a single byte using bit fields. A byte has 8 bits; you can use subsets of those bits to store separate values.

Example: a hardware status register where each bit has a specific meaning:

Bit 7: error flag
Bit 6: ready flag
Bits 4-5: mode (0-3)
Bits 0-3: channel number (0-15)

To read the ready flag (bit 6), mask and shift:

READY = (STATUS AND 0x40) >> 6

To read the mode (bits 4-5):

MODE = (STATUS AND 0x30) >> 4

To set bit 5 without disturbing other bits:

STATUS = STATUS OR 0x20

To clear bit 5:

STATUS = STATUS AND 0xDF    ; 0xDF = 11011111 in binary

This pattern of masking, shifting, ORing, and ANDing appears constantly when programming hardware. The bitwise operators AND, OR, XOR, and NOT are fundamental tools.

Fixed-Point Numbers

Many early systems lacked floating-point hardware. The solution is fixed-point arithmetic: agree that the binary point sits at a fixed position within a word.

For example, in an 8-bit byte with Q4.4 format (4 integer bits, 4 fractional bits):

  • The value 0x18 = 0001 1000 represents 1.5 (binary 0001.1000 = 1 + 0.5)
  • The value 0x28 = 0010 1000 represents 2.5
  • The value 0xF0 = 1111 0000 represents 15 (all integer bits set, no fraction)

To multiply two Q4.4 numbers, multiply them as 8-bit integers and shift right 4 bits to restore the scale. This lets you do decimal arithmetic without floating-point hardware.

Fixed-point is essential for sensor calculations, physical measurements, and financial arithmetic on constrained hardware. A temperature sensor might return values in hundredths of a degree as a 16-bit integer — this is fixed-point encoding with an implied scale factor of 100.

Checksums and Parity

When transmitting or storing data, errors can corrupt bits. Simple encoding schemes detect (and sometimes correct) errors.

Parity bit: Add one extra bit to a byte such that the total number of 1-bits is always even (even parity) or always odd (odd parity). If a single bit flips during transmission, the parity will be wrong, flagging the error. Parity catches single-bit errors but cannot correct them.

Checksum: Sum all bytes in a block of data, keep the lowest byte of the result, and append it. On reception, sum all bytes including the checksum — the result should be zero (or some other known value). A simple checksum catches most random errors. Used in serial protocols, ROM validation, and file integrity checking.

XOR checksum: XOR all bytes together. Computationally cheaper than addition, catches different error patterns. Used in many embedded communication protocols.

These are not cryptographic security — they do not protect against intentional tampering — but they catch the common case of hardware noise corrupting data during storage or transmission.

Practical Notes for Rebuilders

Print a binary-hex conversion table and keep it at your workstation until you can read hex fluently. Fluency means you see 0x4F and immediately know it is 79 decimal and the ASCII character ‘O’, without calculation.

When debugging, always display values in hex rather than decimal when examining memory contents or register values. Decimal hides the bit patterns that reveal what went wrong.

Learn the common bit masks: 0x0F (low nibble), 0xF0 (high nibble), 0x01 (bit 0), 0x80 (bit 7), 0x7F (all bits except bit 7). These appear so frequently that recognition becomes automatic.

Binary encoding is not an obstacle to computing — it is the foundation. Every abstraction above it (characters, floating-point, protocols) is simply an agreed convention for interpreting bit patterns. Understanding the layer beneath the convention gives you the power to work without the convention when you must.