Data Storage

Why This Matters

A computer without persistent storage must be reprogrammed from scratch every time it is powered on. All programs, all data, all accumulated knowledge --- gone the instant the power fails. Data storage is what transforms a computer from a calculator into a knowledge repository. It is how civilization preserves and transmits information at scale.

What You Need

For magnetic storage:

  • Ferrite ring cores (for core memory) or iron oxide coating material
  • Fine copper wire, 30-40 AWG (for core memory threading)
  • Electric motor and precision bearing assembly (for disk/drum)
  • Read/write heads: small electromagnets with very fine pole gaps
  • Flexible or rigid substrate (aluminum disc or plastic tape)
  • Iron oxide powder (Fe2O3) mixed with binder (lacquer, shellac)
  • Amplifier circuits (transistor-based, for reading weak magnetic signals)

For paper storage:

  • Heavy card stock (manila card, approximately 0.18 mm thick)
  • Paper tape (continuous roll, standard telegraph tape works)
  • Precision punch mechanism (steel dies)
  • Optical sensors (LED + phototransistor pairs) or mechanical brush contacts

For semiconductor memory:

  • Transistors and capacitors (for SRAM/DRAM)
  • Diodes (for ROM arrays)
  • Breadboard or PCB fabrication capability

Storage Fundamentals

Volatile vs Persistent

All computer storage falls into two categories:

Volatile (loses data when power is removed):

  • Registers (inside the processor)
  • SRAM (static RAM --- flip-flop based)
  • DRAM (dynamic RAM --- capacitor based, loses data even with power unless refreshed)

Persistent (retains data without power):

  • Magnetic tape, drum, disk
  • Punched cards and tape
  • ROM (read-only memory, data burned in at manufacture)
  • EPROM/EEPROM (erasable/electrically erasable programmable ROM)

The Memory Hierarchy

Speed and cost are inversely related. Fast memory is expensive and small. Cheap memory is slow and large.

LevelTypeSpeedTypical SizeCost/Byte
1CPU Registers1 ns8-64 bytesHighest
2SRAM (Cache)5-20 ns1-256 KBVery high
3DRAM (Main memory)50-100 ns64 KB-several MBModerate
4Magnetic disk5-20 ms1 MB-several GBLow
5Magnetic tapeSeconds-minutesUnlimited (add more tape)Lowest

A well-designed system uses each level as a staging area for the next: the processor works from registers, loads data from RAM, which is filled from disk, which is backed up to tape.

Capacity Units

UnitEqualsPractical Reference
1 bitSingle 0 or 1One transistor switch position
1 byte8 bitsOne character of text (ASCII)
1 KB1,024 bytesAbout half a page of text
1 MB1,048,576 bytesA short novel
1 GB~1 billion bytesAn encyclopedia

Paper-Based Storage

Paper storage is the easiest to build from scratch and the most durable for long-term archival. It was used from the 1890s through the 1970s and remains a viable technology for a post-collapse rebuild.

Punched Cards

The standard Hollerith punched card is 187 x 83 mm (7.375 x 3.25 inches), divided into 80 columns and 12 rows. Each column represents one character, encoded by the pattern of holes punched in that column.

Making punched cards:

  1. Cut heavy card stock to standard dimensions (precision is important for machine feeding)
  2. Mark a grid of 80 columns x 12 rows (column spacing: 2.3 mm, row spacing: 6.35 mm)
  3. Use a hand punch with a guide template, or build a mechanical punch with spring-loaded dies
  4. Each hole is a rectangular slot, approximately 3.2 x 1.3 mm

Encoding: Each column can have 0, 1, or more holes punched. The pattern of holes encodes a character using the Hollerith code:

CharacterHoles Punched (rows)
0-9Single hole in rows 0-9
A-IRow 12 + rows 1-9
J-RRow 11 + rows 1-9
S-ZRow 0 + rows 2-9

Capacity: 80 characters per card. A program or data set is a stack of cards (a โ€œdeckโ€). A box of 2,000 cards holds 160 KB --- about the same as 3.5 pages of this article.

Punched Tape

A continuous roll of paper tape, typically 25.4 mm (1 inch) wide, with 5 to 8 data holes per row plus a smaller sprocket hole for mechanical feeding.

Advantages over cards:

  • Continuous (no 80-character limit)
  • Cheaper to produce
  • Easier to feed through a reader
  • Can be spliced for editing (cut out bad section, glue in corrected tape)

Standard 8-level tape: Each row has 8 data positions (one byte) plus a sprocket hole. The sprocket hole is smaller and positioned between columns 3 and 4.

Row format (8-level):
o O O . O O O O o
| | | | | | | | |
8 7 6 S 5 4 3 2 1   (S = sprocket, smaller hole)

Tip

Punched tape is the easiest storage medium to fabricate and read. You can punch holes with a simple hand tool and read them optically (LED shining through hole onto phototransistor) or mechanically (spring-loaded pins drop through holes to make electrical contact). Start here for your first persistent storage system.

Building a Tape Reader

Optical reader:

  1. Mount 8 LEDs in a line matching the hole spacing, shining downward
  2. Mount 8 phototransistors directly below, facing up
  3. Pass the tape between LEDs and phototransistors
  4. Where there is a hole, light passes through and the phototransistor conducts (reading a 1)
  5. Where there is no hole, light is blocked (reading a 0)
  6. A sprocket hole sensor triggers the read for each row
  7. Connect phototransistor outputs to the computerโ€™s input port

Tape advance mechanism:

  • A sprocket wheel engages the sprocket holes
  • A stepper motor or hand crank advances one row per step
  • The computer reads each row, advances the tape, reads the next row

Magnetic Storage

Magnetic storage records data by magnetizing small regions of a magnetic material in one of two directions, representing 0 and 1.

Magnetic Recording Principles

Key concepts:

Hysteresis: When you magnetize a material and remove the magnetizing force, the material retains some magnetization (remanence). This retained magnetization is your stored data.

Coercivity: The strength of the opposing magnetic field needed to demagnetize the material. High coercivity materials are harder to write but resist accidental erasure. Low coercivity materials are easy to write but risk data loss from stray magnetic fields.

MaterialCoercivityUse
Gamma iron oxide (Fe2O3)MediumStandard tape and disk coating
Chromium dioxide (CrO2)HighHigh-density tape
Ferrite ceramicMedium-highCore memory rings
Soft ironVery lowRead/write head cores (must magnetize and demagnetize easily)

Magnetic Core Memory

Before semiconductor RAM existed, core memory was the standard random-access storage technology. It uses tiny ferrite rings (cores), each storing one bit.

How it works:

  1. Each core is a toroid (donut shape) of ferrite ceramic, 0.5-2 mm in diameter
  2. Three wires thread through each core: X select, Y select, and sense/inhibit
  3. To write a 1: send current through both X and Y wires simultaneously. Neither wire alone carries enough current to flip the core, but together they exceed the threshold. The core magnetizes clockwise.
  4. To write a 0: the inhibit wire cancels the Y current for cores that should store 0
  5. To read: send read current through X and Y. If the core was storing 1, it flips to 0 and induces a pulse on the sense wire. If it was already 0, no pulse.

Destructive read: Reading a core always resets it to 0. The controller must immediately rewrite the data after reading. This is handled automatically in hardware.

Building core memory:

  1. Obtain or make ferrite cores (small ferrite beads work, or wind ferrite powder with binder into toroids)
  2. Thread three wires through each core in a grid pattern
  3. An 8x8 grid gives 64 bits (8 bytes). A 32x32 grid gives 1,024 bits (128 bytes)
  4. Total wire threading for a 32x32 plane: 3 wires x 1,024 cores = 3,072 threading operations

Warning

Core memory construction is extraordinarily tedious. A 1 KB memory (8 planes of 32x32) requires threading over 24,000 wire passages through tiny ferrite rings. This was done by hand (usually by women with excellent fine motor skills) until automated machinery took over. Budget weeks of patient work for even a small core memory array.

Magnetic Tape

Magnetic tape stores data sequentially on a long strip of plastic film coated with magnetic oxide.

Making magnetic tape:

  1. Start with a smooth, flexible plastic substrate (polyester film, Mylar, or even smooth paper in a pinch)
  2. Mix gamma iron oxide (Fe2O3) powder with a binder (lacquer, shellac, or acrylic)
  3. Apply a thin, uniform coating to one side of the substrate
  4. Allow to dry completely
  5. Slit to the desired width (standard: 12.7 mm / 0.5 inch)

Read/write head:

  1. Wind a small coil (100-500 turns of 40 AWG wire) around a C-shaped soft iron core
  2. The gap in the C faces the tape surface (gap width: 0.01-0.05 mm)
  3. To write: current through the coil magnetizes the tape as it passes the gap
  4. To read: magnetized tape passing the gap induces voltage in the coil
  5. The head must make firm, consistent contact with the tape surface

Tape drive mechanics:

  • Two reels (supply and take-up) with a motor-driven capstan between them
  • The capstan controls tape speed precisely
  • Tape speed: 10-100 cm/second for simple systems
  • A tension mechanism prevents tape slack and ensures constant head contact

Data format:

  • Data is recorded in blocks (records) separated by inter-record gaps (blank tape sections)
  • Each block starts with a preamble (synchronization pattern) and ends with a checksum
  • Between blocks, the tape can be stopped and started without losing data position

Magnetic Disk

Magnetic disk provides random access: any piece of data can be read without scanning through all preceding data, unlike tape.

Basic construction:

  1. A rigid aluminum disc coated with magnetic oxide (like tape, but circular)
  2. The disc spins on a motor-driven spindle at constant speed (360-3,600 RPM for early systems)
  3. A read/write head floats above or rests on the disc surface
  4. Data is recorded in concentric circular tracks
  5. The head moves radially to access different tracks

Organization:

  • Tracks: Concentric circles on the disc surface (typically 40-200 per side)
  • Sectors: Each track is divided into equal arc segments (typically 8-32 per track)
  • Block: One sector holds a fixed amount of data (typically 128-512 bytes)
  • Address: Track number + sector number uniquely identifies any block
            Track 0 (outermost)
           /
    /-----------\
   |  /-------\  |
   | |  /---\  | |    Track 39 (innermost)
   | | |     | | |
   | |  \---/  | |
   |  \-------/  |
    \-----------/
         |
     Spindle/motor

Sector layout (top view of one track):
  [S0][S1][S2][S3][S4][S5][S6][S7]  <- 8 sectors per track

Tip

For a first magnetic disc system, use a salvaged hard drive motor and platter if available. The precision of commercial platters far exceeds anything you can coat by hand. Even a dead hard driveโ€™s mechanical components (motor, bearings, platters) are invaluable. Build only the read/write head and electronics from scratch.


Semiconductor Memory

Static RAM (SRAM)

SRAM uses flip-flops (from Boolean Logic and Gates) to store bits. Each bit requires 4-6 transistors.

Advantages:

  • Very fast (access time 5-20 nanoseconds)
  • Does not need refresh --- holds data as long as power is on
  • Simple interface --- just address lines and data lines

Disadvantages:

  • Expensive (6 transistors per bit)
  • Low density (a 1 KB SRAM needs about 50,000 transistors)

Use: Processor cache, small high-speed buffers, register files.

Dynamic RAM (DRAM)

DRAM stores each bit as charge on a tiny capacitor, with one transistor as an access switch. Much denser than SRAM (1 transistor + 1 capacitor per bit vs 6 transistors).

The refresh problem: Capacitors leak charge. Within a few milliseconds, the stored data fades. A refresh circuit must periodically read each row and rewrite it, hundreds of times per second. This adds complexity but the density advantage is overwhelming.

Use: Main computer memory. A system that cannot fabricate DRAM will use core memory or static RAM instead.

Read-Only Memory (ROM)

ROM contains fixed data that survives power loss. Several types, in order of complexity:

TypeWrite MethodErase MethodRewrites
Mask ROMFactory manufacturedCannot erase0
PROMOne-time fuse blowingCannot erase0
EPROMElectrical programmingUV light (20-30 min)~100
EEPROMElectrical programmingElectrical erasure~100,000

Simple diode ROM: The easiest ROM to build. Create a grid of row and column wires. At each intersection where you want a 1, connect a diode. Where you want a 0, leave the intersection empty.

        Col0  Col1  Col2  Col3
Row0 ---[D]---[D]---------[D]---   Data: 1 1 0 1
Row1 ---------[D]---[D]---------   Data: 0 1 1 0
Row2 ---[D]---------[D]---[D]---   Data: 1 0 1 1

To read Row1: energize Row1 wire, check which columns conduct through diodes. Columns 1 and 2 have diodes, reading 0110.

A 256-byte diode ROM requires 2,048 diodes (256 rows x 8 columns, with approximately half the intersections populated). Tedious but entirely feasible with salvaged diodes.


Error Detection and Correction

Storage media are imperfect. Tape gets dirty, disk surfaces degrade, cosmic rays flip RAM bits. Error detection and correction are essential.

Parity Bits

The simplest error detection: add one extra bit to each byte. Even parity means the total number of 1-bits (including the parity bit) is always even.

Data:    1010110  (four 1-bits, even count)
Parity:  0        (already even, parity bit = 0)
Stored:  10101100

Data:    1010111  (five 1-bits, odd count)
Parity:  1        (make it even, parity bit = 1)
Stored:  10101111

When reading, count the 1-bits. If the count is odd, an error occurred somewhere. Parity detects single-bit errors but cannot locate them or detect double-bit errors.

Checksums

For blocks of data, sum all the bytes and store the sum at the end. When reading, recompute the sum and compare. If they differ, the block has errors.

Data bytes:  42, 17, 99, 0, 255, 8
Sum:         421
Checksum stored: 421

On read, recompute: 42 + 17 + 99 + 0 + 255 + 8 = 421  OK
If one byte corrupted: 42 + 17 + 100 + 0 + 255 + 8 = 422  ERROR

Hamming Codes

Hamming codes can not only detect but also correct single-bit errors. They embed check bits at power-of-2 positions (1, 2, 4, 8, โ€ฆ) within the data.

How it works for a 7-bit codeword (4 data bits + 3 check bits):

Position1234567
TypeP1P2D1P4D2D3D4
  • P1 checks positions 1,3,5,7 (binary: bit 0 is set)
  • P2 checks positions 2,3,6,7 (binary: bit 1 is set)
  • P4 checks positions 4,5,6,7 (binary: bit 2 is set)

When reading, recompute each parity check. If all match, no error. If one or more fail, the binary pattern of the failing checks gives the position of the error bit. Flip that bit to correct it.

Tip

For magnetic tape and disk, always use at least checksums. For critical data (programs, irreplaceable records), use Hamming codes. The overhead is small (about 3 extra bits per 4 data bits for single-error correction) and the protection is worth every bit.


File Systems

Raw storage is an undifferentiated sequence of bytes. A file system imposes structure so you can organize, name, find, and manage files.

Flat File System

The simplest approach: a single directory listing all files.

Directory (stored at known location on disk):
| Name       | Start Sector | Length (sectors) |
|------------|-------------|-----------------|
| PROGRAM1   | 10          | 5               |
| DATA       | 15          | 3               |
| PROGRAM2   | 20          | 8               |
| (empty)    | 28          | 72 (free space) |

To read a file: look up its name in the directory, find its start sector and length, read those sectors.

To write a file: find enough contiguous free space, write the data, add an entry to the directory.

Limitation: Files must be contiguous (one unbroken sequence of sectors). If you delete files and create new ones, you get fragmentation --- scattered free sectors too small for new files, even though total free space is adequate.

Linked Allocation

Each sector contains a pointer to the next sector of the file. Files can be scattered anywhere on the disk.

Directory:
| Name    | First Sector |
|---------|-------------|
| MYFILE  | 10          |

Sector 10: [data...] [next: 25]
Sector 25: [data...] [next: 7]
Sector 7:  [data...] [next: 0]  (0 = end of file)

No fragmentation problem, but random access is slow (must follow the chain from the beginning).

FAT (File Allocation Table)

A practical compromise used in early PCs: a table in memory maps every sector on the disk to either โ€œfree,โ€ โ€œend of file,โ€ or โ€œpointer to next sector.โ€

This gives linked allocationโ€™s flexibility with much faster access, because the entire chain can be traversed in the in-memory table without reading each disk sector.


Common Mistakes

MistakeWhy Itโ€™s DangerousWhat to Do Instead
No error checking on stored dataSingle-bit corruption silently produces wrong resultsAlways use at least parity or checksums for every block
Magnetic media near strong magnets or heatPartial or total data erasureStore tapes and disks away from motors, speakers, magnets; keep below 50 C
No backup copiesSingle media failure loses everythingKeep at least two copies of critical data on different media
Read/write head touching spinning diskSurface damage, data loss in that trackMaintain precise head height, park head before stopping motor
Forgetting tape interblock gapsDrive cannot stop and restart between blocks, data overrunsAlways leave 1-2 cm of blank tape between records
Core memory: insufficient drive currentCore does not fully switch, unreliable readCalibrate drive current to be well above the coreโ€™s switching threshold
No file systemData is unorganized, impossible to manage as volume growsImplement at least a flat directory from the start
DRAM without refresh circuitData fades within millisecondsRefresh every row at least every 2-4 ms, or use SRAM/core instead

Whatโ€™s Next

With persistent data storage, your computer system can accumulate knowledge, run complex programs, and serve as the foundation for networked systems:

  • Internet Infrastructure --- connect multiple computers with stored data into a network, enabling shared access to information across your entire community

Quick Reference Card

Data Storage --- At a Glance

Volatile: Registers, SRAM, DRAM --- lose data when power fails

Persistent: Magnetic tape/disk, punched cards/tape, ROM --- survive power loss

Easiest to build: Punched paper tape (punch holes, read with LED + phototransistor)

Best random access (buildable): Magnetic core memory (ferrite rings, 3 wires per core)

Best bulk storage: Magnetic tape (iron oxide on plastic film, sequential access)

Best random bulk storage: Magnetic disk (rotating platter, concentric tracks, sectors)

SRAM: 6 transistors per bit, fast, no refresh needed

DRAM: 1 transistor + 1 capacitor per bit, dense, requires refresh every 2-4 ms

Simple ROM: Diode matrix at row/column intersections

Error detection: Parity (1 extra bit, detects single errors), Checksum (sum of bytes)

Error correction: Hamming code (3 check bits per 4 data bits, corrects single-bit errors)

File system minimum: Directory listing file names, start locations, and lengths

Backup rule: Always maintain at least two copies of critical data on separate media