Data Storage
Why This Matters
A computer without persistent storage must be reprogrammed from scratch every time it is powered on. All programs, all data, all accumulated knowledge --- gone the instant the power fails. Data storage is what transforms a computer from a calculator into a knowledge repository. It is how civilization preserves and transmits information at scale.
What You Need
For magnetic storage:
- Ferrite ring cores (for core memory) or iron oxide coating material
- Fine copper wire, 30-40 AWG (for core memory threading)
- Electric motor and precision bearing assembly (for disk/drum)
- Read/write heads: small electromagnets with very fine pole gaps
- Flexible or rigid substrate (aluminum disc or plastic tape)
- Iron oxide powder (Fe2O3) mixed with binder (lacquer, shellac)
- Amplifier circuits (transistor-based, for reading weak magnetic signals)
For paper storage:
- Heavy card stock (manila card, approximately 0.18 mm thick)
- Paper tape (continuous roll, standard telegraph tape works)
- Precision punch mechanism (steel dies)
- Optical sensors (LED + phototransistor pairs) or mechanical brush contacts
For semiconductor memory:
- Transistors and capacitors (for SRAM/DRAM)
- Diodes (for ROM arrays)
- Breadboard or PCB fabrication capability
Storage Fundamentals
Volatile vs Persistent
All computer storage falls into two categories:
Volatile (loses data when power is removed):
- Registers (inside the processor)
- SRAM (static RAM --- flip-flop based)
- DRAM (dynamic RAM --- capacitor based, loses data even with power unless refreshed)
Persistent (retains data without power):
- Magnetic tape, drum, disk
- Punched cards and tape
- ROM (read-only memory, data burned in at manufacture)
- EPROM/EEPROM (erasable/electrically erasable programmable ROM)
The Memory Hierarchy
Speed and cost are inversely related. Fast memory is expensive and small. Cheap memory is slow and large.
| Level | Type | Speed | Typical Size | Cost/Byte |
|---|---|---|---|---|
| 1 | CPU Registers | 1 ns | 8-64 bytes | Highest |
| 2 | SRAM (Cache) | 5-20 ns | 1-256 KB | Very high |
| 3 | DRAM (Main memory) | 50-100 ns | 64 KB-several MB | Moderate |
| 4 | Magnetic disk | 5-20 ms | 1 MB-several GB | Low |
| 5 | Magnetic tape | Seconds-minutes | Unlimited (add more tape) | Lowest |
A well-designed system uses each level as a staging area for the next: the processor works from registers, loads data from RAM, which is filled from disk, which is backed up to tape.
Capacity Units
| Unit | Equals | Practical Reference |
|---|---|---|
| 1 bit | Single 0 or 1 | One transistor switch position |
| 1 byte | 8 bits | One character of text (ASCII) |
| 1 KB | 1,024 bytes | About half a page of text |
| 1 MB | 1,048,576 bytes | A short novel |
| 1 GB | ~1 billion bytes | An encyclopedia |
Paper-Based Storage
Paper storage is the easiest to build from scratch and the most durable for long-term archival. It was used from the 1890s through the 1970s and remains a viable technology for a post-collapse rebuild.
Punched Cards
The standard Hollerith punched card is 187 x 83 mm (7.375 x 3.25 inches), divided into 80 columns and 12 rows. Each column represents one character, encoded by the pattern of holes punched in that column.
Making punched cards:
- Cut heavy card stock to standard dimensions (precision is important for machine feeding)
- Mark a grid of 80 columns x 12 rows (column spacing: 2.3 mm, row spacing: 6.35 mm)
- Use a hand punch with a guide template, or build a mechanical punch with spring-loaded dies
- Each hole is a rectangular slot, approximately 3.2 x 1.3 mm
Encoding: Each column can have 0, 1, or more holes punched. The pattern of holes encodes a character using the Hollerith code:
| Character | Holes Punched (rows) |
|---|---|
| 0-9 | Single hole in rows 0-9 |
| A-I | Row 12 + rows 1-9 |
| J-R | Row 11 + rows 1-9 |
| S-Z | Row 0 + rows 2-9 |
Capacity: 80 characters per card. A program or data set is a stack of cards (a โdeckโ). A box of 2,000 cards holds 160 KB --- about the same as 3.5 pages of this article.
Punched Tape
A continuous roll of paper tape, typically 25.4 mm (1 inch) wide, with 5 to 8 data holes per row plus a smaller sprocket hole for mechanical feeding.
Advantages over cards:
- Continuous (no 80-character limit)
- Cheaper to produce
- Easier to feed through a reader
- Can be spliced for editing (cut out bad section, glue in corrected tape)
Standard 8-level tape: Each row has 8 data positions (one byte) plus a sprocket hole. The sprocket hole is smaller and positioned between columns 3 and 4.
Row format (8-level):
o O O . O O O O o
| | | | | | | | |
8 7 6 S 5 4 3 2 1 (S = sprocket, smaller hole)
Tip
Punched tape is the easiest storage medium to fabricate and read. You can punch holes with a simple hand tool and read them optically (LED shining through hole onto phototransistor) or mechanically (spring-loaded pins drop through holes to make electrical contact). Start here for your first persistent storage system.
Building a Tape Reader
Optical reader:
- Mount 8 LEDs in a line matching the hole spacing, shining downward
- Mount 8 phototransistors directly below, facing up
- Pass the tape between LEDs and phototransistors
- Where there is a hole, light passes through and the phototransistor conducts (reading a 1)
- Where there is no hole, light is blocked (reading a 0)
- A sprocket hole sensor triggers the read for each row
- Connect phototransistor outputs to the computerโs input port
Tape advance mechanism:
- A sprocket wheel engages the sprocket holes
- A stepper motor or hand crank advances one row per step
- The computer reads each row, advances the tape, reads the next row
Magnetic Storage
Magnetic storage records data by magnetizing small regions of a magnetic material in one of two directions, representing 0 and 1.
Magnetic Recording Principles
Key concepts:
Hysteresis: When you magnetize a material and remove the magnetizing force, the material retains some magnetization (remanence). This retained magnetization is your stored data.
Coercivity: The strength of the opposing magnetic field needed to demagnetize the material. High coercivity materials are harder to write but resist accidental erasure. Low coercivity materials are easy to write but risk data loss from stray magnetic fields.
| Material | Coercivity | Use |
|---|---|---|
| Gamma iron oxide (Fe2O3) | Medium | Standard tape and disk coating |
| Chromium dioxide (CrO2) | High | High-density tape |
| Ferrite ceramic | Medium-high | Core memory rings |
| Soft iron | Very low | Read/write head cores (must magnetize and demagnetize easily) |
Magnetic Core Memory
Before semiconductor RAM existed, core memory was the standard random-access storage technology. It uses tiny ferrite rings (cores), each storing one bit.
How it works:
- Each core is a toroid (donut shape) of ferrite ceramic, 0.5-2 mm in diameter
- Three wires thread through each core: X select, Y select, and sense/inhibit
- To write a 1: send current through both X and Y wires simultaneously. Neither wire alone carries enough current to flip the core, but together they exceed the threshold. The core magnetizes clockwise.
- To write a 0: the inhibit wire cancels the Y current for cores that should store 0
- To read: send read current through X and Y. If the core was storing 1, it flips to 0 and induces a pulse on the sense wire. If it was already 0, no pulse.
Destructive read: Reading a core always resets it to 0. The controller must immediately rewrite the data after reading. This is handled automatically in hardware.
Building core memory:
- Obtain or make ferrite cores (small ferrite beads work, or wind ferrite powder with binder into toroids)
- Thread three wires through each core in a grid pattern
- An 8x8 grid gives 64 bits (8 bytes). A 32x32 grid gives 1,024 bits (128 bytes)
- Total wire threading for a 32x32 plane: 3 wires x 1,024 cores = 3,072 threading operations
Warning
Core memory construction is extraordinarily tedious. A 1 KB memory (8 planes of 32x32) requires threading over 24,000 wire passages through tiny ferrite rings. This was done by hand (usually by women with excellent fine motor skills) until automated machinery took over. Budget weeks of patient work for even a small core memory array.
Magnetic Tape
Magnetic tape stores data sequentially on a long strip of plastic film coated with magnetic oxide.
Making magnetic tape:
- Start with a smooth, flexible plastic substrate (polyester film, Mylar, or even smooth paper in a pinch)
- Mix gamma iron oxide (Fe2O3) powder with a binder (lacquer, shellac, or acrylic)
- Apply a thin, uniform coating to one side of the substrate
- Allow to dry completely
- Slit to the desired width (standard: 12.7 mm / 0.5 inch)
Read/write head:
- Wind a small coil (100-500 turns of 40 AWG wire) around a C-shaped soft iron core
- The gap in the C faces the tape surface (gap width: 0.01-0.05 mm)
- To write: current through the coil magnetizes the tape as it passes the gap
- To read: magnetized tape passing the gap induces voltage in the coil
- The head must make firm, consistent contact with the tape surface
Tape drive mechanics:
- Two reels (supply and take-up) with a motor-driven capstan between them
- The capstan controls tape speed precisely
- Tape speed: 10-100 cm/second for simple systems
- A tension mechanism prevents tape slack and ensures constant head contact
Data format:
- Data is recorded in blocks (records) separated by inter-record gaps (blank tape sections)
- Each block starts with a preamble (synchronization pattern) and ends with a checksum
- Between blocks, the tape can be stopped and started without losing data position
Magnetic Disk
Magnetic disk provides random access: any piece of data can be read without scanning through all preceding data, unlike tape.
Basic construction:
- A rigid aluminum disc coated with magnetic oxide (like tape, but circular)
- The disc spins on a motor-driven spindle at constant speed (360-3,600 RPM for early systems)
- A read/write head floats above or rests on the disc surface
- Data is recorded in concentric circular tracks
- The head moves radially to access different tracks
Organization:
- Tracks: Concentric circles on the disc surface (typically 40-200 per side)
- Sectors: Each track is divided into equal arc segments (typically 8-32 per track)
- Block: One sector holds a fixed amount of data (typically 128-512 bytes)
- Address: Track number + sector number uniquely identifies any block
Track 0 (outermost)
/
/-----------\
| /-------\ |
| | /---\ | | Track 39 (innermost)
| | | | | |
| | \---/ | |
| \-------/ |
\-----------/
|
Spindle/motor
Sector layout (top view of one track):
[S0][S1][S2][S3][S4][S5][S6][S7] <- 8 sectors per track
Tip
For a first magnetic disc system, use a salvaged hard drive motor and platter if available. The precision of commercial platters far exceeds anything you can coat by hand. Even a dead hard driveโs mechanical components (motor, bearings, platters) are invaluable. Build only the read/write head and electronics from scratch.
Semiconductor Memory
Static RAM (SRAM)
SRAM uses flip-flops (from Boolean Logic and Gates) to store bits. Each bit requires 4-6 transistors.
Advantages:
- Very fast (access time 5-20 nanoseconds)
- Does not need refresh --- holds data as long as power is on
- Simple interface --- just address lines and data lines
Disadvantages:
- Expensive (6 transistors per bit)
- Low density (a 1 KB SRAM needs about 50,000 transistors)
Use: Processor cache, small high-speed buffers, register files.
Dynamic RAM (DRAM)
DRAM stores each bit as charge on a tiny capacitor, with one transistor as an access switch. Much denser than SRAM (1 transistor + 1 capacitor per bit vs 6 transistors).
The refresh problem: Capacitors leak charge. Within a few milliseconds, the stored data fades. A refresh circuit must periodically read each row and rewrite it, hundreds of times per second. This adds complexity but the density advantage is overwhelming.
Use: Main computer memory. A system that cannot fabricate DRAM will use core memory or static RAM instead.
Read-Only Memory (ROM)
ROM contains fixed data that survives power loss. Several types, in order of complexity:
| Type | Write Method | Erase Method | Rewrites |
|---|---|---|---|
| Mask ROM | Factory manufactured | Cannot erase | 0 |
| PROM | One-time fuse blowing | Cannot erase | 0 |
| EPROM | Electrical programming | UV light (20-30 min) | ~100 |
| EEPROM | Electrical programming | Electrical erasure | ~100,000 |
Simple diode ROM: The easiest ROM to build. Create a grid of row and column wires. At each intersection where you want a 1, connect a diode. Where you want a 0, leave the intersection empty.
Col0 Col1 Col2 Col3
Row0 ---[D]---[D]---------[D]--- Data: 1 1 0 1
Row1 ---------[D]---[D]--------- Data: 0 1 1 0
Row2 ---[D]---------[D]---[D]--- Data: 1 0 1 1
To read Row1: energize Row1 wire, check which columns conduct through diodes. Columns 1 and 2 have diodes, reading 0110.
A 256-byte diode ROM requires 2,048 diodes (256 rows x 8 columns, with approximately half the intersections populated). Tedious but entirely feasible with salvaged diodes.
Error Detection and Correction
Storage media are imperfect. Tape gets dirty, disk surfaces degrade, cosmic rays flip RAM bits. Error detection and correction are essential.
Parity Bits
The simplest error detection: add one extra bit to each byte. Even parity means the total number of 1-bits (including the parity bit) is always even.
Data: 1010110 (four 1-bits, even count)
Parity: 0 (already even, parity bit = 0)
Stored: 10101100
Data: 1010111 (five 1-bits, odd count)
Parity: 1 (make it even, parity bit = 1)
Stored: 10101111
When reading, count the 1-bits. If the count is odd, an error occurred somewhere. Parity detects single-bit errors but cannot locate them or detect double-bit errors.
Checksums
For blocks of data, sum all the bytes and store the sum at the end. When reading, recompute the sum and compare. If they differ, the block has errors.
Data bytes: 42, 17, 99, 0, 255, 8
Sum: 421
Checksum stored: 421
On read, recompute: 42 + 17 + 99 + 0 + 255 + 8 = 421 OK
If one byte corrupted: 42 + 17 + 100 + 0 + 255 + 8 = 422 ERROR
Hamming Codes
Hamming codes can not only detect but also correct single-bit errors. They embed check bits at power-of-2 positions (1, 2, 4, 8, โฆ) within the data.
How it works for a 7-bit codeword (4 data bits + 3 check bits):
| Position | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|
| Type | P1 | P2 | D1 | P4 | D2 | D3 | D4 |
- P1 checks positions 1,3,5,7 (binary: bit 0 is set)
- P2 checks positions 2,3,6,7 (binary: bit 1 is set)
- P4 checks positions 4,5,6,7 (binary: bit 2 is set)
When reading, recompute each parity check. If all match, no error. If one or more fail, the binary pattern of the failing checks gives the position of the error bit. Flip that bit to correct it.
Tip
For magnetic tape and disk, always use at least checksums. For critical data (programs, irreplaceable records), use Hamming codes. The overhead is small (about 3 extra bits per 4 data bits for single-error correction) and the protection is worth every bit.
File Systems
Raw storage is an undifferentiated sequence of bytes. A file system imposes structure so you can organize, name, find, and manage files.
Flat File System
The simplest approach: a single directory listing all files.
Directory (stored at known location on disk):
| Name | Start Sector | Length (sectors) |
|------------|-------------|-----------------|
| PROGRAM1 | 10 | 5 |
| DATA | 15 | 3 |
| PROGRAM2 | 20 | 8 |
| (empty) | 28 | 72 (free space) |
To read a file: look up its name in the directory, find its start sector and length, read those sectors.
To write a file: find enough contiguous free space, write the data, add an entry to the directory.
Limitation: Files must be contiguous (one unbroken sequence of sectors). If you delete files and create new ones, you get fragmentation --- scattered free sectors too small for new files, even though total free space is adequate.
Linked Allocation
Each sector contains a pointer to the next sector of the file. Files can be scattered anywhere on the disk.
Directory:
| Name | First Sector |
|---------|-------------|
| MYFILE | 10 |
Sector 10: [data...] [next: 25]
Sector 25: [data...] [next: 7]
Sector 7: [data...] [next: 0] (0 = end of file)
No fragmentation problem, but random access is slow (must follow the chain from the beginning).
FAT (File Allocation Table)
A practical compromise used in early PCs: a table in memory maps every sector on the disk to either โfree,โ โend of file,โ or โpointer to next sector.โ
This gives linked allocationโs flexibility with much faster access, because the entire chain can be traversed in the in-memory table without reading each disk sector.
Common Mistakes
| Mistake | Why Itโs Dangerous | What to Do Instead |
|---|---|---|
| No error checking on stored data | Single-bit corruption silently produces wrong results | Always use at least parity or checksums for every block |
| Magnetic media near strong magnets or heat | Partial or total data erasure | Store tapes and disks away from motors, speakers, magnets; keep below 50 C |
| No backup copies | Single media failure loses everything | Keep at least two copies of critical data on different media |
| Read/write head touching spinning disk | Surface damage, data loss in that track | Maintain precise head height, park head before stopping motor |
| Forgetting tape interblock gaps | Drive cannot stop and restart between blocks, data overruns | Always leave 1-2 cm of blank tape between records |
| Core memory: insufficient drive current | Core does not fully switch, unreliable read | Calibrate drive current to be well above the coreโs switching threshold |
| No file system | Data is unorganized, impossible to manage as volume grows | Implement at least a flat directory from the start |
| DRAM without refresh circuit | Data fades within milliseconds | Refresh every row at least every 2-4 ms, or use SRAM/core instead |
Whatโs Next
With persistent data storage, your computer system can accumulate knowledge, run complex programs, and serve as the foundation for networked systems:
- Internet Infrastructure --- connect multiple computers with stored data into a network, enabling shared access to information across your entire community
Quick Reference Card
Data Storage --- At a Glance
Volatile: Registers, SRAM, DRAM --- lose data when power fails
Persistent: Magnetic tape/disk, punched cards/tape, ROM --- survive power loss
Easiest to build: Punched paper tape (punch holes, read with LED + phototransistor)
Best random access (buildable): Magnetic core memory (ferrite rings, 3 wires per core)
Best bulk storage: Magnetic tape (iron oxide on plastic film, sequential access)
Best random bulk storage: Magnetic disk (rotating platter, concentric tracks, sectors)
SRAM: 6 transistors per bit, fast, no refresh needed
DRAM: 1 transistor + 1 capacitor per bit, dense, requires refresh every 2-4 ms
Simple ROM: Diode matrix at row/column intersections
Error detection: Parity (1 extra bit, detects single errors), Checksum (sum of bytes)
Error correction: Hamming code (3 check bits per 4 data bits, corrects single-bit errors)
File system minimum: Directory listing file names, start locations, and lengths
Backup rule: Always maintain at least two copies of critical data on separate media