CD Technology
Part of Data Storage
How 650 megabytes of data fit on a 12 cm polycarbonate disc — the encoding, error correction, and physical format that made CDs the dominant digital medium of the 1980s–2000s.
Why This Matters
The Compact Disc is a triumph of engineering precision and clever mathematics. Developed jointly by Philips and Sony in the early 1980s, it crammed far more data onto a small, easily manufactured disc than any previous consumer medium, with error correction robust enough to tolerate significant disc damage.
For rebuilders, CDs are perhaps the most valuable optical archive medium available. Billions were manufactured. The format is fully documented in the Red Book (audio) and Yellow Book (data CD-ROM) specifications. Any CD drive — including the billions salvageable from computers, game consoles, and music players — can read a standard CD. The discs are durable, compact, and contain enough capacity for entire libraries of reference material, programs, and databases.
Understanding the CD format helps you verify disc integrity, recover partially damaged discs, evaluate which drives can read which discs, and appreciate how the format’s engineering decisions translated into the capacity and reliability that made it successful.
Physical Format
A standard CD is 120 mm in diameter, 1.2 mm thick, made of clear polycarbonate with a very thin aluminium reflective layer on one side, protected by a lacquer coating and printed label.
Data layer: Immediately below the lacquer layer sits the aluminium reflective coating (~50 nm thick). Beneath the aluminium is the data layer — for pressed CDs, this is a pattern of pits and lands molded into the polycarbonate substrate during injection molding. The pit depth is 110 nm (λ/4 in polycarbonate, which has refractive index 1.55, so 780 nm / 4 / 1.55 ≈ 125 nm — the actual depth is slightly less due to design tradeoffs).
The spiral track: One continuous spiral track runs from 25 mm radius (inner) to 58 mm radius (outer), covering approximately 5.5 km of track in a typical data CD. Track pitch is 1.6 μm.
Scanning velocity: CDs are read at constant linear velocity (CLV): 1.2–1.4 m/s for single-speed (1×) drives. Higher speed drives read at multiples of this: 8× = 9.6–11.2 m/s. At the inner track (25 mm radius), the disc spins at ~500 RPM for 1× speed; at the outer track (58 mm radius), ~200 RPM.
Channel Bit Encoding: EFM
Raw data bytes cannot be directly recorded as bit patterns on a CD. Instead, each 8-bit byte is converted to a 14-channel-bit pattern using Eight-to-Fourteen Modulation (EFM).
EFM’s design constraint: in the channel bit stream, no two 1-bits can be closer together than 3 bit-cell positions, and no two 1-bits can be farther apart than 11 bit-cell positions. This ensures:
- Minimum pit/land length: prevents the spot size from having to resolve features too small to detect reliably
- Maximum run length: ensures regular signal transitions for clock recovery (the decoder must synchronize its clock to the incoming data)
After EFM encoding, 3 “merging bits” are added between each 14-bit code word (to satisfy run-length constraints across word boundaries). So each 8-bit byte becomes 14 + 3 = 17 channel bits, plus a 24-bit synchronization pattern at the start of each frame.
The 256 possible bytes each have a unique EFM code (and additional patterns exist for synchronization and control). The decoding table is fixed and was chosen to minimize disc surface area per byte while satisfying the run-length constraints.
Sector and Block Structure
Data on a CD-ROM is organized in a hierarchy: frames → sectors → blocks.
Frame: 24 bytes of user data, plus 24 bytes of error correction data (C2 code), producing 32 bytes after EFM encoding overhead. This is the smallest unit the CD encoding system processes.
Sector (Mode 1 CD-ROM): 98 frames = 2,352 bytes of raw sector data, consisting of:
- 12 bytes: synchronization pattern
- 3 bytes: sector address (minute:second:frame in BCD, each frame = 1/75 second)
- 1 byte: mode indicator
- 2,048 bytes: user data
- 4 bytes: error detection code (CRC-32)
- 8 bytes: reserved
- 276 bytes: additional error correction (Reed-Solomon Product-code, P and Q parity)
So each sector stores 2,048 bytes of user data within 2,352 total bytes — an overhead factor of ~15% for error correction and addressing.
Disc capacity: 74 minutes × 60 seconds × 75 frames/second × 2,048 bytes/sector = 681 MB. Standard CDs are specified for 74 minutes; some manufacturers produce 80-minute discs (700 MB).
Error Correction: CIRC
The CD uses Cross-Interleave Reed-Solomon Coding (CIRC), which provides exceptional protection against burst errors (scratches).
First level (C1 decoder): A (32,28) Reed-Solomon code (28 data bytes, 4 parity bytes) can correct up to 2 bytes of erasure or detect up to 2 bytes of error per 32-byte block.
Interleaving: After C1 encoding, the data bytes are reordered using a specific interleave pattern before C2 encoding. This spreads adjacent bytes from the same frame over 108 frames ≈ 2.5 mm of track. A scratch that destroys 1 mm of track damages only about 40% of the bytes in any given frame — spread, they become single-byte errors that C1 can easily correct.
Second level (C2 decoder): A (28,24) Reed-Solomon code adds 4 more parity bytes and can correct up to 2 additional bytes of erasure. Combined with C1 and the interleaving, burst errors up to 7.7 mm of track length are fully correctable. Burst errors up to 12 mm can be interpolated (for audio) or reported as uncorrectable errors (for data).
This robust error correction is why a CD with a moderate scratch typically still plays perfectly — the error correction overhead handles the burst error without the listener or user noticing.
The CD File System: ISO 9660
CD-ROM discs use the ISO 9660 file system to organize sectors into files and directories. ISO 9660 is simple, widely supported, and designed for read-only access.
Primary Volume Descriptor (PVD): Located at sector 16, contains the disc name, creation date, total number of sectors, block size, and the location of the root directory record.
Directory records: Each directory entry contains the file name, starting sector number, file length in bytes, creation date/time, and flags (directory vs. file). Directory names are limited to 8 characters in base ISO 9660 (extensions like Joliet and Rock Ridge lift this restriction).
Path table: An optional table listing all directories with their parent directories, allowing fast location of deeply nested files.
Data files: Each file is stored in one or more contiguous sectors. ISO 9660 does not support fragmented files — a file occupies consecutive sectors from its starting sector to its ending sector. This simplifies the drive firmware (no linked-list traversal) but means that once data is written (mastered), it cannot be changed.
The simplicity of ISO 9660 makes it one of the easiest file systems to implement from scratch. A CD-ROM reader needs only the PVD location (always sector 16), the root directory location (from the PVD), and the ability to follow directory chains. Total implementation: perhaps 200 lines of code in a procedural language.
CD-R and CD-RW Specifics
CD-R (recordable): Uses organic dye instead of pits. A focused write laser heats dye spots, permanently darkening them to simulate pits. CD-R discs are compatible with standard CD drives but require drives to be tested for compatibility (some early CD players cannot read CD-R due to lower reflectivity).
Writing: CD-R must be written at the appropriate speed for the disc rating. Writing too fast causes incomplete heating and unreadable sectors. Writing too slow can over-heat the dye. Match write speed to disc specification, and use a quality disc from a reputable manufacturer.
CD-RW (rewritable): Phase-change material instead of dye. Significantly lower reflectivity than pressed CDs — many older CD players cannot read CD-RW. Only ~1,000 rewrite cycles before the phase-change layer degrades. Best for data that changes frequently over a short period, not for archival.
Verifying a burn: Always verify by reading back every sector immediately after writing, before removing the disc from the drive. A sector with CRC errors during verify should be re-burned on a fresh disc. Never trust a burn you have not verified.