Capacity Units

Part of Data Storage

Bits, bytes, kilobytes, megabytes — the standard units for quantifying digital storage capacity and why two competing definitions cause constant confusion.

Why This Matters

Storage capacity is measured in bits and bytes and their multiples. Understanding these units — and the critical difference between decimal and binary prefixes — is necessary for interpreting specifications, calculating memory requirements, and understanding why a “500 GB” hard drive shows up as less than 500 GB in your operating system.

The confusion is not accidental. Hard drive manufacturers use decimal prefixes (1 KB = 1,000 bytes) because they produce larger-sounding numbers. Operating systems historically used binary prefixes (1 KB = 1,024 bytes) because powers of 2 align with hardware. The gap widens at larger scales: a “1 TB” drive by SI definition has 10^12 bytes, but operating systems may report only ~931 GiB.

Anyone designing storage systems or writing firmware must understand both systems and know which one applies in each context.

The Bit and Byte

The bit (b) is the fundamental unit of binary information — a single 0 or 1. It corresponds to one digital signal: HIGH or LOW.

The byte (B) is 8 bits. A byte can represent 2^8 = 256 different values (0 to 255 unsigned, or -128 to +127 signed in two’s complement). One byte holds one ASCII character, one pixel in 8-bit color, one I/O port value.

Historical note: the term “byte” originates from IBM and meant the smallest addressable unit of memory. Early computers used 6-bit or 7-bit bytes; 8-bit bytes became standard with the IBM System/360 and dominated from the mid-1960s onward.

The nibble (4 bits, half a byte) is used informally. One nibble holds one hexadecimal digit (0–F).

Decimal Prefixes (SI)

The International System of Units (SI) defines standard prefixes for powers of 10:

PrefixSymbolValue
kilok10^3 = 1,000
megaM10^6 = 1,000,000
gigaG10^9 = 1,000,000,000
teraT10^12
petaP10^15
exaE10^18

In storage, hard drive and SSD manufacturers use SI prefixes:

  • 1 KB = 1,000 bytes
  • 1 MB = 1,000,000 bytes
  • 1 GB = 1,000,000,000 bytes
  • 1 TB = 1,000,000,000,000 bytes

A “500 GB” drive holds 500 × 10^9 = 500,000,000,000 bytes.

Binary Prefixes (IEC)

Computing traditionally uses powers of 2 for memory addressing. The IEC (International Electrotechnical Commission) defined binary prefixes in 1998 to avoid confusion:

PrefixSymbolValueApprox.
kibibyteKiB2^10 = 1,024≈ 10^3
mebibyteMiB2^20 = 1,048,576≈ 10^6
gibibyteGiB2^30 = 1,073,741,824≈ 10^9
tebibyteTiB2^40 ≈ 1.1 × 10^12≈ 10^12

Operating systems often use these binary sizes but call them KB, MB, GB (mislabeling binary sizes with SI names). This is the primary source of confusion.

Windows traditionally reports file sizes in binary units labeled as KB/MB/GB. Linux (with modern df and ls) often uses both, and some commands can output in either format. macOS switched to decimal (SI) units in macOS 10.6, so Mac users see numbers closer to what drive manufacturers report.

The Capacity Gap

The gap between SI and binary units grows with scale:

ScaleSIBinaryGap
”1 KB”1,000 B1,024 B2.4%
“1 MB”1,000,000 B1,048,576 B4.9%
“1 GB”10^9 B2^30 B7.4%
“1 TB”10^12 B2^40 B9.95%
“1 PB”10^15 B2^50 B12.6%

A “1 TB” hard drive (SI) holds 10^12 bytes. An OS using binary units reports this as 10^12 / 2^30 ≈ 931 GiB — displayed as “931 GB” by Windows, hence the seeming discrepancy.

Data Rates: Bits vs. Bytes

Network speeds and storage transfer rates are often quoted in bits per second (bps), not bytes per second. Be careful with the capitalization convention: lowercase ‘b’ = bits, uppercase ‘B’ = bytes.

  • 1 Gbps Ethernet = 10^9 bits/second = 125 MB/s (SI megabytes per second)
  • USB 3.0 = 5 Gbps = 625 MB/s theoretical, ~400 MB/s in practice
  • SATA III SSD: ~550 MB/s sequential read

When comparing a 1 Gbps network to a storage device rated at 125 MB/s: they are equivalent (125 × 8 = 1,000 Mbps). Failure to distinguish bits from bytes leads to factor-of-8 errors in capacity and bandwidth calculations.

Memory vs. Storage Terminology

ContextUnits typically usedNotes
RAMBinary (MiB, GiB)Must be powers of 2 for addressing
ROM/FlashBinary (KiB, MiB)Flash in powers of 2
Hard drivesSI (GB, TB)Manufacturer convention
SSDsSI (GB, TB)Follows HDD convention
Network bandwidthSI (Mbps, Gbps)Always bits, SI prefixes
File sizes (OS)VariesWindows: binary labeled as SI; macOS: SI
Firmware: addressingBinaryAddress spaces are 2^N

When writing firmware or system code, always use powers of 2 for buffer sizes, array dimensions, and memory regions. Use 1024 (not 1000) for KiB in code, and document clearly whether your reported sizes are binary or decimal.