Capacity Units

Part of Data Storage

Bits, bytes, kilobytes, megabytes — the standard units for quantifying digital storage capacity and why two competing definitions cause constant confusion.

Why This Matters

Storage capacity is measured in bits and bytes and their multiples. Understanding these units — and the critical difference between decimal and binary prefixes — is necessary for interpreting specifications, calculating memory requirements, and understanding why a “500 GB” hard drive shows up as less than 500 GB in your operating system.

The confusion is not accidental. Hard drive manufacturers use decimal prefixes (1 KB = 1,000 bytes) because they produce larger-sounding numbers. Operating systems historically used binary prefixes (1 KB = 1,024 bytes) because powers of 2 align with hardware. The gap widens at larger scales: a “1 TB” drive by SI definition has 10^12 bytes, but operating systems may report only ~931 GiB.

Anyone designing storage systems or writing firmware must understand both systems and know which one applies in each context.

The Bit and Byte

The bit (b) is the fundamental unit of binary information — a single 0 or 1. It corresponds to one digital signal: HIGH or LOW.

The byte (B) is 8 bits. A byte can represent 2^8 = 256 different values (0 to 255 unsigned, or -128 to +127 signed in two’s complement). One byte holds one ASCII character, one pixel in 8-bit color, one I/O port value.

Historical note: the term “byte” originates from IBM and meant the smallest addressable unit of memory. Early computers used 6-bit or 7-bit bytes; 8-bit bytes became standard with the IBM System/360 and dominated from the mid-1960s onward.

The nibble (4 bits, half a byte) is used informally. One nibble holds one hexadecimal digit (0–F).

Decimal Prefixes (SI)

The International System of Units (SI) defines standard prefixes for powers of 10:

Prefix	Symbol	Value
kilo	k	10^3 = 1,000
mega	M	10^6 = 1,000,000
giga	G	10^9 = 1,000,000,000
tera	T	10^12
peta	P	10^15
exa	E	10^18

In storage, hard drive and SSD manufacturers use SI prefixes:

1 KB = 1,000 bytes
1 MB = 1,000,000 bytes
1 GB = 1,000,000,000 bytes
1 TB = 1,000,000,000,000 bytes

A “500 GB” drive holds 500 × 10^9 = 500,000,000,000 bytes.

Binary Prefixes (IEC)

Computing traditionally uses powers of 2 for memory addressing. The IEC (International Electrotechnical Commission) defined binary prefixes in 1998 to avoid confusion:

Prefix	Symbol	Value	Approx.
kibibyte	KiB	2^10 = 1,024	≈ 10^3
mebibyte	MiB	2^20 = 1,048,576	≈ 10^6
gibibyte	GiB	2^30 = 1,073,741,824	≈ 10^9
tebibyte	TiB	2^40 ≈ 1.1 × 10^12	≈ 10^12

Operating systems often use these binary sizes but call them KB, MB, GB (mislabeling binary sizes with SI names). This is the primary source of confusion.

Windows traditionally reports file sizes in binary units labeled as KB/MB/GB. Linux (with modern df and ls) often uses both, and some commands can output in either format. macOS switched to decimal (SI) units in macOS 10.6, so Mac users see numbers closer to what drive manufacturers report.

The Capacity Gap

The gap between SI and binary units grows with scale:

Scale	SI	Binary	Gap
”1 KB”	1,000 B	1,024 B	2.4%
“1 MB”	1,000,000 B	1,048,576 B	4.9%
“1 GB”	10^9 B	2^30 B	7.4%
“1 TB”	10^12 B	2^40 B	9.95%
“1 PB”	10^15 B	2^50 B	12.6%

A “1 TB” hard drive (SI) holds 10^12 bytes. An OS using binary units reports this as 10^12 / 2^30 ≈ 931 GiB — displayed as “931 GB” by Windows, hence the seeming discrepancy.

Data Rates: Bits vs. Bytes

Network speeds and storage transfer rates are often quoted in bits per second (bps), not bytes per second. Be careful with the capitalization convention: lowercase ‘b’ = bits, uppercase ‘B’ = bytes.

1 Gbps Ethernet = 10^9 bits/second = 125 MB/s (SI megabytes per second)
USB 3.0 = 5 Gbps = 625 MB/s theoretical, ~400 MB/s in practice
SATA III SSD: ~550 MB/s sequential read

When comparing a 1 Gbps network to a storage device rated at 125 MB/s: they are equivalent (125 × 8 = 1,000 Mbps). Failure to distinguish bits from bytes leads to factor-of-8 errors in capacity and bandwidth calculations.

Memory vs. Storage Terminology

Context	Units typically used	Notes
RAM	Binary (MiB, GiB)	Must be powers of 2 for addressing
ROM/Flash	Binary (KiB, MiB)	Flash in powers of 2
Hard drives	SI (GB, TB)	Manufacturer convention
SSDs	SI (GB, TB)	Follows HDD convention
Network bandwidth	SI (Mbps, Gbps)	Always bits, SI prefixes
File sizes (OS)	Varies	Windows: binary labeled as SI; macOS: SI
Firmware: addressing	Binary	Address spaces are 2^N

When writing firmware or system code, always use powers of 2 for buffer sizes, array dimensions, and memory regions. Use 1024 (not 1000) for KiB in code, and document clearly whether your reported sizes are binary or decimal.

Explorer

Capacity Units

Capacity Units

Why This Matters

The Bit and Byte

Decimal Prefixes (SI)

Binary Prefixes (IEC)

The Capacity Gap

Data Rates: Bits vs. Bytes

Memory vs. Storage Terminology

Table of Contents