Collision Handling

Part of Networking

Detecting and recovering when two computers transmit simultaneously on a shared medium — the engineering problem that Ethernet solved.

Why This Matters

When two transmitters simultaneously drive a shared wire, their signals interfere. The result is garbage — neither message gets through. On a bus or half-duplex network, collisions are inevitable if multiple nodes try to transmit without coordination.

Collision handling is what makes a shared medium network usable. Without it, you either need a centralized coordinator (polled or token-passing networks) or you get unreliable communication where frames are lost silently. With proper collision detection and recovery, a network can serve many nodes fairly with no central coordinator, using simple decentralized rules that each node follows independently.

CSMA/CD (Carrier Sense Multiple Access with Collision Detection) is the protocol that made Ethernet work at scale. Understanding how it detects collisions, how it backs off, and when it gives up explains not just Ethernet but the general class of distributed access control problems that appears wherever a shared resource is contested.

The Problem: Simultaneous Transmission

On a bus network, any node that wants to transmit simply starts driving the wire. If two nodes start transmitting at exactly the same time (within one propagation delay of each other), both drive the wire simultaneously with different signals. The resulting voltage on the wire is neither node’s intended signal — it is the arithmetic sum of both.

Why this is hard to prevent: Nodes cannot know that another node is about to transmit — only that it is currently transmitting. If nodes A and B both check the wire, both find it idle, and both start transmitting within one round-trip propagation time of each other, both believe they have the wire but actually collide.

The collision detection window (the maximum time within which a collision can occur without being detectable by the initiating transmitter) is twice the one-way propagation delay from one end of the bus to the other. On a 10BASE2 Ethernet segment (185 meters, signal speed ~0.77c), this is approximately 1 microsecond. At 10 Mbps, 1 microsecond corresponds to 10 bits. Ethernet’s minimum frame size (64 bytes = 512 bits) ensures that every frame is longer than the collision detection window — if you are still transmitting when the collision is detected, you know the collision happened to your frame.

CSMA/CD Step by Step

1. Carrier Sense: Before transmitting, listen to the medium. If it is busy (another node is transmitting), wait until it becomes idle. “Carrier sense” means detecting the presence of a carrier signal (any ongoing transmission).

On Ethernet: the transmit/receive section of the interface detects whether the bus is carrying signals. If busy, the node waits in a deferral loop: it continues listening until the medium is idle, then waits an additional brief period (the interframe gap, 9.6 microseconds on 10 Mbps Ethernet) before transmitting.

2. Transmit: Once the medium appears idle, start transmitting the frame.

3. Collision Detection: While transmitting, continuously compare what is on the wire to what you are transmitting. On coaxial Ethernet, each node measures the DC component of the bus voltage. Normal transmission: voltage swings between +/- levels deterministically. Collision: voltage swings are larger or irregular because two drivers are fighting.

On RS-485 half-duplex networks without collision detection hardware: the node must disable its receiver while transmitting (because it would otherwise hear its own transmission). Without simultaneous receive capability, collision detection is not possible — this is why CSMA/CD is an Ethernet-specific capability not universally available on all bus protocols.

4. Jam Signal: When a collision is detected, immediately stop transmitting the data frame and transmit a special 32-bit jam pattern instead. The jam ensures that all nodes on the segment recognize that a collision occurred — even nodes that might have otherwise completed receiving before detecting the collision. After jamming, stop transmitting.

5. Backoff: After detecting a collision, wait a random amount of time before attempting to retransmit. The random wait is critical: if both colliding nodes retransmit immediately, they will collide again.

Truncated Binary Exponential Backoff:

  • After the first collision: choose a random wait of 0 or 1 slot times (one slot = 512 bit times on 10 Mbps Ethernet)
  • After the second collision with the same frame: choose randomly from 0, 1, 2, or 3 slot times
  • After the nth collision: choose randomly from 0 to 2^n - 1 slot times
  • This doubling of the window with each consecutive collision is “exponential backoff”
  • Truncated at n = 10 (maximum window = 1,023 slot times)
  • After 16 consecutive collisions: discard the frame and report error to the upper layer

The randomness in backoff means that even if two nodes collide, they will usually choose different wait times and the next transmission attempt by the first-to-retry will succeed uncontested.

Why Exponential Backoff Works

The key insight: under light load, few collisions occur and the small window (0 or 1 slot) is sufficient — retransmissions happen quickly with low probability of re-collision. Under heavy load (many nodes trying simultaneously), the window grows large enough that all contending nodes spread out over a long time interval, reducing the probability that any two of them pick the same slot.

Stability: CSMA/CD with exponential backoff is provably stable for loads up to about 50–60% of the theoretical channel capacity. Above this utilization, collision rates rise, backoff times grow, and throughput can collapse to near zero (a phenomenon called “congestion collapse”). Well-designed networks keep utilization below 30–40% to maintain healthy headroom.

Channel efficiency: At low loads, CSMA/CD is very efficient — nodes transmit immediately without waiting. At high loads, efficiency drops due to collisions and backoff overhead. The maximum theoretical throughput of CSMA/CD is somewhat less than the raw channel bandwidth.

Alternatives to CSMA/CD

CSMA/CA (Collision Avoidance): Used in Wi-Fi (802.11). Because radio networks cannot detect collisions while transmitting (the transmitter overwhelms its own receiver), Wi-Fi nodes avoid collisions by waiting a random backoff period before transmitting, even if the channel appears idle. This reduces collision probability at the cost of higher average latency.

Token passing: A token circulates among all nodes. A node may only transmit while holding the token. After transmitting, it passes the token to the next node. No collisions possible. Used in token ring (IEEE 802.4/5) and ARCNET. Fairer than CSMA/CD under heavy load but more complex to manage (what happens when the token holder crashes?).

Time-division multiple access (TDMA): A master clock divides time into slots and assigns specific slots to each node. Each node transmits only in its assigned slot. No collisions possible; deterministic latency (useful for real-time control). Requires global time synchronization.

For a rebuilt network using an RS-485 bus (where collision detection is not available), polled access (master sends a poll to each node in turn; only the polled node may transmit) is the most reliable choice. It has higher latency than CSMA/CD but is completely deterministic and requires no collision recovery logic.

Practical Collision Rate Monitoring

On a working Ethernet network, monitoring collision statistics reveals network health:

Low collision rate (<1% of frames): Normal. Individual collisions are resolved by retransmit and backoff.

Rising collision rate (1–10%): Network approaching saturation. Look for nodes that transmit very large volumes. Segment the network by adding a switch (which isolates collision domains).

High collision rate (>10%): Serious problem. Either the network is severely overloaded, or there is a faulty node (“jabber” — a node transmitting continuously or with corrupt framing), or a physical layer fault (bad terminator, damaged cable segment causing reflections that appear as collisions).

Diagnosing a faulty network: Connect an oscilloscope to the bus. During a collision, the voltage amplitude increases beyond the normal single-transmitter level. A jabbing node produces a near-continuous transmission that looks like carrier with no inter-frame gaps. A bad terminator produces ringing (oscillation) after each transmission. Each of these has a characteristic oscilloscope signature.

For an RS-485 polled network where collisions should never occur, any simultaneous bus driving (two nodes responding to the same poll, or a node transmitting outside its poll slot) indicates a firmware bug or a failed transceiver whose driver enable has stuck active. These can be diagnosed by measuring the A-B differential voltage: if it saturates or oscillates during what should be a quiet period, a driver is stuck enabled.