Code Structure
Part of Telegraph
Code structure defines the rules and patterns of telegraph codes — how letters, numbers, and symbols are represented as sequences of electrical signals.
Why This Matters
A telegraph system is only as useful as the code it uses. The code determines how quickly an operator can send messages, how reliably those messages are received under difficult conditions, and how much traffic a single line can carry. A well-designed code minimizes the time needed to transmit common characters while remaining unambiguous under noise and operator variation. A poorly designed code is tedious to learn, slow to send, and prone to errors.
Morse code — the International Morse Code (ITU Morse) — is the dominant telegraph code for good reasons. It was empirically optimized so that common letters in English have short representations: E is dot (.), T is dash (—), A is dot-dash (.—), the vowels and most common consonants are two or three elements. Uncommon letters have longer sequences. This frequency-matching dramatically reduces average message length compared to a fixed-length code.
Understanding code structure also reveals why different codes serve different purposes: the original American Morse code (railway code) uses different conventions than International Morse, telegraphic codes use numbers to represent whole phrases, and modern error-correcting codes use redundancy to detect and correct transmission errors. Each design reflects a different tradeoff between efficiency, reliability, and implementation complexity.
International Morse Code
International Morse Code (IMC, sometimes called Continental Morse) represents characters as sequences of dots (short marks) and dashes (long marks), with silence intervals between elements, between characters, and between words.
Timing ratios: a dot is the basic time unit (1 unit). A dash is 3 units. The gap between elements within a character is 1 unit. The gap between characters is 3 units. The gap between words is 7 units. These ratios, maintained consistently by a skilled operator, allow unambiguous decoding.
The 26 letters, 10 digits, and common punctuation marks are each assigned unique Morse sequences. The structure assigns shorter sequences to more frequent characters:
- Single element: E (·), T (—)
- Two elements: A (·—), I (··), M (——), N (—·)
- Three elements: D (—··), G (——·), K (—·—), O (———), R (·—·), S (···), U (··—), W (·——)
- Four or more elements: less common letters
The code can be memorized using mnemonics, but the most effective learning method is direct audio recognition — learning the sound of each character as a musical pattern rather than thinking in dots and dashes. “A sounds like dit-dah,” “B sounds like dah-dit-dit-dit,” and so on. Proficiency comes from thousands of repetitions until characters are recognized instantly.
Code Efficiency and Timing
Morse code efficiency can be analyzed mathematically. Consider that the average English letter frequency gives: common letters (E, T, A, O, I, N) make up roughly 50% of all text. In IMC, these letters average about 2.5 timing units each. Less common letters average 4–6 units. The result: IMC transmits English text about 50% faster than it would if all characters had equal length.
Sending speed is measured in words per minute (WPM). A “word” in the standard measure is the five-character word “PARIS” (which averages 50 timing units at standard spacing). At 20 WPM, the dot duration is 60 milliseconds. At 25 WPM, 48 ms. Skilled operators send 30–40 WPM; the world record exceeds 100 WPM using specialized keying methods.
Efficiency is also affected by the ratio of characters to message information. Radiogram procedures use abbreviations: QTH = “what is your location?”, QRM = “are you being interfered with?”, QRN = “are you troubled by static?”, and hundreds more. Q codes and prosigns (procedural signals) compress common phrases dramatically, increasing effective throughput without increasing sending speed.
Prosigns and Procedural Signals
Prosigns are special character combinations sent as single characters (without the inter-character space) that signal procedural meanings rather than text content.
AR (·—·—·): “end of message” — signals that the message text is complete. SK (···—·—): “end of contact” — signals that the contact is complete and the station is signing off. KN (—·—·): “go ahead, specific station” — only the called station should reply. K (—·—): “go ahead” — any station may reply. BT (—···—): “text separator” — separates message header from text in formal traffic. AS (·—···): “wait” — requesting a pause. DE (—···): “this is” — standard prosign preceding station identification.
In message handling, the format is: callsign DE callsign K (or KN). A complete contact might be: “W2ABC DE W5XYZ K” (W2ABC, this is W5XYZ, over) — recognized instantly by any trained operator anywhere in the world.
Cipher and Coded Messages
Plain language messages are limited in efficiency by the need to spell everything out. Commercial telegraph codes replaced common phrases, addresses, shipping terms, and entire contract clauses with five-letter or five-digit code groups. The 1903 ABC Code book, for example, contained 100,000 code groups each representing a complete sentence. A merchant could send a 10-group coded message that conveyed what would otherwise require 200 words.
For a post-collapse community, developing a local code book for common traffic makes sense. Standard phrases — “requesting assistance,” “situation is under control,” “supplies needed: [item list follows],” “casualties: [number follows]” — can be encoded as short code groups. The code book must be distributed to all stations in advance and kept secure from adversaries if operational security matters.
Cipher (encrypted) messages use codes designed to conceal meaning from unauthorized readers. Simple substitution ciphers (replace each letter with another) are weak and can be broken in minutes by analysis. One-time pad systems (using a randomly generated key as long as the message, discarded after use) are theoretically unbreakable but require secure pre-distribution of key material. For field use, a shared code book serves both efficiency and modest security.
Error Detection and Correction
Morse code has no built-in error correction. An error produces a wrong letter that the receiver does not know is wrong — unless context reveals it. Standard practice for error correction: if the sender makes a mistake, send the error signal (eight dots: ········) to alert the receiver that the preceding character was wrong, then immediately resend the correct character. The receiver erases the wrong character and substitutes the correct one.
For formal message traffic, the receiver reads back the received text and the sender confirms accuracy (“Rogers, your copy is correct” or sends corrections). This confirmation procedure catches transcription errors and ensures both parties have identical copies.
Statistical error detection: in critical applications, messages can be sent twice. The receiver checks both copies for agreement; discrepancies reveal errors. Or a message “check” (the number of words in the text portion of a radiogram) is sent in the message header; the receiver verifies their copy has the correct word count before confirming receipt.
Modern digital modes (RTTY, PSK31, and especially robust modes like Winlink) add systematic error detection and correction codes to the message stream — the transmitter sends additional check bits derived from the data, and the receiver uses these to detect and correct errors without retransmission. Understanding this principle, that redundancy enables error correction, is a foundational concept of information theory with applications far beyond telegraphy.