Historical Ciphers

1. Secure communication and secure storage

The classical setting is two parties, Alice and Bob, who want to communicate securely despite the presence of an eavesdropper. Historically this corresponds to wiretapping; today it includes digital communication systems that are usually encrypted by default at least at some layer. The lecturer then broadens the notion of a “channel” beyond space to time: storing data securely is essentially communicating with one’s future self. This is why storage encryption matters as much as communication encryption, especially in the presence of malware or theft of stored secrets such as cryptocurrency keys.

2. Early examples: steganography and the scytale

The lecture first revisits an ancient example involving wax tablets. A hidden message could be scratched into the wooden layer underneath the wax, and an ordinary visible message could be written on the wax surface. This is not encryption in the modern sense, but steganography: hiding the existence of the message itself. The weakness is obvious once the technique becomes known—if attackers know hidden writing may exist under the wax, the concealment loses its effectiveness.

Another historical device is the scytale from ancient Greece. A strip of leather is wound around a rod of a certain diameter, and the message is written across the wound strip. Once unwound, the letters appear scrambled. To read it, one needs a rod of the correct diameter. This is presented as an early example of a keyed encryption scheme, where the “key” is effectively the correct staff size. The lecturer also notes that this still has some steganographic flavor, since a belt with random-looking letters might not immediately look like a ciphertext.

3. Why modern cryptography needs a formal model

The lecture then shifts from historical anecdotes to formalization. An encryption scheme consists of an encryption algorithm that maps a message to a ciphertext and a decryption algorithm that maps ciphertext back to message. But if the scheme is just a fixed public algorithm pair, then once the attacker knows the method, they can decrypt messages themselves. Therefore, modern encryption schemes introduce a secret key used by both encryption and decryption. The algorithm may be public, but the key must remain secret. This idea is central and foreshadows a later discussion of Kerckhoffs’s principle.

4. Substitution ciphers

The lecture next introduces substitution ciphers, among the oldest known ciphers. The key is a permutation of the alphabet, and encryption replaces each plaintext character by its image under that permutation. Decryption simply reverses the substitution. The lecturer briefly discusses the problem of choosing a random permutation uniformly and mentions the Fisher–Yates algorithm as the correct method for unbiased sampling, contrasting this with naive manual approaches.

A substitution cipher seems strong at first glance because the key space is very large: with 26 letters, there are \(26!\) possible keys. The lecturer emphasizes that this is a huge number, especially in a pre-digital world where all computation must be done by hand. Still, despite the large key space, the cipher is insecure for deeper statistical reasons that will appear later.

The lecture also distinguishes monoalphabetic substitution from polyalphabetic substitution. In a monoalphabetic substitution cipher, plaintext and ciphertext are over the same alphabet and the same substitution is used everywhere. The instructor uses an XKCD comic to illustrate how absurd this idea becomes when the alphabet is too small—for example, if the underlying “alphabet” is effectively binary, then substitution offers almost no security.

5. Caesar cipher and shift cipher

The Caesar cipher is introduced as a special case of substitution ciphers. Instead of an arbitrary permutation, each letter is shifted by a fixed amount in the alphabet, with wraparound modulo 26. In the classical Caesar cipher the shift is 3, though one can generalize this to a shift cipher where the shift value itself is the key. Encryption and decryption are simple cyclic shifts.

The problem is that the key space is tiny: there are only 26 possible shifts, and shifting by 26 is identical to shifting by 0. This makes brute-force decryption trivial even by hand. The Caesar cipher is therefore a clear example of a scheme whose insecurity follows directly from the small size of the key space.

6. Vigenère cipher

The lecture then moves to the Vigenère cipher, presented as a more sophisticated variant of the shift cipher. Instead of one fixed shift applied to every character, the key is a string of letters, and each position of the plaintext is shifted according to the corresponding key character. If the plaintext is longer than the key, the key is repeated periodically. The lecturer translates characters into numbers 0–25, adds plaintext and key values modulo 26, and then converts the results back into letters. This makes the operation look arithmetically simple.

An advantage of the Vigenère cipher is that key generation is easy: one simply picks a random string of length \(n\). This gives \(26^n\) possible keys, which can be quite large if \(n\) is even moderately big. Historically, this made the cipher appear very strong, and it was regarded for centuries as effectively “indecipherable.” The lecture notes that it was used widely by diplomats and militaries for roughly 350 to 400 years.

7. Cryptanalysis begins: why key size alone is not enough

At this point the lecture turns to cryptanalysis, stressing that cryptography is not just about inventing schemes but about critically evaluating whether they truly achieve their claimed guarantees. The scytale leaks physical clues such as the curvature of the leather strip. The Caesar and shift ciphers fail because the key space is too small. The more interesting question is why ciphers like substitution and Vigenère, with much larger key spaces, are still breakable.

This leads to the key idea that ciphertext can preserve statistical information about plaintext. Human languages are highly redundant: letters do not occur uniformly, and common words and patterns appear much more often than others. This redundancy makes communication robust for humans, but it also gives attackers structure to exploit. The lecture uses English as the main example, noting that E is by far the most frequent letter, and that ordinary English text samples display character frequencies close to the overall language distribution.

8. Frequency analysis of Vigenère

The lecturer explains how to attack the Vigenère cipher using letter-frequency statistics. The main challenge is that the attacker does not initially know the key length, but they can guess it or try increasing values. In practice, very long keys are inconvenient for humans to use manually, so plausible lengths are limited. Once a key length is guessed, the ciphertext is split into several substrings, one for each position modulo the key length. Each substring is effectively a text encrypted under a simple shift cipher with one fixed shift.

The central heuristic is that if the original text is English, then each of these regularly spaced substrings still roughly preserves the character-frequency distribution of English. Therefore, for each substring, one can compute its frequency histogram and compare it with the English frequency histogram under all possible cyclic shifts. The shift that aligns the two best is likely the correct key character. Repeating this for all substrings yields all key shifts, after which the substrings can be decrypted and recombined into the plaintext. In the classroom example, this process successfully recovers the phrase “Star Wars” as the plaintext.

The lecturer also answers a question about what happens if the key is as long as the message. In that case, each substring would contain essentially only one sample, so frequency analysis no longer works. However, if such a long key is reused, the situation becomes vulnerable again because reusing a long key across multiple messages recreates exploitable structure. This observation anticipates later discussion in the course.

9. Higher-order statistics: bigrams, trigrams, and n-grams

The lecture next generalizes from single-letter frequencies to higher-order statistics. Natural language has common bigrams like “th” and “he,” and common trigrams like “the.” These patterns can also leak through weak ciphers. The instructor gives a historical side remark about why old printed English sometimes used “ye,” relating it to printing technology, the loss of the older English thorn character, and substitution with letters available in continental printing systems. But the cryptographic point is that language has highly nonuniform local structure, and attackers can exploit not only single-character frequencies but also frequent short sequences.

10. Plaintext recovery versus message distinguishing

The lecture then discusses attack goals. Recovering the exact plaintext from ciphertext is the strongest type of attack, but sometimes an attacker needs less. If the attacker already knows substantial context, they may only need to determine which of a few candidate messages was encrypted. An example is given where a military commander knows that an intercepted message is either “run for the hills” or “press the attack.” Under a substitution cipher, the exact identity of letters changes, but the frequency profile of the message is preserved. By comparing the ciphertext’s ordered character-frequency profile with those of the candidate plaintexts, one can eliminate inconsistent candidates without fully decrypting the text. In the example, only one candidate matches, so the attacker learns the intended meaning.

This is an important conceptual step: a cipher may fail even when full plaintext recovery is hard, because leaking enough information to distinguish between plausible messages can already be disastrous.

11. Enigma machine

The lecture then briefly introduces the Enigma machine, used by Germany in World War II. Enigma is described as a more complex form of substitution cipher implemented mechanically and electrically. A pressed key sends an electrical signal through a configurable plugboard, through a set of rotating rotors, to a reflector, and then back to light up a ciphertext letter. Because the rotors move between keypresses, the substitution changes over time, making it a polyalphabetic system.

The lecturer explains that part of the secret lies in the rotor configuration and their initial positions, and that Allied operations even attempted to seize German machines and rotor sets because possessing those components meant compromising significant key material. A key structural weakness is also highlighted: because of the reflector-based wiring, a letter can never encrypt to itself. Pressing A cannot produce A as the visible output. This seemingly small structural property leaks information about plaintext and was one of the factors that enabled successful cryptanalysis when combined with knowledge of the machine’s design.

12. Main lessons from historical ciphers

Toward the end, the instructor extracts several general lessons from these historical examples.

First, ciphers without keys or with very small key spaces are fundamentally insecure. Once the method is known, such schemes cannot recover from compromise.

Second, even ciphers with large key spaces can fail if the ciphertext preserves too much statistical structure from the plaintext, such as single-letter frequencies, n-gram patterns, or message profiles. These leaks are exactly what classical cryptanalysis exploits.

Third, some schemes are so weak that obtaining even a single plaintext–ciphertext pair can immediately reveal the key. The Vigenère cipher is mentioned here: if one knows both plaintext and ciphertext under the same repeated-key scheme, the key can be derived by simple subtraction modulo 26.

13. Kerckhoffs’s principle

The lecture concludes with Kerckhoffs’s principle, a foundational design philosophy from the late nineteenth century: the security of a cipher must not depend on keeping the method secret; only the key should need to remain secret. The instructor quotes the principle in essence: a cipher should remain usable even if it falls into the enemy’s hands. Schemes that rely on secrecy of design are practicing security by obscurity, which is considered poor cryptographic design. The reason is practical and conceptual: it is always much easier to replace a compromised key than to replace a compromised design.