Bartosz Naskręcki
Elements of Cryptanalysis — Chapters 1–9
Foundations • Polyalphabetic Ciphers • Polygraphic Ciphers
Formal definitions • Substitution ciphers • Frequency analysis
"The enemy knows the system being used."
— Claude Shannon (1949)
| Attack Model | Abbr. | Adversary's Capabilities |
|---|---|---|
| Ciphertext-only | COA | Observes ciphertexts \(c_1, c_2, \ldots\) |
| Known-plaintext | KPA | Knows some pairs \((m_i, c_i)\) |
| Chosen-plaintext | CPA | Can choose \(m_i\), obtains \(c_i = E_k(m_i)\) |
| Chosen-ciphertext | CCA | Can choose \(c_i\), obtains \(m_i = D_k(c_i)\) |
| Related-key | RKA | Encryptions under keys related to \(k\) |
COA \(\subset\) KPA \(\subset\) CPA \(\subset\) ACPA (increasing adversary power)
Proof: \(\sigma_j \circ \sigma_k = \sigma_{j+k \bmod n}\), identity \(\sigma_0\), inverse \(\sigma_k^{-1} = \sigma_{n-k}\), generator \(\sigma_1\).
Although \(26! \approx 4 \times 10^{26}\) keys make brute force impossible, substitution ciphers preserve letter frequencies:
\(\sigma\) relabels letters but does not alter their frequency distribution.
Al-Kindi's insight (850 AD): compare ciphertext frequencies to known language frequencies to recover \(\sigma\).
This motivates Chapter 3: formal frequency analysis.
| E | T | A | O | I | N | S | H | R | D | L | U |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 12.7% | 9.1% | 8.2% | 7.5% | 7.0% | 6.8% | 6.3% | 6.1% | 6.0% | 4.3% | 4.0% | 2.8% |
Only 26 candidates to test — an \(O(n)\) attack regardless of message length.
Monoalphabetic cryptanalysis • Vigenère & Kasiski • Index of Coincidence
Top English digrams: TH (3.6%), HE (3.1%), IN (2.4%), ER (2.1%), AN (2.0%)
Effective above:
200–500 characters (letter matching alone)
100–200 characters (with digram refinement)
Key insight:
Despite \(26! \approx 4 \times 10^{26}\) keys, statistical structure reduces the effective search to manageable size.
First discovered by Babbage (~1846, unpublished) and independently by Kasiski (1863).
Stage 1: Key length \(L\)
Stage 2: Key recovery
The gap \(\kappa_p - \kappa_r \approx 0.028\) is the engine of the IoC method.
Proof sketch: With probability \(1/L\), two positions share the same key letter (coincidence rate \(\kappa_p\)); with probability \((L{-}1)/L\), different key letters (coincidence rate \(\kappa_r\)).
In practice, the two methods complement each other.
Hill cipher • Playfair cipher • Automated cryptanalysis
Since \(26 = 2 \times 13\), the determinant must be coprime to both 2 and 13.
Valid values: \(\det(K) \bmod 26 \in \{1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, 25\}\)
— 12 out of 26.
Completely broken with just \(n\) known pairs (e.g., \(n = 2\) for a \(2 \times 2\) key).
MONARCHY:
Decryption reverses: shift left, shift up, same rectangle swap (self-inverse).
Consequence: the digram frequency distribution is preserved (merely relabelled).
| Model | Possible \(n\)-grams | Entropy rate | Discrimination |
|---|---|---|---|
| Unigram | 26 | ~4.2 bits/char | Low |
| Bigram | 676 | ~3.6 bits/char | Medium |
| Trigram | 17,576 | ~3.1 bits/char | Good |
| Quadgram | 456,976 | ~2.8 bits/char | Excellent |
For substitution ciphers with quadgram scoring on 200+ chars: \(p \approx 0.05\text{--}0.15\), so ~7–20 restarts suffice.
| Chapter | Core Concept | Key Result |
|---|---|---|
| 1 | Cryptosystem formalism | 5-tuple definition, Kerckhoffs's principle, attack taxonomy |
| 2 | Substitution = permutation | \(|S_{26}| = 26! \approx 4 \times 10^{26}\), shifts form \(\mathbb{Z}_n \leq S_n\) |
| 3 | Frequency analysis | Shift invariance of distributions, \(\chi^2\) key recovery |
| 4 | Digram analysis | Theorem 4.1: monoalphabetic ciphers preserve digram structure |
| 5 | Vigenère & Kasiski | Repeated trigrams reveal key length; streams are Caesar ciphers |
| 6 | Index of Coincidence | Friedman: \(\mathrm{IC} \approx (\kappa_p - \kappa_r)/L + \kappa_r\) |
| 7 | Hill cipher | Matrix encryption; broken by \(n\) known-plaintext pairs |
| 8 | Playfair cipher | Digram permutation on 600 pairs; digram frequencies preserved |
| 9 | Automated cryptanalysis | Quadgram scoring + hill climbing: general-purpose attack |
Next: Parts IV–V — Enigma & Information Theory (Chapters 10–15)