Parts I–III:
Classical Ciphers & Cryptanalysis

Bartosz Naskręcki

Elements of Cryptanalysis — Chapters 1–9
Foundations • Polyalphabetic Ciphers • Polygraphic Ciphers

Part I: Foundations

Chapters 1–3

Formal definitions • Substitution ciphers • Frequency analysis

Chapter 1

Introduction to Cryptanalysis

Historical Milestones

~850 AD — Al-Kindi invents frequency analysis in Baghdad, breaking monoalphabetic substitution ciphers
1883 — Kerckhoffs formulates six design principles for military ciphers: "security must reside in the key, not the algorithm"
1949 — Shannon publishes Communication Theory of Secrecy Systems, formalizing perfect secrecy and the one-time pad

"The enemy knows the system being used."
— Claude Shannon (1949)

Formal Definition: Cryptosystem

Definition 1.1 — Cryptosystem

A cryptosystem is a 5-tuple \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) where:

\(\mathcal{P}\) — plaintext space, \(\mathcal{C}\) — ciphertext space, \(\mathcal{K}\) — key space
\(\mathcal{E} = \{E_k : \mathcal{P} \to \mathcal{C}\}\) — encryption functions
\(\mathcal{D} = \{D_k : \mathcal{C} \to \mathcal{P}\}\) — decryption functions

Correctness: \(D_k(E_k(m)) = m\) for all \(k \in \mathcal{K},\; m \in \mathcal{P}\).

Kerckhoffs's Principle

Security must hold even when \((\mathcal{P}, \mathcal{C}, \mathcal{K}, \mathcal{E}, \mathcal{D})\) are public. The only secret is the key \(k \in \mathcal{K}\).

Attack Taxonomy

Attack Model	Abbr.	Adversary's Capabilities
Ciphertext-only	COA	Observes ciphertexts \(c_1, c_2, \ldots\)
Known-plaintext	KPA	Knows some pairs \((m_i, c_i)\)
Chosen-plaintext	CPA	Can choose \(m_i\), obtains \(c_i = E_k(m_i)\)
Chosen-ciphertext	CCA	Can choose \(c_i\), obtains \(m_i = D_k(c_i)\)
Related-key	RKA	Encryptions under keys related to \(k\)

COA \(\subset\) KPA \(\subset\) CPA \(\subset\) ACPA (increasing adversary power)

Chapter 2

Permutations and Substitution Ciphers

Permutations as Cipher Keys

Definition 2.1 — Permutation

A permutation of \(\mathcal{A}\) with \(|\mathcal{A}| = n\) is a bijection \(\sigma: \mathcal{A} \to \mathcal{A}\).

Definition 2.2 — Symmetric Group \(S_n\)

The set of all permutations of \(\{0, 1, \ldots, n{-}1\}\) under composition forms the symmetric group \(S_n\) with \(|S_n| = n!\).

Shift Cipher (Caesar)

\(\sigma_k(x) = (x + k) \bmod n\)
Key space: \(n = 26\) keys

General Substitution

Key \(\sigma \in S_n\), arbitrary permutation
Key space: \(26! \approx 4 \times 10^{26}\)

Key Theorems on Substitution Ciphers

Theorem 2.1 — Keyspace Size

The number of distinct substitution ciphers over an alphabet of size \(n\) is \(n!\).

Theorem 2.2 — Shift Ciphers Form a Cyclic Subgroup

\(\{\sigma_0, \sigma_1, \ldots, \sigma_{n-1}\}\) is a cyclic subgroup of \(S_n\) of order \(n\), isomorphic to \(\mathbb{Z}_n\).

Proof: \(\sigma_j \circ \sigma_k = \sigma_{j+k \bmod n}\), identity \(\sigma_0\), inverse \(\sigma_k^{-1} = \sigma_{n-k}\), generator \(\sigma_1\).

Theorem 2.3 — Involutions

A substitution cipher is self-invertible iff \(\sigma^2 = \mathrm{id}\) (involution).
For \(n = 26\): only \(k = 0\) and \(k = 13\) (ROT13) are self-invertible shifts.

The Fundamental Weakness

Critical Vulnerability

Keyspace size \(\neq\) security.

Although \(26! \approx 4 \times 10^{26}\) keys make brute force impossible, substitution ciphers preserve letter frequencies:

\(\sigma\) relabels letters but does not alter their frequency distribution.

Al-Kindi's insight (850 AD): compare ciphertext frequencies to known language frequencies to recover \(\sigma\).

This motivates Chapter 3: formal frequency analysis.

Chapter 3

Frequency Analysis

Letter Frequencies & Distributions

Definition — Frequency Distribution

For text \(T\) of length \(N\), the frequency distribution is \[ \mathbf{F}(T) = \bigl(f(a_1), f(a_2), \ldots, f(a_n)\bigr) \in \Delta^{n-1} \] where \(f(c) = \mathrm{count}(c, T) / N\) and \(\Delta^{n-1}\) is the probability simplex.

English "ETAOINSHRDLU" frequencies:

E	T	A	O	I	N	S	H	R	D	L	U
12.7%	9.1%	8.2%	7.5%	7.0%	6.8%	6.3%	6.1%	6.0%	4.3%	4.0%	2.8%

Breaking the Shift Cipher

Theorem — Shift Invariance of Frequencies

If \(C = \mathrm{Shift}_k(T)\), then \[ \mathbf{F}(C) = (f_{(0-k) \bmod n},\; f_{(1-k) \bmod n},\; \ldots,\; f_{(n-1-k) \bmod n}) \] The ciphertext frequency distribution is a cyclic shift of the plaintext distribution.

Key Recovery via Chi-Squared

\[ \hat{k} = \arg\min_{s \in \{0,\ldots,25\}} \chi^2\!\left(\mathrm{Shift}_{-s}(\mathbf{F}(C)),\; \mathbf{F}_{\mathrm{eng}}\right) \] where \(\displaystyle\chi^2(\mathbf{p}, \mathbf{q}) = \sum_{i=0}^{25} \frac{(p_i - q_i)^2}{q_i}\).

Only 26 candidates to test — an \(O(n)\) attack regardless of message length.

Text Length and Reliability

< 50 characters: frequency analysis unreliable (insufficient data)
~100 characters: success rate reaches ~80%
200+ characters: success rate > 95%
500+ characters: nearly guaranteed correct key recovery

Limitations

Frequency analysis is powerful against monoalphabetic ciphers but is defeated by polyalphabetic ciphers (Vigenere), which flatten the frequency distribution by using multiple substitution alphabets.

Part II: Classical Polyalphabetic Ciphers

Chapters 4–6

Monoalphabetic cryptanalysis • Vigenère & Kasiski • Index of Coincidence

Chapter 4

Monoalphabetic Cryptanalysis

Digram Analysis

Definition — Digrams & Digram Frequency

A digram is a consecutive pair \(t_i t_{i+1}\). The digram frequency is: \[ f_{ab}(t) = \frac{\#\{i : t_i = a,\; t_{i+1} = b\}}{\ell - 1} \]

Theorem 4.1 — Preservation of Digram Structure

For a monoalphabetic cipher with key \(\sigma\): \[ f_{\sigma(a)\sigma(b)}(c) = f_{ab}(m) \] The digram frequency matrix of the ciphertext is a permuted version of the plaintext digram matrix. This extends to \(n\)-grams of any order.

Top English digrams: TH (3.6%), HE (3.1%), IN (2.4%), ER (2.1%), AN (2.0%)

The Frequency Matching Attack

Algorithm 4.1 — Key Recovery

Single-letter matching: Sort ciphertext letters by frequency; map to sorted English frequencies (E, T, A, O, I, ...)
Digram refinement: Check the most frequent ciphertext digram — it should map to TH or HE. Resolve ambiguities using the TH/HT asymmetry.
Greedy swap improvement: Try all pairwise swaps in the candidate key; keep swaps that improve a scoring metric.

Effective above:
200–500 characters (letter matching alone)
100–200 characters (with digram refinement)

Key insight:
Despite \(26! \approx 4 \times 10^{26}\) keys, statistical structure reduces the effective search to manageable size.

Chapter 5

The Vigenère Cipher & Kasiski Examination

The Vigenère Cipher

Definition 5.1 — Vigenère Cipher

Key: \(\mathbf{k} = (k_0, k_1, \ldots, k_{L-1}) \in \mathbb{Z}_{26}^L\) of length \(L\). \[ c_i = (m_i + k_{i \bmod L}) \bmod 26 \] \[ m_i = (c_i - k_{i \bmod L}) \bmod 26 \]

Polyalphabetic: same plaintext letter maps to different ciphertext letters depending on position
Flattens frequencies: the ciphertext distribution approaches uniformity, defeating single-letter frequency analysis
Considered "le chiffre indéchiffrable" for ~300 years (1553–1863)

The Kasiski Examination

Theorem 5.1 — Kasiski's Observation

If a plaintext trigram repeats at positions \(i\) and \(j\) with \(L \mid (j - i)\), then the ciphertext trigrams are identical: \[ c_i c_{i+1} c_{i+2} = c_j c_{j+1} c_{j+2} \]

Algorithm — Kasiski Examination

Find all repeated trigrams in the ciphertext
Compute the distances between their occurrences
Take the GCD of all distances → estimate of key length \(L\)

First discovered by Babbage (~1846, unpublished) and independently by Kasiski (1863).

Complete Vigenère Attack Pipeline

Two-Stage Ciphertext-Only Attack

Stage 1: Key length \(L\)

Kasiski: GCD of repeated trigram distances
Or: IoC column method (Chapter 6)

Stage 2: Key recovery

Split ciphertext into \(L\) streams
Stream \(j\): \(c_j, c_{j+L}, c_{j+2L}, \ldots\)
Each stream is a Caesar cipher!
Apply \(\chi^2\) to recover each \(k_j\)

Limitation

Longer keys produce fewer repeated trigrams. As key length \(L \to N\), the cipher approaches a one-time pad (perfect secrecy).

Chapter 6

Index of Coincidence

The Index of Coincidence

Definition 6.1 — Index of Coincidence (Friedman, 1922)

For text \(x\) of length \(N\) with letter counts \(f_0, \ldots, f_{25}\): \[ \mathrm{IC}(x) = \frac{\displaystyle\sum_{i=0}^{25} f_i(f_i - 1)}{N(N-1)} \] This is the probability that two randomly chosen letters are identical.

English plaintext

\[\kappa_p = \sum_{i=0}^{25} p_i^2 \approx 0.0667\]

Random text (uniform)

\[\kappa_r = \frac{1}{26} \approx 0.0385\]

The gap \(\kappa_p - \kappa_r \approx 0.028\) is the engine of the IoC method.

Friedman's Theorem & Key Length Formula

Theorem 6.1 — Friedman (1922)

For a polyalphabetic cipher with key length \(L\): \[ \mathrm{IC}_{\text{expected}} \approx \frac{\kappa_p - \kappa_r}{L} + \kappa_r = \frac{1}{L}(\kappa_p - \kappa_r) + \kappa_r \]

Proof sketch: With probability \(1/L\), two positions share the same key letter (coincidence rate \(\kappa_p\)); with probability \((L{-}1)/L\), different key letters (coincidence rate \(\kappa_r\)).

Corollary — Key Length Estimation

\[ L \approx \frac{\kappa_p - \kappa_r}{\mathrm{IC}_{\text{obs}} - \kappa_r} = \frac{0.0282}{\mathrm{IC}_{\text{obs}} - 0.0385} \]

IoC: Key Properties & Comparisons

Invariance

The IoC is invariant under monoalphabetic substitution: any permutation of the alphabet merely relabels the counts \(\{f_i\}\), and the IoC formula depends only on the multiset of counts.

Kasiski Examination

Finds repeated \(n\)-grams
Exact divisor information
Needs longer ciphertext
Discrete (GCD-based)

Index of Coincidence

Global statistical measure
Works on shorter texts
No repeated \(n\)-grams needed
Continuous (numerical score)

In practice, the two methods complement each other.

Part III: Polygraphic Ciphers

Chapters 7–9

Hill cipher • Playfair cipher • Automated cryptanalysis

Chapter 7

The Hill Cipher

Matrix Encryption over \(\mathbb{Z}_{26}\)

Definition 7.1 — Hill Cipher

Block size \(n\), key: invertible \(K \in \mathrm{GL}(n, \mathbb{Z}_{26})\). \[ \mathbf{c} = K \cdot \mathbf{m} \pmod{26}, \qquad \mathbf{m} = K^{-1} \cdot \mathbf{c} \pmod{26} \]

Invertibility Criterion

\(K\) is invertible mod 26 \(\iff\) \(\gcd(\det(K), 26) = 1\).

Since \(26 = 2 \times 13\), the determinant must be coprime to both 2 and 13.
Valid values: \(\det(K) \bmod 26 \in \{1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, 25\}\) — 12 out of 26.

Modular Inverse

\(K^{-1} = (\det K)^{-1} \cdot \mathrm{adj}(K) \pmod{26}\) via the extended Euclidean algorithm.

Strengths and Fatal Weakness

Theorem 7.3 — Frequency Analysis Resistance

The Hill cipher with \(n \geq 2\) destroys single-letter frequencies: the IoC of ciphertext approaches \(\kappa_r \approx 0.0385\).

Theorem 7.2 — Known-Plaintext Attack

Given \(n\) known plaintext-ciphertext block pairs, if \(M = [\mathbf{m}_1 | \cdots | \mathbf{m}_n]\) is invertible mod 26: \[ K = C \cdot M^{-1} \pmod{26} \]

Completely broken with just \(n\) known pairs (e.g., \(n = 2\) for a \(2 \times 2\) key).

Design Lesson

Algebraic structure that makes a cipher elegant also makes it breakable.
Linearity enables efficient encryption and efficient cryptanalysis.
Modern ciphers (AES) combine linear and nonlinear operations.

Chapter 8

The Playfair Cipher

The 5×5 Grid and Encryption Rules

Definition 8.1 — Playfair Grid

A 5×5 matrix of 25 letters (I/J merged), constructed from a keyword.
Example with keyword MONARCHY:

      M O N A R

      C H Y B D

      E F G I K

      L P Q S T

      U V W X Z

Three Encryption Rules (for digram \((a, b)\), \(a \neq b\))

Same row: each letter shifts one position right (wrap around)
Same column: each letter shifts one position down (wrap around)
Rectangle: swap columns — each letter moves to the other corner of its row

Decryption reverses: shift left, shift up, same rectangle swap (self-inverse).

Cryptanalysis of Playfair

Theorem 8.1 — Digram Permutation

For a fixed grid \(G\), the Playfair encryption \(E_G\) is a permutation on the 600 ordered digrams \(\{(a, b) \in \mathcal{A}^2 : a \neq b\}\).

Consequence: the digram frequency distribution is preserved (merely relabelled).

Vulnerability

Most frequent ciphertext digrams correspond to TH, HE, IN, ER, AN
Key space: \(25! \approx 1.55 \times 10^{25}\) (but keyword construction reduces it)

Hill-Climbing Attack

Start with random 5×5 grid
Decrypt, score with quadgram log-probabilities
Perturb grid (swap letters/rows/cols)
Keep improvements; repeat
Multiple restarts to escape local optima

Chapter 9

Automated Cryptanalysis:
Hill Climbing with N-grams

N-gram Scoring

Definition 9.1 — N-gram

An \(n\)-gram is a contiguous sequence of \(n\) characters: \(g_i = x_i x_{i+1} \cdots x_{i+n-1}\).

Definition 9.3 — Log-Likelihood Score

\[ \text{score}(x) = \sum_{i=1}^{N-n+1} \log_{10} F(x_i x_{i+1} \cdots x_{i+n-1}) \] where \(F(g)\) is the frequency of \(n\)-gram \(g\) in a reference corpus.
Floor probability for unseen \(n\)-grams: \(F_{\text{floor}} = 0.01 / (M - n + 1)\).

Model	Possible \(n\)-grams	Entropy rate	Discrimination
Unigram	26	~4.2 bits/char	Low
Bigram	676	~3.6 bits/char	Medium
Trigram	17,576	~3.1 bits/char	Good
Quadgram	456,976	~2.8 bits/char	Excellent

Hill Climbing Algorithm

Definition 9.4 — Hill Climbing for Cryptanalysis

Initialize: Choose a random key \(k_0 \in \mathcal{K}\)
Evaluate: Decrypt with \(k_0\), compute \(\text{score}(D_{k_0}(c))\)
Perturb: Generate neighbor \(k'\) via small random modification
Accept/Reject: If \(\text{score}(D_{k'}(c)) > \text{score}(D_{k_0}(c))\), set \(k_0 \leftarrow k'\)
Repeat steps 3–4 for many iterations
Restart from a new random key; keep the best result across restarts

Theorem 9.1 — Expected Restarts

If \(p\) = probability a single run finds the global optimum, then: \[ E[\text{restarts}] = \frac{1}{p}, \qquad P(\text{found in } k \text{ runs}) = 1 - (1-p)^k \]

For substitution ciphers with quadgram scoring on 200+ chars: \(p \approx 0.05\text{--}0.15\), so ~7–20 restarts suffice.

Generality of Automated Cryptanalysis

Substitution Cipher

Key: permutation of 26 letters
Perturbation: swap two letters
Effective for 200+ characters

Playfair Cipher

Key: 5×5 grid (25 letters)
Perturbation: swap grid entries
Effective for 200+ characters

Key Insight

The same framework attacks different cipher types by changing only the key representation and perturbation strategy. The n-gram scoring function remains unchanged — it measures "how English" the candidate decryption looks.

Extension: Simulated Annealing

Accept worse solutions with probability \(e^{\Delta / T}\), where \(\Delta = \text{score}_{\text{new}} - \text{score}_{\text{old}}\) and temperature \(T\) decreases over time. This helps escape local optima without requiring restarts.

Parts I–III: Summary of Key Concepts

Chapter	Core Concept	Key Result
1	Cryptosystem formalism	5-tuple definition, Kerckhoffs's principle, attack taxonomy
2	Substitution = permutation	\(\|S_{26}\| = 26! \approx 4 \times 10^{26}\), shifts form \(\mathbb{Z}_n \leq S_n\)
3	Frequency analysis	Shift invariance of distributions, \(\chi^2\) key recovery
4	Digram analysis	Theorem 4.1: monoalphabetic ciphers preserve digram structure
5	Vigenère & Kasiski	Repeated trigrams reveal key length; streams are Caesar ciphers
6	Index of Coincidence	Friedman: \(\mathrm{IC} \approx (\kappa_p - \kappa_r)/L + \kappa_r\)
7	Hill cipher	Matrix encryption; broken by \(n\) known-plaintext pairs
8	Playfair cipher	Digram permutation on 600 pairs; digram frequencies preserved
9	Automated cryptanalysis	Quadgram scoring + hill climbing: general-purpose attack

Recurring Themes

Large keyspace ≠ security — statistical structure leaks through substitution (\(26!\) keys, but frequency analysis breaks it)
Polyalphabetic flattening — Vigenère defeats letter-frequency analysis, but period detection (Kasiski, IoC) reduces it to monoalphabetic subproblems
Polygraphic mixing — Hill and Playfair destroy single-letter statistics, but linearity (Hill) and digram preservation (Playfair) create new vulnerabilities
Linearity is dangerous — the Hill cipher's algebraic structure enables both efficient encryption and efficient cryptanalysis; modern ciphers require nonlinearity (S-boxes)
General-purpose attacks — n-gram scoring with hill climbing transcends cipher-specific methods, foreshadowing the computational approach to modern cryptanalysis

End of Parts I–III

Next: Parts IV–V — Enigma & Information Theory (Chapters 10–15)

← Back to Slides Index

Next Deck: Parts IV–V →

Parts I–III:Classical Ciphers & Cryptanalysis

Part I: Foundations

Chapters 1–3

Chapter 1

Introduction to Cryptanalysis

Historical Milestones

Formal Definition: Cryptosystem

Attack Taxonomy

Chapter 2

Permutations and Substitution Ciphers

Permutations as Cipher Keys

Key Theorems on Substitution Ciphers

The Fundamental Weakness

Chapter 3

Frequency Analysis

Letter Frequencies & Distributions

Breaking the Shift Cipher

Text Length and Reliability

Part II: Classical Polyalphabetic Ciphers

Chapters 4–6

Chapter 4

Monoalphabetic Cryptanalysis

Digram Analysis

The Frequency Matching Attack

Chapter 5

The Vigenère Cipher & Kasiski Examination

The Vigenère Cipher

The Kasiski Examination

Complete Vigenère Attack Pipeline

Chapter 6

Index of Coincidence

The Index of Coincidence

Friedman's Theorem & Key Length Formula

IoC: Key Properties & Comparisons

Kasiski Examination

Index of Coincidence

Part III: Polygraphic Ciphers

Chapters 7–9

Chapter 7

The Hill Cipher

Matrix Encryption over \(\mathbb{Z}_{26}\)

Strengths and Fatal Weakness

Chapter 8

The Playfair Cipher

The 5×5 Grid and Encryption Rules

Cryptanalysis of Playfair

Vulnerability

Hill-Climbing Attack

Chapter 9

Automated Cryptanalysis:Hill Climbing with N-grams

N-gram Scoring

Hill Climbing Algorithm

Generality of Automated Cryptanalysis

Parts I–III: Summary of Key Concepts

Recurring Themes

End of Parts I–III

Parts I–III:
Classical Ciphers & Cryptanalysis

Automated Cryptanalysis:
Hill Climbing with N-grams