Chapter 1: Introduction to Cryptanalysis (SageMath)#

Python Version Available

This is the SageMath companion for Chapter 1: Introduction to Cryptanalysis. The Python/NumPy version contains identical theory with implementations using only NumPy and Matplotlib.

1.1 Introduction: The Eternal Arms Race#

The history of cryptanalysis is as old as the history of secret writing itself. For as long as humans have sought to conceal messages, others have devoted their intellect to revealing them. The earliest known treatise on cryptanalysis is the Risalah fi Istikhraj al-Mu’amma (“Manuscript on Deciphering Cryptographic Messages”), written by the Arab polymath Abu Yusuf Ya’qub ibn Ishaq al-Kindi around 850 AD in Baghdad. Al-Kindi introduced the technique of frequency analysis — the observation that in any sufficiently long text written in a natural language, certain letters appear with predictable regularity. By comparing the frequency distribution of symbols in a ciphertext against the known distribution of the underlying language, one can peel away the veil of substitution ciphers. This single insight rendered every monoalphabetic substitution cipher insecure and inaugurated the scientific study of code-breaking.

A millennium later, the Dutch-born linguist and cryptographer Auguste Kerckhoffs (1835–1903) formulated six design principles for military ciphers in his landmark 1883 paper La Cryptographie Militaire. The second of these principles has become the cornerstone of modern cryptographic design:

“Il faut qu’il n’exige pas le secret, et qu’il puisse sans inconvénient tomber entre les mains de l’ennemi.”

(“It must not require secrecy, and it must be able to fall into the enemy’s hands without inconvenience.”)

— Auguste Kerckhoffs, La Cryptographie Militaire (1883)

Kerckhoffs recognized a profound truth: in any large-scale deployment of a cipher — across an army, a government, or an entire network — the algorithm will inevitably leak. Captured codebooks, reverse-engineered devices, disloyal insiders, or simple espionage will eventually expose the mechanism. Security must therefore reside not in the obscurity of the method, but in the secrecy of a short, easily changeable key. This insight separated the study of cryptography from the mere art of concealment and elevated it toward a principled engineering discipline.

The mathematical formalization of these ideas arrived in 1949, when Claude Elwood Shannon — working at Bell Labs — published Communication Theory of Secrecy Systems. Shannon cast cryptography in the language of information theory, defining concepts such as perfect secrecy, unicity distance, and redundancy with mathematical precision. He restated Kerckhoffs’s principle in its most succinct modern form:

“The enemy knows the system being used.”

— Claude Shannon, Communication Theory of Secrecy Systems (1949)

Shannon proved that perfect secrecy — where the ciphertext reveals absolutely no information about the plaintext — requires a key at least as long as the message itself (the one-time pad). This result established an information-theoretic lower bound that no amount of computational ingenuity can circumvent, and it set the stage for the modern distinction between information-theoretic and computational security.

Tip

Interdisciplinary thinking is essential. Cryptanalysis draws on number theory, linear algebra, statistics, combinatorics, information theory, and computer science. The most powerful attacks in history have come from unexpected angles: linguists exploiting letter frequencies, mathematicians finding algebraic weaknesses, engineers exploiting timing variations. As you study this material, cultivate breadth alongside depth.

1.2 Historical Context: From Antiquity to the Digital Age#

The art of secret writing — cryptography — is among the oldest intellectual pursuits of civilization. And wherever there have been secrets, there have been those who seek to uncover them: the cryptanalysts.

The history of cryptanalysis stretches from the ancient world to the present day:

  • c. 850 AD: Abu Yusuf Ya’qub ibn Ishaq al-Kindi writes Risalah fi Istikhraj al-Mu’amma (“A Manuscript on Deciphering Cryptographic Messages”), introducing frequency analysis — the first known general technique for breaking substitution ciphers.

  • 1553: Giovan Battista Bellaso publishes the polyalphabetic cipher later misattributed to Blaise de Vigenère.

  • 1863: Friedrich Kasiski publishes a method for breaking polyalphabetic ciphers by detecting repeated patterns.

  • 1883: Auguste Kerckhoffs publishes his six principles of military cryptography, including the famous maxim that a cipher’s security must reside solely in the key.

  • 1918–1945: The Enigma machine and its breaking by Polish and British cryptanalysts reshape world history.

  • 1945: Claude Shannon lays the mathematical foundation for cryptography in his classified report “A Mathematical Theory of Cryptography”.

  • 1976–1978: Diffie–Hellman key exchange and the RSA cryptosystem launch the public-key revolution.

  • 1990–1993: Differential and linear cryptanalysis transform the design of block ciphers.

  • 1994: Peter Shor discovers quantum algorithms that threaten all public-key cryptography.

  • 2024: NIST standardizes the first post-quantum cryptographic algorithms.


1.3 Kerckhoffs’s Principles#

In 1883, the Dutch linguist and cryptographer Auguste Kerckhoffs published “La Cryptographie Militaire” in the Journal des Sciences Militaires. He articulated six principles for the design of military ciphers, of which the second has become the most famous:

Kerckhoffs’s Principle

“Il faut qu’il n’exige pas le secret, et qu’il puisse sans inconvénient tomber entre les mains de l’ennemi.”

(“It must not require secrecy, and it must be able to fall into the enemy’s hands without causing inconvenience.”)

In modern terms: the security of a cryptosystem must depend solely on the secrecy of the key, not on the secrecy of the algorithm.

This principle has profound implications:

  1. Algorithm publication: Cryptographic algorithms should be made public so that they can be analyzed by the broader community.

  2. Key management: All security reduces to the problem of protecting keys.

  3. Security through obscurity is insufficient: Relying on the secrecy of an algorithm is a fundamental design flaw.

Danger

Security through obscurity — hiding the algorithm — provides only the illusion of security. History is littered with proprietary ciphers that were quickly broken once their algorithms became known (GSM A5/1, CSS for DVDs, etc.).

1.4 Attack Taxonomies#

Definition 1.1 (Cryptographic Attack Model)

A cryptographic attack is an attempt to recover the plaintext \(m\) or the key \(k\) from available information. Attacks are classified by what information the attacker possesses:

Attack Type

Attacker’s Knowledge

Ciphertext-only

Only the ciphertext \(c\)

Known-plaintext

One or more \((m_i, c_i)\) pairs

Chosen-plaintext

Can choose \(m_i\) and obtain \(c_i = E_k(m_i)\)

Chosen-ciphertext

Can choose \(c_i\) and obtain \(m_i = D_k(c_i)\)

Related-key

Access to encryptions under related keys

Definition 1.2 (Computational Security)

A cryptosystem is computationally secure if the best known attack requires computational resources (time, memory, or data) that exceed the attacker’s capabilities. Formally, we say a cryptosystem with key length \(n\) provides \(b\)-bit security if the best known attack requires \(\Theta(2^b)\) operations.

1.5 The General Encryption Scheme#

Every symmetric cryptosystem consists of three algorithms:

Definition 1.3 (Symmetric Cryptosystem)

A symmetric cryptosystem is a tuple \((\mathcal{M}, \mathcal{C}, \mathcal{K}, E, D)\) where:

  • \(\mathcal{M}\) is the message space (set of possible plaintexts)

  • \(\mathcal{C}\) is the ciphertext space

  • \(\mathcal{K}\) is the key space

  • \(E: \mathcal{K} \times \mathcal{M} \to \mathcal{C}\) is the encryption function

  • \(D: \mathcal{K} \times \mathcal{C} \to \mathcal{M}\) is the decryption function

satisfying \(D_k(E_k(m)) = m\) for all \(k \in \mathcal{K}\) and \(m \in \mathcal{M}\).

Implementation in SageMath#

The following skeleton defines the general structure of a cryptosystem. We use SageMath’s built-in tools where helpful.

# General scheme of a symmetric cryptosystem
# SageMath provides built-in modular arithmetic, so many operations become simpler

def Encoding(message, encryption_key):
    """Encrypt a message using the given key."""
    pass

def Decoding(ciphertext, decryption_key):
    """Decrypt a ciphertext using the given key."""
    pass

def GenerateEncryptionKey(params):
    """Generate an encryption key from parameters."""
    pass

def GenerateDecryptionKey(params):
    """Generate a decryption key from parameters."""
    pass

def KeyGeneration(params):
    """Generate a (encryption_key, decryption_key) pair."""
    e = GenerateEncryptionKey(params)
    d = GenerateDecryptionKey(params)
    return (e, d)

1.6 The Substitution Cipher#

The simplest and oldest class of ciphers replaces each letter of the plaintext with another letter.

Definition 1.4 (Simple Substitution Cipher)

A simple substitution cipher over an alphabet \(\mathcal{A}\) of size \(n\) is defined by a permutation \(\sigma \in S_n\). Encryption replaces each letter \(a_i\) with \(a_{\sigma(i)}\). The key is the permutation \(\sigma\), and the key space has size \( |S_n| = n!\).

Definition 1.5 (Shift Cipher / Caesar Cipher)

The shift cipher (or Caesar cipher) is the special case where \(\sigma\) is a cyclic shift: \(\sigma(i) = (i + k) \bmod n\) for a fixed key \(k \in \{0, 1, \ldots, n-1\}\).

  • Encryption: \(E_k(x) = (x + k) \bmod n\)

  • Decryption: \(D_k(y) = (y - k) \bmod n\)

  • Key space: \( |\mathcal{K}| = n\) (only 26 keys for English)

Shift Cipher in SageMath#

# Helper: clean text to uppercase letters only
import re

def lclear(text):
    """Remove all non-alphabetic characters and convert to uppercase."""
    return re.sub(r'[^A-Z]', '', text.upper())

def lclearw(text):
    """Remove non-alphabetic characters except spaces, convert to uppercase."""
    return re.sub(r'[^A-Z ]', '', text.upper())
# Shift cipher implementation
def shift_encrypt(plaintext, key):
    """Encrypt using a shift cipher with the given key (0-25)."""
    plaintext = lclear(plaintext)
    return ''.join(chr((ord(c) - ord('A') + key) % 26 + ord('A')) for c in plaintext)

def shift_decrypt(ciphertext, key):
    """Decrypt a shift cipher."""
    return shift_encrypt(ciphertext, -key)

# Example: Caesar's original shift of 3
plaintext = "ATTACK AT DAWN"
key = 3
ciphertext = shift_encrypt(plaintext, key)
decrypted = shift_decrypt(ciphertext, key)
print(f"Plaintext:  {lclear(plaintext)}")
print(f"Key:        {key}")
print(f"Ciphertext: {ciphertext}")
print(f"Decrypted:  {decrypted}")
Plaintext:  ATTACKATDAWN
Key:        3
Ciphertext: DWWDFNDWGDZQ
Decrypted:  ATTACKATDAWN

Experiment 2: Visualizing the Shift Cipher#

# Visualize the shift cipher as a permutation
key = 7
alphabet = [chr(i + ord('A')) for i in range(26)]
shifted = [chr((i + key) % 26 + ord('A')) for i in range(26)]

# Display as a table
print(f"Key = {key}")
print("Plain:  " + " ".join(alphabet))
print("Cipher: " + " ".join(shifted))

# Visualize with SageMath bar chart
offsets = [(i + key) % 26 for i in range(26)]
show(bar_chart(offsets, color='steelblue'),
     axes_labels=['Letter index', 'Mapped to'],
     title=f'Shift Cipher (k={key}): Mapping')
Key = 7
Plain:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Cipher: H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
../_images/465647e7cf9a54c9d83825641b3f33b1bfb3f9d118e4463ab738270ab499037e.png

1.7 The General Substitution Cipher#

A general substitution cipher uses an arbitrary permutation of the alphabet, not just a cyclic shift.

In SageMath, permutations are first-class objects with powerful built-in support via Permutation() and SymmetricGroup(). This makes the implementation much more natural than working with raw arrays.

# SageMath's built-in permutation support
G = SymmetricGroup(26)

# Generate a random permutation key
set_random_seed(42)
sigma = G.random_element()
print(f"Random permutation key:\n{sigma}")
print(f"\nCycle notation: {sigma.cycle_string()}")
print(f"Order of permutation: {sigma.order()}")
Random permutation key:
(1,16,21,18,25,3,11,26,5,4,23,6,7,15,9,22,12)(2,24,17,14,8,13,19,10,20)

Cycle notation: (1,16,21,18,25,3,11,26,5,4,23,6,7,15,9,22,12)(2,24,17,14,8,13,19,10,20)
Order of permutation: 153
# General substitution cipher using SageMath permutations
def substitution_encrypt(plaintext, perm):
    """Encrypt using a general substitution cipher."""
    plaintext = lclear(plaintext)
    # SageMath permutations are 1-indexed
    return ''.join(chr(perm(ord(c) - ord('A') + 1) - 1 + ord('A')) for c in plaintext)

def substitution_decrypt(ciphertext, perm):
    """Decrypt using the inverse permutation."""
    return substitution_encrypt(ciphertext, perm.inverse())

# Demo
msg = "MEET ME AT THE LIBRARY"
ct = substitution_encrypt(msg, sigma)
pt = substitution_decrypt(ct, sigma)
print(f"Plaintext:  {lclear(msg)}")
print(f"Ciphertext: {ct}")
print(f"Decrypted:  {pt}")
print(f"\nMatch: {lclear(msg) == pt}")
Plaintext:  MEETMEATTHELIBRARY
Ciphertext: SDDBSDPBBMDAVXYPYC
Decrypted:  MEETMEATTHELIBRARY

Match: True

Self-Invertible Permutations (Involutions)#

A particularly interesting class of substitution ciphers uses involutions — permutations \(\sigma\) such that \(\sigma^2 = \text{id}\), meaning \(\sigma = \sigma^{-1}\). With an involution, the same key encrypts and decrypts.

Definition 1.6 (Involution)

An involution is a permutation \(\sigma\) with \(\sigma \circ \sigma = \text{id}\). Equivalently, every cycle of \(\sigma\) has length 1 or 2. The number of involutions on \(n\) elements is given by:

\[ a(n) = a(n-1) + (n-1) \cdot a(n-2), \quad a(0) = a(1) = 1\]
# Counting and generating involutions in SageMath
def count_involutions(n):
    """Count involutions on n elements using the recurrence."""
    if n <= 1:
        return 1
    a = [0] * (n + 1)
    a[0] = a[1] = 1
    for k in range(2, n + 1):
        a[k] = a[k-1] + (k-1) * a[k-2]
    return a[n]

# Number of involutions for alphabet sizes
for n in [5, 10, 15, 20, 26]:
    inv = count_involutions(n)
    total = factorial(n)
    ratio = float(inv / total) * 100
    print(f"n={int(n):2d}: {inv:>15,} involutions out of {total:>30,} ({float(ratio):.4f}%)")
n= 5:              26 involutions out of                            120 (21.6667%)
n=10:           9,496 involutions out of                      3,628,800 (0.2617%)
n=15:      10,349,536 involutions out of              1,307,674,368,000 (0.0008%)
n=20:  23,758,664,096 involutions out of      2,432,902,008,176,640,000 (0.0000%)
n=26: 532,985,208,200,576 involutions out of 403,291,461,126,605,635,584,000,000 (0.0000%)
# Generate a random involution in SageMath
def random_involution(n):
    """Generate a random involution on {1,...,n}."""
    available = list(range(1, n+1))
    cycles = []
    while available:
        i = available.pop(0)
        if available and random() < 0.5:
            j = available.pop(ZZ.random_element(len(available)))
            cycles.append((i, j))
        else:
            cycles.append((i,))
    from sage.combinat.permutation import Permutation
    perm = list(range(1, n+1))
    for cyc in cycles:
        if len(cyc) == 2:
            perm[cyc[0]-1] = cyc[1]
            perm[cyc[1]-1] = cyc[0]
    return Permutation(perm)

set_random_seed(123)
inv = random_involution(26)
print(f"Random involution: {inv}")
print(f"Cycle type: {inv.cycle_type()}")
print(f"Is involution: {inv * inv == Permutation(list(range(1,27)))}")
Random involution: [15, 2, 8, 4, 24, 9, 7, 3, 6, 17, 11, 16, 14, 13, 1, 12, 10, 20, 26, 18, 22, 21, 25, 5, 23, 19]
Cycle type: [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1]
Is involution: True

1.8 Key Space Analysis#

Theorem 1.1 (Size of Substitution Key Spaces)

For substitution ciphers over an alphabet of size \(n\):

Cipher Type

Key Space Size

For \(n = 26\)

Shift cipher

\(n\)

26

Affine cipher

\(n \cdot \phi(n)\)

312

General substitution

\(n!\)

\(\approx 4.03 \times 10^{26}\)

Involution

\(\sum_{k=0}^{\lfloor n/2 \rfloor} \binom{n}{2k} (2k-1)!!\)

\(\approx 7.91 \times 10^{12}\)

Even though \(26! \approx 4 \times 10^{26}\) makes brute force infeasible, frequency analysis breaks general substitution ciphers efficiently — the large key space provides no protection against statistical attacks.

# Key space analysis in SageMath
n = 26
print(f"Alphabet size: {n}")
print(f"Shift cipher keys:        {n}")
print(f"Affine cipher keys:       {n * euler_phi(n)}")
print(f"General substitution:     {factorial(n)}")
print(f"Involutions:              {count_involutions(n)}")
print()

# Brute force time estimates (at 10^9 keys/sec)
rate = 10^9
for name, keys in [("Shift", n), ("Affine", n * euler_phi(n)),
                    ("Involution", count_involutions(n)),
                    ("General", factorial(n))]:
    seconds = float(keys / rate)
    if seconds < 1:
        print(f"  {name:20s}: {float(seconds*1e6):.2f} microseconds")
    elif seconds < 3600:
        print(f"  {name:20s}: {float(seconds):.2f} seconds")
    elif seconds < 86400*365:
        print(f"  {name:20s}: {float(seconds/3600):.1f} hours")
    else:
        print(f"  {name:20s}: {seconds/(86400*365.25):.2e} years")
Alphabet size: 26
Shift cipher keys:        26
Affine cipher keys:       312
General substitution:     403291461126605635584000000
Involutions:              532985208200576

  Shift               : 0.03 microseconds
  Affine              : 0.31 microseconds
  Involution          : 148.1 hours
  General             : 1.28e+10 years

Experiment 3: Why Key Space Size Doesn’t Equal Security#

Even with \(26! \approx 4 \times 10^{26}\) possible keys, a general substitution cipher is easily broken by frequency analysis.

Let’s preview why — we’ll study this in depth in Chapter 3.

# Frequency analysis preview
english_text = "TO BE OR NOT TO BE THAT IS THE QUESTION WHETHER TIS NOBLER IN THE MIND TO SUFFER THE SLINGS AND ARROWS OF OUTRAGEOUS FORTUNE OR TO TAKE ARMS AGAINST A SEA OF TROUBLES"
english_text = lclear(english_text)

# Encrypt with random substitution
ct = substitution_encrypt(english_text, sigma)

# Count frequencies
def letter_freq(text):
    counts = [0] * 26
    for c in text:
        counts[ord(c) - ord('A')] += 1
    total = len(text)
    return [c/total for c in counts]

plain_freq = letter_freq(english_text)
cipher_freq = letter_freq(ct)

# Plot side by side using SageMath
labels = [chr(i + ord('A')) for i in range(26)]
p1 = bar_chart(plain_freq, color='steelblue')
p2 = bar_chart(cipher_freq, color='crimson')
show(graphics_array([p1, p2]), figsize=[12, 4])
print("Left: Plaintext frequency | Right: Ciphertext frequency")
print("The pattern is the same — just shuffled! This is the key vulnerability.")
../_images/d5b6ffb7ae42076a9bade7c16972e613924461373b008388f08a2e20852e1150.png
Left: Plaintext frequency | Right: Ciphertext frequency
The pattern is the same — just shuffled! This is the key vulnerability.

1.9 Looking Ahead#

This chapter has introduced the fundamental concepts and vocabulary of cryptanalysis. In the chapters that follow, we will:

  • Chapter 2: Study permutations and substitution ciphers in depth, using SageMath’s SymmetricGroup and Permutation classes.

  • Chapter 3: Develop Al-Kindi’s frequency analysis into a complete attack on substitution ciphers.

  • Chapters 4–6: Tackle polyalphabetic ciphers (Vigenère) with Kasiski examination and the index of coincidence.

  • Chapters 7–9: Break polygraphic ciphers (Hill, Playfair) using matrix algebra and automated techniques.

  • Chapter 10–12: Explore the Enigma machine and its breaking.

  • Chapters 13–24: Enter the modern era: block ciphers, linear/differential cryptanalysis, AES, and finite fields.

  • Chapters 25–36: Public-key cryptography: RSA, Diffie–Hellman, elliptic curves, and algebraic attacks.

  • Chapters 37–45: The quantum threat and post-quantum cryptography.

1.10 Exercises#

Exercise 1.1 (Kerckhoffs’s Principle). Give three examples from the history of cryptography where “security through obscurity” failed — i.e., where a cipher was broken shortly after its algorithm became known.

Exercise 1.2 (Key Space). Compute the size of the key space for an affine cipher \(E(x) = ax + b \bmod n\) over alphabets of size \(n = 26\), \(n = 29\) (Polish alphabet), and \(n = 256\) (bytes). Why does \(a\) need to satisfy \(\gcd(a, n) = 1\)?

Exercise 1.3 (Involution Count). Use SageMath to verify the involution count formula for \(n = 5, 6, 7\) by exhaustively checking all permutations in \(S_n\).

Exercise 1.4 (Shift Cipher Statistics). Encrypt a passage from a book using a random shift. Then compute the letter frequency distribution of the ciphertext and compare it to the standard English distribution. How would you use this comparison to recover the key?

Exercise 1.5 (Challenge). Implement an affine cipher \(E(x) = ax + b \bmod 26\) in SageMath. How many keys does this cipher have? Write a brute-force attack that tries all valid keys and scores them using frequency analysis.

1.11 Summary#

Concept

Key Idea

Cryptanalysis

The science of breaking ciphers

Kerckhoffs’s principle

Security must reside in the key alone

Attack models

Ciphertext-only, known-plaintext, chosen-plaintext, chosen-ciphertext

Substitution cipher

Replace each letter via a permutation \(\sigma \in S_n\)

Shift (Caesar) cipher

Special case: \(\sigma(i) = (i + k) \bmod n\)

Involutions

Self-inverse permutations; \(\sigma^2 = \text{id}\)

Key space ≠ security

Large key space necessary but not sufficient

SageMath tools introduced:

  • SymmetricGroup(n), G.random_element() — permutation groups

  • Permutation(...), .cycle_string(), .order(), .inverse() — permutation objects

  • euler_phi(n), factorial(n), gcd(a, b) — number theory

  • set_random_seed(), ZZ.random_element() — reproducible randomness

  • bar_chart(), graphics_array(), show() — visualization

1.12 References#

  1. Al-Kindi (c. 850). Risalah fi Istikhraj al-Mu’amma (A Manuscript on Deciphering Cryptographic Messages). The first known work on cryptanalysis, introducing frequency analysis.

  2. Kerckhoffs, A. (1883). La Cryptographie Militaire. Journal des Sciences Militaires, 9, 5–38. Establishes the foundational principles of cryptographic design.

  3. Singh, S. (1999). The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography. Anchor Books. An accessible history of cryptography and cryptanalysis.

  4. Stinson, D. R. and Paterson, M. B. (2018). Cryptography: Theory and Practice, 4th ed. CRC Press. A rigorous mathematical treatment of classical and modern cryptography.

  5. Katz, J. and Lindell, Y. (2020). Introduction to Modern Cryptography, 3rd ed. CRC Press. The standard textbook for formal definitions of security.