Chapter 7: The Hill Cipher (SageMath)

Chapter 7: The Hill Cipher (SageMath)#

Python Version Available

This notebook contains the SageMath implementation. A pure Python/NumPy version is available in Chapter 7: The Hill Cipher.

7.1 Introduction: Linear Algebra Enters Cryptography#

In 1929, the American mathematician Lester S. Hill (1891–1961) published a remarkable paper in The American Mathematical Monthly titled “Cryptography in an Algebraic Alphabet.” For the first time in the history of cryptography, a cipher was designed using the full power of linear algebra — specifically, matrix multiplication over modular arithmetic.

“The problem of substitution-encipherment … leads naturally to the consideration of sets of linear substitutions in a modular arithmetic.” — Lester S. Hill, “Cryptography in an Algebraic Alphabet,” 1929

Unlike all previous ciphers, which operated on individual letters or at most pairs, the Hill cipher encrypts blocks of \(n\) letters simultaneously by treating them as vectors in \(\mathbb{Z}_{26}^n\) and multiplying by an invertible \(n \times n\) key matrix. This makes it the first polygraphic cipher based on linear transformations, and the first to completely destroy single-letter frequency statistics.

Hill even obtained a US patent (US1845947) for his invention in 1932. Though the cipher was never widely adopted for military use — it is vulnerable to known-plaintext attacks via simple linear algebra — the Hill cipher remains pedagogically important. It demonstrates both the power and the limitations of linearity in cryptography, foreshadowing the modern principle that nonlinearity (S-boxes) is essential for security.

Tip

The Hill cipher teaches a crucial lesson: algebraic structure that makes a cipher elegant also makes it breakable. The linearity that allows efficient encryption also allows efficient cryptanalysis. This tension between structure and security is a recurring theme throughout modern cryptography.

7.2 Formal Definitions#

Definition 7.1 (Hill Cipher)

Let \(n \geq 1\) be the block size and let \(K\) be an invertible \(n \times n\) matrix over \(\mathbb{Z}_{26}\). The Hill cipher encrypts a plaintext block \(\mathbf{m} = (m_1, \ldots, m_n)^T \in \mathbb{Z}_{26}^n\) as:

\[ \mathbf{c} = K \cdot \mathbf{m} \pmod{26}\]

Decryption is performed using the modular inverse:

\[ \mathbf{m} = K^{-1} \cdot \mathbf{c} \pmod{26}\]

Each letter is mapped to an integer: \(a \mapsto 0, b \mapsto 1, \ldots, z \mapsto 25\).

Definition 7.2 (Invertibility over \(\mathbb{Z}_{26}\))

A matrix \(K\) is invertible modulo 26 if and only if

\[ \gcd(\det(K), 26) = 1\]

Since \(26 = 2 \times 13\), the determinant must be coprime to both 2 and 13. Equivalently, \(\det(K) \pmod{26}\) must be one of \(\{1, 3, 5, 7, 9, 11, 15, 17, 19, 21, 23, 25\}\) — there are 12 valid values out of 26.

Definition 7.3 (Modular Matrix Inverse)

The inverse \(K^{-1} \pmod{26}\) is computed as:

\[ K^{-1} = (\det K)^{-1} \cdot \text{adj}(K) \pmod{26}\]

where \(\text{adj}(K)\) is the adjugate (transpose of the cofactor matrix) and \((\det K)^{-1}\) is the modular multiplicative inverse of \(\det(K)\) modulo 26, computed via the extended Euclidean algorithm.

Definition 7.4 (Known-Plaintext Attack)

Given \(n\) known plaintext-ciphertext block pairs \((\mathbf{m}_i, \mathbf{c}_i)\), form the matrices \(M = [\mathbf{m}_1 | \cdots | \mathbf{m}_n]\) and \(C = [\mathbf{c}_1 | \cdots | \mathbf{c}_n]\). If \(M\) is invertible mod 26, the key is recovered as:

\[ K = C \cdot M^{-1} \pmod{26}\]

Danger

The Hill cipher is completely broken by a known-plaintext attack.

An attacker who knows just \(n\) plaintext-ciphertext pairs (e.g., \(n = 2\) for a \(2 \times 2\) key) can recover the entire key matrix in polynomial time. This makes the Hill cipher unsuitable for any practical security application. Its vulnerability arises from the linearity of the encryption function.

SageMath Implementation#

SageMath provides native support for matrices over \(\mathbb{Z}_{26}\) via IntegerModRing and Matrix. This means we do not need to implement modular determinants, extended GCD, or adjugate computations by hand — SageMath handles all of this natively. Matrix inversion modulo 26 is simply K.inverse().

import re

def lclear(text):
    """Remove all non-lowercase-letter characters."""
    return re.sub(r'[^a-z]', '', text.lower())

def lclearw(text):
    """Clean text, keeping spaces between words."""
    t1 = re.sub(r'[^a-z]', ' ', text.lower())
    return re.sub(' {2,}', ' ', t1)

def Frequency(text, alphabet):
    """Compute letter frequency distribution."""
    s = lclear(text)
    return [float(s.count(let) / len(s)) for let in alphabet]

# Hill cipher using SageMath's native matrix arithmetic over Z_26

Z26 = IntegerModRing(26)
lalphabet = 'abcdefghijklmnopqrstuvwxyz'

def generate_key(key_string, block_size):
    """
    Generate a key matrix from a string.
    Each character maps to its position: a=0, b=1, ..., z=25.
    """
    key_values = [ord(c) - ord('a') for c in key_string.lower()]
    MS = MatrixSpace(Z26, block_size, block_size)
    return MS(key_values)

def text_to_blocks(text, block_size):
    """Convert cleaned text to a list of vectors over Z_26."""
    text = lclear(text)
    while len(text) % block_size != 0:
        text += 'x'
    blocks = []
    for i in range(0, len(text), block_size):
        v = vector(Z26, [ord(text[i+j]) - ord('a') for j in range(block_size)])
        blocks.append(v)
    return blocks

def blocks_to_text(blocks):
    """Convert blocks of Z_26 vectors back to text."""
    return ''.join(chr(int(x) + ord('a')) for block in blocks for x in block)

def hill_encrypt(plaintext, key):
    """Encrypt plaintext using the Hill cipher."""
    n = key.nrows()
    blocks = text_to_blocks(plaintext, n)
    encrypted = [key * b for b in blocks]
    return blocks_to_text(encrypted)

def hill_decrypt(ciphertext, key):
    """Decrypt ciphertext using the Hill cipher."""
    n = key.nrows()
    key_inv = key.inverse()
    blocks = text_to_blocks(ciphertext, n)
    decrypted = [key_inv * b for b in blocks]
    return blocks_to_text(decrypted)

# --- Demo: 3x3 Hill cipher ---
K = Matrix(Z26, [[6, 24, 1], [13, 16, 10], [20, 17, 15]])
print("Key matrix K (3x3 over Z_26):")
print(K)
print(f"\ndet(K) mod 26 = {K.det()}")
print(f"gcd(det(K), 26) = {gcd(int(K.det()), 26)}")
print(f"\nK^(-1) mod 26:")
print(K.inverse())

pt = "actfast"
ct = hill_encrypt(pt, K)
dt = hill_decrypt(ct, K)
print(f"\nPlaintext:  {pt}")
print(f"Encrypted:  {ct}")
print(f"Decrypted:  {dt}")
print(f"Correct:    {dt.startswith(pt)}")

Key matrix K (3x3 over Z_26):
[ 6 24  1]
[13 16 10]
[20 17 15]

det(K) mod 26 = 25
gcd(det(K), 26) = 1

K^(-1) mod 26:
[ 8  5 10]
[21  8 21]
[21 12  8]

Plaintext:  actfast
Encrypted:  pohwlgnny
Decrypted:  actfastxx
Correct:    True

7.3 Key Results#

Theorem 7.1 (General Linear Group)

The set of all invertible \(n \times n\) matrices over \(\mathbb{Z}_{26}\) forms a group under matrix multiplication modulo 26, denoted \(\mathrm{GL}(n, \mathbb{Z}_{26})\). The number of valid Hill cipher keys of block size \(n\) is \(|\mathrm{GL}(n, \mathbb{Z}_{26})|\).

Proof sketch

Closure: If \(A, B\) are invertible mod 26, then \(\det(AB) = \det(A)\det(B)\), which is coprime to 26 since the product of units in \(\mathbb{Z}_{26}\) is a unit. Identity: \(I_n\). Inverses exist by the adjugate formula. Associativity follows from matrix multiplication. \(\blacksquare\)

Theorem 7.2 (Known-Plaintext Key Recovery)

Given \(n\) known plaintext-ciphertext block pairs \((\mathbf{m}_i, \mathbf{c}_i)\) for \(i = 1, \ldots, n\), where \(M = [\mathbf{m}_1 | \cdots | \mathbf{m}_n]\) is invertible mod 26, the key is uniquely determined:

\[ K = C \cdot M^{-1} \pmod{26}\]

Proof

Since \(\mathbf{c}_i = K \cdot \mathbf{m}_i\) for each \(i\), we have \(C = K \cdot M\). Multiplying both sides on the right by \(M^{-1}\) gives \(K = C \cdot M^{-1}\). \(\blacksquare\)

Observation 7.3 (Frequency Analysis Resistance)

The Hill cipher with block size \(n \geq 2\) destroys the single-letter frequency distribution of plaintext. The index of coincidence of Hill-encrypted text approaches \(\kappa_r = 1/26 \approx 0.0385\) (the random baseline). This flattening is observed for typical keys but is not guaranteed for all invertible Hill cipher keys.

Experiment 1: Step-by-Step 2x2 Encryption#

Let us trace the encryption of “help” using a \(2 \times 2\) key matrix. SageMath handles all arithmetic over \(\mathbb{Z}_{26}\) natively.

# 2x2 Hill cipher step-by-step
K2 = Matrix(Z26, [[3, 3], [2, 5]])
print("Key K =")
print(K2)
print(f"\ndet(K) mod 26 = {K2.det()}")
K2_inv = K2.inverse()
print("\nK^(-1) mod 26 =")
print(K2_inv)
print()

# Encrypt "help" = [7, 4, 11, 15]
plaintext = "help"
blocks = text_to_blocks(plaintext, 2)
print(f"Plaintext: '{plaintext}'")
print(f"As numbers: {[ord(c)-ord('a') for c in plaintext]}")
print(f"Blocks: {[list(b) for b in blocks]}")
print()

for i, block in enumerate(blocks):
    result = K2 * block
    letters_in = ''.join(chr(int(v) + ord('a')) for v in block)
    letters_out = ''.join(chr(int(v) + ord('a')) for v in result)
    b0, b1 = int(block[0]), int(block[1])
    k00, k01, k10, k11 = int(K2[0,0]), int(K2[0,1]), int(K2[1,0]), int(K2[1,1])
    print(f"Block {i+1}: {[b0, b1]} ('{letters_in}')")
    print(f"  K @ [{b0}, {b1}]^T = [{k00}*{b0}+{k01}*{b1}, {k10}*{b0}+{k11}*{b1}] = [{k00*b0+k01*b1}, {k10*b0+k11*b1}]")
    print(f"  mod 26 = {[int(v) for v in result]} ('{letters_out}')")
    print()

ct = hill_encrypt(plaintext, K2)
dt = hill_decrypt(ct, K2)
print(f"Full encryption: '{plaintext}' -> '{ct}'")
print(f"Full decryption: '{ct}' -> '{dt}'")

Key K =
[3 3]
[2 5]

det(K) mod 26 = 9

K^(-1) mod 26 =
[15 17]
[20  9]

Plaintext: 'help'
As numbers: [7, 4, 11, 15]
Blocks: [[7, 4], [11, 15]]

Block 1: [7, 4] ('he')
  K @ [7, 4]^T = [3*7+3*4, 2*7+5*4] = [33, 34]
  mod 26 = [7, 8] ('hi')

Block 2: [11, 15] ('lp')
  K @ [11, 15]^T = [3*11+3*15, 2*11+5*15] = [78, 97]
  mod 26 = [0, 19] ('at')

Full encryption: 'help' -> 'hiat'
Full decryption: 'hiat' -> 'help'

Experiment 2: Frequency Analysis#

Failure

The Hill cipher completely flattens the letter frequency distribution. The linear structure of the Hill cipher destroys the statistical nature of the frequency analysis. However, as we will explore later, this same linear structure can be exploited by linear cryptanalysis. So our failure will become our next method of attack!

# Load a long English text (Ulysses by James Joyce)
counter = 0
longtxtw = ''
with open('../crypt_old_course/Course/Modules/week2/clean/Ulysses_4300-0.txt', encoding='utf8') as f:
    for line in f:
        longtxtw += lclearw(line.strip())
        counter += 1
        if counter > 10000:
            break

print(f"Loaded {len(lclear(longtxtw))} characters of cleaned text.")
print(f"Preview: {lclear(longtxtw[:200])}")

Loaded 10193 characters of cleaned text.
Preview: statelyplumpbuckmulligancamefromthestairheadbearingabowloflatheronwhichamirrorandarazorlaycrossedayellowdressinggownungirdledwassustainedgentlybehindhimonthemildmorning

# Frequency distribution of the original English text
bar_chart(Frequency(lclear(longtxtw[:10000]), lalphabet))

../_images/9e8b33ae4cbf79ab2cd5a2851fb25726146e10a1d5e84af1e24c13e106907a39.png

# Encrypt a long passage with the 2x2 Hill cipher
key_string = 'hill'
plaintext_long = lclear(longtxtw[:1003])
key_2x2 = generate_key(key_string, 2)
ciphertext_long = hill_encrypt(plaintext_long, key_2x2)
decrypted_check = hill_decrypt(ciphertext_long, key_2x2)
assert plaintext_long == decrypted_check, "Decryption failed!"
print(f"Key matrix (from '{key_string}'):")
print(key_2x2)
print(f"\nEncrypted {len(plaintext_long)} characters successfully.")
print(f"Decryption verified: {plaintext_long == decrypted_check}")

Key matrix (from 'hill'):
[ 7  8]
[11 11]

Encrypted 836 characters successfully.
Decryption verified: True

# Frequency distribution of the Hill cipher ciphertext (sorted)
bar_chart(sorted(Frequency(ciphertext_long[:1003], lalphabet)))

../_images/2fe260ef9618980cd599ee976177a815dd047e7c61c9a3a716a02338d2f7f8eb.png

# Compare: sorted frequency of the original plaintext
bar_chart(sorted(Frequency(lclear(longtxtw[:1003]), lalphabet)))

../_images/829b336e378cf8588487929b8d4c397ea94741205fac7aec4adfff919c8a3d2c.png

freq_plain = Frequency(plaintext_long, lalphabet)
freq_cipher = Frequency(ciphertext_long, lalphabet)

print(f"Plaintext std dev of frequencies:  {float(std(freq_plain)):.6f}")
print(f"Ciphertext std dev of frequencies: {float(std(freq_cipher)):.6f}")
print(f"Random baseline (1/26):            {float(1/26):.6f}")
print(f"\nThe Hill cipher flattens the frequency distribution toward uniform.")

Plaintext std dev of frequencies:  0.032475
Ciphertext std dev of frequencies: 0.011008
Random baseline (1/26):            0.038462

The Hill cipher flattens the frequency distribution toward uniform.

/var/folders/z7/wp7m8p7x1250jzvklw5z24mm0000gn/T/ipykernel_30187/1926442996.py:4: DeprecationWarning: sage.stats.basic_stats.std is deprecated; use numpy.std or numpy.nanstd instead
See https://github.com/sagemath/sage/issues/29662 for details.
  print(f"Plaintext std dev of frequencies:  {float(std(freq_plain)):.6f}")
/opt/homebrew/Caskroom/miniconda/base/envs/sage/lib/python3.11/site-packages/sage/stats/basic_stats.py:258: DeprecationWarning: sage.stats.basic_stats.variance is deprecated; use numpy.var or numpy.nanvar instead
See https://github.com/sagemath/sage/issues/29662 for details.
  return sqrt(variance(v, bias=bias))
/opt/homebrew/Caskroom/miniconda/base/envs/sage/lib/python3.11/site-packages/sage/stats/basic_stats.py:354: DeprecationWarning: sage.stats.basic_stats.mean is deprecated; use numpy.mean or numpy.nanmean instead
See https://github.com/sagemath/sage/issues/29662 for details.
  mu = mean(v)
/var/folders/z7/wp7m8p7x1250jzvklw5z24mm0000gn/T/ipykernel_30187/1926442996.py:5: DeprecationWarning: sage.stats.basic_stats.std is deprecated; use numpy.std or numpy.nanstd instead
See https://github.com/sagemath/sage/issues/29662 for details.
  print(f"Ciphertext std dev of frequencies: {float(std(freq_cipher)):.6f}")

Experiment 3: Index of Coincidence Comparison#

We compute the IoC for plaintext and Hill-encrypted text to confirm the destruction of statistical structure. The Hill cipher has no statistical structure similar to the original English.

def IC(sample, alphabet):
    """Compute the Index of Coincidence (scaled by alphabet size)."""
    s = lclear(sample)
    n = len(s)
    if n < 2:
        return 0.0
    return float(
        len(alphabet) * sum(
            s.count(x) * (s.count(x) - 1) for x in alphabet
        ) / (n * (n - 1))
    )

# IoC for plaintext
sample = plaintext_long[:2000]
gra1 = list_plot([IC(sample[:k], lalphabet) for k in range(5, len(lclear(sample)))])
gra1.show()

../_images/9ccdb03f313b704432b108eb36b59cea5beea9e088dab773bee4afac304ea69a.png

# IoC for ciphertext
sample2 = ciphertext_long[:2000]
gra2 = list_plot([IC(sample2[:k], lalphabet) for k in range(5, len(lclear(sample2)))], color='red')
gra2.show()

../_images/448d5c7059c478273062070caf00500c942acabe167f35f1d51ae99827e43ac7.png

# Combined plot: plaintext (blue) vs ciphertext (red)
multi_graphics([(gra1, (0.125, 0.65, 0.4, 0.4)), (gra2, (0.125, 0.11, 0.4, 0.4))])

../_images/036836e30105e13fde0c670d96cc0b0690748feba9e6c1d0b7f3a98fa3af16a3.png

print(f"Plaintext IoC (full):   {float(IC(plaintext_long, lalphabet)):.4f}")
print(f"Ciphertext IoC (full):  {float(IC(ciphertext_long, lalphabet)):.4f}")
print(f"English expected:       ~1.73 (scaled by 26)")
print(f"Random expected:        ~1.00 (scaled by 26)")

Plaintext IoC (full):   1.6564
Ciphertext IoC (full):  1.0489
English expected:       ~1.73 (scaled by 26)
Random expected:        ~1.00 (scaled by 26)

Experiment 4: Known-Plaintext Attack#

We demonstrate the devastating simplicity of breaking the Hill cipher with known plaintext. With just \(n\) known plaintext-ciphertext block pairs, we can recover the entire \(n \times n\) key matrix using SageMath’s built-in modular matrix inversion.

def hill_kpa_attack(known_plain_blocks, known_cipher_blocks):
    """
    Known-plaintext attack on the Hill cipher.

    Given n plaintext blocks and their corresponding ciphertext blocks,
    recover the n x n key matrix K such that c_i = K * m_i (mod 26).

    We form M = [m1 | m2 | ... | mn] and C = [c1 | c2 | ... | cn],
    then K = C * M^{-1} (mod 26).
    """
    n = len(known_plain_blocks)
    M = Matrix(Z26, known_plain_blocks).transpose()
    C = Matrix(Z26, known_cipher_blocks).transpose()
    K_recovered = C * M.inverse()
    return K_recovered

# Set up: encrypt with secret key
secret_key = Matrix(Z26, [[3, 3], [2, 5]])

# Attacker knows 2 plaintext-ciphertext pairs
plain1 = vector(Z26, [7, 4])    # "he"
plain2 = vector(Z26, [11, 15])  # "lp"
cipher1 = secret_key * plain1
cipher2 = secret_key * plain2

print("=== Known-Plaintext Attack ===")
print(f"Known pair 1: plaintext 'he' = {list(plain1)} -> ciphertext = {list(cipher1)} = '{chr(int(cipher1[0])+ord('a'))}{chr(int(cipher1[1])+ord('a'))}'")
print(f"Known pair 2: plaintext 'lp' = {list(plain2)} -> ciphertext = {list(cipher2)} = '{chr(int(cipher2[0])+ord('a'))}{chr(int(cipher2[1])+ord('a'))}'")
print()

K_recovered = hill_kpa_attack([list(plain1), list(plain2)], [list(cipher1), list(cipher2)])
print("True key:")
print(secret_key)
print("\nRecovered key:")
print(K_recovered)
print(f"\nKeys match: {K_recovered == secret_key}")
print()

# Verify by decrypting with recovered key
test_msg = "thehillcipherisbroken"
ct_test = hill_encrypt(test_msg, secret_key)
dt_test = hill_decrypt(ct_test, K_recovered)
print(f"Test message:    '{test_msg}'")
print(f"Encrypted:       '{ct_test}'")
print(f"Decrypted (rec): '{dt_test}'")
print(f"Attack success:  {dt_test.startswith(test_msg)}")

=== Known-Plaintext Attack ===
Known pair 1: plaintext 'he' = [7, 4] -> ciphertext = [7, 8] = 'hi'
Known pair 2: plaintext 'lp' = [11, 15] -> ciphertext = [0, 19] = 'at'

True key:
[3 3]
[2 5]

Recovered key:
[3 3]
[2 5]

Keys match: True

Test message:    'thehillcipherisbroken'
Encrypted:       'avhrftngrnhixwfppaqoel'
Decrypted (rec): 'thehillcipherisbrokenx'
Attack success:  True

Experiment 5: Keyspace Analysis#

We compute the fraction of \(2 \times 2\) matrices that are invertible for various moduli, and the size of \(\mathrm{GL}(2, \mathbb{Z}_m)\).

def count_invertible_2x2(m):
    """Count invertible 2x2 matrices over Z_m."""
    count = 0
    for a in range(m):
        for b in range(m):
            for c in range(m):
                for d in range(m):
                    det = (a * d - b * c) % m
                    if gcd(det, m) == 1:
                        count += 1
    return count

# Compute for small moduli
moduli = list(range(2, 20))
fractions = []
counts = []
for m in moduli:
    total = m^4
    inv = count_invertible_2x2(m)
    counts.append(inv)
    fractions.append(float(inv / total))

# Fraction of invertible matrices (green = prime modulus, blue = composite)
prime_colors = ['green' if is_prime(m) else 'steelblue' for m in moduli]
bar_chart(fractions, color=prime_colors)

../_images/6240adf37122765755d52556adc1108c6bf0586e1790ac9205c5d5e0e2ff3d0d.png

# Size of GL(2, Z_m) on log scale
log_counts = [float(log(c, 10)) if c > 0 else 0 for c in counts]
bar_chart(log_counts, color='darkorange')

../_images/8f1b6983f0efefd5c65cce24983510db0b40bce34696424979ef2f4f929a7bb2.png

# Analytical computation for m=26 using CRT: Z_26 = Z_2 x Z_13
# |GL(2, Z_p)| = (p^2 - 1)(p^2 - p) for prime p
gl2_2 = (2^2 - 1) * (2^2 - 2)    # = 3 * 2 = 6
gl2_13 = (13^2 - 1) * (13^2 - 13) # = 168 * 156 = 26208
gl2_26 = gl2_2 * gl2_13

print(f"For m=26 (standard Hill cipher):")
print(f"  Total 2x2 matrices: 26^4 = {26^4}")
print(f"  |GL(2, Z_26)| = |GL(2,Z_2)| x |GL(2,Z_13)| = {gl2_2} x {gl2_13} = {gl2_26}")
print(f"  Fraction invertible: {float(gl2_26 / 26^4):.4f}")
print(f"  Key space: {gl2_26} (about {float(log(gl2_26, 2)):.1f} bits)")
print()
# Verify with SageMath's built-in group order
print(f"  SageMath verification: |GL(2, GF(2))| = {GL(2, GF(2)).order()}")
print(f"  SageMath verification: |GL(2, GF(13))| = {GL(2, GF(13)).order()}")

For m=26 (standard Hill cipher):
  Total 2x2 matrices: 26^4 = 456976
  |GL(2, Z_26)| = |GL(2,Z_2)| x |GL(2,Z_13)| = 6 x 26208 = 157248
  Fraction invertible: 0.3441
  Key space: 157248 (about 17.3 bits)

  SageMath verification: |GL(2, GF(2))| = 6
  SageMath verification: |GL(2, GF(13))| = 26208

Experiment 6: SageMath Matrix Verification#

SageMath makes it easy to verify the algebraic properties of the Hill cipher. Let us confirm that \(K \cdot K^{-1} \equiv I \pmod{26}\) and explore the matrix structure.

# Verify algebraic properties using SageMath
K_test = Matrix(Z26, [[3, 3], [2, 5]])
K_inv = K_test.inverse()
I2 = identity_matrix(Z26, 2)

print("K =")
print(K_test)
print("\nK^(-1) =")
print(K_inv)
print("\nK * K^(-1) =")
print(K_test * K_inv)
print(f"\nK * K^(-1) == I? {K_test * K_inv == I2}")
print(f"\ndet(K) = {K_test.det()}")
print(f"det(K)^(-1) mod 26 = {K_test.det()^(-1)}")
print(f"det(K) * det(K)^(-1) = {K_test.det() * K_test.det()^(-1)}")

K =
[3 3]
[2 5]

K^(-1) =
[15 17]
[20  9]

K * K^(-1) =
[1 0]
[0 1]

K * K^(-1) == I? True

det(K) = 9
det(K)^(-1) mod 26 = 3
det(K) * det(K)^(-1) = 1

7.4 Exercises#

Exercise 7.1: Encrypt by Hand#

Encrypt the plaintext “CODE” using the \(2 \times 2\) key matrix \(K = \begin{pmatrix} 5 & 8 \\ 17 & 3 \end{pmatrix}\). Show all intermediate computations.

Hint

Convert C=2, O=14, D=3, E=4. Block 1: \(K \cdot (2, 14)^T = (5 \cdot 2 + 8 \cdot 14, 17 \cdot 2 + 3 \cdot 14)^T = (122, 76)^T \equiv (18, 24)^T\) mod 26, giving ‘sy’. Verify with: Matrix(Z26, [[5,8],[17,3]]) * vector(Z26, [2,14]).

Exercise 7.2: Find the Inverse#

Compute the inverse of \(K = \begin{pmatrix} 7 & 11 \\ 3 & 4 \end{pmatrix}\) modulo 26. Verify by multiplying \(K \cdot K^{-1} \equiv I \pmod{26}\).

Hint

\(\det(K) = 28 - 33 = -5 \equiv 21 \pmod{26}\). Since \(\gcd(21, 26) = 1\), the inverse exists. In SageMath: K = Matrix(Z26, [[7,11],[3,4]]); print(K.inverse()); print(K * K.inverse()).

Exercise 7.3: Proof of Invertibility Criterion#

Prove that a matrix \(K\) over \(\mathbb{Z}_m\) is invertible if and only if \(\gcd(\det(K), m) = 1\).

Hint

Use the adjugate formula: \(K^{-1} = \det(K)^{-1} \cdot \text{adj}(K)\). The scalar \(\det(K)^{-1}\) exists in \(\mathbb{Z}_m\) iff \(\gcd(\det(K), m) = 1\).

Exercise 7.4: Chosen-Plaintext Attack#

Design a chosen-plaintext attack on a \(3 \times 3\) Hill cipher. What 3 plaintexts would you choose to make the key recovery as simple as possible?

Hint

Choose the standard basis vectors: \((1,0,0)^T\), \((0,1,0)^T\), \((0,0,1)^T\) – corresponding to “baa”, “aba”, “aab”. Then \(M = I\), so \(K = C \cdot I^{-1} = C\). The ciphertext blocks directly reveal the columns of the key. In SageMath: identity_matrix(Z26, 3) is the plaintext matrix.

Exercise 7.5 (Challenge): Counting Valid Keys#

For the \(2 \times 2\) Hill cipher over \(\mathbb{Z}_{26}\), compute \(|\mathrm{GL}(2, \mathbb{Z}_{26})|\) analytically using the Chinese Remainder Theorem: \(\mathbb{Z}_{26} \cong \mathbb{Z}_2 \times \mathbb{Z}_{13}\).

Hint

By CRT, \(\mathrm{GL}(2, \mathbb{Z}_{26}) \cong \mathrm{GL}(2, \mathbb{Z}_2) \times \mathrm{GL}(2, \mathbb{Z}_{13})\). For a prime \(p\): \(|\mathrm{GL}(2, \mathbb{Z}_p)| = (p^2 - 1)(p^2 - p)\). Verify in SageMath: GL(2, GF(2)).order() and GL(2, GF(13)).order().

Exercise 7.6: Exploring n-gram Distributions#

Explore the properties of the \(n\)-grams in the Hill cipher. Verify that there is no reasonable language structure in the ciphertext, and conclude that the Hill cipher is frequency analysis resistant.

Hint

Compute digram and trigram frequencies for both plaintext and Hill-encrypted text. Compare visually using bar_chart(). The ciphertext n-gram distribution should appear nearly uniform.

7.5 Summary and Key Takeaways#

In this chapter, we have studied the Hill cipher, the first polygraphic cipher based on linear algebra:

Matrix encryption. The Hill cipher encrypts \(n\)-letter blocks via multiplication by an \(n \times n\) key matrix over \(\mathbb{Z}_{26}\), completely destroying single-letter frequency statistics.
Invertibility. A key matrix is valid iff its determinant is coprime to 26. The modular inverse is computed via the adjugate formula and the extended Euclidean algorithm.
Known-plaintext vulnerability. The cipher is completely broken by \(n\) known plaintext-ciphertext pairs – a consequence of the linearity of the encryption function.
Frequency resistance but linear weakness. While the Hill cipher resists frequency analysis, its linear structure makes it vulnerable to algebraic attacks.
SageMath advantages. SageMath’s native support for IntegerModRing(26), MatrixSpace, and .inverse() eliminates the need for manual modular arithmetic implementations.

Tip

The Hill cipher perfectly illustrates a fundamental principle of cryptographic design: algebraic structure that enables efficient encryption also enables efficient cryptanalysis. Modern ciphers (like AES) carefully combine linear operations (MixColumns) with nonlinear operations (SubBytes) to achieve both efficiency and security.

What comes next: In Chapter 8, we study the Playfair cipher, another polygraphic system that operates on digrams using a different geometric approach.

7.6 References#

Hill, L. S. (1929). “Cryptography in an Algebraic Alphabet.” The American Mathematical Monthly, 36(6), 306–312. The original paper introducing the Hill cipher – elegant and readable, showing how linear algebra naturally applies to cryptography.
Hill, L. S. (1931). “Concerning Certain Linear Transformation Apparatus of Cryptography.” The American Mathematical Monthly, 38(3), 135–154. Hill’s follow-up paper with more detailed analysis and larger block sizes.
Overbey, J., Traves, W., and Wojdylo, J. (2005). “On the Keyspace of the Hill Cipher.” Cryptologia, 29(1), 59–72. Detailed analysis of the keyspace size |GL(n, Z_m)| for various n and m.
Stinson, D. R. and Paterson, M. (2018). Cryptography: Theory and Practice, 4th ed. CRC Press. Chapter 2 provides a modern textbook treatment of the Hill cipher and its cryptanalysis.
Eisenberg, M. (1999). “Hill Ciphers and Modular Linear Algebra.” Unpublished lecture notes, University of British Columbia. Clear exposition of the linear algebra behind the Hill cipher, including the CRT decomposition of GL(n, Z_m).
Ciphertext-only attack on Hill cipher. IACR ePrint 2015/802.
Linear cryptanalysis of Hill cipher. NKU Course Notes.

Chapter 7: The Hill Cipher (SageMath)

Contents

Chapter 7: The Hill Cipher (SageMath)#

7.1 Introduction: Linear Algebra Enters Cryptography#

Further Readings#

7.2 Formal Definitions#

SageMath Implementation#

7.3 Key Results#

Experiment 1: Step-by-Step 2x2 Encryption#

Experiment 2: Frequency Analysis#

Experiment 3: Index of Coincidence Comparison#

Experiment 4: Known-Plaintext Attack#

Experiment 5: Keyspace Analysis#

Experiment 6: SageMath Matrix Verification#

7.4 Exercises#

Exercise 7.1: Encrypt by Hand#

Exercise 7.2: Find the Inverse#

Exercise 7.3: Proof of Invertibility Criterion#

Exercise 7.4: Chosen-Plaintext Attack#

Exercise 7.5 (Challenge): Counting Valid Keys#

Exercise 7.6: Exploring n-gram Distributions#

7.5 Summary and Key Takeaways#

7.6 References#