Chapter 25: The RSA Cryptosystem (SageMath)#
Python Version Available
This is the SageMath companion for Chapter 25: The RSA Cryptosystem. The Python/NumPy version contains identical theory with implementations using only NumPy.
25.1 Historical Context: The Public-Key Revolution#
In 1977, Ron Rivest, Adi Shamir, and Leonard Adleman at MIT published a paper that would transform cryptography forever: “A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.” Their system, known as RSA, was the first practical implementation of the public-key concept introduced by Diffie and Hellman just one year earlier.
The RSA cryptosystem’s security rests on the integer factorization problem: while it is easy to multiply two large primes \(p\) and \(q\) to get \(n = pq\), it appears to be computationally infeasible to recover \(p\) and \(q\) from \(n\) when the primes are sufficiently large (typically 1024+ bits each).
RSA has become one of the most widely deployed cryptographic algorithms in history:
TLS/SSL: RSA key exchange and signatures secure most HTTPS connections.
PGP/GPG: RSA encryption is used for email security.
Code signing: Software updates are authenticated using RSA signatures.
Smart cards: RSA is used in banking and identity cards worldwide.
Tip
Public-key vs. symmetric cryptography. Unlike the substitution and block ciphers we studied in Parts I–VIII, RSA uses different keys for encryption and decryption. The encryption key is public — anyone can encrypt — while only the holder of the private decryption key can decrypt. This eliminates the key distribution problem that plagued symmetric cryptography for millennia.
25.2 Formal Definitions#
Definition 25.1 (RSA Key Generation)
Choose two distinct large primes \(p\) and \(q\).
Compute \(n = pq\) (the RSA modulus).
Compute \(\phi(n) = (p-1)(q-1)\) (Euler’s totient).
Choose \(e\) with \(1 < e < \phi(n)\) and \(\gcd(e, \phi(n)) = 1\) (the public exponent).
Compute \(d = e^{-1} \bmod \phi(n)\) (the private exponent).
Public key: \((n, e)\). Private key: \((n, d)\).
Definition 25.2 (RSA Encryption and Decryption)
For a message \(m \in \{0, 1, \ldots, n-1\}\):
Encryption: \(c = m^e \bmod n\)
Decryption: \(m = c^d \bmod n\)
Correctness follows from Euler’s theorem: since \(ed \equiv 1 \pmod{\phi(n)}\), we have \(c^d = m^{ed} = m^{1 + k\phi(n)} = m \cdot (m^{\phi(n)})^k \equiv m \pmod{n}\) whenever \(\gcd(m, n) = 1\).
Theorem 25.1 (RSA Correctness)
For any \(m \in \mathbb{Z}_n\) (including \(m\) that share a factor with \(n\)), the RSA decryption recovers the original message:
Proof
When \(\gcd(m, n) = 1\), this follows directly from Euler’s theorem. For the case \(\gcd(m, n) > 1\) (i.e., \(p | m\) or \(q | m\)), we use the Chinese Remainder Theorem. If \(p | m\), then \(m^{ed} \equiv 0 \equiv m \pmod{p}\). Also \(\gcd(m, q) = 1\), so \(m^{ed} \equiv m \pmod{q}\) by Fermat’s little theorem. By CRT, \(m^{ed} \equiv m \pmod{n}\).
```{danger}
**Textbook RSA is insecure!** The basic RSA described above is deterministic — the same plaintext always encrypts to the same ciphertext. This violates semantic security. In practice, RSA must be used with padding schemes like OAEP (Optimal Asymmetric Encryption Padding) to achieve CCA2 security.
25.3 Implementation in SageMath#
SageMath provides powerful built-in functions that make RSA implementation elegant:
random_prime(N)— generate a random prime up to \(N\)inverse_mod(e, phi)— compute modular inversepower_mod(m, e, n)— modular exponentiationeuler_phi(n)— Euler’s totient functionfactor(n)— integer factorizationgcd(a, b)— greatest common divisor
# RSA Key Generation in SageMath
def rsa_keygen(bits=512):
"""Generate an RSA key pair with the given bit length."""
p = random_prime(2^bits, lbound=2^(bits-1))
q = random_prime(2^bits, lbound=2^(bits-1))
while p == q:
q = random_prime(2^bits, lbound=2^(bits-1))
n = p * q
phi_n = (p - 1) * (q - 1)
# Standard public exponent
e = 65537
assert gcd(e, phi_n) == 1, "e and phi(n) must be coprime"
d = inverse_mod(e, phi_n)
return {
'public': (n, e),
'private': (n, d),
'p': p,
'q': q,
'phi_n': phi_n
}
# Generate a key pair
set_random_seed(42)
keys = rsa_keygen(bits=256) # Small for demonstration
n, e = keys['public']
_, d = keys['private']
print(f"p = {keys['p']}")
print(f"q = {keys['q']}")
print(f"n = {n}")
print(f"e = {e}")
print(f"d = {d}")
print(f"n has {n.nbits()} bits")
p = 88312755628949696921057815284243931625952258705415727922489452061668346578827
q = 112077575000807757776400520499314073774445401420501142644045698123153870166821
n = 9897879492531617143011769329237131985492735349086461903213596892112367810233417901867475608944642271717165285107375938988069040632095656309380355116498967
e = 65537
d = 2306640729502958460491306482833189142231572195654325535922933081056993661072248211753700149378304272926344555259295854901836753730104128468870918320611753
n has 512 bits
# RSA Encryption and Decryption
def rsa_encrypt(m, public_key):
"""Encrypt message m using public key (n, e)."""
n, e = public_key
return power_mod(m, e, n)
def rsa_decrypt(c, private_key):
"""Decrypt ciphertext c using private key (n, d)."""
n, d = private_key
return power_mod(c, d, n)
# Encrypt a message
message = 123456789
ciphertext = rsa_encrypt(message, keys['public'])
decrypted = rsa_decrypt(ciphertext, keys['private'])
print(f"Message: {message}")
print(f"Ciphertext: {ciphertext}")
print(f"Decrypted: {decrypted}")
print(f"Match: {message == decrypted}")
Message: 123456789
Ciphertext: 7764045828994215788209055946287661028944564657144108537609437591194216972618633830563630170439080271721555942267113609710823647337817699683686149602518399
Decrypted: 123456789
Match: True
Encrypting Text Messages#
# Encrypt a text message
def text_to_int(text):
"""Convert text to an integer."""
return int.from_bytes(text.encode('utf-8'), 'big')
def int_to_text(num):
"""Convert integer back to text."""
length = (num.bit_length() + 7) // 8
return int(num).to_bytes(length, 'big').decode('utf-8')
# Encrypt a short message
msg = "Hello, RSA!"
m = text_to_int(msg)
assert m < n, "Message must be smaller than modulus"
c = rsa_encrypt(m, keys['public'])
dec = rsa_decrypt(c, keys['private'])
recovered = int_to_text(dec)
print(f"Original: '{msg}'")
print(f"As integer: {m}")
print(f"Encrypted: {c}")
print(f"Decrypted: '{recovered}'")
Original: 'Hello, RSA!'
As integer: 87521618088895491219865889
Encrypted: 8411626285046101703422961620287752855896373842971788481537083746395586241376976653489906727334177770058395804913600071202438177765579766270224621301761202
Decrypted: 'Hello, RSA!'
25.4 CRT Decryption Speedup#
Theorem 25.2 (CRT Speedup for RSA Decryption)
By the Chinese Remainder Theorem, RSA decryption can be performed approximately 4x faster by computing:
and then combining: \(m = \text{CRT}(m_p, m_q, p, q)\).
# CRT-accelerated decryption
def rsa_decrypt_crt(c, p, q, d):
"""RSA decryption using CRT for ~4x speedup."""
dp = d % (p - 1)
dq = d % (q - 1)
mp = power_mod(c, dp, p)
mq = power_mod(c, dq, q)
return crt([mp, mq], [p, q])
# Verify CRT decryption gives the same result
dec_crt = rsa_decrypt_crt(ciphertext, keys['p'], keys['q'], d)
print(f"Standard decryption: {decrypted}")
print(f"CRT decryption: {dec_crt}")
print(f"Match: {decrypted == dec_crt}")
Standard decryption: 123456789
CRT decryption: 123456789
Match: True
# Timing comparison
import time
# Generate larger keys for meaningful timing
set_random_seed(100)
big_keys = rsa_keygen(bits=1024)
big_n, big_e = big_keys['public']
_, big_d = big_keys['private']
test_msg = ZZ.random_element(2^100, 2^200)
test_ct = rsa_encrypt(test_msg, big_keys['public'])
# Time standard decryption
trials = 100
t0 = time.time()
for _ in range(trials):
rsa_decrypt(test_ct, big_keys['private'])
std_time = (time.time() - t0) / trials
# Time CRT decryption
t0 = time.time()
for _ in range(trials):
rsa_decrypt_crt(test_ct, big_keys['p'], big_keys['q'], big_d)
crt_time = (time.time() - t0) / trials
print(f"Standard decryption: {float(std_time*1000):.3f} ms")
print(f"CRT decryption: {float(crt_time*1000):.3f} ms")
print(f"Speedup: {float(std_time/crt_time):.2f}x")
Standard decryption: 6.120 ms
CRT decryption: 2.508 ms
Speedup: 2.44x
25.5 Common Attacks on RSA#
Small Public Exponent Attack#
Definition 25.3 (Håstad’s Broadcast Attack)
If the same message \(m\) is encrypted with the same small exponent \(e\) to \(e\) different recipients (with different moduli \(n_1, \ldots, n_e\)), then \(m\) can be recovered using the Chinese Remainder Theorem without factoring any \(n_i\).
# Hastad's broadcast attack (e=3)
set_random_seed(2024)
# Three recipients with different moduli
keys1 = rsa_keygen(bits=256)
keys2 = rsa_keygen(bits=256)
keys3 = rsa_keygen(bits=256)
# Same message encrypted to all three with e=3
# (we'd need e=3 for this attack; using a contrived example)
secret_msg = ZZ.random_element(2^50)
n1, _ = keys1['public']
n2, _ = keys2['public']
n3, _ = keys3['public']
c1 = power_mod(secret_msg, 3, n1)
c2 = power_mod(secret_msg, 3, n2)
c3 = power_mod(secret_msg, 3, n3)
# CRT to recover m^3
m_cubed = crt([c1, c2, c3], [n1, n2, n3])
# Take integer cube root
recovered = ZZ(m_cubed).nth_root(3)
print(f"Original message: {secret_msg}")
print(f"Recovered message: {recovered}")
print(f"Match: {secret_msg == recovered}")
Original message: 768368873937730
Recovered message: 768368873937730
Match: True
Wiener’s Attack on Small Private Exponents#
Theorem 25.3 (Wiener’s Attack)
If the private exponent \(d < \frac{1}{3} n^{1/4}\), then \(d\) can be recovered efficiently from \((n, e)\) using the continued fraction expansion of \(e/n\).
# Wiener's attack using continued fractions
def wiener_attack(n, e):
"""Attempt to recover d using continued fraction expansion of e/n."""
# Compute continued fraction convergents of e/n
cf = continued_fraction(e / n)
convergents = cf.convergents()
for conv in convergents:
k = conv.numerator()
d_candidate = conv.denominator()
if k == 0:
continue
# Check if (ed - 1) / k is an integer (= phi(n))
if (e * d_candidate - 1) % k != 0:
continue
phi_candidate = (e * d_candidate - 1) // k
# phi(n) = n - p - q + 1, so p + q = n - phi + 1
s = n - phi_candidate + 1
# p and q are roots of x^2 - s*x + n = 0
discriminant = s^2 - 4*n
if discriminant < 0:
continue
sqrt_disc = isqrt(discriminant)
if sqrt_disc^2 == discriminant:
p = (s + sqrt_disc) // 2
q = (s - sqrt_disc) // 2
if p * q == n:
return d_candidate, p, q
return None
# Create a vulnerable RSA instance with small d
p = random_prime(2^128, lbound=2^127)
q = random_prime(2^128, lbound=2^127)
n_vuln = p * q
phi_vuln = (p-1) * (q-1)
# Choose a small d
d_small = random_prime(isqrt(isqrt(n_vuln)) // 4)
e_vuln = inverse_mod(d_small, phi_vuln)
print(f"n has {n_vuln.nbits()} bits")
print(f"True d = {d_small} ({d_small.nbits()} bits)")
result = wiener_attack(n_vuln, e_vuln)
if result:
d_recovered, p_recovered, q_recovered = result
print(f"Recovered d = {d_recovered}")
print(f"Attack successful: {d_recovered == d_small}")
else:
print("Attack failed (d may be too large)")
n has 256 bits
True d = 1269808153351150757 (61 bits)
Recovered d = 1269808153351150757
Attack successful: True
25.6 Factoring Algorithms#
The security of RSA ultimately rests on the difficulty of factoring \(n = pq\). SageMath has built-in factoring capabilities that make it easy to explore the limits of RSA security.
# Factoring in SageMath
# Trial division (works for small factors)
n_small = 1000000007 * 1000000009
print(f"n = {n_small}")
print(f"factor(n) = {factor(n_small)}")
# Fermat's factoring method
def fermat_factor(n):
"""Fermat's factoring method: works well when p and q are close."""
a = isqrt(n)
if a * a == n:
return a, a
a += 1
while True:
b2 = a*a - n
b = isqrt(b2)
if b * b == b2:
return a + b, a - b
a += 1
# Test with close primes (vulnerable!)
p_close = next_prime(10^30)
q_close = next_prime(p_close)
n_close = p_close * q_close
print(f"\nClose primes: p={p_close}, q={q_close}")
print(f"n = {n_close}")
import time
t0 = time.time()
p_found, q_found = fermat_factor(n_close)
t1 = time.time()
print(f"Fermat factoring: p={p_found}, q={q_found}")
print(f"Time: {float((t1-t0)*1000):.2f} ms")
n = 1000000016000000063
factor(n) = 1000000007 * 1000000009
Close primes: p=1000000000000000000000000000057, q=1000000000000000000000000000099
n = 1000000000000000000000000000156000000000000000000000000005643
Fermat factoring: p=1000000000000000000000000000099, q=1000000000000000000000000000057
Time: 0.03 ms
# Pollard's p-1 method
def pollard_p1(n, B=10^6):
"""Pollard's p-1 factoring: works when p-1 is B-smooth."""
a = ZZ(2)
for p in primes(B):
e = ZZ(floor(log(RR(B)) / log(RR(p)))) # floor(log_p(B))
a = power_mod(a, p^e, n)
d = gcd(a - 1, n)
if 1 < d < n:
return d, n // d
return None
# Test with a p-1 smooth prime
p_smooth = 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 + 1
while not is_prime(p_smooth):
p_smooth += 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29
q_safe = random_prime(10^20, lbound=10^19)
n_smooth = p_smooth * q_safe
print(f"n = {n_smooth}")
print(f"p-1 = {factor(p_smooth - 1)}")
result = pollard_p1(n_smooth, B=100)
if result:
print(f"Pollard p-1 found: {result[0]} × {result[1]}")
else:
print("Pollard p-1 failed with B=100")
n = 5219436539619139192156027590631
p-1 = 2 * 3 * 5 * 7 * 11^2 * 13 * 17 * 19 * 23 * 29
Pollard p-1 failed with B=100
Experiment: Key Size vs. Factoring Time#
# How factoring difficulty scales with key size
import time
bit_sizes = [32, 40, 48, 56, 64, 72, 80]
times = []
for bits in bit_sizes:
p = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
q = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
n = p * q
t0 = time.time()
f = factor(n)
t1 = time.time()
times.append(t1 - t0)
print(f" {bits}-bit n: factored in {float(times[-1]*1000):.2f} ms -> {f}")
# Plot
show(list_plot(list(zip(bit_sizes, [max(t, 1e-6) for t in times])),
plotjoined=True, marker='o', scale='semilogy',
axes_labels=['Key size (bits)', 'Factoring time (seconds)'],
title='RSA Factoring Time vs Key Size'))
32-bit n: factored in 0.15 ms -> 34913 * 55339
40-bit n: factored in 0.17 ms -> 586009 * 909863
48-bit n: factored in 0.26 ms -> 12851093 * 13378961
56-bit n: factored in 0.63 ms -> 154693849 * 260449997
64-bit n: factored in 0.16 ms -> 2222586523 * 2355447373
72-bit n: factored in 13.08 ms -> 45587097553 * 63983094061
80-bit n: factored in 0.73 ms -> 551948169691 * 941541834269
RSA Digital Signatures#
Definition 25.4 (RSA Signature)
To sign a message \(m\): compute \(s = m^d \bmod n\) using the private key.
To verify a signature: check that \(s^e \bmod n = m\) using the public key.
This works because signing is essentially “decryption” and verification is “encryption.” The security of RSA signatures relies on the same hardness assumption as RSA encryption.
In practice, one signs a hash of the message rather than the message itself: \(s = H(m)^d \bmod n\). This prevents existential forgery attacks.
# RSA Signatures
def rsa_sign(m, private_key):
"""Sign a message using RSA."""
n, d = private_key
return power_mod(m, d, n)
def rsa_verify(m, signature, public_key):
"""Verify an RSA signature."""
n, e = public_key
return power_mod(signature, e, n) == m
# Demo
message = 42
sig = rsa_sign(message, keys['private'])
is_valid = rsa_verify(message, sig, keys['public'])
print(f"Message: {message}")
print(f"Signature: {sig}")
print(f"Valid: {is_valid}")
# Tampered message
is_valid_tampered = rsa_verify(message + 1, sig, keys['public'])
print(f"Tampered message valid: {is_valid_tampered}")
Message: 42
Signature: 9823044510811283207637496131386574239834560120160062083813986749317920141924008321073114184878169672877787237063444137913167514767358829804195269180964091
Valid: True
Tampered message valid: False
The Multiplicative Homomorphism#
Theorem 25.4 (RSA Homomorphism)
Textbook RSA is multiplicatively homomorphic: for any two messages \(m_1, m_2\):
This is because \((m_1^e)(m_2^e) = (m_1 m_2)^e \bmod n\).
This property is both a feature (useful for homomorphic computation) and a vulnerability (enables chosen-ciphertext attacks on textbook RSA).
# Demonstrate multiplicative homomorphism
m1 = 7
m2 = 11
n, e = keys['public']
c1 = rsa_encrypt(m1, keys['public'])
c2 = rsa_encrypt(m2, keys['public'])
# Product of ciphertexts
c_product = (c1 * c2) % n
# Decrypt the product
dec_product = rsa_decrypt(c_product, keys['private'])
print(f"m1 = {m1}, m2 = {m2}")
print(f"m1 * m2 = {m1 * m2}")
print(f"Dec(Enc(m1) * Enc(m2)) = {dec_product}")
print(f"Homomorphism holds: {dec_product == m1 * m2}")
m1 = 7, m2 = 11
m1 * m2 = 77
Dec(Enc(m1) * Enc(m2)) = 77
Homomorphism holds: True
25.5b Why Padding Matters#
Textbook RSA has several critical weaknesses that padding schemes address:
Deterministic encryption: The same plaintext always produces the same ciphertext, leaking information about message equality.
Multiplicative homomorphism: An attacker can manipulate ciphertexts without knowing the plaintexts.
Small message attack: If \(m < n^{1/e}\), then \(m^e < n\) and no modular reduction occurs — the attacker can simply take the \(e\)-th root.
Related message attack: Coppersmith showed that if two messages differ by a known polynomial relation, both can be recovered.
OAEP (Optimal Asymmetric Encryption Padding)
In practice, RSA is always used with OAEP (Bellare & Rogaway, 1994): the message is padded with randomness and hashed before encryption. This achieves:
IND-CCA2 security (indistinguishable under adaptive chosen-ciphertext attack)
Protection against all the above attacks
# Small message attack: m^e < n
n, e = keys['public']
small_msg = 42 # Very small message
c = rsa_encrypt(small_msg, keys['public'])
# If m^e < n, we can just take the e-th root
m_cubed = small_msg^e
if m_cubed < n:
# The ciphertext IS just m^e (no modular reduction happened)
recovered = ZZ(c).nth_root(e)
print(f"Small message attack: m^e < n? {m_cubed < n}")
print(f"Recovered: {recovered}")
else:
print(f"m^e >= n, so modular reduction occurred")
print(f"m^e has {m_cubed.nbits()} bits, n has {n.nbits()} bits")
# With a slightly larger message
big_msg = ZZ.random_element(n // 2, n - 1)
c_big = rsa_encrypt(big_msg, keys['public'])
print(f"\nLarger message: m has {big_msg.nbits()} bits")
print(f"m^e >= n: {big_msg^e >= n} (safe from root attack)")
m^e >= n, so modular reduction occurred
m^e has 353397 bits, n has 512 bits
Larger message: m has 511 bits
m^e >= n: True (safe from root attack)
25.6b RSA Key Size Recommendations#
The security of RSA depends on the difficulty of factoring the modulus \(n\). As factoring algorithms improve and computing power grows, minimum key sizes must increase:
Year |
Minimum RSA Key Size |
Equivalent Symmetric Security |
|---|---|---|
2000 |
1024 bits |
~80 bits |
2015 |
2048 bits |
~112 bits |
2024 |
3072 bits |
~128 bits |
2030+ |
4096+ bits |
~140+ bits |
Danger
Quantum threat: Shor’s algorithm (Chapter 38) can factor \(n\) in polynomial time on a quantum computer. This means RSA at any key size will be insecure once large-scale quantum computers exist. NIST recommends transitioning to post-quantum algorithms (Parts XIV–XV) by 2035.
Current Records
The largest RSA modulus factored publicly is RSA-250 (829 bits), achieved in 2020 using the General Number Field Sieve (GNFS). This required approximately 2,700 core-years of computation. RSA-2048 (the standard today) is estimated to require \(\sim 10^{20}\) core-years with current algorithms.
# Estimate GNFS factoring complexity
# GNFS heuristic complexity: exp(c * (ln n)^(1/3) * (ln ln n)^(2/3))
# where c ≈ 1.923 for the number field sieve
from sage.functions.log import log as ln
c_gnfs = 1.923
for bits in [512, 768, 1024, 2048, 3072, 4096]:
n_val = 2^bits
ln_n = float(ln(n_val))
ln_ln_n = float(ln(ln_n))
# Number of operations (log2)
log2_ops = float(c_gnfs * ln_n^(1/3) * ln_ln_n^(2/3) / ln(2))
print(f" RSA-{int(bits):4d}: ~2^{float(log2_ops):.1f} operations")
RSA- 512: ~2^63.9 operations
RSA- 768: ~2^76.5 operations
RSA-1024: ~2^inf operations
RSA-2048: ~2^inf operations
RSA-3072: ~2^inf operations
RSA-4096: ~2^inf operations
Experiment: RSA Encryption/Decryption Timing by Key Size#
# Timing RSA operations at different key sizes
import time
bit_sizes = [512, 1024, 2048]
enc_times = []
dec_times = []
for bits in bit_sizes:
set_random_seed(bits)
k = rsa_keygen(bits=bits//2)
test_n, test_e = k['public']
_, test_d = k['private']
test_m = ZZ.random_element(2, test_n - 1)
# Time encryption
trials = 50
t0 = time.time()
for _ in range(trials):
c = power_mod(test_m, test_e, test_n)
enc_times.append((time.time() - t0) / trials * 1000)
# Time decryption
t0 = time.time()
for _ in range(trials):
m = power_mod(c, test_d, test_n)
dec_times.append((time.time() - t0) / trials * 1000)
print(f"RSA-{bits}: encrypt={float(enc_times[-1]):.3f}ms, decrypt={float(dec_times[-1]):.3f}ms")
# Plot
enc_plot = list_plot(list(zip(bit_sizes, enc_times)), plotjoined=True,
marker='o', color='blue', legend_label='Encryption')
dec_plot = list_plot(list(zip(bit_sizes, dec_times)), plotjoined=True,
marker='s', color='red', legend_label='Decryption')
show(enc_plot + dec_plot, axes_labels=['Key size (bits)', 'Time (ms)'],
title='RSA Operation Timing', scale='semilogy')
RSA-512: encrypt=0.008ms, decrypt=0.316ms
RSA-1024: encrypt=0.020ms, decrypt=1.069ms
RSA-2048: encrypt=0.031ms, decrypt=5.660ms
25.7 Exercises#
Exercise 25.1 (Basic RSA). Choose \(p = 61\), \(q = 53\), and \(e = 17\). Compute \(n\), \(\phi(n)\), and \(d\) by hand (verify with SageMath). Encrypt and decrypt the message \(m = 65\).
Hint
\(n = 3233\), \(\phi(n) = 3120\). Use inverse_mod(17, 3120) to find \(d\).
Exercise 25.2 (CRT). Implement RSA decryption using CRT and verify it gives the same result as standard decryption. Measure the speedup for 2048-bit keys.
Hint
Use SageMath’s crt() function. The speedup should be approximately 4x.
Exercise 25.3 (Common Modulus Attack). If two users share the same \(n\) but use different public exponents \(e_1, e_2\) with \(\gcd(e_1, e_2) = 1\), and the same message \(m\) is encrypted to both, show how to recover \(m\) without factoring \(n\).
Hint
Use xgcd(e1, e2) to find \(a, b\) such that \(ae_1 + be_2 = 1\). Then \(c_1^a \cdot c_2^b \equiv m \pmod{n}\).
Exercise 25.4 (Wiener’s Attack). Generate RSA keys where \(d < n^{1/4}/3\) and verify that Wiener’s attack recovers the key. What is the largest \(d\) for which the attack succeeds?
Hint
Use continued_fraction(e/n).convergents() and test each convergent.
Exercise 25.5 (Challenge: Pollard’s Rho). Implement Pollard’s \(\rho\) factoring algorithm using the iteration \(x_{n+1} = x_n^2 + c \bmod n\) and Floyd’s cycle detection. Factor several 60-80 bit semiprimes.
Hint
Start with x = y = 2, c = 1. Update x once and y twice per step. Check gcd(abs(x-y), n) each iteration.
25.6c The Number Field Sieve#
The General Number Field Sieve (GNFS) is the fastest known algorithm for factoring large integers. Its heuristic running time is:
This is sub-exponential but super-polynomial — faster than trial division (\(O(\sqrt{n})\)) but much slower than polynomial time. The GNFS consists of several stages:
Polynomial selection: Choose two polynomials \(f(x)\) and \(g(x)\) with a common root modulo \(n\).
Sieving: Find many pairs \((a, b)\) such that both \(f(a/b) \cdot b^{\deg f}\) and \(g(a/b) \cdot b^{\deg g}\) are smooth (have only small prime factors).
Linear algebra: Build a large sparse matrix over \(\mathbb{F}_2\) from the smooth relations and find dependencies using the Block Lanczos or Block Wiedemann algorithm.
Square root: Compute square roots in the number field to obtain a congruence \(x^2 \equiv y^2 \pmod{n}\), yielding the factorization \(\gcd(x \pm y, n)\).
Factoring Records
Year |
Number |
Digits |
Bits |
Method |
|---|---|---|---|---|
1991 |
RSA-100 |
100 |
330 |
QS |
1999 |
RSA-155 |
155 |
512 |
GNFS |
2005 |
RSA-200 |
200 |
663 |
GNFS |
2009 |
RSA-768 |
232 |
768 |
GNFS |
2020 |
RSA-250 |
250 |
829 |
GNFS |
# Pollard's rho factoring method in SageMath
def pollard_rho(n, max_iter=10^6):
"""Pollard's rho factoring algorithm using Floyd's cycle detection."""
if n % 2 == 0:
return 2, n // 2
x = ZZ(2)
y = ZZ(2)
c = ZZ(1)
d = ZZ(1)
f = lambda x: (x^2 + c) % n
while d == 1:
x = f(x)
y = f(f(y))
d = gcd(abs(x - y), n)
if d != n:
return d, n // d
return None
# Test on moderate-size composites
import time
for bits in [40, 50, 60, 70]:
set_random_seed(bits)
p = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
q = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
n = p * q
t0 = time.time()
result = pollard_rho(n)
elapsed = time.time() - t0
if result:
print(f"{bits}-bit n: {result[0]} × {result[1]} ({float(elapsed*1000):.1f}ms)")
else:
print(f"{bits}-bit n: failed ({float(elapsed*1000):.1f}ms)")
40-bit n: 873527 × 699541 (0.3ms)
50-bit n: 31842127 × 17744647 (1.9ms)
60-bit n: 985137473 × 582900881 (2.8ms)
70-bit n: 20975848559 × 18087034963 (192.9ms)
SageMath’s Built-in Factoring#
SageMath wraps multiple factoring algorithms and automatically selects the best one based on the input size. The factor() function is remarkably powerful for educational exploration.
# SageMath's factor() automatically selects the best algorithm
import time
# Factor some famous numbers
famous = {
"Mersenne M67 (not prime!)": 2^67 - 1,
"Fermat F5": 2^(2^5) + 1,
"RSA-59 (first RSA challenge)": 71641520761751435455133616475667090434063332228247871795429,
}
for name, n in famous.items():
t0 = time.time()
f = factor(n)
elapsed = time.time() - t0
print(f"{name}:")
print(f" n = {n}")
print(f" = {f}")
print(f" ({float(elapsed*1000):.1f}ms)\n")
Mersenne M67 (not prime!):
n = 147573952589676412927
= 193707721 * 761838257287
(0.4ms)
Fermat F5:
n = 4294967297
= 641 * 6700417
(0.0ms)
RSA-59 (first RSA challenge):
n = 71641520761751435455133616475667090434063332228247871795429
= 200429218120815554269743635437 * 357440504101388365610785389017
(2055.1ms)
25.6d Mathematical Foundations#
The correctness and security of RSA rest on several fundamental results from number theory, all of which are built into SageMath:
Theorem 25.5 (Euler’s Theorem)
If \(\gcd(a, n) = 1\), then \(a^{\phi(n)} \equiv 1 \pmod{n}\).
This generalizes Fermat’s Little Theorem (\(a^{p-1} \equiv 1 \pmod{p}\) for prime \(p\)) and is the foundation of RSA correctness.
Theorem 25.6 (Chinese Remainder Theorem)
If \(n = pq\) with \(\gcd(p, q) = 1\), then:
This isomorphism is the basis for CRT decryption and for understanding the structure of \(\mathbb{Z}_n^*\).
Theorem 25.7 (Fundamental Theorem of Arithmetic)
Every integer \(n > 1\) has a unique prime factorization \(n = p_1^{a_1} p_2^{a_2} \cdots p_k^{a_k}\).
The assumed difficulty of finding this factorization for large semiprimes is the security assumption behind RSA.
# Verify Euler's theorem
p, q = keys['p'], keys['q']
n = p * q
phi_n = (p - 1) * (q - 1) # Avoid euler_phi(n) which requires factoring
# For a random a coprime to n
a = ZZ.random_element(2, n)
while gcd(a, n) != 1:
a = ZZ.random_element(2, n)
result = power_mod(a, phi_n, n)
print(f"a = {a}")
print(f"phi(n) = {phi_n}")
print(f"a^phi(n) mod n = {result}")
print(f"Euler's theorem holds: {result == 1}")
# Verify CRT isomorphism
print(f"\nCRT decomposition:")
print(f"n = {n}")
print(f"Zn* has order phi(n) = {phi_n}")
print(f"Zp* has order p-1 = {p-1}")
print(f"Zq* has order q-1 = {q-1}")
print(f"(p-1)(q-1) = {(p-1)*(q-1)} = phi(n)? {(p-1)*(q-1) == phi_n}")
a = 1960024513001890534185541621043399983485910448195501724639771013179433964921995779364617489326761514627244896593454810323967439775273049717673999129830238
phi(n) = 9897879492531617143011769329237131985492735349086461903213596892112367810233217511536845851489944813381381727101975541327943123761529121159195532899753320
a^phi(n) mod n = 1
Euler's theorem holds: True
CRT decomposition:
n = 9897879492531617143011769329237131985492735349086461903213596892112367810233417901867475608944642271717165285107375938988069040632095656309380355116498967
Zn* has order phi(n) = 9897879492531617143011769329237131985492735349086461903213596892112367810233217511536845851489944813381381727101975541327943123761529121159195532899753320
Zp* has order p-1 = 88312755628949696921057815284243931625952258705415727922489452061668346578826
Zq* has order q-1 = 112077575000807757776400520499314073774445401420501142644045698123153870166820
(p-1)(q-1) = 9897879492531617143011769329237131985492735349086461903213596892112367810233217511536845851489944813381381727101975541327943123761529121159195532899753320 = phi(n)? True
25.8 Summary#
Concept |
Key Idea |
|---|---|
RSA |
Public-key cryptosystem based on integer factorization |
Key generation |
Choose primes \(p, q\); compute \(n = pq\), \(\phi(n)\), \(e\), \(d\) |
Encryption |
\(c = m^e \bmod n\) |
Decryption |
\(m = c^d \bmod n\) |
CRT speedup |
~4x faster decryption using Chinese Remainder Theorem |
Håstad’s attack |
Broadcast with small \(e\) broken via CRT |
Wiener’s attack |
Small \(d\) recovered via continued fractions |
Textbook RSA |
Deterministic, requires padding (OAEP) for security |
SageMath tools introduced:
random_prime(),next_prime(),is_prime()— prime generationpower_mod(),inverse_mod(),euler_phi()— modular arithmeticfactor(),gcd(),xgcd()— factoring and GCDcrt()— Chinese Remainder Theoremcontinued_fraction(),.convergents()— continued fractionsisqrt()— integer square rootlist_plot(),bar_chart(),show()— visualization
25.9 References#
Rivest, R. L., Shamir, A., and Adleman, L. (1978). A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM, 21(2), 120–126. The original RSA paper.
Wiener, M. J. (1990). Cryptanalysis of Short RSA Secret Exponents. IEEE Transactions on Information Theory, 36(3), 553–558. The continued fraction attack on small private exponents.
Boneh, D. (1999). Twenty Years of Attacks on the RSA Cryptosystem. Notices of the AMS, 46(2), 203–213. Comprehensive survey of RSA attacks.
Lenstra, A. K. et al. (2012). Ron was wrong, Whit is right. Crypto 2012. Study finding that many real-world RSA keys share prime factors.
Pollard, J. M. (1974). Theorems on Factorization and Primality Testing. Proceedings of the Cambridge Philosophical Society, 76, 521–528. The p−1 factoring algorithm.