Chapter 25: The RSA Cryptosystem (SageMath)

Chapter 25: The RSA Cryptosystem (SageMath)#

Python Version Available

This is the SageMath companion for Chapter 25: The RSA Cryptosystem. The Python/NumPy version contains identical theory with implementations using only NumPy.

25.1 Historical Context: The Public-Key Revolution#

In 1977, Ron Rivest, Adi Shamir, and Leonard Adleman at MIT published a paper that would transform cryptography forever: “A Method for Obtaining Digital Signatures and Public-Key Cryptosystems.” Their system, known as RSA, was the first practical implementation of the public-key concept introduced by Diffie and Hellman just one year earlier.

The RSA cryptosystem’s security rests on the integer factorization problem: while it is easy to multiply two large primes \(p\) and \(q\) to get \(n = pq\), it appears to be computationally infeasible to recover \(p\) and \(q\) from \(n\) when the primes are sufficiently large (typically 1024+ bits each).

RSA has become one of the most widely deployed cryptographic algorithms in history:

TLS/SSL: RSA key exchange and signatures secure most HTTPS connections.
PGP/GPG: RSA encryption is used for email security.
Code signing: Software updates are authenticated using RSA signatures.
Smart cards: RSA is used in banking and identity cards worldwide.

Tip

Public-key vs. symmetric cryptography. Unlike the substitution and block ciphers we studied in Parts I–VIII, RSA uses different keys for encryption and decryption. The encryption key is public — anyone can encrypt — while only the holder of the private decryption key can decrypt. This eliminates the key distribution problem that plagued symmetric cryptography for millennia.

25.2 Formal Definitions#

Definition 25.1 (RSA Key Generation)

Choose two distinct large primes \(p\) and \(q\).
Compute \(n = pq\) (the RSA modulus).
Compute \(\phi(n) = (p-1)(q-1)\) (Euler’s totient).
Choose \(e\) with \(1 < e < \phi(n)\) and \(\gcd(e, \phi(n)) = 1\) (the public exponent).
Compute \(d = e^{-1} \bmod \phi(n)\) (the private exponent).
Public key: \((n, e)\). Private key: \((n, d)\).

Definition 25.2 (RSA Encryption and Decryption)

For a message \(m \in \{0, 1, \ldots, n-1\}\):

Encryption: \(c = m^e \bmod n\)
Decryption: \(m = c^d \bmod n\)

Correctness follows from Euler’s theorem: since \(ed \equiv 1 \pmod{\phi(n)}\), we have \(c^d = m^{ed} = m^{1 + k\phi(n)} = m \cdot (m^{\phi(n)})^k \equiv m \pmod{n}\) whenever \(\gcd(m, n) = 1\).

Theorem 25.1 (RSA Correctness)

For any \(m \in \mathbb{Z}_n\) (including \(m\) that share a factor with \(n\)), the RSA decryption recovers the original message:

\[ m^{ed} \equiv m \pmod{n}\]

Proof

When \(\gcd(m, n) = 1\), this follows directly from Euler’s theorem. For the case \(\gcd(m, n) > 1\) (i.e., \(p | m\) or \(q | m\)), we use the Chinese Remainder Theorem. If \(p | m\), then \(m^{ed} \equiv 0 \equiv m \pmod{p}\). Also \(\gcd(m, q) = 1\), so \(m^{ed} \equiv m \pmod{q}\) by Fermat’s little theorem. By CRT, \(m^{ed} \equiv m \pmod{n}\).

```{danger}
**Textbook RSA is insecure!** The basic RSA described above is deterministic — the same plaintext always encrypts to the same ciphertext. This violates semantic security. In practice, RSA must be used with padding schemes like OAEP (Optimal Asymmetric Encryption Padding) to achieve CCA2 security.

25.3 Implementation in SageMath#

SageMath provides powerful built-in functions that make RSA implementation elegant:

random_prime(N) — generate a random prime up to \(N\)
inverse_mod(e, phi) — compute modular inverse
power_mod(m, e, n) — modular exponentiation
euler_phi(n) — Euler’s totient function
factor(n) — integer factorization
gcd(a, b) — greatest common divisor

# RSA Key Generation in SageMath
def rsa_keygen(bits=512):
    """Generate an RSA key pair with the given bit length."""
    p = random_prime(2^bits, lbound=2^(bits-1))
    q = random_prime(2^bits, lbound=2^(bits-1))
    while p == q:
        q = random_prime(2^bits, lbound=2^(bits-1))
    
    n = p * q
    phi_n = (p - 1) * (q - 1)
    
    # Standard public exponent
    e = 65537
    assert gcd(e, phi_n) == 1, "e and phi(n) must be coprime"
    
    d = inverse_mod(e, phi_n)
    
    return {
        'public': (n, e),
        'private': (n, d),
        'p': p,
        'q': q,
        'phi_n': phi_n
    }

# Generate a key pair
set_random_seed(42)
keys = rsa_keygen(bits=256)  # Small for demonstration
n, e = keys['public']
_, d = keys['private']

print(f"p = {keys['p']}")
print(f"q = {keys['q']}")
print(f"n = {n}")
print(f"e = {e}")
print(f"d = {d}")
print(f"n has {n.nbits()} bits")

p = 88312755628949696921057815284243931625952258705415727922489452061668346578827
q = 112077575000807757776400520499314073774445401420501142644045698123153870166821
n = 9897879492531617143011769329237131985492735349086461903213596892112367810233417901867475608944642271717165285107375938988069040632095656309380355116498967
e = 65537
d = 2306640729502958460491306482833189142231572195654325535922933081056993661072248211753700149378304272926344555259295854901836753730104128468870918320611753
n has 512 bits

# RSA Encryption and Decryption
def rsa_encrypt(m, public_key):
    """Encrypt message m using public key (n, e)."""
    n, e = public_key
    return power_mod(m, e, n)

def rsa_decrypt(c, private_key):
    """Decrypt ciphertext c using private key (n, d)."""
    n, d = private_key
    return power_mod(c, d, n)

# Encrypt a message
message = 123456789
ciphertext = rsa_encrypt(message, keys['public'])
decrypted = rsa_decrypt(ciphertext, keys['private'])

print(f"Message:    {message}")
print(f"Ciphertext: {ciphertext}")
print(f"Decrypted:  {decrypted}")
print(f"Match: {message == decrypted}")

Message:    123456789
Ciphertext: 7764045828994215788209055946287661028944564657144108537609437591194216972618633830563630170439080271721555942267113609710823647337817699683686149602518399
Decrypted:  123456789
Match: True

Encrypting Text Messages#

# Encrypt a text message
def text_to_int(text):
    """Convert text to an integer."""
    return int.from_bytes(text.encode('utf-8'), 'big')

def int_to_text(num):
    """Convert integer back to text."""
    length = (num.bit_length() + 7) // 8
    return int(num).to_bytes(length, 'big').decode('utf-8')

# Encrypt a short message
msg = "Hello, RSA!"
m = text_to_int(msg)
assert m < n, "Message must be smaller than modulus"

c = rsa_encrypt(m, keys['public'])
dec = rsa_decrypt(c, keys['private'])
recovered = int_to_text(dec)

print(f"Original:   '{msg}'")
print(f"As integer: {m}")
print(f"Encrypted:  {c}")
print(f"Decrypted:  '{recovered}'")

Original:   'Hello, RSA!'
As integer: 87521618088895491219865889
Encrypted:  8411626285046101703422961620287752855896373842971788481537083746395586241376976653489906727334177770058395804913600071202438177765579766270224621301761202
Decrypted:  'Hello, RSA!'

25.4 CRT Decryption Speedup#

Theorem 25.2 (CRT Speedup for RSA Decryption)

By the Chinese Remainder Theorem, RSA decryption can be performed approximately 4x faster by computing:

\[ m_p = c^{d \bmod (p-1)} \bmod p \]

\[ m_q = c^{d \bmod (q-1)} \bmod q\]

and then combining: \(m = \text{CRT}(m_p, m_q, p, q)\).

# CRT-accelerated decryption
def rsa_decrypt_crt(c, p, q, d):
    """RSA decryption using CRT for ~4x speedup."""
    dp = d % (p - 1)
    dq = d % (q - 1)
    mp = power_mod(c, dp, p)
    mq = power_mod(c, dq, q)
    return crt([mp, mq], [p, q])

# Verify CRT decryption gives the same result
dec_crt = rsa_decrypt_crt(ciphertext, keys['p'], keys['q'], d)
print(f"Standard decryption: {decrypted}")
print(f"CRT decryption:      {dec_crt}")
print(f"Match: {decrypted == dec_crt}")

Standard decryption: 123456789
CRT decryption:      123456789
Match: True

# Timing comparison
import time

# Generate larger keys for meaningful timing
set_random_seed(100)
big_keys = rsa_keygen(bits=1024)
big_n, big_e = big_keys['public']
_, big_d = big_keys['private']

test_msg = ZZ.random_element(2^100, 2^200)
test_ct = rsa_encrypt(test_msg, big_keys['public'])

# Time standard decryption
trials = 100
t0 = time.time()
for _ in range(trials):
    rsa_decrypt(test_ct, big_keys['private'])
std_time = (time.time() - t0) / trials

# Time CRT decryption
t0 = time.time()
for _ in range(trials):
    rsa_decrypt_crt(test_ct, big_keys['p'], big_keys['q'], big_d)
crt_time = (time.time() - t0) / trials

print(f"Standard decryption: {float(std_time*1000):.3f} ms")
print(f"CRT decryption:      {float(crt_time*1000):.3f} ms")
print(f"Speedup:             {float(std_time/crt_time):.2f}x")

Standard decryption: 6.120 ms
CRT decryption:      2.508 ms
Speedup:             2.44x

25.5 Common Attacks on RSA#

Small Public Exponent Attack#

Definition 25.3 (Håstad’s Broadcast Attack)

If the same message \(m\) is encrypted with the same small exponent \(e\) to \(e\) different recipients (with different moduli \(n_1, \ldots, n_e\)), then \(m\) can be recovered using the Chinese Remainder Theorem without factoring any \(n_i\).

# Hastad's broadcast attack (e=3)
set_random_seed(2024)

# Three recipients with different moduli
keys1 = rsa_keygen(bits=256)
keys2 = rsa_keygen(bits=256)
keys3 = rsa_keygen(bits=256)

# Same message encrypted to all three with e=3
# (we'd need e=3 for this attack; using a contrived example)
secret_msg = ZZ.random_element(2^50)
n1, _ = keys1['public']
n2, _ = keys2['public']
n3, _ = keys3['public']

c1 = power_mod(secret_msg, 3, n1)
c2 = power_mod(secret_msg, 3, n2)
c3 = power_mod(secret_msg, 3, n3)

# CRT to recover m^3
m_cubed = crt([c1, c2, c3], [n1, n2, n3])

# Take integer cube root
recovered = ZZ(m_cubed).nth_root(3)
print(f"Original message:  {secret_msg}")
print(f"Recovered message: {recovered}")
print(f"Match: {secret_msg == recovered}")

Original message:  768368873937730
Recovered message: 768368873937730
Match: True

Wiener’s Attack on Small Private Exponents#

Theorem 25.3 (Wiener’s Attack)

If the private exponent \(d < \frac{1}{3} n^{1/4}\), then \(d\) can be recovered efficiently from \((n, e)\) using the continued fraction expansion of \(e/n\).

# Wiener's attack using continued fractions
def wiener_attack(n, e):
    """Attempt to recover d using continued fraction expansion of e/n."""
    # Compute continued fraction convergents of e/n
    cf = continued_fraction(e / n)
    convergents = cf.convergents()
    
    for conv in convergents:
        k = conv.numerator()
        d_candidate = conv.denominator()
        
        if k == 0:
            continue
        
        # Check if (ed - 1) / k is an integer (= phi(n))
        if (e * d_candidate - 1) % k != 0:
            continue
        
        phi_candidate = (e * d_candidate - 1) // k
        
        # phi(n) = n - p - q + 1, so p + q = n - phi + 1
        s = n - phi_candidate + 1
        # p and q are roots of x^2 - s*x + n = 0
        discriminant = s^2 - 4*n
        if discriminant < 0:
            continue
        
        sqrt_disc = isqrt(discriminant)
        if sqrt_disc^2 == discriminant:
            p = (s + sqrt_disc) // 2
            q = (s - sqrt_disc) // 2
            if p * q == n:
                return d_candidate, p, q
    
    return None

# Create a vulnerable RSA instance with small d
p = random_prime(2^128, lbound=2^127)
q = random_prime(2^128, lbound=2^127)
n_vuln = p * q
phi_vuln = (p-1) * (q-1)

# Choose a small d
d_small = random_prime(isqrt(isqrt(n_vuln)) // 4)
e_vuln = inverse_mod(d_small, phi_vuln)

print(f"n has {n_vuln.nbits()} bits")
print(f"True d = {d_small} ({d_small.nbits()} bits)")

result = wiener_attack(n_vuln, e_vuln)
if result:
    d_recovered, p_recovered, q_recovered = result
    print(f"Recovered d = {d_recovered}")
    print(f"Attack successful: {d_recovered == d_small}")
else:
    print("Attack failed (d may be too large)")

n has 256 bits
True d = 1269808153351150757 (61 bits)
Recovered d = 1269808153351150757
Attack successful: True

25.6 Factoring Algorithms#

The security of RSA ultimately rests on the difficulty of factoring \(n = pq\). SageMath has built-in factoring capabilities that make it easy to explore the limits of RSA security.

# Factoring in SageMath
# Trial division (works for small factors)
n_small = 1000000007 * 1000000009
print(f"n = {n_small}")
print(f"factor(n) = {factor(n_small)}")

# Fermat's factoring method
def fermat_factor(n):
    """Fermat's factoring method: works well when p and q are close."""
    a = isqrt(n)
    if a * a == n:
        return a, a
    a += 1
    while True:
        b2 = a*a - n
        b = isqrt(b2)
        if b * b == b2:
            return a + b, a - b
        a += 1

# Test with close primes (vulnerable!)
p_close = next_prime(10^30)
q_close = next_prime(p_close)
n_close = p_close * q_close
print(f"\nClose primes: p={p_close}, q={q_close}")
print(f"n = {n_close}")

import time
t0 = time.time()
p_found, q_found = fermat_factor(n_close)
t1 = time.time()
print(f"Fermat factoring: p={p_found}, q={q_found}")
print(f"Time: {float((t1-t0)*1000):.2f} ms")

n = 1000000016000000063
factor(n) = 1000000007 * 1000000009

Close primes: p=1000000000000000000000000000057, q=1000000000000000000000000000099
n = 1000000000000000000000000000156000000000000000000000000005643
Fermat factoring: p=1000000000000000000000000000099, q=1000000000000000000000000000057
Time: 0.03 ms

# Pollard's p-1 method
def pollard_p1(n, B=10^6):
    """Pollard's p-1 factoring: works when p-1 is B-smooth."""
    a = ZZ(2)
    for p in primes(B):
        e = ZZ(floor(log(RR(B)) / log(RR(p))))  # floor(log_p(B))
        a = power_mod(a, p^e, n)
    
    d = gcd(a - 1, n)
    if 1 < d < n:
        return d, n // d
    return None

# Test with a p-1 smooth prime
p_smooth = 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29 + 1
while not is_prime(p_smooth):
    p_smooth += 2 * 3 * 5 * 7 * 11 * 13 * 17 * 19 * 23 * 29
q_safe = random_prime(10^20, lbound=10^19)
n_smooth = p_smooth * q_safe

print(f"n = {n_smooth}")
print(f"p-1 = {factor(p_smooth - 1)}")

result = pollard_p1(n_smooth, B=100)
if result:
    print(f"Pollard p-1 found: {result[0]} × {result[1]}")
else:
    print("Pollard p-1 failed with B=100")

n = 5219436539619139192156027590631
p-1 = 2 * 3 * 5 * 7 * 11^2 * 13 * 17 * 19 * 23 * 29
Pollard p-1 failed with B=100

Experiment: Key Size vs. Factoring Time#

# How factoring difficulty scales with key size
import time

bit_sizes = [32, 40, 48, 56, 64, 72, 80]
times = []

for bits in bit_sizes:
    p = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
    q = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
    n = p * q
    
    t0 = time.time()
    f = factor(n)
    t1 = time.time()
    times.append(t1 - t0)
    print(f"  {bits}-bit n: factored in {float(times[-1]*1000):.2f} ms -> {f}")

# Plot
show(list_plot(list(zip(bit_sizes, [max(t, 1e-6) for t in times])), 
               plotjoined=True, marker='o', scale='semilogy',
               axes_labels=['Key size (bits)', 'Factoring time (seconds)'],
               title='RSA Factoring Time vs Key Size'))

  32-bit n: factored in 0.15 ms -> 34913 * 55339
  40-bit n: factored in 0.17 ms -> 586009 * 909863
  48-bit n: factored in 0.26 ms -> 12851093 * 13378961
  56-bit n: factored in 0.63 ms -> 154693849 * 260449997
  64-bit n: factored in 0.16 ms -> 2222586523 * 2355447373

  72-bit n: factored in 13.08 ms -> 45587097553 * 63983094061
  80-bit n: factored in 0.73 ms -> 551948169691 * 941541834269

../_images/4ed41d31cf987256a809571c87103c19768d5cd995f596ee906f18ab72bf55f0.png

RSA Digital Signatures#

Definition 25.4 (RSA Signature)

To sign a message \(m\): compute \(s = m^d \bmod n\) using the private key.

To verify a signature: check that \(s^e \bmod n = m\) using the public key.

This works because signing is essentially “decryption” and verification is “encryption.” The security of RSA signatures relies on the same hardness assumption as RSA encryption.

In practice, one signs a hash of the message rather than the message itself: \(s = H(m)^d \bmod n\). This prevents existential forgery attacks.

# RSA Signatures
def rsa_sign(m, private_key):
    """Sign a message using RSA."""
    n, d = private_key
    return power_mod(m, d, n)

def rsa_verify(m, signature, public_key):
    """Verify an RSA signature."""
    n, e = public_key
    return power_mod(signature, e, n) == m

# Demo
message = 42
sig = rsa_sign(message, keys['private'])
is_valid = rsa_verify(message, sig, keys['public'])
print(f"Message: {message}")
print(f"Signature: {sig}")
print(f"Valid: {is_valid}")

# Tampered message
is_valid_tampered = rsa_verify(message + 1, sig, keys['public'])
print(f"Tampered message valid: {is_valid_tampered}")

Message: 42
Signature: 9823044510811283207637496131386574239834560120160062083813986749317920141924008321073114184878169672877787237063444137913167514767358829804195269180964091
Valid: True
Tampered message valid: False

The Multiplicative Homomorphism#

Theorem 25.4 (RSA Homomorphism)

Textbook RSA is multiplicatively homomorphic: for any two messages \(m_1, m_2\):

\[ E(m_1) \cdot E(m_2) \equiv E(m_1 \cdot m_2) \pmod{n}\]

This is because \((m_1^e)(m_2^e) = (m_1 m_2)^e \bmod n\).

This property is both a feature (useful for homomorphic computation) and a vulnerability (enables chosen-ciphertext attacks on textbook RSA).

# Demonstrate multiplicative homomorphism
m1 = 7
m2 = 11
n, e = keys['public']

c1 = rsa_encrypt(m1, keys['public'])
c2 = rsa_encrypt(m2, keys['public'])

# Product of ciphertexts
c_product = (c1 * c2) % n

# Decrypt the product
dec_product = rsa_decrypt(c_product, keys['private'])

print(f"m1 = {m1}, m2 = {m2}")
print(f"m1 * m2 = {m1 * m2}")
print(f"Dec(Enc(m1) * Enc(m2)) = {dec_product}")
print(f"Homomorphism holds: {dec_product == m1 * m2}")

m1 = 7, m2 = 11
m1 * m2 = 77
Dec(Enc(m1) * Enc(m2)) = 77
Homomorphism holds: True

25.5b Why Padding Matters#

Textbook RSA has several critical weaknesses that padding schemes address:

Deterministic encryption: The same plaintext always produces the same ciphertext, leaking information about message equality.
Multiplicative homomorphism: An attacker can manipulate ciphertexts without knowing the plaintexts.
Small message attack: If \(m < n^{1/e}\), then \(m^e < n\) and no modular reduction occurs — the attacker can simply take the \(e\)-th root.
Related message attack: Coppersmith showed that if two messages differ by a known polynomial relation, both can be recovered.

OAEP (Optimal Asymmetric Encryption Padding)

In practice, RSA is always used with OAEP (Bellare & Rogaway, 1994): the message is padded with randomness and hashed before encryption. This achieves:

IND-CCA2 security (indistinguishable under adaptive chosen-ciphertext attack)
Protection against all the above attacks

# Small message attack: m^e < n
n, e = keys['public']
small_msg = 42  # Very small message

c = rsa_encrypt(small_msg, keys['public'])

# If m^e < n, we can just take the e-th root
m_cubed = small_msg^e
if m_cubed < n:
    # The ciphertext IS just m^e (no modular reduction happened)
    recovered = ZZ(c).nth_root(e)
    print(f"Small message attack: m^e < n? {m_cubed < n}")
    print(f"Recovered: {recovered}")
else:
    print(f"m^e >= n, so modular reduction occurred")
    print(f"m^e has {m_cubed.nbits()} bits, n has {n.nbits()} bits")
    
# With a slightly larger message
big_msg = ZZ.random_element(n // 2, n - 1)
c_big = rsa_encrypt(big_msg, keys['public'])
print(f"\nLarger message: m has {big_msg.nbits()} bits")
print(f"m^e >= n: {big_msg^e >= n} (safe from root attack)")

m^e >= n, so modular reduction occurred
m^e has 353397 bits, n has 512 bits

Larger message: m has 511 bits
m^e >= n: True (safe from root attack)

25.6b RSA Key Size Recommendations#

The security of RSA depends on the difficulty of factoring the modulus \(n\). As factoring algorithms improve and computing power grows, minimum key sizes must increase:

Year	Minimum RSA Key Size	Equivalent Symmetric Security
2000	1024 bits	~80 bits
2015	2048 bits	~112 bits
2024	3072 bits	~128 bits
2030+	4096+ bits	~140+ bits

Danger

Quantum threat: Shor’s algorithm (Chapter 38) can factor \(n\) in polynomial time on a quantum computer. This means RSA at any key size will be insecure once large-scale quantum computers exist. NIST recommends transitioning to post-quantum algorithms (Parts XIV–XV) by 2035.

Current Records

The largest RSA modulus factored publicly is RSA-250 (829 bits), achieved in 2020 using the General Number Field Sieve (GNFS). This required approximately 2,700 core-years of computation. RSA-2048 (the standard today) is estimated to require \(\sim 10^{20}\) core-years with current algorithms.

# Estimate GNFS factoring complexity
# GNFS heuristic complexity: exp(c * (ln n)^(1/3) * (ln ln n)^(2/3))
# where c ≈ 1.923 for the number field sieve

from sage.functions.log import log as ln

c_gnfs = 1.923

for bits in [512, 768, 1024, 2048, 3072, 4096]:
    n_val = 2^bits
    ln_n = float(ln(n_val))
    ln_ln_n = float(ln(ln_n))
    
    # Number of operations (log2)
    log2_ops = float(c_gnfs * ln_n^(1/3) * ln_ln_n^(2/3) / ln(2))
    
    print(f"  RSA-{int(bits):4d}: ~2^{float(log2_ops):.1f} operations")

  RSA- 512: ~2^63.9 operations
  RSA- 768: ~2^76.5 operations
  RSA-1024: ~2^inf operations
  RSA-2048: ~2^inf operations
  RSA-3072: ~2^inf operations
  RSA-4096: ~2^inf operations

Experiment: RSA Encryption/Decryption Timing by Key Size#

# Timing RSA operations at different key sizes
import time

bit_sizes = [512, 1024, 2048]
enc_times = []
dec_times = []

for bits in bit_sizes:
    set_random_seed(bits)
    k = rsa_keygen(bits=bits//2)
    test_n, test_e = k['public']
    _, test_d = k['private']
    test_m = ZZ.random_element(2, test_n - 1)
    
    # Time encryption
    trials = 50
    t0 = time.time()
    for _ in range(trials):
        c = power_mod(test_m, test_e, test_n)
    enc_times.append((time.time() - t0) / trials * 1000)
    
    # Time decryption
    t0 = time.time()
    for _ in range(trials):
        m = power_mod(c, test_d, test_n)
    dec_times.append((time.time() - t0) / trials * 1000)
    
    print(f"RSA-{bits}: encrypt={float(enc_times[-1]):.3f}ms, decrypt={float(dec_times[-1]):.3f}ms")

# Plot
enc_plot = list_plot(list(zip(bit_sizes, enc_times)), plotjoined=True, 
                      marker='o', color='blue', legend_label='Encryption')
dec_plot = list_plot(list(zip(bit_sizes, dec_times)), plotjoined=True,
                      marker='s', color='red', legend_label='Decryption')
show(enc_plot + dec_plot, axes_labels=['Key size (bits)', 'Time (ms)'],
     title='RSA Operation Timing', scale='semilogy')

RSA-512: encrypt=0.008ms, decrypt=0.316ms

RSA-1024: encrypt=0.020ms, decrypt=1.069ms

RSA-2048: encrypt=0.031ms, decrypt=5.660ms

../_images/bfe93d2a6a09891c0dad78188c28485d76c0deb0d9ef96afbbcf2dbaee1f19ba.png

25.7 Exercises#

Exercise 25.1 (Basic RSA). Choose \(p = 61\), \(q = 53\), and \(e = 17\). Compute \(n\), \(\phi(n)\), and \(d\) by hand (verify with SageMath). Encrypt and decrypt the message \(m = 65\).

Hint

\(n = 3233\), \(\phi(n) = 3120\). Use inverse_mod(17, 3120) to find \(d\).

Exercise 25.2 (CRT). Implement RSA decryption using CRT and verify it gives the same result as standard decryption. Measure the speedup for 2048-bit keys.

Hint

Use SageMath’s crt() function. The speedup should be approximately 4x.

Exercise 25.3 (Common Modulus Attack). If two users share the same \(n\) but use different public exponents \(e_1, e_2\) with \(\gcd(e_1, e_2) = 1\), and the same message \(m\) is encrypted to both, show how to recover \(m\) without factoring \(n\).

Hint

Use xgcd(e1, e2) to find \(a, b\) such that \(ae_1 + be_2 = 1\). Then \(c_1^a \cdot c_2^b \equiv m \pmod{n}\).

Exercise 25.4 (Wiener’s Attack). Generate RSA keys where \(d < n^{1/4}/3\) and verify that Wiener’s attack recovers the key. What is the largest \(d\) for which the attack succeeds?

Hint

Use continued_fraction(e/n).convergents() and test each convergent.

Exercise 25.5 (Challenge: Pollard’s Rho). Implement Pollard’s \(\rho\) factoring algorithm using the iteration \(x_{n+1} = x_n^2 + c \bmod n\) and Floyd’s cycle detection. Factor several 60-80 bit semiprimes.

Hint

Start with x = y = 2, c = 1. Update x once and y twice per step. Check gcd(abs(x-y), n) each iteration.

25.6c The Number Field Sieve#

The General Number Field Sieve (GNFS) is the fastest known algorithm for factoring large integers. Its heuristic running time is:

\[ L_n\left[\frac{1}{3}, \left(\frac{64}{9}\right)^{1/3}\right] = \exp\left(\left(\frac{64}{9}\right)^{1/3} (\ln n)^{1/3} (\ln \ln n)^{2/3}\right)\]

This is sub-exponential but super-polynomial — faster than trial division (\(O(\sqrt{n})\)) but much slower than polynomial time. The GNFS consists of several stages:

Polynomial selection: Choose two polynomials \(f(x)\) and \(g(x)\) with a common root modulo \(n\).
Sieving: Find many pairs \((a, b)\) such that both \(f(a/b) \cdot b^{\deg f}\) and \(g(a/b) \cdot b^{\deg g}\) are smooth (have only small prime factors).
Linear algebra: Build a large sparse matrix over \(\mathbb{F}_2\) from the smooth relations and find dependencies using the Block Lanczos or Block Wiedemann algorithm.
Square root: Compute square roots in the number field to obtain a congruence \(x^2 \equiv y^2 \pmod{n}\), yielding the factorization \(\gcd(x \pm y, n)\).

Factoring Records

Year	Number	Digits	Bits	Method
1991	RSA-100	100	330	QS
1999	RSA-155	155	512	GNFS
2005	RSA-200	200	663	GNFS
2009	RSA-768	232	768	GNFS
2020	RSA-250	250	829	GNFS

# Pollard's rho factoring method in SageMath
def pollard_rho(n, max_iter=10^6):
    """Pollard's rho factoring algorithm using Floyd's cycle detection."""
    if n % 2 == 0:
        return 2, n // 2
    
    x = ZZ(2)
    y = ZZ(2)
    c = ZZ(1)
    d = ZZ(1)
    
    f = lambda x: (x^2 + c) % n
    
    while d == 1:
        x = f(x)
        y = f(f(y))
        d = gcd(abs(x - y), n)
    
    if d != n:
        return d, n // d
    return None

# Test on moderate-size composites
import time
for bits in [40, 50, 60, 70]:
    set_random_seed(bits)
    p = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
    q = random_prime(2^(bits//2), lbound=2^(bits//2 - 1))
    n = p * q
    
    t0 = time.time()
    result = pollard_rho(n)
    elapsed = time.time() - t0
    
    if result:
        print(f"{bits}-bit n: {result[0]} × {result[1]} ({float(elapsed*1000):.1f}ms)")
    else:
        print(f"{bits}-bit n: failed ({float(elapsed*1000):.1f}ms)")

40-bit n: 873527 × 699541 (0.3ms)
50-bit n: 31842127 × 17744647 (1.9ms)
60-bit n: 985137473 × 582900881 (2.8ms)

70-bit n: 20975848559 × 18087034963 (192.9ms)

SageMath’s Built-in Factoring#

SageMath wraps multiple factoring algorithms and automatically selects the best one based on the input size. The factor() function is remarkably powerful for educational exploration.

# SageMath's factor() automatically selects the best algorithm
import time

# Factor some famous numbers
famous = {
    "Mersenne M67 (not prime!)": 2^67 - 1,
    "Fermat F5": 2^(2^5) + 1,
    "RSA-59 (first RSA challenge)": 71641520761751435455133616475667090434063332228247871795429,
}

for name, n in famous.items():
    t0 = time.time()
    f = factor(n)
    elapsed = time.time() - t0
    print(f"{name}:")
    print(f"  n = {n}")
    print(f"  = {f}")
    print(f"  ({float(elapsed*1000):.1f}ms)\n")

Mersenne M67 (not prime!):
  n = 147573952589676412927
  = 193707721 * 761838257287
  (0.4ms)

Fermat F5:
  n = 4294967297
  = 641 * 6700417
  (0.0ms)

RSA-59 (first RSA challenge):
  n = 71641520761751435455133616475667090434063332228247871795429
  = 200429218120815554269743635437 * 357440504101388365610785389017
  (2055.1ms)

25.6d Mathematical Foundations#

The correctness and security of RSA rest on several fundamental results from number theory, all of which are built into SageMath:

Theorem 25.5 (Euler’s Theorem)

If \(\gcd(a, n) = 1\), then \(a^{\phi(n)} \equiv 1 \pmod{n}\).

This generalizes Fermat’s Little Theorem (\(a^{p-1} \equiv 1 \pmod{p}\) for prime \(p\)) and is the foundation of RSA correctness.

Theorem 25.6 (Chinese Remainder Theorem)

If \(n = pq\) with \(\gcd(p, q) = 1\), then:

\[ \mathbb{Z}_n \cong \mathbb{Z}_p \times \mathbb{Z}_q\]

This isomorphism is the basis for CRT decryption and for understanding the structure of \(\mathbb{Z}_n^*\).

Theorem 25.7 (Fundamental Theorem of Arithmetic)

Every integer \(n > 1\) has a unique prime factorization \(n = p_1^{a_1} p_2^{a_2} \cdots p_k^{a_k}\).

The assumed difficulty of finding this factorization for large semiprimes is the security assumption behind RSA.

# Verify Euler's theorem
p, q = keys['p'], keys['q']
n = p * q
phi_n = (p - 1) * (q - 1)  # Avoid euler_phi(n) which requires factoring

# For a random a coprime to n
a = ZZ.random_element(2, n)
while gcd(a, n) != 1:
    a = ZZ.random_element(2, n)

result = power_mod(a, phi_n, n)
print(f"a = {a}")
print(f"phi(n) = {phi_n}")
print(f"a^phi(n) mod n = {result}")
print(f"Euler's theorem holds: {result == 1}")

# Verify CRT isomorphism
print(f"\nCRT decomposition:")
print(f"n = {n}")
print(f"Zn* has order phi(n) = {phi_n}")
print(f"Zp* has order p-1 = {p-1}")
print(f"Zq* has order q-1 = {q-1}")
print(f"(p-1)(q-1) = {(p-1)*(q-1)} = phi(n)? {(p-1)*(q-1) == phi_n}")

a = 1960024513001890534185541621043399983485910448195501724639771013179433964921995779364617489326761514627244896593454810323967439775273049717673999129830238
phi(n) = 9897879492531617143011769329237131985492735349086461903213596892112367810233217511536845851489944813381381727101975541327943123761529121159195532899753320
a^phi(n) mod n = 1
Euler's theorem holds: True

CRT decomposition:
n = 9897879492531617143011769329237131985492735349086461903213596892112367810233417901867475608944642271717165285107375938988069040632095656309380355116498967
Zn* has order phi(n) = 9897879492531617143011769329237131985492735349086461903213596892112367810233217511536845851489944813381381727101975541327943123761529121159195532899753320
Zp* has order p-1 = 88312755628949696921057815284243931625952258705415727922489452061668346578826
Zq* has order q-1 = 112077575000807757776400520499314073774445401420501142644045698123153870166820
(p-1)(q-1) = 9897879492531617143011769329237131985492735349086461903213596892112367810233217511536845851489944813381381727101975541327943123761529121159195532899753320 = phi(n)? True

25.8 Summary#

Concept	Key Idea
RSA	Public-key cryptosystem based on integer factorization
Key generation	Choose primes \(p, q\); compute \(n = pq\), \(\phi(n)\), \(e\), \(d\)
Encryption	\(c = m^e \bmod n\)
Decryption	\(m = c^d \bmod n\)
CRT speedup	~4x faster decryption using Chinese Remainder Theorem
Håstad’s attack	Broadcast with small \(e\) broken via CRT
Wiener’s attack	Small \(d\) recovered via continued fractions
Textbook RSA	Deterministic, requires padding (OAEP) for security

SageMath tools introduced:

random_prime(), next_prime(), is_prime() — prime generation
power_mod(), inverse_mod(), euler_phi() — modular arithmetic
factor(), gcd(), xgcd() — factoring and GCD
crt() — Chinese Remainder Theorem
continued_fraction(), .convergents() — continued fractions
isqrt() — integer square root
list_plot(), bar_chart(), show() — visualization

25.9 References#

Rivest, R. L., Shamir, A., and Adleman, L. (1978). A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Communications of the ACM, 21(2), 120–126. The original RSA paper.
Wiener, M. J. (1990). Cryptanalysis of Short RSA Secret Exponents. IEEE Transactions on Information Theory, 36(3), 553–558. The continued fraction attack on small private exponents.
Boneh, D. (1999). Twenty Years of Attacks on the RSA Cryptosystem. Notices of the AMS, 46(2), 203–213. Comprehensive survey of RSA attacks.
Lenstra, A. K. et al. (2012). Ron was wrong, Whit is right. Crypto 2012. Study finding that many real-world RSA keys share prime factors.
Pollard, J. M. (1974). Theorems on Factorization and Primality Testing. Proceedings of the Cambridge Philosophical Society, 76, 521–528. The p−1 factoring algorithm.

Chapter 25: The RSA Cryptosystem (SageMath)

Contents

Chapter 25: The RSA Cryptosystem (SageMath)#

25.1 Historical Context: The Public-Key Revolution#

25.2 Formal Definitions#

25.3 Implementation in SageMath#

Encrypting Text Messages#

25.4 CRT Decryption Speedup#

25.5 Common Attacks on RSA#

Small Public Exponent Attack#

Wiener’s Attack on Small Private Exponents#

25.6 Factoring Algorithms#

Experiment: Key Size vs. Factoring Time#

RSA Digital Signatures#

The Multiplicative Homomorphism#

25.5b Why Padding Matters#

25.6b RSA Key Size Recommendations#

Experiment: RSA Encryption/Decryption Timing by Key Size#

25.7 Exercises#

25.6c The Number Field Sieve#

SageMath’s Built-in Factoring#

25.6d Mathematical Foundations#

25.8 Summary#

25.9 References#