# Student Project Catalogue

This is a list of **ten project topics** suitable for the post-quantum
half of the course. Each is sized to be doable by one or two students
in a few weekends. **You are not expected to deliver a production
implementation** — partial code, a small demo, and a clear write-up
beat a "full" implementation that you do not understand.

Each entry has the same structure:

- **Difficulty** — ★ (one weekend) to ★★★★ (real research project).
- **Description** — what you will build and why it matters.
- **Literature** — the small reading list that should anchor your work.
- **Implement** — the concrete code pieces you should write yourself.
- **Present** — what to show on lab day (W8).

Pick **one** topic, send your choice to the instructor before W5, and
plan a 10-minute presentation with a 2-page write-up due W8.

```{admonition} Submission format
:class: tip
- **Code:** a single GitHub repository (or a zip), one `README.md`, one
  driver script that produces the demo output in under 60 seconds.
- **Write-up:** 2 pages PDF — problem, your approach, results, what you
  could not get to and why.
- **Slides:** 5–8 Reveal.js or PDF slides for the W8 presentation.
- **No production claims** — every piece of code should be labelled
  "toy" / "educational" in its docstring header.
```

---

## Project 1 — ★ Lamport one-time signatures with a CLI

**Description.** Build a Lamport one-time signature scheme using
SHA-256 and wrap it in a small command-line interface
(`lamport keygen`, `lamport sign`, `lamport verify`). The goal is to
internalise that hash-based signatures are *one-time* and to make the
state-reuse failure mode concretely observable.

**Literature.**

- [Chapter 47 — Hash-Based Signatures](../w6_hash_based/ch47_hash_signatures.ipynb), §47.2.
- [Exercise sheet 47](../w6_hash_based/ch47_exercises.ipynb), Ex 47.E1.
- Lamport, L. (1979). *Constructing Digital Signatures from a One-Way
  Function*, SRI CSL-98.

**Implement.**

- `lamport_keygen(n=32)` returning `(sk, pk)` where each is a
  list of $2 \cdot 8n$ random byte-strings.
- `lamport_sign(sk, msg)` and `lamport_verify(pk, msg, sig)`.
- A CLI driver (`argparse`) that writes keys / signatures to disk in
  JSON or hex.
- A **state-reuse demonstrator**: sign two different messages with the
  same key, then forge a third valid signature.

**Present.**

- Demo: keygen, sign two messages, then run the forge attack live.
- Size table: secret key, public key, signature in bytes for $n = 32$.
- Two sentences on why "stateless" hash signatures (SLH-DSA) exist.

---

## Project 2 — ★ Merkle tree multi-signature scheme

**Description.** Combine Project 1's Lamport keys into a binary Merkle
tree of height $h = 4$ (16 leaves). Publish only the root as the public
key. Each signature carries the OTS payload **plus an authentication
path** that the verifier uses to recompute the root.

**Literature.**

- [Chapter 47](../w6_hash_based/ch47_hash_signatures.ipynb), §47.4.
- [Exercise sheet 47](../w6_hash_based/ch47_exercises.ipynb), Ex 47.E2 and E4.
- Merkle, R. (1979). *Secrecy, Authentication and Public Key Systems.*
  Stanford PhD thesis.
- RFC 8391 — XMSS, §3 (tree construction).

**Implement.**

- `build_merkle(leaf_pks)` returning level-by-level node hashes.
- `auth_path(levels, j)` extracting the sibling hashes from leaf $j$
  to the root.
- `verify_merkle_path(root, leaf, j, path)`.
- A combined `mss_sign(state, msg)` / `mss_verify(root, sig)` that
  routes through your Lamport from Project 1.

**Present.**

- An SVG (or matplotlib) drawing of the height-4 tree with leaf
  $j = 5$ highlighted and the auth-path siblings coloured.
- A table: signature byte count as $h$ grows from 2 to 6.

---

## Project 3 — ★★ Number Theoretic Transform (NTT) for Kyber

**Description.** Implement the NTT and inverse NTT for Kyber's ring
$R_q = \mathbb{Z}_{3329}[x]/(x^{256}+1)$ in pure Python. Verify the
round-trip identity and time it against schoolbook multiplication.

**Literature.**

- Upstream cryptanalysis course, **Chapter 41 — ML-KEM (Kyber)**, §41.3.
- FIPS 203, §4 (NTT specification).
- Cooley & Tukey (1965). *An algorithm for the machine calculation of
  complex Fourier series.* (For historical context.)

**Implement.**

- Pre-computation of primitive 256-th roots of unity mod 3329.
- Cooley–Tukey forward NTT (bit-reversal permutation + butterflies).
- Gentleman–Sande inverse NTT.
- A pointwise multiplication in NTT domain.
- A timing comparison: schoolbook multiplication vs. NTT-based, at
  $n = 64, 128, 256, 512$.

**Present.**

- A log–log plot of multiplication time vs. polynomial degree.
- The 256-point twiddle table as a tiny figure.
- One paragraph: why $q = 3329$ and $\zeta = 17$ are the *right*
  numbers for $n = 256$.

---

## Project 4 — ★★ Toy Regev LWE encryption

**Description.** Implement the original (Regev 2005) LWE-based
public-key encryption scheme. Use $n = 64$, $q = 521$, and a discrete
Gaussian error. Measure the decryption failure rate as you vary the
error standard deviation $\sigma$.

**Literature.**

- Regev, O. (2005). *On Lattices, Learning with Errors, Random Linear
  Codes, and Cryptography.* STOC '05.
- Peikert, C. (2016). *A Decade of Lattice Cryptography*, §4.
- Upstream **Chapter 40 — Lattice Problems**, §40.8–40.12.

**Implement.**

- A small **discrete Gaussian sampler** $\mathrm{D}_{\mathbb{Z}, \sigma}$.
- `lwe_keygen(n, q, m, sigma)` returning $(A, \mathbf{b} = A \mathbf{s}
  + \mathbf{e} \bmod q)$ for secret $\mathbf{s}$.
- `lwe_encrypt(pk, bit)` and `lwe_decrypt(sk, ct)` — one bit per
  ciphertext.
- A Monte-Carlo decryption-failure curve: 1000 random
  encryptions at each $\sigma \in \{1, 2, 4, 8, 16, 32\}$.

**Present.**

- Plot DFR vs $\sigma$ with the predicted Gaussian-tail bound overlay.
- Comment on the trade-off: large $\sigma$ → more security against LWE,
  more decryption failures.

---

## Project 5 — ★★ Toy NTRU encryption (with decryption-failure study)

**Description.** Build NTRU with the small parameters
$(N, p, q) = (11, 3, 31)$. Implement polynomial multiplication in
$R = \mathbb{Z}[x]/(x^N - 1)$, polynomial inversion in $R_p$ and $R_q$,
keygen, encrypt, decrypt. Estimate the decryption-failure rate as a
function of $q$.

**Literature.**

- [Chapter 48 — NTRU, Multivariate, Isogeny](../w7_other_families/ch48_ntru_mv_isogeny.ipynb), §48.2.
- [Exercise sheet 48](../w7_other_families/ch48_exercises.ipynb), Ex 48.E3.
- Hoffstein, J., Pipher, J., Silverman, J. H. (1998). *NTRU: A
  Ring-Based Public Key Cryptosystem*, ANTS-III.

**Implement.**

- Cyclic-convolution polynomial multiplication in $R$.
- Polynomial extended Euclidean for inversion in $R_p$ and $R_q$
  (do **not** depend on SymPy — see Ch 48 for a pure-Python version).
- `ntru_keygen`, `ntru_encrypt`, `ntru_decrypt`.
- A Monte-Carlo DFR estimator over 500 random plaintexts at each
  $q \in \{17, 31, 61, 127\}$.

**Present.**

- DFR vs $q$ plot (semilog y-axis).
- The full set of polynomials $(f, g, F_p, F_q, h, m, r, c, a, b, m')$
  for one decryption traced end-to-end.

---

## Project 6 — ★★ McEliece on tiny linear codes

**Description.** Implement the McEliece cryptosystem on small linear
codes — start with Hamming$(7, 4)$ (trivial: corrects 1 error), then
try a small Goppa code if time permits. Show the brute-force
information-set-decoding attack on the public key.

**Literature.**

- Upstream **Chapter 43 — Code-Based Cryptography**.
- McEliece, R. J. (1978). *A Public-Key Cryptosystem Based on Algebraic
  Coding Theory*, DSN Progress Report 42-44.
- Prange, E. (1962). *The use of information sets in decoding cyclic
  codes.* IRE Trans. Information Theory.

**Implement.**

- `LinearCode` class with generator + parity-check matrices over
  $\mathrm{GF}(2)$, plus syndrome decoding for Hamming$(7, 4)$.
- McEliece `keygen` (scrambler $S$ + permutation $P$),
  `encrypt`, `decrypt`.
- Brute-force ISD attack: pick random information sets, attempt decode,
  measure time vs code parameters.

**Present.**

- One run of keygen, encrypt, decrypt with the matrices printed.
- ISD running time on Hamming$(7, 4)$ vs Hamming$(15, 11)$, in a log plot.
- Two sentences on why **structured** Goppa codes resist ISD where
  random codes do not.

---

## Project 7 — ★★★ WOTS+ with bitmasks (Hülsing 2013)

**Description.** Implement the full **WOTS+** scheme with the per-step
bitmask trick that defeats the multi-target attack on plain WOTS.
Measure signature size, verifier hash count, and total wall-clock time
vs. Lamport.

**Literature.**

- [Chapter 47](../w6_hash_based/ch47_hash_signatures.ipynb), §47.3.
- Hülsing, A. (2013). *W-OTS+ — Shorter Signatures for Hash-Based
  Signature Schemes.* AFRICACRYPT 2013.
- RFC 8391 (XMSS), §3.1 (WOTS+ subroutines).

**Implement.**

- Chain function $H_{r_i}(x) := H(x \oplus r_i)$ with public bitmasks $r_i$.
- WOTS+ keygen, sign, verify for $w = 16$ and $n = 32$.
- The Winternitz checksum (Ex 47.E3 is a warm-up).
- Benchmark harness comparing **WOTS+**, Lamport, and Ed25519 (for
  reference) on signature size and signing time.

**Present.**

- Bar chart: signature size in bytes for the three schemes.
- A one-sentence reason why removing the bitmasks reintroduces the
  multi-target attack (cite Hülsing §3 or the chapter).

---

## Project 8 — ★★★ Toy ML-KEM-512 without NTT

**Description.** Implement a *schoolbook* (no-NTT) version of
ML-KEM-512: keygen, encapsulate, decapsulate, plus the
Fujisaki–Okamoto transform. Use the same parameters as FIPS 203
(`n = 256, k = 2, q = 3329`) but with naive polynomial multiplication.

**Literature.**

- FIPS 203 (ML-KEM), the *standardised* spec.
- Upstream **Chapter 41 — ML-KEM (Kyber) Design and Implementation**.
- Bos et al. (2018). *CRYSTALS-Kyber: a CCA-secure module-lattice-based
  KEM.* EuroS&P.

**Implement.**

- Centered binomial distribution (CBD) sampler — only $\eta = 2$ or
  $3$ needed.
- Module operations $R_q^{k \times k} \times R_q^k$.
- K-PKE (CPA encryption); then FO-transform for IND-CCA security.
- KAT (Known Answer Test) cross-check: pick a fixed seed, derive a
  ciphertext, and verify it matches a reference (`pqcrypto` package on
  PyPI is a good comparison point).

**Present.**

- Sizes: pk = 800 B, sk = 1632 B, ct = 768 B — verify against FIPS 203.
- One paragraph: which step **needs** the FO transform and why.
- (Stretch) timing comparison vs. `pip install pqcrypto`'s ML-KEM-512.

---

## Project 9 — ★★★ LLL implementation + Merkle–Hellman break

**Description.** Implement the LLL algorithm in pure Python (size
reduction + Lovász condition + swap, working with **Python integer
arithmetic** to stay exact). Then use your implementation to break a
low-density Merkle–Hellman knapsack via the Lagarias–Odlyzko lattice.

**Literature.**

- [Chapter 49 — LLL Step-by-Step Tutorial](../w2_lattices/lll_tutorial.ipynb)
  — every building block you need, with a worked 2-D example and four
  applications.
- Lenstra, Lenstra, Lovász (1982). *Factoring polynomials with rational
  coefficients.* Math. Ann. 261.
- Lagarias & Odlyzko (1985). *Solving low-density subset sum problems.*
  J. ACM 32(1).

**Implement.**

- `gram_schmidt(B)` returning $(B^*, \mu)$.
- `size_reduce(B, mu)` and a complete `lll(B, delta)`.
- Lagarias–Odlyzko lattice constructor `lo_lattice(a, S)`.
- A driver that generates a low-density $n = 16$ Merkle–Hellman key,
  encrypts a plaintext, and runs LLL on the LO lattice to recover it.

**Present.**

- For $n = 8, 10, 12, 16$: density of the generated key vs LLL recovery
  rate (50 random instances per $n$).
- A 2-D visualisation of the basis before/after LLL (Hadamard ratio).
- One slide on **why the attack works only below density 0.94** (cite
  Coster et al. 1992).

---

## Project 10 — ★★★★ Hidden Number Problem → ECDSA secret-key recovery

**Description.** Reproduce the Boneh–Venkatesan attack on biased
ECDSA-style signatures. Generate signatures whose nonce $k_i$ has
its top $\ell$ bits zero (i.e. $k_i$ is small), then build the BV
lattice and recover the secret key with LLL.

```{admonition} Connect to real-world history
:class: tip
This is the same family of attack that compromised Sony's PlayStation 3
firmware-signing key (where the nonce was *reused*, a degenerate case of
"biased"). Partial-bias variants have been demonstrated against real
Bitcoin / Ethereum signing keys when the RNG had a flaw.
```

**Literature.**

- [Chapter 49 — LLL Tutorial](../w2_lattices/lll_tutorial.ipynb),
  **Application D** — has a working pure-Python implementation that you
  should treat as a reference, then *re-derive yourself*.
- Boneh, D. & Venkatesan, R. (1996). *Hardness of computing the most
  significant bits of secret keys in Diffie-Hellman and related
  schemes.* CRYPTO '96.
- Nguyen, P. Q. & Shparlinski, I. E. (2002). *The insecurity of the
  Digital Signature Algorithm with partially known nonces.*
  J. Cryptology 15(3).

**Implement.**

- A toy ECDSA setup (no actual EC needed — work directly with the
  modular equations).
- Biased-nonce signature generator with leak parameter $\ell$.
- The $(m + 1)$-dimensional Boneh–Venkatesan lattice.
- A reuse of your `lll` from Project 9 (or our reference one).
- A read-off step: scan reduced rows for a candidate $d$ such that
  every $d \cdot t_i \bmod n$ is below the bias bound $B$.

**Present.**

- A phase diagram: $(m, \ell)$ pairs at which the attack succeeds /
  fails, run on 50 random secret keys per cell.
- One paragraph: the connection to the full ECDSA case (the
  embedding row for non-zero $u_i$) — you do not have to implement it,
  but you should be able to sketch it on the whiteboard.

---

## Mapping projects to course weeks

| Project | Difficulty | W1 | W2 | W3 | W4 | W5 | W6 | W7 | W8 |
|---|---|---|---|---|---|---|---|---|---|
| 1 — Lamport CLI | ★ | | | | | | ● | | |
| 2 — Merkle MSS | ★ | | | | | | ● | | |
| 3 — NTT for Kyber | ★★ | | | ● | | | | | |
| 4 — Toy Regev LWE | ★★ | | ● | ● | | | | | |
| 5 — Toy NTRU + DFR | ★★ | | | | | | | ● | |
| 6 — McEliece on small codes | ★★ | | | | | ● | | | |
| 7 — WOTS+ with bitmasks | ★★★ | | | | | | ● | | |
| 8 — Toy ML-KEM-512 | ★★★ | | | ● | | | | | |
| 9 — LLL + knapsack break | ★★★ | ● | ● | | | | | | |
| 10 — HNP / biased ECDSA | ★★★★ | | ● | | ● | | | | |

A "●" means the project's prerequisites are taught in that week.
The four-week practical window (W5 → W8) is what you should plan
around: pick the topic by end of W4, code in W5–W7, present in W8.

---

## Things that are NOT good projects

(Listed so you don't waste a week on them.)

- **"Implement Falcon from FIPS 206 draft."** Falcon's Gaussian sampler
  uses fast Fourier sampling on NTRU lattices; the implementation is a
  multi-month undertaking and the security-critical constant-time
  requirements are unforgiving. Read the spec for fun, do not try to
  reimplement.
- **"Break RSA-2048 with Shor on a real quantum computer."** Even the
  largest current quantum devices cannot factor 21 reliably. Pick a
  classical project.
- **"Implement SLH-DSA-SHA2-256s end to end."** ~1500 lines of careful
  code, lots of stateless-deterministic-derivation traps. The
  size-accounting exercise (Ex 47.E6) is the right scope.
- **"Rederive the Castryck–Decru attack."** Months of supersingular-
  isogeny background required. Project 10 (HNP/ECDSA) is the more
  tractable lattice-cryptanalysis project at the same level of
  *cryptanalytic* drama.

---

## Office hours

If you are stuck on any of these, bring a minimal failing example to
office hours. "My LLL output is wrong" is hard to debug; "here is a
2-cell notebook where my `gram_schmidt(B_bad)` returns a non-orthogonal
result" is easy.

Good luck.