# Group Application Project — Topic Catalogue

> *"Make it work, make it right, make it fast."* — Kent Beck

## What you are asked to do

Form a team of **3–5 students** and pick a project from the catalogue below — or propose your own (see [§ Custom topics](#custom-topics)). Over **about one month** you will design, train, and demonstrate a working solution to a practical problem, drawing on material from any of Chapters 1–40 of this book.

The emphasis is on **feasibility** and **practical engineering**:

- The model must train end-to-end on a single CPU or modest GPU in **under one hour** for the final reported run.
- Use small, public datasets (≤ 1 GB raw, downloadable in minutes).
- Build something a non-specialist can run: clear `README`, a `requirements.txt`, a single `python train.py` and a `python demo.py` (or a small notebook).
- Be honest about what works and what does not — failure modes that you understand are worth more than glossy claims that fall apart on a held-out test.

You are **not** expected to discover something new. You are expected to **execute well** on a known problem and explain *why* every architectural and training choice you made was the right one for your problem.

## Deliverables

A single Git repository (or zip) containing:

1. **`README.md`** — one-page project overview: problem, approach, dataset, results, how to reproduce.
2. **Training code** — clean Python files (or a notebook) that take raw data → trained model. Reproducible from a fixed random seed.
3. **A trained model checkpoint** (≤ 100 MB) plus the demo script that loads it.
4. **A report** (~6–10 pages or 15–20 notebook cells of prose + figures) covering: motivation, related work, data, model architecture (with explicit references to course chapters), training details, results, error analysis, lessons learned.
5. **Presentation slides** (10–15 slides) that you will show in a 10-minute final talk + 5 min Q&A. The "wow" component is graded here.

## Grading rubric

| Criterion | Weight |
|---|---|
| Quality of the code (clean, reproducible, modular) | 30% |
| Clarity of the explanation (report + README) | 30% |
| Presentation quality (slides, talk, demo) | 20% |
| Wow effect (originality, polish, demo magic) | 20% |

The "wow effect" rewards the team that goes one step beyond a working baseline — an unexpected demo, an interactive UI, an unusual dataset, a particularly clean failure analysis, a creative twist on a classical idea.

## Suggested timeline (4 weeks)

| Week | Focus |
|---|---|
| 1 | Pick topic · acquire and explore data · agree on labels/metrics · sketch architecture |
| 2 | Baseline (small simple model) trains end-to-end · first evaluation pass |
| 3 | Stronger model · ablations · error analysis · demo prototype |
| 4 | Polish the report · build slides · rehearse · final reproducible training run |

A team of 4 means ~30 person-hours per member — enough to do a real job, not so much that you should treat this like a research paper.

---

## Project catalogue

Each entry lists the **anchor chapters** in this book, a suggested **dataset**, the rough **model size** that fits the feasibility constraint, and a "wow angle" you can develop for the presentation.

### Vision

#### 1. Polish handwritten digits / characters
**Chapters:** 22–25 (CNN architecture, training, experiments) · 31 (PyTorch CNN)
**Data:** Build a small dataset by collecting handwritten Polish digits and accented letters (ą ć ę ł ń ó ś ź ż) from the team — 200–500 samples per class, 28×28 grayscale. Or use EMNIST + a synthetic accented-letter generator.
**Model:** A LeNet-style CNN (~50K params) — trains in <5 min CPU.
**Wow angle:** Live demo — point a phone camera at handwritten samples, classify in real time using the browser ML stack (TensorFlow.js or ONNX runtime web).

#### 2. Plant disease classifier
**Chapters:** 22–25 · 27 (optimizers, data augmentation)
**Data:** PlantVillage subset (~50K images, 38 classes) — pick 5–10 most relevant species.
**Model:** A small CNN (~500K params) or fine-tune a pretrained MobileNet-V2 with the head replaced.
**Wow angle:** Grad-CAM heatmaps that show *where on the leaf* the model is looking; compare with a botanist-friendly "what to look for" list.

#### 3. Chest X-ray triage
**Chapters:** 22–25 · 27 (regularisation, class-imbalance handling)
**Data:** Pneumonia subset of ChestX-ray8 / Kermany 2018 (~5K images, 2 classes). **Disclaimer:** for educational use only — must be stated in every deliverable.
**Model:** Small CNN trained from scratch *or* fine-tune a torchvision ResNet18 head only.
**Wow angle:** Calibration curves, sensitivity/specificity at different thresholds, side-by-side with a radiologist-friendly explanation. *Never* claim diagnostic value — frame as triage prioritisation.

#### 4. Polish road-sign recogniser
**Chapters:** 22–25 · 27 (data augmentation pipeline)
**Data:** GTSRB (German signs are very similar to Polish) — 43 classes, ~50K training images.
**Model:** Spatial transformer network on top of a 4-conv-layer CNN (~200K params).
**Wow angle:** Live webcam demo — sliding-window detection on photos taken around campus; visualise the spatial-transformer alignment on rotated/skewed signs.

#### 5. Adversarial-example showcase
**Chapters:** 22–25 (CNN) · 17 (gradients) · 19 (universal approximation, but with limits)
**Data:** MNIST or CIFAR-10.
**Model:** Train a small CNN to ~99%/85% on MNIST/CIFAR. Implement FGSM and PGD attacks (Madry et al. 2018).
**Wow angle:** Interactive web demo — user uploads an image, the team's model classifies it, then perturbs it imperceptibly until classification flips. Report the smallest $\epsilon$ that fools each architecture.

### Audio

#### 6. Bird-call classifier
**Chapters:** 22–25 (CNN on spectrograms) · 27 (data augmentation)
**Data:** xeno-canto subset, 10–20 European bird species, 30 s clips, ~200 per class. Convert to mel-spectrograms.
**Model:** A small VGG-style CNN over $128 \times 128$ spectrograms.
**Wow angle:** Live demo using a laptop microphone — record a short clip, predict the species. Add an "uncertainty" output so the model says "I don't know" on out-of-distribution sounds.

#### 7. Polish wake-word detector
**Chapters:** 32–34 (RNN/LSTM) · 22–25 (CNN over spectrograms)
**Data:** Each team member records 100+ samples of a chosen Polish wake word (e.g. "Marek!") plus 30 minutes of negative-class audio (TV, music, Polish speech without the word).
**Model:** CNN over short mel-spectrogram windows + a streaming detection head.
**Wow angle:** Real-time terminal demo: a Python script that listens via the laptop mic and prints the timestamp every time the wake word fires; benchmark false-alarm rate on background audio.

### Language

#### 8. Polish text auto-complete (char-RNN)
**Chapters:** 32–35 (RNN, LSTM, char-RNN)
**Data:** Combine a small Polish corpus (~5–20 MB) from Wolne Lektury — Mickiewicz, Sienkiewicz, Reymont. Filter for one author or genre.
**Model:** 2-layer LSTM (~1M params), char-level.
**Wow angle:** Web demo where the user types Polish prose and the model continues it, with a temperature slider and a side-by-side "what would a Transformer say" comparison (use `gpt-2-pl` from HuggingFace as a reference baseline).

#### 9. Polish poem generator (Tiny Transformer)
**Chapters:** 37–40 (attention, Transformer)
**Data:** Polish poetry — e.g. Mickiewicz + Słowacki + Norwid (≈3 MB clean text).
**Model:** 2-layer decoder-only Transformer, ~500K params, char-level or BPE.
**Wow angle:** Conditioning — the user picks an author and the model imitates that author's style. Quantify with a held-out classifier that distinguishes the three poets.

#### 10. Polish NER tagger
**Chapters:** 37–40 (attention) · 31 (PyTorch training loop)
**Data:** PolEval NER (~10K sentences, BIO tags for PER/LOC/ORG).
**Model:** A small bidirectional LSTM + linear head, *or* fine-tune a HuggingFace `polish-roberta-base` last layer only.
**Wow angle:** Live web tool — paste Polish news article, see entities highlighted with colour codes; report F1 on the held-out PolEval test set.

#### 11. Sentiment of Polish movie reviews
**Chapters:** 32–34 (RNN/LSTM) · 26 (loss functions, class imbalance)
**Data:** Filmweb / Allegro reviews scraped or from PolEmo2.0.
**Model:** Embedding + bidirectional LSTM + classifier head; or a tiny Transformer.
**Wow angle:** Two-axis output — the model predicts both polarity and *aspect* (acting / plot / cinematography). Visualise attention to show *which* phrases drove each prediction.

### Time series

#### 12. Polish electricity load forecaster
**Chapters:** 32–34 (RNN/LSTM) · 27 (regularisation)
**Data:** PSE (Polskie Sieci Elektroenergetyczne) public data — hourly national load for the last 5 years.
**Model:** LSTM with calendar features (weekday, hour, holiday). Forecast horizon: next 24 h.
**Wow angle:** Side-by-side with a strong classical baseline (SARIMA or just last-week's-load). Report MAPE for the team's model and the baseline; show where each is best.

#### 13. Air-quality forecast for a Polish city
**Chapters:** 32–34
**Data:** GIOŚ (Główny Inspektorat Ochrony Środowiska) public API — PM2.5, PM10, NO₂, O₃ for one city, hourly.
**Model:** LSTM combining target city + neighbouring stations + weather features.
**Wow angle:** A Streamlit dashboard with a 48 h forecast + uncertainty bands; alert thresholds aligned with WHO guidelines.

### Recommendation / retrieval

#### 14. Mini search engine over the course book
**Chapters:** 39–40 (self-attention, Transformer)
**Data:** This book itself! All 40 chapters as Markdown/notebook text, split into ~500-token chunks.
**Model:** Use a small pretrained sentence-embedding model (e.g. `intfloat/multilingual-e5-small`) to index every chunk; team writes the indexer, retriever, and evaluation harness.
**Wow angle:** A web search box where the user asks a question in EN or PL ("what is Bahdanau attention?") and gets the top 5 chapter snippets with chapter links. Compare against simple TF-IDF.

#### 15. Movie recommender from MovieLens
**Chapters:** 13 (Hebbian / matrix factorisation in disguise) · 17 (embeddings)
**Data:** MovieLens 100K — 100K ratings, 1000 users, 1700 movies.
**Model:** Two-tower neural network: user-embedding × movie-embedding → score.
**Wow angle:** Show "users who liked X also liked Y" via cosine similarity in the learnt user space; visualise the movie embedding with t-SNE coloured by genre.

### Reproduce-a-classic

#### 16. LeNet-5 on MNIST, faithful to LeCun 1998
**Chapters:** 22–25 · 27
**Data:** MNIST.
**Model:** Reproduce LeNet-5 with the *exact* layer sizes and activations from the 1998 paper.
**Wow angle:** Match the paper's reported error rate within 0.2 percentage points; write a 2-page "what we changed and why" comparing 1998 design to modern best practice. Train ResNet-18 in parallel as a "what we'd do today" baseline.

#### 17. ResNet on CIFAR-10
**Chapters:** 22–25 · 27 (residuals, batch norm)
**Data:** CIFAR-10.
**Model:** ResNet-20 (~270K params) from He et al. 2015, trained from scratch.
**Wow angle:** Strip the residual connection and re-train — show the dramatic accuracy drop. Make the figure that "the residual connection saves you 5+ percentage points and trains 3× more stably".

### Creative

#### 18. Neural style transfer — Polish painting edition
**Chapters:** 22–25 (CNN) · 17 (gradients on inputs)
**Data:** Public-domain images of paintings by Matejko, Kossak, Wyspiański (Wikimedia Commons).
**Model:** Gatys et al. 2015 style-transfer using a pretrained VGG-16. Implement the optimisation loop yourself.
**Wow angle:** Web app where the user uploads a photo and picks a Polish painter's style; produce a stylised image in <30 s on CPU. Frame the gradients-on-inputs idea as the same trick as adversarial examples (Project 5), used for art instead of attack.

#### 19. Tiny GPT writes Pan Tadeusz
**Chapters:** 37–40 (Transformer)
**Data:** Mickiewicz's *Pan Tadeusz* (full text, ~370K tokens) from Wolne Lektury.
**Model:** 2-layer decoder-only Transformer, char- or BPE-level, ~500K params.
**Wow angle:** Side-by-side comparison with the LSTM char-RNN trained on the same text (cf. individual mini-project 26). Quantify perplexity and qualitative coherence on a 200-character sample.

---

## Custom topics

If your team has a domain you care about (medicine, sports, finance, music, your own hobby data) you may propose your own. The proposal should be **one page**, addressing:

- The problem and why it matters
- Which course chapters (1–40) it builds on — at least three
- Dataset source and approximate size
- Model size + estimated training time
- The deliverable a non-specialist could run
- The intended "wow" angle for the presentation

Send the proposal to the instructor by **end of week 1** for approval.

## What is NOT expected

- New theorems or novel architectures.
- State-of-the-art accuracy on a hard benchmark.
- Cloud GPUs, distributed training, or large pretrained models that the team did not pretrain themselves.

A clean, reproducible, well-explained baseline that *works* on a real problem is the goal. Sophistication for its own sake costs you points.

## Submission

A single Git repository URL (private or public, instructor invited) **or** a zip with no git history. Include the report + slides in the same archive. Final deadline as announced in class (early June). Each team gives a 10 min presentation + 5 min Q&A in the final lab session.