Group Application Project — Topic Catalogue#

“Make it work, make it right, make it fast.” — Kent Beck

What you are asked to do#

Form a team of 3–5 students and pick a project from the catalogue below — or propose your own (see § Custom topics). Over about one month you will design, train, and demonstrate a working solution to a practical problem, drawing on material from any of Chapters 1–40 of this book.

The emphasis is on feasibility and practical engineering:

  • The model must train end-to-end on a single CPU or modest GPU in under one hour for the final reported run.

  • Use small, public datasets (≤ 1 GB raw, downloadable in minutes).

  • Build something a non-specialist can run: clear README, a requirements.txt, a single python train.py and a python demo.py (or a small notebook).

  • Be honest about what works and what does not — failure modes that you understand are worth more than glossy claims that fall apart on a held-out test.

You are not expected to discover something new. You are expected to execute well on a known problem and explain why every architectural and training choice you made was the right one for your problem.

Deliverables#

A single Git repository (or zip) containing:

  1. README.md — one-page project overview: problem, approach, dataset, results, how to reproduce.

  2. Training code — clean Python files (or a notebook) that take raw data → trained model. Reproducible from a fixed random seed.

  3. A trained model checkpoint (≤ 100 MB) plus the demo script that loads it.

  4. A report (~6–10 pages or 15–20 notebook cells of prose + figures) covering: motivation, related work, data, model architecture (with explicit references to course chapters), training details, results, error analysis, lessons learned.

  5. Presentation slides (10–15 slides) that you will show in a 10-minute final talk + 5 min Q&A. The “wow” component is graded here.

Grading rubric#

Criterion

Weight

Quality of the code (clean, reproducible, modular)

30%

Clarity of the explanation (report + README)

30%

Presentation quality (slides, talk, demo)

20%

Wow effect (originality, polish, demo magic)

20%

The “wow effect” rewards the team that goes one step beyond a working baseline — an unexpected demo, an interactive UI, an unusual dataset, a particularly clean failure analysis, a creative twist on a classical idea.

Suggested timeline (4 weeks)#

Week

Focus

1

Pick topic · acquire and explore data · agree on labels/metrics · sketch architecture

2

Baseline (small simple model) trains end-to-end · first evaluation pass

3

Stronger model · ablations · error analysis · demo prototype

4

Polish the report · build slides · rehearse · final reproducible training run

A team of 4 means ~30 person-hours per member — enough to do a real job, not so much that you should treat this like a research paper.


Project catalogue#

Each entry lists the anchor chapters in this book, a suggested dataset, the rough model size that fits the feasibility constraint, and a “wow angle” you can develop for the presentation.

Vision#

1. Polish handwritten digits / characters#

Chapters: 22–25 (CNN architecture, training, experiments) · 31 (PyTorch CNN) Data: Build a small dataset by collecting handwritten Polish digits and accented letters (ą ć ę ł ń ó ś ź ż) from the team — 200–500 samples per class, 28×28 grayscale. Or use EMNIST + a synthetic accented-letter generator. Model: A LeNet-style CNN (~50K params) — trains in <5 min CPU. Wow angle: Live demo — point a phone camera at handwritten samples, classify in real time using the browser ML stack (TensorFlow.js or ONNX runtime web).

2. Plant disease classifier#

Chapters: 22–25 · 27 (optimizers, data augmentation) Data: PlantVillage subset (~50K images, 38 classes) — pick 5–10 most relevant species. Model: A small CNN (~500K params) or fine-tune a pretrained MobileNet-V2 with the head replaced. Wow angle: Grad-CAM heatmaps that show where on the leaf the model is looking; compare with a botanist-friendly “what to look for” list.

3. Chest X-ray triage#

Chapters: 22–25 · 27 (regularisation, class-imbalance handling) Data: Pneumonia subset of ChestX-ray8 / Kermany 2018 (~5K images, 2 classes). Disclaimer: for educational use only — must be stated in every deliverable. Model: Small CNN trained from scratch or fine-tune a torchvision ResNet18 head only. Wow angle: Calibration curves, sensitivity/specificity at different thresholds, side-by-side with a radiologist-friendly explanation. Never claim diagnostic value — frame as triage prioritisation.

4. Polish road-sign recogniser#

Chapters: 22–25 · 27 (data augmentation pipeline) Data: GTSRB (German signs are very similar to Polish) — 43 classes, ~50K training images. Model: Spatial transformer network on top of a 4-conv-layer CNN (~200K params). Wow angle: Live webcam demo — sliding-window detection on photos taken around campus; visualise the spatial-transformer alignment on rotated/skewed signs.

5. Adversarial-example showcase#

Chapters: 22–25 (CNN) · 17 (gradients) · 19 (universal approximation, but with limits) Data: MNIST or CIFAR-10. Model: Train a small CNN to ~99%/85% on MNIST/CIFAR. Implement FGSM and PGD attacks (Madry et al. 2018). Wow angle: Interactive web demo — user uploads an image, the team’s model classifies it, then perturbs it imperceptibly until classification flips. Report the smallest \(\epsilon\) that fools each architecture.

Audio#

6. Bird-call classifier#

Chapters: 22–25 (CNN on spectrograms) · 27 (data augmentation) Data: xeno-canto subset, 10–20 European bird species, 30 s clips, ~200 per class. Convert to mel-spectrograms. Model: A small VGG-style CNN over \(128 \times 128\) spectrograms. Wow angle: Live demo using a laptop microphone — record a short clip, predict the species. Add an “uncertainty” output so the model says “I don’t know” on out-of-distribution sounds.

7. Polish wake-word detector#

Chapters: 32–34 (RNN/LSTM) · 22–25 (CNN over spectrograms) Data: Each team member records 100+ samples of a chosen Polish wake word (e.g. “Marek!”) plus 30 minutes of negative-class audio (TV, music, Polish speech without the word). Model: CNN over short mel-spectrogram windows + a streaming detection head. Wow angle: Real-time terminal demo: a Python script that listens via the laptop mic and prints the timestamp every time the wake word fires; benchmark false-alarm rate on background audio.

Language#

8. Polish text auto-complete (char-RNN)#

Chapters: 32–35 (RNN, LSTM, char-RNN) Data: Combine a small Polish corpus (~5–20 MB) from Wolne Lektury — Mickiewicz, Sienkiewicz, Reymont. Filter for one author or genre. Model: 2-layer LSTM (~1M params), char-level. Wow angle: Web demo where the user types Polish prose and the model continues it, with a temperature slider and a side-by-side “what would a Transformer say” comparison (use gpt-2-pl from HuggingFace as a reference baseline).

9. Polish poem generator (Tiny Transformer)#

Chapters: 37–40 (attention, Transformer) Data: Polish poetry — e.g. Mickiewicz + Słowacki + Norwid (≈3 MB clean text). Model: 2-layer decoder-only Transformer, ~500K params, char-level or BPE. Wow angle: Conditioning — the user picks an author and the model imitates that author’s style. Quantify with a held-out classifier that distinguishes the three poets.

10. Polish NER tagger#

Chapters: 37–40 (attention) · 31 (PyTorch training loop) Data: PolEval NER (~10K sentences, BIO tags for PER/LOC/ORG). Model: A small bidirectional LSTM + linear head, or fine-tune a HuggingFace polish-roberta-base last layer only. Wow angle: Live web tool — paste Polish news article, see entities highlighted with colour codes; report F1 on the held-out PolEval test set.

11. Sentiment of Polish movie reviews#

Chapters: 32–34 (RNN/LSTM) · 26 (loss functions, class imbalance) Data: Filmweb / Allegro reviews scraped or from PolEmo2.0. Model: Embedding + bidirectional LSTM + classifier head; or a tiny Transformer. Wow angle: Two-axis output — the model predicts both polarity and aspect (acting / plot / cinematography). Visualise attention to show which phrases drove each prediction.

Time series#

12. Polish electricity load forecaster#

Chapters: 32–34 (RNN/LSTM) · 27 (regularisation) Data: PSE (Polskie Sieci Elektroenergetyczne) public data — hourly national load for the last 5 years. Model: LSTM with calendar features (weekday, hour, holiday). Forecast horizon: next 24 h. Wow angle: Side-by-side with a strong classical baseline (SARIMA or just last-week’s-load). Report MAPE for the team’s model and the baseline; show where each is best.

13. Air-quality forecast for a Polish city#

Chapters: 32–34 Data: GIOŚ (Główny Inspektorat Ochrony Środowiska) public API — PM2.5, PM10, NO₂, O₃ for one city, hourly. Model: LSTM combining target city + neighbouring stations + weather features. Wow angle: A Streamlit dashboard with a 48 h forecast + uncertainty bands; alert thresholds aligned with WHO guidelines.

Recommendation / retrieval#

14. Mini search engine over the course book#

Chapters: 39–40 (self-attention, Transformer) Data: This book itself! All 40 chapters as Markdown/notebook text, split into ~500-token chunks. Model: Use a small pretrained sentence-embedding model (e.g. intfloat/multilingual-e5-small) to index every chunk; team writes the indexer, retriever, and evaluation harness. Wow angle: A web search box where the user asks a question in EN or PL (“what is Bahdanau attention?”) and gets the top 5 chapter snippets with chapter links. Compare against simple TF-IDF.

15. Movie recommender from MovieLens#

Chapters: 13 (Hebbian / matrix factorisation in disguise) · 17 (embeddings) Data: MovieLens 100K — 100K ratings, 1000 users, 1700 movies. Model: Two-tower neural network: user-embedding × movie-embedding → score. Wow angle: Show “users who liked X also liked Y” via cosine similarity in the learnt user space; visualise the movie embedding with t-SNE coloured by genre.

Reproduce-a-classic#

16. LeNet-5 on MNIST, faithful to LeCun 1998#

Chapters: 22–25 · 27 Data: MNIST. Model: Reproduce LeNet-5 with the exact layer sizes and activations from the 1998 paper. Wow angle: Match the paper’s reported error rate within 0.2 percentage points; write a 2-page “what we changed and why” comparing 1998 design to modern best practice. Train ResNet-18 in parallel as a “what we’d do today” baseline.

17. ResNet on CIFAR-10#

Chapters: 22–25 · 27 (residuals, batch norm) Data: CIFAR-10. Model: ResNet-20 (~270K params) from He et al. 2015, trained from scratch. Wow angle: Strip the residual connection and re-train — show the dramatic accuracy drop. Make the figure that “the residual connection saves you 5+ percentage points and trains 3× more stably”.

Creative#

18. Neural style transfer — Polish painting edition#

Chapters: 22–25 (CNN) · 17 (gradients on inputs) Data: Public-domain images of paintings by Matejko, Kossak, Wyspiański (Wikimedia Commons). Model: Gatys et al. 2015 style-transfer using a pretrained VGG-16. Implement the optimisation loop yourself. Wow angle: Web app where the user uploads a photo and picks a Polish painter’s style; produce a stylised image in <30 s on CPU. Frame the gradients-on-inputs idea as the same trick as adversarial examples (Project 5), used for art instead of attack.

19. Tiny GPT writes Pan Tadeusz#

Chapters: 37–40 (Transformer) Data: Mickiewicz’s Pan Tadeusz (full text, ~370K tokens) from Wolne Lektury. Model: 2-layer decoder-only Transformer, char- or BPE-level, ~500K params. Wow angle: Side-by-side comparison with the LSTM char-RNN trained on the same text (cf. individual mini-project 26). Quantify perplexity and qualitative coherence on a 200-character sample.


Custom topics#

If your team has a domain you care about (medicine, sports, finance, music, your own hobby data) you may propose your own. The proposal should be one page, addressing:

  • The problem and why it matters

  • Which course chapters (1–40) it builds on — at least three

  • Dataset source and approximate size

  • Model size + estimated training time

  • The deliverable a non-specialist could run

  • The intended “wow” angle for the presentation

Send the proposal to the instructor by end of week 1 for approval.

What is NOT expected#

  • New theorems or novel architectures.

  • State-of-the-art accuracy on a hard benchmark.

  • Cloud GPUs, distributed training, or large pretrained models that the team did not pretrain themselves.

A clean, reproducible, well-explained baseline that works on a real problem is the goal. Sophistication for its own sake costs you points.

Submission#

A single Git repository URL (private or public, instructor invited) or a zip with no git history. Include the report + slides in the same archive. Final deadline as announced in class (early June). Each team gives a 10 min presentation + 5 min Q&A in the final lab session.