Group Application Project — Topic Catalogue#
“Make it work, make it right, make it fast.” — Kent Beck
What you are asked to do#
Form a team of 3–5 students and pick a project from the catalogue below — or propose your own (see § Custom topics). Over about one month you will design, train, and demonstrate a working solution to a practical problem, drawing on material from any of Chapters 1–40 of this book.
The emphasis is on feasibility and practical engineering:
The model must train end-to-end on a single CPU or modest GPU in under one hour for the final reported run.
Use small, public datasets (≤ 1 GB raw, downloadable in minutes).
Build something a non-specialist can run: clear
README, arequirements.txt, a singlepython train.pyand apython demo.py(or a small notebook).Be honest about what works and what does not — failure modes that you understand are worth more than glossy claims that fall apart on a held-out test.
You are not expected to discover something new. You are expected to execute well on a known problem and explain why every architectural and training choice you made was the right one for your problem.
Deliverables#
A single Git repository (or zip) containing:
README.md— one-page project overview: problem, approach, dataset, results, how to reproduce.Training code — clean Python files (or a notebook) that take raw data → trained model. Reproducible from a fixed random seed.
A trained model checkpoint (≤ 100 MB) plus the demo script that loads it.
A report (~6–10 pages or 15–20 notebook cells of prose + figures) covering: motivation, related work, data, model architecture (with explicit references to course chapters), training details, results, error analysis, lessons learned.
Presentation slides (10–15 slides) that you will show in a 10-minute final talk + 5 min Q&A. The “wow” component is graded here.
Grading rubric#
Criterion |
Weight |
|---|---|
Quality of the code (clean, reproducible, modular) |
30% |
Clarity of the explanation (report + README) |
30% |
Presentation quality (slides, talk, demo) |
20% |
Wow effect (originality, polish, demo magic) |
20% |
The “wow effect” rewards the team that goes one step beyond a working baseline — an unexpected demo, an interactive UI, an unusual dataset, a particularly clean failure analysis, a creative twist on a classical idea.
Suggested timeline (4 weeks)#
Week |
Focus |
|---|---|
1 |
Pick topic · acquire and explore data · agree on labels/metrics · sketch architecture |
2 |
Baseline (small simple model) trains end-to-end · first evaluation pass |
3 |
Stronger model · ablations · error analysis · demo prototype |
4 |
Polish the report · build slides · rehearse · final reproducible training run |
A team of 4 means ~30 person-hours per member — enough to do a real job, not so much that you should treat this like a research paper.
Project catalogue#
Each entry lists the anchor chapters in this book, a suggested dataset, the rough model size that fits the feasibility constraint, and a “wow angle” you can develop for the presentation.
Vision#
1. Polish handwritten digits / characters#
Chapters: 22–25 (CNN architecture, training, experiments) · 31 (PyTorch CNN) Data: Build a small dataset by collecting handwritten Polish digits and accented letters (ą ć ę ł ń ó ś ź ż) from the team — 200–500 samples per class, 28×28 grayscale. Or use EMNIST + a synthetic accented-letter generator. Model: A LeNet-style CNN (~50K params) — trains in <5 min CPU. Wow angle: Live demo — point a phone camera at handwritten samples, classify in real time using the browser ML stack (TensorFlow.js or ONNX runtime web).
2. Plant disease classifier#
Chapters: 22–25 · 27 (optimizers, data augmentation) Data: PlantVillage subset (~50K images, 38 classes) — pick 5–10 most relevant species. Model: A small CNN (~500K params) or fine-tune a pretrained MobileNet-V2 with the head replaced. Wow angle: Grad-CAM heatmaps that show where on the leaf the model is looking; compare with a botanist-friendly “what to look for” list.
3. Chest X-ray triage#
Chapters: 22–25 · 27 (regularisation, class-imbalance handling) Data: Pneumonia subset of ChestX-ray8 / Kermany 2018 (~5K images, 2 classes). Disclaimer: for educational use only — must be stated in every deliverable. Model: Small CNN trained from scratch or fine-tune a torchvision ResNet18 head only. Wow angle: Calibration curves, sensitivity/specificity at different thresholds, side-by-side with a radiologist-friendly explanation. Never claim diagnostic value — frame as triage prioritisation.
4. Polish road-sign recogniser#
Chapters: 22–25 · 27 (data augmentation pipeline) Data: GTSRB (German signs are very similar to Polish) — 43 classes, ~50K training images. Model: Spatial transformer network on top of a 4-conv-layer CNN (~200K params). Wow angle: Live webcam demo — sliding-window detection on photos taken around campus; visualise the spatial-transformer alignment on rotated/skewed signs.
5. Adversarial-example showcase#
Chapters: 22–25 (CNN) · 17 (gradients) · 19 (universal approximation, but with limits) Data: MNIST or CIFAR-10. Model: Train a small CNN to ~99%/85% on MNIST/CIFAR. Implement FGSM and PGD attacks (Madry et al. 2018). Wow angle: Interactive web demo — user uploads an image, the team’s model classifies it, then perturbs it imperceptibly until classification flips. Report the smallest \(\epsilon\) that fools each architecture.
Audio#
6. Bird-call classifier#
Chapters: 22–25 (CNN on spectrograms) · 27 (data augmentation) Data: xeno-canto subset, 10–20 European bird species, 30 s clips, ~200 per class. Convert to mel-spectrograms. Model: A small VGG-style CNN over \(128 \times 128\) spectrograms. Wow angle: Live demo using a laptop microphone — record a short clip, predict the species. Add an “uncertainty” output so the model says “I don’t know” on out-of-distribution sounds.
7. Polish wake-word detector#
Chapters: 32–34 (RNN/LSTM) · 22–25 (CNN over spectrograms) Data: Each team member records 100+ samples of a chosen Polish wake word (e.g. “Marek!”) plus 30 minutes of negative-class audio (TV, music, Polish speech without the word). Model: CNN over short mel-spectrogram windows + a streaming detection head. Wow angle: Real-time terminal demo: a Python script that listens via the laptop mic and prints the timestamp every time the wake word fires; benchmark false-alarm rate on background audio.
Language#
8. Polish text auto-complete (char-RNN)#
Chapters: 32–35 (RNN, LSTM, char-RNN)
Data: Combine a small Polish corpus (~5–20 MB) from Wolne Lektury — Mickiewicz, Sienkiewicz, Reymont. Filter for one author or genre.
Model: 2-layer LSTM (~1M params), char-level.
Wow angle: Web demo where the user types Polish prose and the model continues it, with a temperature slider and a side-by-side “what would a Transformer say” comparison (use gpt-2-pl from HuggingFace as a reference baseline).
9. Polish poem generator (Tiny Transformer)#
Chapters: 37–40 (attention, Transformer) Data: Polish poetry — e.g. Mickiewicz + Słowacki + Norwid (≈3 MB clean text). Model: 2-layer decoder-only Transformer, ~500K params, char-level or BPE. Wow angle: Conditioning — the user picks an author and the model imitates that author’s style. Quantify with a held-out classifier that distinguishes the three poets.
10. Polish NER tagger#
Chapters: 37–40 (attention) · 31 (PyTorch training loop)
Data: PolEval NER (~10K sentences, BIO tags for PER/LOC/ORG).
Model: A small bidirectional LSTM + linear head, or fine-tune a HuggingFace polish-roberta-base last layer only.
Wow angle: Live web tool — paste Polish news article, see entities highlighted with colour codes; report F1 on the held-out PolEval test set.
11. Sentiment of Polish movie reviews#
Chapters: 32–34 (RNN/LSTM) · 26 (loss functions, class imbalance) Data: Filmweb / Allegro reviews scraped or from PolEmo2.0. Model: Embedding + bidirectional LSTM + classifier head; or a tiny Transformer. Wow angle: Two-axis output — the model predicts both polarity and aspect (acting / plot / cinematography). Visualise attention to show which phrases drove each prediction.
Time series#
12. Polish electricity load forecaster#
Chapters: 32–34 (RNN/LSTM) · 27 (regularisation) Data: PSE (Polskie Sieci Elektroenergetyczne) public data — hourly national load for the last 5 years. Model: LSTM with calendar features (weekday, hour, holiday). Forecast horizon: next 24 h. Wow angle: Side-by-side with a strong classical baseline (SARIMA or just last-week’s-load). Report MAPE for the team’s model and the baseline; show where each is best.
13. Air-quality forecast for a Polish city#
Chapters: 32–34 Data: GIOŚ (Główny Inspektorat Ochrony Środowiska) public API — PM2.5, PM10, NO₂, O₃ for one city, hourly. Model: LSTM combining target city + neighbouring stations + weather features. Wow angle: A Streamlit dashboard with a 48 h forecast + uncertainty bands; alert thresholds aligned with WHO guidelines.
Recommendation / retrieval#
14. Mini search engine over the course book#
Chapters: 39–40 (self-attention, Transformer)
Data: This book itself! All 40 chapters as Markdown/notebook text, split into ~500-token chunks.
Model: Use a small pretrained sentence-embedding model (e.g. intfloat/multilingual-e5-small) to index every chunk; team writes the indexer, retriever, and evaluation harness.
Wow angle: A web search box where the user asks a question in EN or PL (“what is Bahdanau attention?”) and gets the top 5 chapter snippets with chapter links. Compare against simple TF-IDF.
15. Movie recommender from MovieLens#
Chapters: 13 (Hebbian / matrix factorisation in disguise) · 17 (embeddings) Data: MovieLens 100K — 100K ratings, 1000 users, 1700 movies. Model: Two-tower neural network: user-embedding × movie-embedding → score. Wow angle: Show “users who liked X also liked Y” via cosine similarity in the learnt user space; visualise the movie embedding with t-SNE coloured by genre.
Reproduce-a-classic#
16. LeNet-5 on MNIST, faithful to LeCun 1998#
Chapters: 22–25 · 27 Data: MNIST. Model: Reproduce LeNet-5 with the exact layer sizes and activations from the 1998 paper. Wow angle: Match the paper’s reported error rate within 0.2 percentage points; write a 2-page “what we changed and why” comparing 1998 design to modern best practice. Train ResNet-18 in parallel as a “what we’d do today” baseline.
17. ResNet on CIFAR-10#
Chapters: 22–25 · 27 (residuals, batch norm) Data: CIFAR-10. Model: ResNet-20 (~270K params) from He et al. 2015, trained from scratch. Wow angle: Strip the residual connection and re-train — show the dramatic accuracy drop. Make the figure that “the residual connection saves you 5+ percentage points and trains 3× more stably”.
Creative#
18. Neural style transfer — Polish painting edition#
Chapters: 22–25 (CNN) · 17 (gradients on inputs) Data: Public-domain images of paintings by Matejko, Kossak, Wyspiański (Wikimedia Commons). Model: Gatys et al. 2015 style-transfer using a pretrained VGG-16. Implement the optimisation loop yourself. Wow angle: Web app where the user uploads a photo and picks a Polish painter’s style; produce a stylised image in <30 s on CPU. Frame the gradients-on-inputs idea as the same trick as adversarial examples (Project 5), used for art instead of attack.
19. Tiny GPT writes Pan Tadeusz#
Chapters: 37–40 (Transformer) Data: Mickiewicz’s Pan Tadeusz (full text, ~370K tokens) from Wolne Lektury. Model: 2-layer decoder-only Transformer, char- or BPE-level, ~500K params. Wow angle: Side-by-side comparison with the LSTM char-RNN trained on the same text (cf. individual mini-project 26). Quantify perplexity and qualitative coherence on a 200-character sample.
Custom topics#
If your team has a domain you care about (medicine, sports, finance, music, your own hobby data) you may propose your own. The proposal should be one page, addressing:
The problem and why it matters
Which course chapters (1–40) it builds on — at least three
Dataset source and approximate size
Model size + estimated training time
The deliverable a non-specialist could run
The intended “wow” angle for the presentation
Send the proposal to the instructor by end of week 1 for approval.
What is NOT expected#
New theorems or novel architectures.
State-of-the-art accuracy on a hard benchmark.
Cloud GPUs, distributed training, or large pretrained models that the team did not pretrain themselves.
A clean, reproducible, well-explained baseline that works on a real problem is the goal. Sophistication for its own sake costs you points.
Submission#
A single Git repository URL (private or public, instructor invited) or a zip with no git history. Include the report + slides in the same archive. Final deadline as announced in class (early June). Each team gives a 10 min presentation + 5 min Q&A in the final lab session.