Part VI
Synthesis
The Complete Arc: 1943–1989
Chapter 20
The Grand Timeline
Ch. 20
Ch. 20 notes
Evolution Six Stages of Development
| Year | Milestone | What Was Achieved |
| 1943 | McCulloch-Pitts | Formal neuron model; Boolean completeness |
| 1949 | Hebb's Postulate | First learning rule (unsupervised) |
| 1957 | Perceptron | First learning machine with convergence proof |
| 1969 | Minsky-Papert | Proved single-layer limits; AI Winter |
| 1986 | Backpropagation | Multi-layer learning via chain rule |
| 1989 | UAT | One hidden layer suffices for any function |
Ch. 20
Ch. 20 notes
Themes Three Recurring Themes
Representation
What can networks compute?
M-P: any Boolean function.
Perceptron: linearly separable only.
MLP: any continuous function.
Learning
How do networks acquire their computation?
M-P: no learning.
Hebb: unsupervised.
Perceptron: supervised (1 layer).
Backprop: supervised (all layers).
Universality
Are there fundamental limits?
Perceptron: yes (linear).
MLP+Backprop: representationally universal.
In practice: depth matters.
Ch. 20
Ch. 20 notes
Comparison Model Comparison
| Property | M-P (1943) | Perceptron (1958) | MLP + Backprop (1986) |
| Architecture | Single threshold | Single layer | Multiple layers |
| Learning | None (hand-set) | Perceptron rule | Backpropagation |
| Can solve XOR? | Yes (manual design) | No | Yes (learned) |
| Theory | Boolean completeness | Convergence thm | UAT |
| Bio. plausibility | High | Moderate | Low |
| Key limitation | No learning | Linear separability | Vanishing gradient |
Ch. 20
Ch. 20 notes
Milestones Three Key Breakthroughs
1943: Formal neuron — neural computation can be described mathematically (McCulloch & Pitts)
1958: Learning from data — machines can automatically find correct weights (Rosenblatt)
1986: Credit assignment — hidden layers can learn useful representations via backpropagation (Rumelhart, Hinton & Williams)
Ch. 20
Ch. 20 notes
Challenges Three Key Obstacles
No learning mechanism (1943–1958): M-P neurons had to be hand-designed — solved by the perceptron learning rule
Linear separability barrier (1969): Single-layer networks cannot compute XOR — solved by multi-layer architecture
Credit assignment problem (1969–1986): No algorithm to train hidden layers — solved by backpropagation
Ch. 20
Ch. 20 notes
Lessons AI Winter Lessons
- Valid criticism + institutional power = 17-year research shutdown
- Minsky-Papert proved limitations of single-layer nets, but community concluded all neural nets were useless
- Backpropagation existed (Werbos, 1974) but was ignored during the winter
- Important ideas can survive decades of neglect
The gap between “can solve” and “can learn to solve” was 17 years and multiple independent rediscoveries.
Ch. 20
Ch. 20 notes
Beyond 1989 What Comes Next
- Convolutional Neural Networks (LeCun, 1989 → Krizhevsky/AlexNet, 2012)
- Recurrent Networks & LSTM (Hochreiter & Schmidhuber, 1997)
- Deep Learning Revolution (Hinton et al., 2006–2012)
- Attention & Transformers (Vaswani et al., 2017)
- Large Language Models (GPT, Claude, 2020s)
All of modern AI rests on: parameterized differentiable functions trained by gradient descent via backpropagation.
Ch. 20
Ch. 20 notes
Pause & Reflect
Questions for Reflection
If Minsky-Papert had been wrong, would multi-layer networks have been developed anyway?
Why was backpropagation discovered at least 4 times before it was widely adopted?
What parallels exist between the 1960s perceptron hype and the 2020s AI hype?
Ch. 20
Ch. 20 notes
The Complete Arc
Part IThe McCulloch-Pitts neuron — computation as logic (1943)
Part IIThe Perceptron — learning from data with convergence proof (1957)
Part IIILimitations — XOR, Minsky-Papert, AI Winter (1969)
Part IVLearning Rules — Hebb, Oja, PCA, credit assignment (1949–1982)
Part VBackpropagation — the four equations, activation functions, UAT (1986)
Part VISynthesis — from McCulloch-Pitts to universal approximation (1943–1989)
Ch. 20
Ch. 20 notes
Thank You
Chapter References
Press ESC for slide overview · S for speaker notes · ? for shortcuts