{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8d7e5a7b",
   "metadata": {},
   "source": [
    "# Chapter 2: Building Logic from Neurons\n",
    "\n",
    "\n",
    "## 1. Introduction: From Gates to Circuits\n",
    "\n",
    "In Chapter 1, we saw that a single McCulloch-Pitts neuron can compute any **threshold function** --- a Boolean function that outputs 1 when a weighted sum of inputs exceeds a fixed threshold. The canonical examples are AND ($\\theta = 2$), OR ($\\theta = 1$), and NOT (with an inhibitory input and a bias).\n",
    "\n",
    "But we also saw that not every Boolean function is a threshold function. **XOR** --- the exclusive-or --- is the simplest counterexample. To compute XOR and, more generally, *arbitrary* Boolean functions, we need to compose neurons into **networks**.\n",
    "\n",
    "This chapter explores three progressively deeper results:\n",
    "\n",
    "1. **The DNF Construction Algorithm:** A systematic procedure to build a feedforward M-P network for any Boolean function, given its truth table.\n",
    "\n",
    "2. **XOR as a warning sign:** The impossibility of computing XOR with a single neuron is the first hint of the deeper **linear separability** constraint that will return to haunt the field in 1969.\n",
    "\n",
    "3. **Feedforward vs. Recurrent Networks:** When we allow cycles in the network, M-P neurons can implement **memory** and **state machines**. McCulloch and Pitts proved that recurrent networks are equivalent to finite automata (their Theorem II), and argued (informally) for a connection to Turing machines.\n",
    "\n",
    "4. **The key limitation:** Despite their computational power, M-P networks have no mechanism for **learning**. Every connection is hardwired. This limitation will motivate Hebb (1949), Rosenblatt (1958), and the entire learning-theoretic tradition.\n",
    "\n",
    "We begin by bringing forward the `McCullochPittsNeuron` and `McCullochPittsNetwork` classes from Chapter 1.\n",
    "\n",
    "```{tip}\n",
    "This chapter demonstrates a fundamental pattern in computer science: **simple components composed systematically can produce arbitrarily complex behavior**. A single M-P neuron is limited, but networks of M-P neurons are universal. This same principle applies to transistors (simple switches that compose into CPUs), to NAND gates (a single universal gate), and later to artificial neurons with learned weights.\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1ffcab31",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "from itertools import product\n",
    "%matplotlib inline\n",
    "\n",
    "\n",
    "class McCullochPittsNeuron:\n",
    "    \"\"\"A McCulloch-Pitts neuron (reproduced from Chapter 1).\"\"\"\n",
    "\n",
    "    def __init__(self, threshold, name=None):\n",
    "        if threshold < 1:\n",
    "            raise ValueError(\"Threshold must be a positive integer.\")\n",
    "        self.threshold = threshold\n",
    "        self.name = name or f\"MP(theta={threshold})\"\n",
    "\n",
    "    def activate(self, excitatory_inputs, inhibitory_inputs=None):\n",
    "        if inhibitory_inputs is not None and any(inhibitory_inputs):\n",
    "            return 0\n",
    "        excitatory_sum = sum(excitatory_inputs)\n",
    "        return 1 if excitatory_sum >= self.threshold else 0\n",
    "\n",
    "    def __repr__(self):\n",
    "        return f\"McCullochPittsNeuron(threshold={self.threshold}, name='{self.name}')\"\n",
    "\n",
    "\n",
    "class McCullochPittsNetwork:\n",
    "    \"\"\"A feedforward network of McCulloch-Pitts neurons (reproduced from Chapter 1).\"\"\"\n",
    "\n",
    "    def __init__(self, n_inputs):\n",
    "        self.n_inputs = n_inputs\n",
    "        self.layers = []\n",
    "\n",
    "    def add_layer(self, layer_spec):\n",
    "        self.layers.append(layer_spec)\n",
    "\n",
    "    def forward(self, inputs):\n",
    "        if len(inputs) != self.n_inputs:\n",
    "            raise ValueError(f\"Expected {self.n_inputs} inputs, got {len(inputs)}\")\n",
    "        current_signals = list(inputs)\n",
    "        for layer in self.layers:\n",
    "            next_signals = []\n",
    "            for spec in layer:\n",
    "                neuron = spec['neuron']\n",
    "                exc = [current_signals[i] for i in spec['excitatory']]\n",
    "                inh = [current_signals[i] for i in spec['inhibitory']]\n",
    "                output = neuron.activate(exc, inh)\n",
    "                next_signals.append(output)\n",
    "            current_signals = next_signals\n",
    "        return current_signals\n",
    "\n",
    "    def __repr__(self):\n",
    "        desc = f\"McCullochPittsNetwork(inputs={self.n_inputs}, layers={len(self.layers)})\"\n",
    "        for i, layer in enumerate(self.layers):\n",
    "            neurons = [s['neuron'].name for s in layer]\n",
    "            desc += f\"\\n  Layer {i}: {neurons}\"\n",
    "        return desc\n",
    "\n",
    "\n",
    "print(\"Classes loaded successfully.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "56829a6c",
   "metadata": {},
   "source": [
    "## 2. The DNF Construction Algorithm\n",
    "\n",
    "### Theory\n",
    "\n",
    "Recall from Chapter 1 that any Boolean function $f: \\{0,1\\}^n \\to \\{0,1\\}$ can be expressed in **Disjunctive Normal Form (DNF):**\n",
    "\n",
    "$$f(x_1, \\ldots, x_n) = \\bigvee_{\\mathbf{a} \\in f^{-1}(1)} \\left( \\bigwedge_{i=1}^{n} \\ell_i^{(\\mathbf{a})} \\right)$$\n",
    "\n",
    "where $\\ell_i^{(\\mathbf{a})} = x_i$ if $a_i = 1$ and $\\ell_i^{(\\mathbf{a})} = \\neg x_i$ if $a_i = 0$.\n",
    "\n",
    "```{admonition} Theorem (DNF Universality)\n",
    ":class: important\n",
    "\n",
    "**Every** Boolean function $f: \\{0,1\\}^n \\to \\{0,1\\}$ can be expressed in Disjunctive Normal Form, and therefore can be realized by a feedforward McCulloch-Pitts network with at most three layers (negation, conjunction, disjunction) using at most $n + 2^n + 1$ neurons.\n",
    "\n",
    "The **algorithm** to convert a truth table into an M-P network is:\n",
    "\n",
    "1. **Extract minterms:** Identify all input patterns $\\mathbf{a}$ where $f(\\mathbf{a}) = 1$.\n",
    "2. **Build negation layer:** Create NOT neurons (with bias) for each input variable.\n",
    "3. **Build minterm layer:** For each minterm, create an AND neuron (threshold $= n$) whose excitatory inputs are the appropriate literals ($x_i$ or $\\neg x_i$).\n",
    "4. **Build output layer:** Create an OR neuron (threshold $= 1$) that takes all minterm outputs as excitatory inputs.\n",
    "```\n",
    "\n",
    "```{warning}\n",
    "**Exponential blowup of the DNF construction.** The DNF construction is *universal* but can be *astronomically expensive*. For a Boolean function on $n$ inputs, the truth table has $2^n$ rows, and the DNF may require up to $2^n$ minterm neurons (one for each row where $f = 1$). For $n = 20$, that is over 1 million neurons; for $n = 30$, over 1 billion. In practice, most Boolean functions of interest have much more compact representations --- but the DNF construction does not exploit any such structure. Finding minimal networks is, in general, an NP-hard problem.\n",
    "```\n",
    "\n",
    "### Implementation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4407b5f2",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "def boolean_to_mp_network(truth_table, n_vars):\n",
    "    \"\"\"Convert a Boolean function (truth table) into a feedforward M-P network using DNF.\"\"\"\n",
    "    minterms = [inp for inp, out in truth_table.items() if out == 1]\n",
    "\n",
    "    if len(minterms) == 0:\n",
    "        net = McCullochPittsNetwork(n_vars + 1)\n",
    "        net.add_layer([\n",
    "            {'neuron': McCullochPittsNeuron(n_vars + 2, 'ZERO'),\n",
    "             'excitatory': list(range(n_vars)),\n",
    "             'inhibitory': []}\n",
    "        ])\n",
    "        return net\n",
    "\n",
    "    bias_idx = n_vars\n",
    "    net = McCullochPittsNetwork(n_vars + 1)\n",
    "\n",
    "    layer1 = []\n",
    "    for i in range(n_vars):\n",
    "        layer1.append({\n",
    "            'neuron': McCullochPittsNeuron(1, f'NOT_x{i}'),\n",
    "            'excitatory': [bias_idx],\n",
    "            'inhibitory': [i]\n",
    "        })\n",
    "    for i in range(n_vars):\n",
    "        layer1.append({\n",
    "            'neuron': McCullochPittsNeuron(1, f'PASS_x{i}'),\n",
    "            'excitatory': [i],\n",
    "            'inhibitory': []\n",
    "        })\n",
    "    net.add_layer(layer1)\n",
    "\n",
    "    layer2 = []\n",
    "    for m_idx, minterm in enumerate(minterms):\n",
    "        exc_indices = []\n",
    "        for i in range(n_vars):\n",
    "            if minterm[i] == 1:\n",
    "                exc_indices.append(n_vars + i)\n",
    "            else:\n",
    "                exc_indices.append(i)\n",
    "        layer2.append({\n",
    "            'neuron': McCullochPittsNeuron(n_vars, f'MINT_{minterm}'),\n",
    "            'excitatory': exc_indices,\n",
    "            'inhibitory': []\n",
    "        })\n",
    "    net.add_layer(layer2)\n",
    "\n",
    "    layer3 = [\n",
    "        {\n",
    "            'neuron': McCullochPittsNeuron(1, 'OR_out'),\n",
    "            'excitatory': list(range(len(minterms))),\n",
    "            'inhibitory': []\n",
    "        }\n",
    "    ]\n",
    "    net.add_layer(layer3)\n",
    "\n",
    "    return net\n",
    "\n",
    "\n",
    "def verify_network(net, truth_table, n_vars):\n",
    "    \"\"\"Verify that a network (with bias) computes the given truth table.\"\"\"\n",
    "    all_pass = True\n",
    "    header = \" \".join([f\"x{i}\" for i in range(n_vars)]) + \"  expected  got  status\"\n",
    "    print(header)\n",
    "    print(\"-\" * len(header))\n",
    "    for inputs, expected in sorted(truth_table.items()):\n",
    "        full_input = list(inputs) + [1]\n",
    "        result = net.forward(full_input)[0]\n",
    "        status = \"PASS\" if result == expected else \"FAIL\"\n",
    "        if result != expected:\n",
    "            all_pass = False\n",
    "        vals = \" \".join([f\"{v:>2}\" for v in inputs])\n",
    "        print(f\"{vals}  {expected:>8}  {result:>3}  {status}\")\n",
    "    return all_pass\n",
    "\n",
    "\n",
    "print(\"DNF construction algorithm defined.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b4dee6d3",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# --- Example 1: AND function via DNF ---\n",
    "print(\"=\" * 50)\n",
    "print(\"Example 1: AND(x0, x1) via DNF construction\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "and_tt = {(0, 0): 0, (0, 1): 0, (1, 0): 0, (1, 1): 1}\n",
    "and_net = boolean_to_mp_network(and_tt, n_vars=2)\n",
    "print(and_net)\n",
    "print()\n",
    "for inputs, expected in sorted(and_tt.items()):\n",
    "    result = and_net.forward(list(inputs) + [1])[0]\n",
    "    status = \"PASS\" if result == expected else \"FAIL\"\n",
    "    print(f\"  AND{inputs} = {result}  (expected {expected})  [{status}]\")\n",
    "print()\n",
    "\n",
    "# --- Example 2: Majority-of-3 via DNF ---\n",
    "print(\"=\" * 50)\n",
    "print(\"Example 2: MAJORITY(x0, x1, x2) via DNF construction\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "maj_tt = {}\n",
    "for x0, x1, x2 in product([0, 1], repeat=3):\n",
    "    maj_tt[(x0, x1, x2)] = 1 if (x0 + x1 + x2) >= 2 else 0\n",
    "\n",
    "maj_net = boolean_to_mp_network(maj_tt, n_vars=3)\n",
    "print(maj_net)\n",
    "print()\n",
    "for inputs, expected in sorted(maj_tt.items()):\n",
    "    result = maj_net.forward(list(inputs) + [1])[0]\n",
    "    status = \"PASS\" if result == expected else \"FAIL\"\n",
    "    print(f\"  MAJ{inputs} = {result}  (expected {expected})  [{status}]\")\n",
    "print()\n",
    "\n",
    "# --- Example 3: Implication (x0 -> x1) ---\n",
    "print(\"=\" * 50)\n",
    "print(\"Example 3: IMPLIES(x0, x1) = NOT x0 OR x1\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "imp_tt = {(0, 0): 1, (0, 1): 1, (1, 0): 0, (1, 1): 1}\n",
    "imp_net = boolean_to_mp_network(imp_tt, n_vars=2)\n",
    "print(imp_net)\n",
    "print()\n",
    "for inputs, expected in sorted(imp_tt.items()):\n",
    "    result = imp_net.forward(list(inputs) + [1])[0]\n",
    "    status = \"PASS\" if result == expected else \"FAIL\"\n",
    "    print(f\"  IMP{inputs} = {result}  (expected {expected})  [{status}]\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64f5cdfd",
   "metadata": {},
   "source": [
    "## 3. XOR: The First Warning Sign\n",
    "\n",
    "### The Impossibility Result\n",
    "\n",
    "```{admonition} Definition (Exclusive-OR)\n",
    ":class: note\n",
    "\n",
    "The **exclusive-or** function is defined as:\n",
    "\n",
    "$$\\text{XOR}(x_1, x_2) = \\begin{cases} 1 & \\text{if } x_1 \\neq x_2 \\\\ 0 & \\text{if } x_1 = x_2 \\end{cases}$$\n",
    "```\n",
    "\n",
    "```{danger}\n",
    "**XOR cannot be computed by a single McCulloch-Pitts neuron.**\n",
    "\n",
    "This is a fundamental limitation. No matter how you assign inputs to excitatory/inhibitory roles, and no matter what threshold you choose, a single M-P neuron **cannot** produce the XOR truth table. This is because:\n",
    "\n",
    "- A single M-P neuron computes a **threshold function** (with absolute inhibition).\n",
    "- XOR is **not** a threshold function.\n",
    "- Equivalently, the positive examples $\\{(0,1), (1,0)\\}$ and negative examples $\\{(0,0), (1,1)\\}$ of XOR are **not linearly separable**.\n",
    "\n",
    "This seemingly minor observation foreshadows the crisis that nearly killed the field: Minsky and Papert's *Perceptrons* (1969) generalized this result to show that single-layer networks cannot compute many important functions (parity, connectedness, etc.).\n",
    "```\n",
    "\n",
    "**Theorem.** No single McCulloch-Pitts neuron can compute XOR.\n",
    "\n",
    "**Proof.** A single M-P neuron computes a function of the form:\n",
    "\n",
    "$$y = \\begin{cases} 1 & \\text{if } \\displaystyle\\sum_{j \\in E} x_j \\geq \\theta \\text{ and } \\sum_{k \\in I} x_k = 0 \\\\ 0 & \\text{otherwise} \\end{cases}$$\n",
    "\n",
    "The key property is **monotonicity in excitatory inputs**: if all inhibitory inputs are 0, then adding more active excitatory inputs can only increase (or maintain) the likelihood of firing.\n",
    "\n",
    "For XOR, we need $f(0,1) = f(1,0) = 1$ but $f(1,1) = 0$. If both inputs are excitatory, then $f(1,0) = 1$ implies $\\theta \\leq 1$. But then $f(1,1)$: the excitatory sum is $2 \\geq 1 = \\theta$, so $f(1,1) = 1$. Contradiction.\n",
    "\n",
    "If we try making one input inhibitory (say $x_2$), then $f(0, 1) = 0$ (since $x_2$ is inhibitory and active), but we need $f(0,1) = 1$. Contradiction.\n",
    "\n",
    "No assignment of roles (excitatory/inhibitory) to the two inputs, for any threshold, produces the XOR truth table. $\\blacksquare$\n",
    "\n",
    "### Geometric Interpretation\n",
    "\n",
    "```{admonition} Definition (Linear Separability)\n",
    ":class: note\n",
    "\n",
    "A Boolean function $f: \\{0,1\\}^n \\to \\{0,1\\}$ is **linearly separable** if there exists a hyperplane in $\\mathbb{R}^n$ that separates the points where $f = 1$ from the points where $f = 0$. Formally, there exist weights $w_1, \\ldots, w_n$ and a threshold $\\theta$ such that:\n",
    "\n",
    "$$f(\\mathbf{x}) = 1 \\iff \\sum_{i=1}^{n} w_i x_i \\geq \\theta$$\n",
    "```\n",
    "\n",
    "This result has a beautiful geometric explanation. A single threshold neuron partitions input space with a **hyperplane**. In 2D, a line. The XOR function's positive examples $\\{(0,1), (1,0)\\}$ and negative examples $\\{(0,0), (1,1)\\}$ are **not linearly separable** --- no single line can separate them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8a5c6c6e",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# --- Visualization: Why XOR is not linearly separable ---\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "fig, axes = plt.subplots(1, 3, figsize=(15, 5))\n",
    "\n",
    "for ax in axes:\n",
    "    ax.set_xlim(-0.5, 1.5)\n",
    "    ax.set_ylim(-0.5, 1.5)\n",
    "    ax.set_xlabel('$x_1$', fontsize=13)\n",
    "    ax.set_ylabel('$x_2$', fontsize=13)\n",
    "    ax.set_aspect('equal')\n",
    "    ax.grid(True, alpha=0.3)\n",
    "\n",
    "x_line = np.linspace(-0.5, 1.5, 100)\n",
    "\n",
    "# --- AND ---\n",
    "ax = axes[0]\n",
    "ax.set_title('AND (linearly separable)', fontsize=13, fontweight='bold')\n",
    "ax.scatter([0, 0, 1], [0, 1, 0], c='red', s=150, zorder=5,\n",
    "           edgecolors='black', linewidth=1.5, label='Output = 0')\n",
    "ax.scatter([1], [1], c='blue', s=150, zorder=5, marker='s',\n",
    "           edgecolors='black', linewidth=1.5, label='Output = 1')\n",
    "ax.plot(x_line, 1.5 - x_line, 'g--', linewidth=2, label='$x_1 + x_2 = 1.5$')\n",
    "ax.fill_between(x_line, 1.5 - x_line, 1.5, alpha=0.1, color='blue')\n",
    "ax.legend(loc='lower right', fontsize=9)\n",
    "\n",
    "# --- OR ---\n",
    "ax = axes[1]\n",
    "ax.set_title('OR (linearly separable)', fontsize=13, fontweight='bold')\n",
    "ax.scatter([0], [0], c='red', s=150, zorder=5,\n",
    "           edgecolors='black', linewidth=1.5, label='Output = 0')\n",
    "ax.scatter([0, 1, 1], [1, 0, 1], c='blue', s=150, zorder=5, marker='s',\n",
    "           edgecolors='black', linewidth=1.5, label='Output = 1')\n",
    "ax.plot(x_line, 0.5 - x_line, 'g--', linewidth=2, label='$x_1 + x_2 = 0.5$')\n",
    "ax.fill_between(x_line, 0.5 - x_line, 1.5, alpha=0.1, color='blue')\n",
    "ax.legend(loc='lower right', fontsize=9)\n",
    "\n",
    "# --- XOR ---\n",
    "ax = axes[2]\n",
    "ax.set_title('XOR (NOT linearly separable)', fontsize=13, fontweight='bold')\n",
    "ax.scatter([0, 1], [0, 1], c='red', s=150, zorder=5,\n",
    "           edgecolors='black', linewidth=1.5, label='Output = 0')\n",
    "ax.scatter([0, 1], [1, 0], c='blue', s=150, zorder=5, marker='s',\n",
    "           edgecolors='black', linewidth=1.5, label='Output = 1')\n",
    "ax.annotate('No single line\\ncan separate these!', xy=(0.5, 0.5),\n",
    "            fontsize=11, ha='center', va='center',\n",
    "            bbox=dict(boxstyle='round,pad=0.3', facecolor='yellow', alpha=0.7))\n",
    "ax.legend(loc='lower right', fontsize=9)\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c269258b",
   "metadata": {},
   "source": [
    "### Building XOR with a Network\n",
    "\n",
    "Although a single neuron cannot compute XOR, a **network** of M-P neurons can. The standard decomposition is:\n",
    "\n",
    "$$\\text{XOR}(x_1, x_2) = (x_1 \\wedge \\neg x_2) \\vee (\\neg x_1 \\wedge x_2)$$\n",
    "\n",
    "This requires:\n",
    "- **2 NOT neurons** (for $\\neg x_1$ and $\\neg x_2$)\n",
    "- **2 AND neurons** (for $x_1 \\wedge \\neg x_2$ and $\\neg x_1 \\wedge x_2$)\n",
    "- **1 OR neuron** (for the final disjunction)\n",
    "\n",
    "Total: **5 neurons** (plus a bias input for the NOT gates).\n",
    "\n",
    "```{tip}\n",
    "The XOR problem is historically one of the most important examples in neural network theory. It demonstrates that **depth matters**: a single layer cannot do what two layers can. This principle --- that deeper networks can compute functions that shallower networks cannot (or can only compute with exponentially more neurons) --- is a central theme in modern deep learning theory.\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cc2b8b1c",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# --- XOR Network: Manual Construction ---\n",
    "\n",
    "def build_xor_network():\n",
    "    \"\"\"Build XOR: XOR(x1, x2) = (x1 AND NOT x2) OR (NOT x1 AND x2).\"\"\"\n",
    "    net = McCullochPittsNetwork(n_inputs=3)  # x1, x2, bias\n",
    "\n",
    "    layer1 = [\n",
    "        {'neuron': McCullochPittsNeuron(1, 'NOT_x1'), 'excitatory': [2], 'inhibitory': [0]},\n",
    "        {'neuron': McCullochPittsNeuron(1, 'NOT_x2'), 'excitatory': [2], 'inhibitory': [1]},\n",
    "        {'neuron': McCullochPittsNeuron(1, 'PASS_x1'), 'excitatory': [0], 'inhibitory': []},\n",
    "        {'neuron': McCullochPittsNeuron(1, 'PASS_x2'), 'excitatory': [1], 'inhibitory': []},\n",
    "    ]\n",
    "    net.add_layer(layer1)\n",
    "\n",
    "    layer2 = [\n",
    "        {'neuron': McCullochPittsNeuron(2, 'x1_AND_NOTx2'), 'excitatory': [2, 1], 'inhibitory': []},\n",
    "        {'neuron': McCullochPittsNeuron(2, 'NOTx1_AND_x2'), 'excitatory': [0, 3], 'inhibitory': []},\n",
    "    ]\n",
    "    net.add_layer(layer2)\n",
    "\n",
    "    layer3 = [\n",
    "        {'neuron': McCullochPittsNeuron(1, 'OR_out'), 'excitatory': [0, 1], 'inhibitory': []}\n",
    "    ]\n",
    "    net.add_layer(layer3)\n",
    "    return net\n",
    "\n",
    "\n",
    "xor_net = build_xor_network()\n",
    "print(\"XOR Network Structure:\")\n",
    "print(xor_net)\n",
    "print()\n",
    "\n",
    "xor_truth = {(0, 0): 0, (0, 1): 1, (1, 0): 1, (1, 1): 0}\n",
    "\n",
    "print(f\"{'x1':>4} {'x2':>4} {'Expected':>9} {'Got':>5} {'Status':>8}\")\n",
    "print(\"-\" * 32)\n",
    "all_pass = True\n",
    "for (x1, x2), expected in sorted(xor_truth.items()):\n",
    "    result = xor_net.forward([x1, x2, 1])[0]\n",
    "    status = \"PASS\" if result == expected else \"FAIL\"\n",
    "    if result != expected:\n",
    "        all_pass = False\n",
    "    print(f\"{x1:>4} {x2:>4} {expected:>9} {result:>5} {status:>8}\")\n",
    "\n",
    "print(f\"\\nAll tests passed: {all_pass}\")\n",
    "print(f\"Total neurons used: 5 (2 NOT + 2 AND + 1 OR)\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3c6dcd2f",
   "metadata": {},
   "source": [
    "### XOR Multi-Layer Network Diagram\n",
    "\n",
    "The following visualization shows the complete multi-layer M-P network that computes XOR, illustrating how depth overcomes the limitation of a single neuron."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30416da5",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import matplotlib.patches as mpatches\n",
    "%matplotlib inline\n",
    "\n",
    "fig, ax = plt.subplots(figsize=(14, 8))\n",
    "ax.set_xlim(-1, 11)\n",
    "ax.set_ylim(-1, 7)\n",
    "ax.set_aspect('equal')\n",
    "ax.axis('off')\n",
    "ax.set_title('XOR Network: Multi-Layer McCulloch-Pitts Construction',\n",
    "             fontsize=15, fontweight='bold', pad=15)\n",
    "\n",
    "# Node positions\n",
    "nodes = {\n",
    "    'x1': (0.5, 5.5), 'x2': (0.5, 3.5), 'bias': (0.5, 1.5),\n",
    "    'NOT_x1': (3.0, 6.0), 'NOT_x2': (3.0, 4.0),\n",
    "    'PASS_x1': (3.0, 2.5), 'PASS_x2': (3.0, 1.0),\n",
    "    'AND1': (6.0, 5.0), 'AND2': (6.0, 2.0),\n",
    "    'OR': (9.0, 3.5),\n",
    "}\n",
    "\n",
    "node_labels = {\n",
    "    'x1': '$x_1$', 'x2': '$x_2$', 'bias': '1\\n(bias)',\n",
    "    'NOT_x1': 'NOT $x_1$\\n$\\\\theta=1$', 'NOT_x2': 'NOT $x_2$\\n$\\\\theta=1$',\n",
    "    'PASS_x1': 'pass $x_1$', 'PASS_x2': 'pass $x_2$',\n",
    "    'AND1': '$x_1 \\\\wedge \\\\neg x_2$\\n$\\\\theta=2$',\n",
    "    'AND2': '$\\\\neg x_1 \\\\wedge x_2$\\n$\\\\theta=2$',\n",
    "    'OR': 'OR\\n$\\\\theta=1$',\n",
    "}\n",
    "\n",
    "node_colors = {\n",
    "    'x1': ('#E8F5E9', '#2E7D32'), 'x2': ('#E8F5E9', '#2E7D32'),\n",
    "    'bias': ('#FFF9C4', '#F9A825'),\n",
    "    'NOT_x1': ('#E3F2FD', '#1565C0'), 'NOT_x2': ('#E3F2FD', '#1565C0'),\n",
    "    'PASS_x1': ('#F3E5F5', '#7B1FA2'), 'PASS_x2': ('#F3E5F5', '#7B1FA2'),\n",
    "    'AND1': ('#FFF3E0', '#E65100'), 'AND2': ('#FFF3E0', '#E65100'),\n",
    "    'OR': ('#FFEBEE', '#C62828'),\n",
    "}\n",
    "\n",
    "for name, (x, y) in nodes.items():\n",
    "    fc, ec = node_colors[name]\n",
    "    circle = plt.Circle((x, y), 0.45, facecolor=fc, edgecolor=ec, linewidth=2.5, zorder=5)\n",
    "    ax.add_patch(circle)\n",
    "    ax.text(x, y, node_labels[name], ha='center', va='center', fontsize=7,\n",
    "            fontweight='bold', zorder=6)\n",
    "\n",
    "# Excitatory edges\n",
    "exc_edges = [\n",
    "    ('bias', 'NOT_x1'), ('bias', 'NOT_x2'),\n",
    "    ('x1', 'PASS_x1'), ('x2', 'PASS_x2'),\n",
    "    ('PASS_x1', 'AND1'), ('NOT_x2', 'AND1'),\n",
    "    ('NOT_x1', 'AND2'), ('PASS_x2', 'AND2'),\n",
    "    ('AND1', 'OR'), ('AND2', 'OR'),\n",
    "]\n",
    "inh_edges = [('x1', 'NOT_x1'), ('x2', 'NOT_x2')]\n",
    "\n",
    "for (src, dst) in exc_edges:\n",
    "    sx, sy = nodes[src]\n",
    "    dx, dy = nodes[dst]\n",
    "    ax.annotate('', xy=(dx - 0.45 * np.sign(dx - sx + 0.001), dy - 0.45 * np.sign(dy - sy + 0.001)),\n",
    "                xytext=(sx + 0.45 * np.sign(dx - sx + 0.001), sy + 0.45 * np.sign(dy - sy + 0.001)),\n",
    "                arrowprops=dict(arrowstyle='->', color='#2E7D32', lw=1.8, alpha=0.7), zorder=3)\n",
    "\n",
    "for (src, dst) in inh_edges:\n",
    "    sx, sy = nodes[src]\n",
    "    dx, dy = nodes[dst]\n",
    "    ax.annotate('', xy=(dx - 0.45 * np.sign(dx - sx + 0.001), dy - 0.45 * np.sign(dy - sy + 0.001)),\n",
    "                xytext=(sx + 0.45 * np.sign(dx - sx + 0.001), sy + 0.45 * np.sign(dy - sy + 0.001)),\n",
    "                arrowprops=dict(arrowstyle='->', color='#C62828', lw=1.8, linestyle='dashed', alpha=0.7), zorder=3)\n",
    "\n",
    "# Output arrow\n",
    "ax.annotate('$y = \\\\mathrm{XOR}(x_1, x_2)$',\n",
    "            xy=(9.45, 3.5), xytext=(10.5, 3.5),\n",
    "            fontsize=12, fontweight='bold', ha='left', va='center',\n",
    "            arrowprops=dict(arrowstyle='->', color='black', lw=2))\n",
    "\n",
    "# Layer labels\n",
    "for lx, label in [(0.5, 'Inputs'), (3.0, 'Layer 1\\n(NOT + Pass)'),\n",
    "                   (6.0, 'Layer 2\\n(AND)'), (9.0, 'Layer 3\\n(OR)')]:\n",
    "    ax.text(lx, -0.5, label, ha='center', va='top', fontsize=10,\n",
    "            fontstyle='italic', color='gray')\n",
    "\n",
    "legend_elements = [\n",
    "    plt.Line2D([0], [0], color='#2E7D32', lw=2, label='Excitatory'),\n",
    "    plt.Line2D([0], [0], color='#C62828', lw=2, linestyle='dashed', label='Inhibitory'),\n",
    "]\n",
    "ax.legend(handles=legend_elements, loc='upper right', fontsize=10, framealpha=0.9)\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "da7a7a04",
   "metadata": {},
   "source": [
    "### Comparison of Logic Gate Implementations\n",
    "\n",
    "The following table summarizes how each common logic gate is implemented using McCulloch-Pitts neurons."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "781d8f26",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "fig, ax = plt.subplots(figsize=(14, 5))\n",
    "ax.axis('off')\n",
    "ax.set_title('McCulloch-Pitts Logic Gate Comparison', fontsize=15, fontweight='bold', pad=20)\n",
    "\n",
    "headers = ['Gate', 'Function', 'Threshold', 'Exc. Inputs', 'Inh. Inputs',\n",
    "           'Bias?', 'Neurons', 'Lin. Sep.?']\n",
    "\n",
    "rows = [\n",
    "    ['AND',  'x1 AND x2',       'theta=2', 'x1, x2',     'None',        'No',  '1', 'Yes'],\n",
    "    ['OR',   'x1 OR x2',        'theta=1', 'x1, x2',     'None',        'No',  '1', 'Yes'],\n",
    "    ['NOT',  'NOT x',            'theta=1', 'bias=1',      'x',           'Yes', '1', 'Yes'],\n",
    "    ['NAND', 'NOT(x1 AND x2)',   '--',      '--',          '--',          '--',  '2', 'Yes'],\n",
    "    ['NOR',  'NOT(x1 OR x2)',    'theta=1', 'bias=1',      'x1, x2',     'Yes', '1', 'Yes'],\n",
    "    ['XOR',  'x1 XOR x2',       '--',      '--',          '--',          '--',  '5*', 'No'],\n",
    "    ['MAJ-3','MAJ(x1,x2,x3)',   'theta=2', 'x1,x2,x3',   'None',        'No',  '1', 'Yes'],\n",
    "]\n",
    "\n",
    "table = ax.table(cellText=rows, colLabels=headers, loc='center', cellLoc='center')\n",
    "table.auto_set_font_size(False)\n",
    "table.set_fontsize(10)\n",
    "table.scale(1.0, 1.8)\n",
    "\n",
    "for j in range(len(headers)):\n",
    "    cell = table[0, j]\n",
    "    cell.set_facecolor('#1565C0')\n",
    "    cell.set_text_props(color='white', fontweight='bold', fontsize=9)\n",
    "    cell.set_edgecolor('white')\n",
    "\n",
    "for i in range(1, len(rows) + 1):\n",
    "    for j in range(len(headers)):\n",
    "        cell = table[i, j]\n",
    "        cell.set_edgecolor('#E0E0E0')\n",
    "        if rows[i-1][0] == 'XOR':\n",
    "            cell.set_facecolor('#FFF3E0')\n",
    "        elif i % 2 == 0:\n",
    "            cell.set_facecolor('#F5F5F5')\n",
    "\n",
    "ax.text(0.5, 0.02,\n",
    "        '* XOR uses 5 neurons via DNF (2 NOT + 2 AND + 1 OR); can be reduced with clever inhibition.',\n",
    "        transform=ax.transAxes, fontsize=9, ha='center', va='bottom',\n",
    "        fontstyle='italic', color='gray')\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4bf43ac4",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# --- XOR via the automatic DNF construction ---\n",
    "\n",
    "xor_tt = {(0, 0): 0, (0, 1): 1, (1, 0): 1, (1, 1): 0}\n",
    "\n",
    "xor_auto = boolean_to_mp_network(xor_tt, n_vars=2)\n",
    "print(\"Automatically constructed XOR network:\")\n",
    "print(xor_auto)\n",
    "print()\n",
    "\n",
    "print(f\"{'x1':>4} {'x2':>4} {'Expected':>9} {'Got':>5} {'Status':>8}\")\n",
    "print(\"-\" * 32)\n",
    "for (x1, x2), expected in sorted(xor_tt.items()):\n",
    "    result = xor_auto.forward([x1, x2, 1])[0]\n",
    "    status = \"PASS\" if result == expected else \"FAIL\"\n",
    "    print(f\"{x1:>4} {x2:>4} {expected:>9} {result:>5} {status:>8}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "31e2918f",
   "metadata": {},
   "source": [
    "## 4. Feedforward vs. Recurrent Networks\n",
    "\n",
    "### Acyclic (Feedforward) Networks\n",
    "\n",
    "```{admonition} Definition (Feedforward Network)\n",
    ":class: note\n",
    "\n",
    "A feedforward M-P network is one whose underlying directed graph is **acyclic** (has no directed cycles). Signals flow strictly from inputs through layers to outputs, with no feedback. A feedforward M-P network computes a **combinational** function --- its output at time $t + d$ (where $d$ is the depth) depends only on the input at time $t$. It has no memory of previous inputs.\n",
    "\n",
    "Formally, a feedforward M-P network with $n$ inputs and $m$ outputs computes a function:\n",
    "\n",
    "$$f: \\{0,1\\}^n \\to \\{0,1\\}^m$$\n",
    "```\n",
    "\n",
    "And we have seen (Theorem I) that every such function is realizable.\n",
    "\n",
    "### Cyclic (Recurrent) Networks\n",
    "\n",
    "```{admonition} Definition (Recurrent Network)\n",
    ":class: note\n",
    "\n",
    "A **recurrent** (cyclic) M-P network is one whose directed graph contains at least one directed **cycle**. A neuron's output can, after some number of time steps, feed back into its own input. This creates the possibility of **memory** --- the network's behavior depends not just on the current input, but on its **history**.\n",
    "```\n",
    "\n",
    "McCulloch and Pitts recognized this in their original paper. They stated:\n",
    "\n",
    "**Theorem II (McCulloch-Pitts, 1943).** Cyclic M-P networks can express *temporal propositional expressions* that are not bounded in time --- i.e., the output can depend on the entire history of inputs.\n",
    "\n",
    "The simplest example is a **latch** (also called a flip-flop or set-reset memory).\n",
    "\n",
    "### The Set-Reset Latch\n",
    "\n",
    "Consider a neuron $A$ with threshold $\\theta = 1$ that receives excitatory input from:\n",
    "1. An external \"set\" signal $S$\n",
    "2. Its own output from the previous time step (feedback)\n",
    "\n",
    "And an inhibitory input from:\n",
    "1. An external \"reset\" signal $R$\n",
    "\n",
    "The activation rule becomes:\n",
    "\n",
    "$$A(t+1) = (S(t) \\vee A(t)) \\wedge \\neg R(t)$$\n",
    "\n",
    "Once $S$ is pulsed (set to 1 for one time step), the neuron fires and its self-feedback keeps it firing. The only way to turn it off is to pulse $R$ (reset)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0d76aed8",
   "metadata": {},
   "outputs": [],
   "source": [
    "# --- Recurrent M-P Network: SR Latch ---\n",
    "\n",
    "class RecurrentMPNeuron:\n",
    "    \"\"\"A McCulloch-Pitts neuron with recurrent (self-feedback) capability.\"\"\"\n",
    "\n",
    "    def __init__(self, threshold, name=None):\n",
    "        self.threshold = threshold\n",
    "        self.name = name or f\"RMP(theta={threshold})\"\n",
    "        self.state = 0\n",
    "\n",
    "    def step(self, excitatory_inputs, inhibitory_inputs=None):\n",
    "        if inhibitory_inputs is not None and any(inhibitory_inputs):\n",
    "            self.state = 0\n",
    "        else:\n",
    "            exc_sum = sum(excitatory_inputs)\n",
    "            self.state = 1 if exc_sum >= self.threshold else 0\n",
    "        return self.state\n",
    "\n",
    "    def reset(self):\n",
    "        self.state = 0\n",
    "\n",
    "\n",
    "class SRLatch:\n",
    "    \"\"\"Set-Reset latch built from a single recurrent M-P neuron.\"\"\"\n",
    "\n",
    "    def __init__(self):\n",
    "        self.neuron = RecurrentMPNeuron(threshold=1, name='Latch')\n",
    "\n",
    "    def step(self, set_signal, reset_signal):\n",
    "        exc = [set_signal, self.neuron.state]\n",
    "        inh = [reset_signal]\n",
    "        return self.neuron.step(exc, inh)\n",
    "\n",
    "    def reset(self):\n",
    "        self.neuron.reset()\n",
    "\n",
    "    @property\n",
    "    def state(self):\n",
    "        return self.neuron.state\n",
    "\n",
    "\n",
    "latch = SRLatch()\n",
    "\n",
    "inputs_sequence = [\n",
    "    (0, 0), (0, 0), (1, 0), (0, 0), (0, 0), (0, 0),\n",
    "    (0, 1), (0, 0), (0, 0), (1, 0), (0, 0), (1, 1), (0, 0),\n",
    "]\n",
    "\n",
    "print(f\"{'t':>3}  {'S':>3}  {'R':>3}  {'Q':>3}  Notes\")\n",
    "print(\"-\" * 40)\n",
    "\n",
    "history = []\n",
    "for t, (s, r) in enumerate(inputs_sequence):\n",
    "    q = latch.step(s, r)\n",
    "    history.append((t, s, r, q))\n",
    "    notes = \"\"\n",
    "    if s == 1 and r == 0:\n",
    "        notes = \"<-- SET\"\n",
    "    elif s == 0 and r == 1:\n",
    "        notes = \"<-- RESET\"\n",
    "    elif s == 1 and r == 1:\n",
    "        notes = \"<-- S+R (inhibition wins)\"\n",
    "    print(f\"{t:>3}  {s:>3}  {r:>3}  {q:>3}  {notes}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c48b6012",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# --- Visualization of the SR Latch simulation ---\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "times = [h[0] for h in history]\n",
    "S_vals = [h[1] for h in history]\n",
    "R_vals = [h[2] for h in history]\n",
    "Q_vals = [h[3] for h in history]\n",
    "\n",
    "fig, axes = plt.subplots(3, 1, figsize=(12, 6), sharex=True)\n",
    "\n",
    "axes[0].step(times, S_vals, where='mid', color='blue', linewidth=2)\n",
    "axes[0].fill_between(times, S_vals, step='mid', alpha=0.2, color='blue')\n",
    "axes[0].set_ylabel('S (Set)', fontsize=12)\n",
    "axes[0].set_ylim(-0.1, 1.3)\n",
    "axes[0].set_yticks([0, 1])\n",
    "axes[0].grid(True, alpha=0.3)\n",
    "\n",
    "axes[1].step(times, R_vals, where='mid', color='red', linewidth=2)\n",
    "axes[1].fill_between(times, R_vals, step='mid', alpha=0.2, color='red')\n",
    "axes[1].set_ylabel('R (Reset)', fontsize=12)\n",
    "axes[1].set_ylim(-0.1, 1.3)\n",
    "axes[1].set_yticks([0, 1])\n",
    "axes[1].grid(True, alpha=0.3)\n",
    "\n",
    "axes[2].step(times, Q_vals, where='mid', color='green', linewidth=2)\n",
    "axes[2].fill_between(times, Q_vals, step='mid', alpha=0.2, color='green')\n",
    "axes[2].set_ylabel('Q (Output)', fontsize=12)\n",
    "axes[2].set_xlabel('Time step $t$', fontsize=12)\n",
    "axes[2].set_ylim(-0.1, 1.3)\n",
    "axes[2].set_yticks([0, 1])\n",
    "axes[2].grid(True, alpha=0.3)\n",
    "\n",
    "fig.suptitle('SR Latch: Recurrent McCulloch-Pitts Neuron', fontsize=14, fontweight='bold')\n",
    "plt.tight_layout()\n",
    "plt.show()\n",
    "print(\"Notice how Q stays latched at 1 after S is pulsed,\")\n",
    "print(\"and only returns to 0 after R is pulsed. This is memory.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f80296cf",
   "metadata": {},
   "source": [
    "## 5. Recurrent Network: Extended Example\n",
    "\n",
    "Let us build a slightly more complex recurrent example: a **toggle flip-flop**. This circuit changes state each time a trigger input $T$ is pulsed.\n",
    "\n",
    "The behavior we want:\n",
    "- $Q(t+1) = Q(t) \\oplus T(t)$\n",
    "- When $T = 0$: hold state ($Q$ stays the same)\n",
    "- When $T = 1$: toggle ($Q$ flips)\n",
    "\n",
    "This is more complex because XOR itself requires a multi-neuron network, and now we need feedback."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "983e57b9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# --- Toggle Flip-Flop using recurrent M-P neurons ---\n",
    "\n",
    "class ToggleFlipFlop:\n",
    "    \"\"\"A toggle flip-flop: Q(t+1) = Q(t) XOR T(t).\"\"\"\n",
    "\n",
    "    def __init__(self):\n",
    "        self.Q = 0\n",
    "        self.not_gate = McCullochPittsNeuron(threshold=1, name='NOT')\n",
    "        self.and_gate = McCullochPittsNeuron(threshold=2, name='AND')\n",
    "        self.or_gate = McCullochPittsNeuron(threshold=1, name='OR')\n",
    "\n",
    "    def step(self, T):\n",
    "        not_T = self.not_gate.activate([1], [T])\n",
    "        not_Q = self.not_gate.activate([1], [self.Q])\n",
    "        term1 = self.and_gate.activate([self.Q, not_T])\n",
    "        term2 = self.and_gate.activate([not_Q, T])\n",
    "        self.Q = self.or_gate.activate([term1, term2])\n",
    "        return self.Q\n",
    "\n",
    "    def reset(self):\n",
    "        self.Q = 0\n",
    "\n",
    "\n",
    "tff = ToggleFlipFlop()\n",
    "trigger_sequence = [0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1]\n",
    "\n",
    "print(f\"{'t':>3}  {'T':>3}  {'Q':>3}\")\n",
    "print(\"-\" * 15)\n",
    "\n",
    "tff_history = []\n",
    "for t, T in enumerate(trigger_sequence):\n",
    "    q = tff.step(T)\n",
    "    tff_history.append((t, T, q))\n",
    "    toggle_mark = \" <-- toggle\" if T == 1 else \"\"\n",
    "    print(f\"{t:>3}  {T:>3}  {q:>3}{toggle_mark}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d21a20b9",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "# --- Visualization of the Toggle Flip-Flop ---\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "t_vals = [h[0] for h in tff_history]\n",
    "T_vals = [h[1] for h in tff_history]\n",
    "Q_tff_vals = [h[2] for h in tff_history]\n",
    "\n",
    "fig, axes = plt.subplots(2, 1, figsize=(14, 5), sharex=True)\n",
    "\n",
    "axes[0].step(t_vals, T_vals, where='mid', color='orange', linewidth=2)\n",
    "axes[0].fill_between(t_vals, T_vals, step='mid', alpha=0.2, color='orange')\n",
    "axes[0].set_ylabel('T (Trigger)', fontsize=12)\n",
    "axes[0].set_ylim(-0.1, 1.3)\n",
    "axes[0].set_yticks([0, 1])\n",
    "axes[0].grid(True, alpha=0.3)\n",
    "axes[0].set_title('Toggle Flip-Flop: Q(t+1) = Q(t) XOR T(t)', fontsize=13, fontweight='bold')\n",
    "\n",
    "axes[1].step(t_vals, Q_tff_vals, where='mid', color='purple', linewidth=2)\n",
    "axes[1].fill_between(t_vals, Q_tff_vals, step='mid', alpha=0.2, color='purple')\n",
    "axes[1].set_ylabel('Q (Output)', fontsize=12)\n",
    "axes[1].set_xlabel('Time step $t$', fontsize=12)\n",
    "axes[1].set_ylim(-0.1, 1.3)\n",
    "axes[1].set_yticks([0, 1])\n",
    "axes[1].grid(True, alpha=0.3)\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()\n",
    "print(\"Each time T=1, the output Q toggles (flips between 0 and 1).\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "04566c86",
   "metadata": {},
   "source": [
    "## 6. Turing Equivalence\n",
    "\n",
    "### Theorem III (McCulloch-Pitts, 1943)\n",
    "\n",
    "McCulloch and Pitts claimed, with varying degrees of rigor, that their neural networks with cycles are equivalent in computational power to Turing machines.\n",
    "\n",
    "### Direction 1: M-P Networks can Simulate Finite Automata\n",
    "\n",
    "```{admonition} Definition (Finite Automaton)\n",
    ":class: note\n",
    "\n",
    "A **finite automaton** (FA) is defined by:\n",
    "- A finite set of states $Q = \\{q_0, q_1, \\ldots, q_m\\}$\n",
    "- An input alphabet $\\Sigma$\n",
    "- A transition function $\\delta: Q \\times \\Sigma \\to Q$\n",
    "- An initial state $q_0$\n",
    "- A set of accepting states $F \\subseteq Q$\n",
    "```\n",
    "\n",
    "**Claim.** Any finite automaton can be simulated by a cyclic M-P network.\n",
    "\n",
    "**Proof sketch.** Encode each state $q_i$ as a dedicated neuron that is active when the automaton is in that state. The transition function is a Boolean function, implementable by M-P neurons. The recurrence provides state memory. $\\blacksquare$\n",
    "\n",
    "### Direction 2: Turing Machines can Simulate M-P Networks\n",
    "\n",
    "This direction is straightforward: any M-P network with $n$ neurons can be simulated by a Turing machine that maintains an $n$-bit state vector.\n",
    "\n",
    "### The Crucial Nuance\n",
    "\n",
    "A **finite** M-P network has at most $2^n$ internal states, making it equivalent to a **finite automaton**, not a full Turing machine. The precise equivalence was clarified by **Kleene (1956)**:\n",
    "\n",
    "```{admonition} Theorem (Kleene-McCulloch-Pitts)\n",
    ":class: important\n",
    "\n",
    "$$\\text{Finite cyclic M-P networks} \\equiv \\text{Finite automata} \\equiv \\text{Regular expressions}$$\n",
    "\n",
    "The class of languages recognized by finite cyclic McCulloch-Pitts networks is precisely the class of **regular languages**.\n",
    "```\n",
    "\n",
    "```{tip}\n",
    "The gap between finite automata and Turing machines is a recurring theme in theoretical computer science. Finite M-P networks compute regular languages; adding an external memory (like a stack) would give context-free languages; adding an unbounded tape gives Turing-completeness. Modern recurrent neural networks (RNNs, LSTMs, Transformers) face analogous questions about their computational power.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3baeb5c2",
   "metadata": {},
   "source": [
    "## 7. The Key Limitation: No Learning\n",
    "\n",
    "For all their computational power, McCulloch-Pitts networks have a fundamental limitation: **there is no mechanism for learning**.\n",
    "\n",
    "```{warning}\n",
    "The absence of learning in M-P networks is not just a technical inconvenience --- it is a **fundamental design limitation**. To build an M-P network, you must already know the complete truth table of the function you want to compute. But the whole point of intelligence is dealing with situations you have *not* seen before. The quest to add learning to neural networks drove the next 40 years of research, from Hebb (1949) through backpropagation (1986).\n",
    "```\n",
    "\n",
    "This means that to build an M-P network for a given task, you must:\n",
    "1. Know the desired input-output mapping **exactly** (the truth table).\n",
    "2. Manually design the network (or use the DNF construction).\n",
    "3. Accept that the network will never change its behavior.\n",
    "\n",
    "The recognition of this limitation drove the next major developments:\n",
    "\n",
    "- **Donald Hebb (1949):** \"neurons that fire together, wire together\" --- the first **learning rule**.\n",
    "- **Frank Rosenblatt (1958):** The **Perceptron** with adjustable weights and a **convergence theorem**.\n",
    "- **Widrow and Hoff (1960):** The **ADALINE** and the **delta rule** (least mean squares).\n",
    "- **Rumelhart, Hinton, and Williams (1986):** **Backpropagation** for multi-layer networks.\n",
    "\n",
    "The story from McCulloch-Pitts to backpropagation is the story of how the field learned to make neural networks **learn**."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d2136262",
   "metadata": {},
   "source": [
    "## 8. Exercises\n",
    "\n",
    "### Exercise 2.1: Full Adder\n",
    "\n",
    "A **full adder** takes three 1-bit inputs ($x_1$, $x_2$, and carry-in $c_{\\text{in}}$) and produces:\n",
    "- **Sum** $S = x_1 \\oplus x_2 \\oplus c_{\\text{in}}$\n",
    "- **Carry-out** $C_{\\text{out}} = (x_1 \\wedge x_2) \\vee (x_1 \\wedge c_{\\text{in}}) \\vee (x_2 \\wedge c_{\\text{in}})$\n",
    "\n",
    "Build a McCulloch-Pitts network that implements a full adder.\n",
    "\n",
    "| $x_1$ | $x_2$ | $c_{\\text{in}}$ | Sum | $C_{\\text{out}}$ |\n",
    "|:-----:|:-----:|:----------------:|:---:|:-----------------:|\n",
    "| 0     | 0     | 0                | 0   | 0                 |\n",
    "| 0     | 0     | 1                | 1   | 0                 |\n",
    "| 0     | 1     | 0                | 1   | 0                 |\n",
    "| 0     | 1     | 1                | 0   | 1                 |\n",
    "| 1     | 0     | 0                | 1   | 0                 |\n",
    "| 1     | 0     | 1                | 0   | 1                 |\n",
    "| 1     | 1     | 0                | 0   | 1                 |\n",
    "| 1     | 1     | 1                | 1   | 1                 |\n",
    "\n",
    "```{hint}\n",
    ":class: dropdown\n",
    "\n",
    "Use the DNF construction for each output separately, or observe that $C_{\\text{out}}$ is the majority function ($\\theta = 2$, three excitatory inputs) and $S$ is the 3-bit parity function. Parity requires a multi-layer network: chain two XOR computations.\n",
    "```\n",
    "\n",
    "### Exercise 2.2: 2-Bit Counter\n",
    "\n",
    "Implement a **2-bit binary counter** using recurrent M-P neurons. The counter should count up by 1 each time a clock input $C$ is pulsed: $00 \\to 01 \\to 10 \\to 11 \\to 00 \\to \\ldots$\n",
    "\n",
    "```{hint}\n",
    ":class: dropdown\n",
    "\n",
    "$Q_0$ is a toggle flip-flop driven by $C$. $Q_1$ is a toggle flip-flop driven by the carry from $Q_0$ (i.e., $Q_1$ toggles when $Q_0$ goes from 1 to 0).\n",
    "```\n",
    "\n",
    "### Exercise 2.3: $n$-Bit Parity\n",
    "\n",
    "The **parity function** on $n$ inputs outputs 1 iff the number of 1-inputs is odd:\n",
    "\n",
    "$$\\text{PAR}(x_1, \\ldots, x_n) = x_1 \\oplus x_2 \\oplus \\cdots \\oplus x_n$$\n",
    "\n",
    "How many M-P neurons are needed to compute $n$-bit parity?\n",
    "\n",
    "**Analysis:** The DNF construction uses $n + 2^{n-1} + 1$ neurons. A tree construction uses $O(n)$ neurons and $O(\\log n)$ depth.\n",
    "\n",
    "```{hint}\n",
    ":class: dropdown\n",
    "\n",
    "For 4-bit parity: compute $a = x_1 \\oplus x_2$ and $b = x_3 \\oplus x_4$ in parallel (each using 5 neurons), then compute $a \\oplus b$ (another 5 neurons). Total: 15 neurons, depth 6. Compare with DNF: $4 + 8 + 1 = 13$ neurons but only depth 3. There is a tradeoff between number of neurons and depth!\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "05f1adc1",
   "metadata": {},
   "source": [
    "## 9. Summary\n",
    "\n",
    "In this chapter, we have gone from individual McCulloch-Pitts neurons to **networks** that can compute arbitrary Boolean functions and implement stateful sequential logic.\n",
    "\n",
    "### Key results:\n",
    "\n",
    "1. **The DNF construction** provides a systematic algorithm to build an M-P network for any Boolean function. The cost is at most $n + 2^n + 1$ neurons (worst case).\n",
    "\n",
    "2. **XOR is not computable by a single M-P neuron.** This is because XOR is not a threshold function. This is a precursor to Minsky and Papert (1969).\n",
    "\n",
    "3. **Recurrent (cyclic) M-P networks can implement memory.** The SR latch is the simplest example.\n",
    "\n",
    "4. **Finite cyclic M-P networks are equivalent to finite automata** (Kleene-McCulloch-Pitts theorem).\n",
    "\n",
    "5. **M-P networks have no learning mechanism.** Everything is hardwired.\n",
    "\n",
    "```{tip}\n",
    "Looking back at this chapter, notice the pattern: **each limitation of a model motivates the next advance**. Single neurons cannot compute XOR, so we build networks. Feedforward networks have no memory, so we add cycles. Fixed networks cannot learn, so we introduce adjustable weights. This dialectic of limitation and innovation is the engine that drives the field forward.\n",
    "```\n",
    "\n",
    "### The road ahead:\n",
    "\n",
    "In the next chapter, we place these results in their historical context, tracing the intellectual lineages from McCulloch and Pitts through the AI winter and up to the backpropagation revolution of 1986."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}