Transformer Visualizer — Classical Foundations of ANN

A trained Transformer encoder-decoder has many attention matrices: encoder self-attention (every input position attending to every other input), decoder self-attention (causal — each output looks only at earlier outputs), and cross-attention (each output looks at the input). This applet simulates a 2-layer × 4-head Transformer trained on string reversal — pick a layer and head and watch the alignment pattern that head specialised in.

Input string

The model is trained to reverse the string. Output appears immediately below.

in  = transformer
out = remrofsnart

View

attention type

layer

Heads (click to focus)

H0 mirror H1 diagonal H2 first/last H3 diffuse

Diagnostics

entropy

—

peak

—

mirror score

—

diag score

—

Cross-attention · Layer 0 · Head 0

For string reversal, cross-attention should show an anti-diagonal stripe — output position $i$ aligns with input position $T-i$. Layer 0's heads typically discover this; Layer 1 refines it. Encoder self-attention and decoder self-attention show different patterns — explore them.

What you're looking at

A real Transformer has many attention matrices per layer (one per head). The shape of the matrix tells you what the head learned to do. The four heads in this applet are stylised approximations of the kinds of patterns a trained reversal-Transformer typically discovers: a clean anti-diagonal "mirror" head, a less-confident sharper diagonal, a head that fixates on edges (BOS/EOS), and a diffuse head that averages many positions. The encoder self-attention is mostly identity-like (each token reads itself plus neighbours); the decoder self-attention is strictly lower-triangular due to causal masking. In Chapter 40 you trained a real version of this in PyTorch — this applet visualises what those trained matrices look like before you have to look at them in matplotlib.

← Back to course