The Predicting Machine
The Cerebellum, the Forward Model, and the Limits of an Idea
The structure engineers love
There is a structure at the back of the brain, tucked beneath the great hemispheres like an afterthought, that has fascinated people with a certain cast of mind more than any other part of the nervous system. It is the cerebellum — the name means “little brain” — and almost everything about it invites the kind of person who thinks in circuits and control loops to lean forward.
Start with the arithmetic, which is genuinely startling. The cerebellum is a small thing: roughly a tenth of the brain by volume, a wrinkled appendage you could cup in one hand. Yet it contains somewhere around four-fifths of all the neurons in your brain — on the order of seventy billion cells, more than the entire cerebral cortex, the seat of everything we are proud of, packed into a structure a fraction of its size. If you were handed the neuron count and asked to guess where the cells were, you would guess wrong. The brain placed most of its neuron count — though, since most of those cells are tiny, not most of its mass — inside the little brain, and even that lopsided investment should tell us the cerebellum is doing something the rest of the brain found worth paying for.
But the number is not what draws the engineers. What draws them is the regularity. Cut into the cerebellum almost anywhere — the part that helps you walk, the part that helps you speak, the part that lights up when you do mental arithmetic — and you find the same basic circuit: the same cell types, wired in the same way, repeated across the whole sheet with a monotony you find nowhere else in the brain. The cerebral cortex varies from region to region; its visual areas do not look quite like its motor areas, which do not look quite like its frontal areas. The cerebellar cortex barely varies at all. This is not to say the structure is homogeneous — its inputs and outputs are organized into distinct loops, and it is divided into molecular zones and parasagittal stripes that matter for how it is wired into the rest of the brain. But the circuit is strikingly conserved; it is the traffic through it that differs. To a first approximation the cerebellum is a single computation stamped out millions of times, like printed transistors on a wafer, and laid over an enormous surface — for the cerebellar sheet, if you unfolded its tight pleats, would run something like a meter long.
A conserved circuit repeated across a vast array is, to an engineer, a provocation. It says: here is one operation, applied over and over, to whatever you feed in. It says there is a single thing the cerebellum does — some basic computation — performed on motor signals in one place and on something else in another, the same way throughout. The promise is almost irresistible: that if you could just work out what the one operation is, you would understand the whole structure at a stroke, because every part is the same machine running on different inputs.
This is why the cerebellum, more than any other piece of the brain, became a playground for people with theories. We know its wiring in exquisite detail — its inputs, its outputs, every cell in the canonical circuit and nearly every connection. It has the cleanest, most legible architecture in the central nervous system, and into that legible architecture, generations of theorists — many trained in engineering and control theory rather than biology — have poured ideas about what such a machine could compute. The remarkable thing, the thing this chapter has to be honest about from the start, is that after more than a century of study and decades of beautiful theory, we are still arguing about what it does. We understand the cerebellum’s hardware better than that of any other brain structure, and we understand its function less well than we would like to admit. That gap — between a perfectly mapped circuit and a contested function — is the tension that organizes everything ahead.
I will give you the leading answer now, because this is not a mystery to be strung out. The best current idea is that the cerebellum is the brain’s predicting machine: that it builds and continuously updates an internal model of the body and its world, a model that can forecast the sensory consequences of a movement before the movement has produced any sensation to feel. If that is right, the cerebellum is the organ that makes the feedforward control of the last two chapters possible — the structure standing behind the calf muscle that braced before the arm lifted, behind the hand that learned the weight while perception stayed fooled. It is, in the language of this book, the organ of prediction, and prediction, we keep finding, is what the brain is really for. But “if that is right” is carrying real weight in that sentence, and part of our job is to see exactly how much.
What the little brain is for: the lesion tells us
How do you discover the function of a structure that does not, itself, move anything? For the cerebellum sends no axons to muscle. Like the basal ganglia we will come to next, it has no direct line to the alpha motor neurons of the final common path; whatever it does, it does by shaping the activity of other structures that do reach muscle. So we cannot read its function off its outputs the way we might for the motor cortex. We have to discover it the way neurology has always discovered things: by watching what happens when it breaks.
The first person to do this properly was the French physiologist Pierre Flourens, in the 1820s, working — as people did then — by making lesions in animals and observing what they could no longer do. His experimental subjects were pigeons, and what he found, when he removed the cerebellum, was not what you might expect. The birds were not paralyzed. They could still move every part of their bodies; the movements were all there. What had vanished was something subtler and, in the vocabulary of the time, almost new. Flourens wrote that after the lesion all movements persisted, but that they were no longer regular and coordinated. The animal could move; it could not move well.
We could almost stop the chapter here, because Flourens, two centuries ago, named the thing. The cerebellum is the organ of coordination. It is not where movements are generated and not where they are commanded; it is what makes them smooth, accurate, and properly timed. The concept of coordination — of movement as something that can be intact in its parts and yet broken in its organization — was barely part of how anyone thought about the brain before this. Flourens put it there, and it has never left.
The modern name for what cerebellar patients lose is ataxia — literally “without order,” a failure of coordination — and it is the hallmark, the signature that tells a neurologist the trouble is in the cerebellum. It is worth being concrete about what ataxia actually looks like, because the specific failures are not a random list; each one is a clue to the underlying computation, and we will collect them as evidence.
Watch a patient with cerebellar damage try to walk and the gait is wide-based, lurching, unsteady — the rolling, careful walk of someone crossing the deck of a moving ship, or, as every clinician notes, of someone who has had too much to drink. Ask them to walk heel-to-toe along a line and they cannot; they stagger and put a foot out to the side to keep from falling. Ask them to reach out and touch their nose with a fingertip, the classic bedside test, and you see the most telling sign of all: as the finger approaches its target, it begins to overshoot and correct, overshoot the other way and correct again, oscillating back and forth in a widening wobble as it nears the nose — an intention tremor, a tremor that appears specifically during a guided movement and worsens as precision is demanded, the opposite of the tremor-at-rest we will see in Parkinson’s. The patient gets close and then cannot land. This overshooting and undershooting of a target has its own name, dysmetria — “wrong measure” — the inability to scale a movement correctly to the distance it must cover. Ask the patient to rapidly alternate a movement — flip the hand palm-up, palm-down, palm-up, as fast as they can — and the rhythm falls apart into a clumsy, irregular fumble; this is dysdiadochokinesia, the loss of smooth alternation. Movements that should be single smooth arcs decompose into a sequence of separate jerks, as though the joints were being moved one at a time rather than together. And the muscles of speech are not spared: cerebellar patients are often dysarthric, their speech slurred, halting, poorly metered — words run together or break apart, the fine timing of articulation gone.
Read that list again with the engineer’s eye and a pattern jumps out. Every single deficit is a failure of prediction and correction in time. The hand overshoots because the system did not anticipate when to start braking. The reach oscillates because each correction arrives too late and overcorrects, with no forecast to damp it. The rapid alternation fails because it depends on precisely timed switching between antagonist muscles. The smooth movement decomposes because smoothness is the seamless anticipatory blending of one joint’s motion into the next. The slurred speech is the same failure in the fastest, most finely timed muscles we own. Nothing here is weakness; the muscles are strong and the movements are all available. What is broken is the organization of movement in time — the anticipation that should keep each act smooth and on-target. The lesion is telling us, in every one of its signs, that the cerebellum’s job has to do with timing and prediction.
The next time you see a police officer run someone through a roadside sobriety check — walk a straight line heel-to-toe, stand on one leg, tip the head back and touch the nose with a fingertip, eyes closed — you are watching something close to a neurologist’s cerebellar examination, and not by coincidence. (A caution for the pedant: the formal standardized field sobriety battery used in the United States is a specific trio — testing for a particular eye jerk called nystagmus, plus walk-and-turn and one-leg-stand — and finger-to-nose and Romberg-type tasks, though common at the roadside, are not officially part of that battery. But the family resemblance to a cerebellar exam is real.) These tasks probe coordination, balance, and the visually unguided accuracy of movement — precisely the functions that fail in ataxia.
They detect drunkenness because the cerebellum is exquisitely sensitive to alcohol. Among all the brain’s structures, it is one of the first and hardest hit by acute intoxication, which is why the drunk and the cerebellar patient share a clinical picture so closely that the roadside tasks for one resemble the bedside tests for the other: the wide unsteady gait, the heel-to-toe stagger, the finger that misses the nose, the slurred speech. Alcohol does temporarily and reversibly, to a healthy cerebellum, much of what a lesion does permanently. The roadside check is, in effect, asking your cerebellum to prove it is sober by demonstrating that its predictions are still on time. Chronic heavy drinking, incidentally, can damage the cerebellum for good — there is a well-recognized alcoholic cerebellar degeneration, concentrated in exactly the region that governs the legs — so that the unsteady gait outlasts the drinking.
And here is the second great clinical fact, the one that anchors the diagnostic logic of the whole unit. Cerebellar damage does not cause paralysis. This is worth dwelling on because it is the cleanest possible demonstration that the cerebellum sits off to the side of the main pathway to muscle, not within it. A person with a destroyed cerebellum can still move every muscle. They are not weak. They have not lost the final common path, nor the descending commands that drive it. What they have lost is the quality of movement — its smoothness, its accuracy, its timing. Recall the diagnostic ladder from the overview: destroy the final common path and you get paralysis; damage the descending cortical control and you get weakness and spasticity; damage the cerebellum and you get ataxia — strong but clumsy, well-powered but poorly aimed and poorly timed. Each kind of failure points to a different level of the architecture, and ataxia points precisely here, to a structure that calibrates movement without commanding it.
There is a strange and instructive endpoint to this fact, which we will return to at the chapter’s close but should plant now. People are very occasionally born with little or no cerebellum — a rare developmental condition called cerebellar agenesis. You might expect this to be devastating, given that the missing structure holds most of the brain’s neurons. Often it is not. Such individuals are typically ataxic and dysarthric, clumsy and late to walk, frequently needing a cane — but many walk, talk, hold jobs, and live substantially ordinary lives. One well-known case came to medical attention only as an adult, for an unrelated complaint, and was found to have essentially no cerebellum despite a largely normal life. We will have to handle this evidence with real care later — true total agenesis is exceedingly rare, many such cases have some residual cerebellar tissue or other abnormalities of the brainstem and posterior brain, and a brain that develops without a cerebellum from the start has a lifetime to compensate in ways an adult who loses one suddenly does not. So congenital absence cannot be weighed in quite the same scale as an adult lesion. But hold the basic fact in reserve. It will tell us something important about how much, and how little, the predicting machine turns out to be for.
A map before the machinery
Before we go further it helps to have a rough map of the structure, because the cerebellum is not an undifferentiated blob and several threads we are about to pull — the steadying of gaze, the staggering gait, the cognitive story, the side on which signs appear — live in different parts of it. The internal circuit is much the same throughout, as we said; but what that circuit is wired to divides the cerebellum into three functional territories, oldest to newest, and it is worth holding them in mind.
The oldest part, tucked underneath against the brainstem, is the vestibulocerebellum — wired to the balance organs of the inner ear and to the systems that move the eyes. It governs balance, posture in the vertical, and the stabilization of gaze; it is the part that will concern us when we come to the steadying of the eyes during head movement. Damage here makes people unsteady on their feet and disturbs their eye movements.
The middle territory, running down the midline and the strip just beside it, is the spinocerebellum — wired, as its name says, to the spinal cord, receiving the great proprioceptive stream from the body. It governs the coordination of the trunk, the gait, and the ongoing movements of the limbs, comparing the body’s actual state against what was intended. This is the part whose failure produces the lurching, wide-based, heel-to-toe-incapable walk that is the textbook picture of ataxia.
The newest and, in humans, by far the largest territory is the cerebrocerebellum — the great lateral hemispheres, wired not to the spinal cord or the inner ear but, through massive loops, to the cerebral cortex, including the association cortex of the frontal and parietal lobes. This is the part that has expanded so dramatically in primates, the part implicated in the planning and precise timing of movement and in the cognitive functions we will weigh at the chapter’s end. Its deep output nucleus, the dentate, is correspondingly enlarged in us.
Keep this trio in view. When we discuss the steadying of gaze, we are in the vestibulocerebellum; when we discuss ataxic gait, the spinocerebellum; when we discuss timing, planning, and cognition, the cerebrocerebellum. The same circuit, in each — but very different traffic, and very different consequences when it fails.
Two problems that make prediction necessary
Before we open up the machine, we should be clear about why a predicting machine had to evolve at all — what problem it solves that could not be solved otherwise. Because the cerebellum is, at bottom, a solution, and you cannot understand a solution without understanding its problem. There are two, and they are both consequences of a hard fact about controlling a body: feedback is too slow.
Recall the picture of control we built in the overview — the loop in which a command produces a movement, the movement produces sensory feedback, and the feedback is compared against what was intended so that errors can be corrected. This is the basic architecture of any controller, and the spinal cord runs versions of it locally and fast. But when the loop has to run through the brain, it runs into a wall: the sensory signals reporting what your body actually did take time to arrive. Proprioception from your limbs — the reports from muscle spindles and tendon organs about where your joints are and what forces they bear — must travel up the spinal cord, be processed, and come back as a correction, and the round trip costs tens to well over a hundred milliseconds. That does not sound like much. It is catastrophic.
The first problem, then, is delay. Imagine controlling any fast movement purely by feedback. You issue a command, you wait for the senses to tell you how it is going, and by the time they do, the world has moved on; your correction is a correction to a situation that no longer exists, and it arrives late, and because it arrives late it overshoots, and the overshoot generates a new error, and you correct that late, and you oscillate. You have, in fact, exactly reproduced the intention tremor of the cerebellar patient — which is precisely the point. A system forced to rely on delayed feedback wobbles, because every correction is stale. The faster and more precise the movement, the worse the problem, until for the quickest movements feedback control becomes useless: the movement is over before the first report arrives. You cannot catch a ball, or speak a fluent sentence, or play a run of notes, by waiting to feel each one.
The second problem is variability. The relationship between a motor command and its result is not fixed. The same command produces different movements depending on circumstances the command cannot know in advance. You reach to lift a box and it is heavier than you expected, so your planned force is wrong. You step onto what you took for solid ground and it gives. You are walking across that pitching ship’s deck and the floor rises to meet your foot a moment early. Your own body changes, too — a limb fatigues, grows over a lifetime, is loaded differently from moment to moment. A controller that assumed a fixed, known relationship between command and consequence would be wrong constantly, and would have to discover each error the slow way, through delayed feedback, after it had already thrown the movement off.
Put the two problems together and the predicament is sharp. The body must be controlled fast, faster than feedback allows; and the thing being controlled keeps changing, so that no fixed command will do. How can a brain control something quickly and accurately when it cannot trust its commands to have fixed effects and cannot wait to find out what actually happened?
The answer — the answer the cerebellum seems to embody — is to predict. If you cannot wait for the real sensory feedback, then generate fake feedback ahead of time: build a model of the body that takes your intended command as input and forecasts, instantly and internally, what the consequences will be — where the limb will go, what you will feel — without any need to actually move and wait. Then you can correct against the prediction, which is available now, instead of against the measurement, which is available too late. And if the model’s predictions turn out wrong — if the box was heavy, if the deck moved — you use that error not only to fix the current movement but to update the model, so that next time the prediction is better. A controller built this way is fast, because it does not wait; and it is adaptive, because it learns. This is the idea of an internal model, or forward model, and it is the single most important concept in this chapter.
Can a body move at all without feedback? Strikingly, yes — and the demonstration tells us how much the brain relies on prediction. There are rare patients who, through a large-fiber sensory neuronopathy — a condition that destroys the nerves carrying proprioception while sparing the motor nerves — have lost essentially all sense of where their bodies are: the limbs can still be commanded, but they send back no report of position or motion. The most famous such patient, Ian Waterman, lost proprioception from the neck down in his late teens and had to learn, painstakingly, to move again by watching his limbs and steering them with vision, an effort of constant conscious attention. Ask a deafferented person, in the dark so that vision cannot help either, to draw a shape in the air — a circle, a figure-eight — and they can still do it; the movement comes out, roughly, with no feedback at all. This is a fire-and-forget system: the command is issued and runs to completion unmonitored, exactly the way a feedforward controller works, and it proves that movement does not strictly need moment-to-moment feedback. But it also shows the limits — the movements are unstable, they drift, they degrade, and without vision to substitute they become profoundly effortful. We can move without feedback. We cannot move well without it — or, more precisely, without the internal prediction that normally stands in for it.
The problem the cerebellum solves is not unique to brains, and the engineering world solved it the same way — which is good evidence that the solution is the right one and not just a story we tell about neural tissue. Consider steering a supertanker. The ship is so massive that a turn of the wheel expresses itself in the hull’s motion only minutes later. Steer such a vessel by watching what it is currently doing and you will fail catastrophically: you turn the wheel, see no response, turn it further, and long after, the ship answers to all that accumulated input at once and swings wildly past your heading. Helmsmen lose large ships exactly this way, oscillating around their course — the supertanker’s version of intention tremor, the same wobble that delayed proprioception produces in a reaching arm.
The engineering fix is a predictive display. The ship carries a model of its own dynamics — how this hull, at this speed, in this wind and tide, answers the helm — and that model, fed the current command, computes where the ship will be in thirty seconds, a minute, five minutes, and shows the helmsman that predicted track now. The operator steers by the forecast rather than the present position, and the wobble vanishes, because the prediction supplies instantly what real feedback would supply far too late. You have met a humbler version in a car whose backup camera overlays your projected path on the screen. You are not steering by where the car is; you are steering by where the model says it is about to be. This, I will argue, is almost exactly what the cerebellum does for the body — and that an engineer and evolution arrived at the same device is the strongest reason to take the forward-model idea seriously.
A clue from an electric fish
The forward-model idea, stated abstractly, can feel like a just-so story — a tidy engineering metaphor draped over a brain structure because it fits. The reason to believe it is not merely a metaphor is that we can watch a cerebellum-like circuit actually build a forward model, neuron by neuron, in an animal strange enough that the whole computation lies open to view. The animal is an electric fish, and it is one of the most beautiful illustrations in all of neuroscience of what a predicting machine is for.
But first the anatomical fact that points us there. If you survey the vertebrates and ask which has the largest cerebellum relative to the rest of its brain, the answer is not a mammal. It is a fish — the mormyrid, or elephantnose fish, a denizen of murky African rivers — and its cerebellum is not merely large but enormous, a vast structure that balloons over and dwarfs the rest of the brain, far out of proportion to anything we see in ourselves. (Sharks, too, have strikingly large cerebella, for related reasons.) When one small lineage of animals invests in a structure to that wild degree, evolution is shouting that the structure is central to that animal’s way of life. So: what is the mormyrid’s way of life, and why does it need so much little brain? (A note in advance, so we do not overgeneralize: the computation we are about to follow is carried out in the fish in a cerebellum-like structure — a sheet of tissue built from much the same circuit as the true cerebellum, but not the cerebellum proper. The fish proves what this kind of circuit can do; whether the mammalian cerebellum does precisely the same thing is a further question we will keep separate.)
It is an electric fish. It generates a weak electric field around its body, by discharging a specialized organ, and that field spreads out into the surrounding water. The fish then senses the field with electroreceptors studded across its skin. Why? Because it lives in dark, muddy water where eyes are nearly useless, and the electric field is its way of seeing. Objects in the water distort the field — a rock, a plant, a smaller fish, anything whose electrical properties differ from water — and by reading the distortions across its skin, the fish detects and locates what it cannot see. A conducting object concentrates the field lines; a non-conducting one spreads them; and from the pattern of disturbance, the fish builds a picture of its surroundings. It is electrolocation, the electrical cousin of the bat’s echolocation, and the mormyrid is exquisitely good at it.
Now here is the problem, and it is precisely the problem of prediction. The fish is generating the very field it is trying to read. Every time it discharges its electric organ, it floods its own electroreceptors with a huge, self-caused signal — the field it just produced. Buried somewhere in that self-generated wash is the tiny distortion caused by a nearby object, the signal it actually cares about. How does it find the small meaningful disturbance hidden inside the enormous, self-inflicted, predictable signal? If it simply read its electroreceptors raw, it would be deafened by its own voice; the prey’s faint signature would be lost in the roar of the fish’s own discharge.
The solution is the forward model, made flesh. In a cerebellum-like structure — a great folded sheet of tissue with the same kind of circuit as the cerebellum proper — the fish builds a prediction of its own electrosensory input: given that I just produced this discharge, here is what my electroreceptors should report if nothing is out there. It then subtracts that prediction from what the receptors actually report. If the world is empty, the prediction matches the input exactly, the subtraction yields nothing, and the receptors fall silent — the fish has cancelled out its own voice. But if an object is present, the real input no longer matches the prediction; the object’s distortion is exactly the part the model did not forecast, and so it survives the subtraction and stands out cleanly against the silence. By predicting its own self-caused sensations and removing them, the fish is left with precisely the part of the world it did not cause — the news, the prey, the thing worth knowing.
This is worth pausing on, because we have seen its shadow before and will see it again. The cerebellum-like circuit literally constructs what neuroscientists call a negative image — an internal copy of the expected self-caused input, opposite in sign, so that adding it to the real input cancels the self-caused part away. And the mechanism that builds the negative image is learning: the fish’s circuit gradually shapes its prediction, through experience, until it precisely matches and cancels the reafferent signal. Disrupt the field artificially and the negative image, over many discharges, re-forms to fit the new circumstance. Here, in a fish, is a cerebellum-type machine doing exactly what we proposed the cerebellum does — predicting the sensory consequences of its own action, comparing the prediction against reality, and learning from the mismatch — and doing it transparently enough that we can record the prediction itself. The forward model is not only an engineer’s metaphor. It is something a brain demonstrably builds.
The electric fish’s trick — predict your own self-caused sensations and subtract them — is a special case of one of the most general principles in all of neuroscience, and it has two names worth untangling, because you will meet both.
The broad principle is corollary discharge: whenever the brain issues a motor command, it also sends a copy of that command to its own sensory systems, warning them what is about to happen so they can distinguish the consequences of the body’s own actions from genuine changes in the world. The narrower, more specific term is efference copy — strictly, a copy of the outgoing motor command (“efference” meaning outgoing), used in the context of motor control to predict the movement’s sensory effects. The terms are often used interchangeably, and the historical distinction between them is finer than it is useful; for our purposes, treat corollary discharge as the general idea and efference copy as its concrete instance in the motor system. The point is the same: the brain tells itself what it is about to do.
You can feel this mechanism work, and feel it fail, with your own eyes. Look around the room, moving your eyes from object to object. The world holds still, even though the image is racing across your retina with every flick of your gaze — a torrent of retinal motion you do not perceive as motion at all. Corollary discharge is a major part of why: each time your brain commands an eye movement, it sends the visual system a signal saying, in effect, I am about to move the eyes this much; the coming smear of the image is my own doing; treat it as self-generated, not as the world moving. (It is not the whole story — the brain also briefly suppresses vision during the fastest eye movements and updates its spatial map around each one — but the corollary discharge is what lets the visual system interpret the retinal motion as caused by the eyes rather than by the world.) Now defeat it: close one eye and gently push the open eyeball from the side with a fingertip, through the lid. The image moves across your retina much as it does in a normal eye movement — but this time you did not command the movement, so there is no corollary discharge to account for it, and the world appears to jump and swim. Same retinal motion; opposite experience; the difference is whether a prediction of the movement was issued. The held-still world of normal vision is partly an achievement of prediction, kin to the achievement the electric fish performs on its electric field and the cerebellum performs on the moving body. It is also, in part, why you cannot tickle yourself: the cerebellum predicts the sensory consequences of your own touch and discounts them, so a self-generated tickle is sharply attenuated, while another person’s identical touch — unpredicted — lands at full force.
Opening the machine
We have circled the function long enough; let us open the machine and see how it is built, because the wiring turns out to map with uncanny directness onto the computation we have proposed. I will keep the cellular detail disciplined — the box diagrams accompanying this chapter carry the full picture — but a few features of the circuit are too important, and too revealing, to leave in the figures.
Begin with what goes in. Among the many inputs the cerebellum receives, two streams matter most for our purposes, and the distinction between them is the key to the whole idea.
The first stream carries a copy of the cortical command. As the cortex’s commands descend toward the muscles, they throw off branches (collaterals) along the way, and a major one synapses in a cluster of cells in the brainstem called the pontine nuclei, which relay it onward into the cerebellum. (The pontine relay actually carries a broad swath of cortical activity, not the motor command alone — but the motor-relevant part of it is what concerns us here.) This is efference copy arriving in the flesh: the cerebellum is told what the rest of the brain has just ordered, at the moment the order goes out. It learns the plan as the plan is issued.
The second stream carries the state of the body — chiefly the proprioceptive reports from the muscles and joints, the readings from the spindles and tendon organs that say where every limb is and what force it bears. This information ascends through dedicated pathways, the spinocerebellar tracts, running up the same side of the cord (the cerebellum, as we will see, is the great exception to the brain’s crossed organization) and pouring into the cerebellum a continuous account of the body’s actual configuration — and, importantly, not only of the body’s sensory state but of the ongoing activity of the spinal motor circuits themselves, so that the cerebellum is informed about what the cord is doing as well as what the limbs are feeling. So the cerebellum holds, at every instant, two things at once: what the brain intended — the efference copy — and what the body and cord are actually doing — the ascending state. These are precisely the two ingredients a forward model requires: the command whose consequences are to be predicted, and the actual state against which the prediction is to be checked. The anatomy delivers what the theory orders.
Both streams enter the cerebellar cortex as mossy fibers, and here we meet the cell that explains the staggering neuron count. The mossy fibers synapse onto granule cells — the most numerous neuron in the entire brain, tiny and packed in their billions, which is why so large a share of your neurons live in this small structure. Each granule cell sends its axon up toward the surface of the cerebellar sheet, where it splits and runs both ways as a long, straight parallel fiber — so called because these fibers run rigorously parallel to one another, like the wires of a vast loom, threading horizontally through the structure for millimeters.
And across those parallel fibers, at right angles, stand the Purkinje cells — the principal neurons of the cerebellar cortex, the cells that carry the cortex’s entire output, and among the most magnificent objects in the body. A Purkinje cell’s dendritic tree is enormous, bearing on the order of a hundred thousand synaptic inputs — among the most heavily-connected cells we possess. And it has a peculiar geometry that matters: the great fan of dendrites is nearly flat, splayed out in a single plane like a hand pressed against glass, almost two-dimensional. The parallel fibers run perpendicular to that plane, so that each fiber pierces the flat fan of a great many Purkinje cells in a row, like a wire threaded through a stack of cards, while each Purkinje cell is crossed by a vast number of parallel fibers, gathering input from an immense swath of granule cells. The architecture is built for convergence: huge numbers of inputs funnelling onto each output cell. And that output is compressed — the colossal dendritic tree feeds a single thin axon leaving the bottom, a great deal coming in and one channel going out. That single axon is not quiet; the Purkinje cell fires continuously, at high rates, and what it transmits is the moment-to-moment modulation of that firing — a steady signal turned up and down — rather than occasional sparse pulses. (One crucial fact, easy to forget: the Purkinje cell’s output is inhibitory. It releases GABA. The cerebellar cortex’s principal neuron does not excite its targets — it suppresses them, and exerts its control by modulating how much it suppresses, a theme it shares, as we will see, with the basal ganglia.)
Here is the point the reviewer-minded reader should hold onto, because it is a common confusion: the Purkinje cell is the output of the cortex of the cerebellum, not of the cerebellum as a whole. The output of the whole structure is the deep cerebellar nuclei, buried in its core — and these nuclei are not merely relays. They receive the inhibitory output of the Purkinje cells, but they also receive excitatory collaterals directly from the incoming mossy and climbing fibers, and they combine the two. The Purkinje cells, in other words, sculpt — by inhibition — a signal that the deep nuclei are independently driven to produce; the cortex shapes the nuclear output rather than being it. (One territory is an exception worth noting: the oldest, vestibular part of the cerebellum sends much of its Purkinje output not to a deep nucleus but straight to the vestibular nuclei of the brainstem, which serve as its output stage instead.)
From the deep nuclei (and the vestibular nuclei), the cerebellum’s influence divides. A major stream ascends, by way of the motor thalamus, back up to the cerebral cortex — closing a loop, cortex to cerebellum to cortex, through which the cerebellum can reach forward and modify the motor plan before and during its execution. Another stream descends to the brainstem — to the vestibular and reticular nuclei (the medial, postural systems of two chapters ago) and the red nucleus — through which the cerebellum can adjust posture and movement on the fly. The loop is thus complete: the cerebellum takes in the plan and the body’s state, computes something, and returns the result to the structures that command movement, above and below — never touching muscle itself, always working through others. It is a co-processor, off to the side of the main line, consulted continuously, returning its corrections to the systems that act.
Now watch the computation fall out of the wiring — carefully, because here it is easy to overclaim. On the flat dendritic fan of each Purkinje cell, the parallel fibers deliver an enormous, high-dimensional description of the current situation: the command, the state of the body and cord, the recent context, all recoded through the granule-cell layer into a vast population of parallel-fiber signals. It would be too much to say that any single Purkinje cell’s firing simply is “the prediction,” read off in some transparent code. The honest statement is at the level of the circuit: the granule-cell/parallel-fiber system supplies a rich, high-dimensional representation of context-and-command, and the Purkinje cells learn a mapping from that representation to an inhibitory signal that, combined at the deep nuclei with the direct excitatory drive, helps generate the appropriate prediction or correction. The cerebellar circuit is, in this picture, a machine for taking a huge description of the present and computing from it something useful about the immediate future — which is what a forward model does. The flat-fan-and-parallel-fiber geometry is not an architectural curiosity; it is the physical form of an operation that combines a vast number of signals into a single learned output. As this book keeps insisting, the architecture does not implement the computation as a separate step laid on top of the tissue. The architecture is the computation.
But a forecasting machine is only useful if it can be corrected when it forecasts wrong — and that requires a second, entirely separate input, the one we have not yet met.
The teaching signal: how the machine learns
A predicting machine that could not learn from its mistakes would be worthless, because the body it models keeps changing — limbs grow, loads vary, the deck pitches — and any fixed model would drift into error. So there must be a way to tell the Purkinje cell you got it wrong, and to adjust it so that next time it gets it right. There is, and it is one of the most distinctive arrangements in the brain.
It comes from a small structure in the brainstem called the inferior olive, and its axons reach the cerebellum as climbing fibers — and the climbing fiber is unlike any other input in the nervous system. Where a Purkinje cell receives signals from a vast number of parallel fibers, each making a single modest synapse, it is dominated, in the mature circuit, by a single climbing fiber: one inferior olive cell’s axon wraps around the Purkinje cell’s soma and proximal dendrites and makes hundreds of powerful contacts, clutching the cell in a grip (the name is apt — it climbs the dendritic tree like ivy up a trellis). The relationship is not strictly one-to-one in the other direction: a single olivary axon branches to climb several Purkinje cells, a few up to around ten, so that a small cluster of cells shares one teacher. But each Purkinje cell answers to essentially one climbing fiber. When that fiber fires, it does not nudge the Purkinje cell; it seizes it, driving a massive, unmistakable electrical event — a complex spike. One parallel fiber is a whisper among thousands; the climbing fiber is a hand on the shoulder. And it fires rarely — around once a second, against the continuous high-rate firing the parallel fibers drive — so its occasional, overwhelming signal stands out as categorically different from the ordinary business of the cell.
What is that signal? The classical and still-central interpretation — the one to understand first — is that the climbing fiber carries an error signal, a teaching signal in the language of learning. In this account the inferior olive monitors whether movements are going wrong — whether the actual consequences are deviating from what was predicted — and when they do, it fires the climbing fiber to the responsible Purkinje cells, delivering a terse, powerful verdict: that was an error. And the climbing fiber’s job is not merely to announce the error but to change the cell so the error is not repeated — to teach.
Here the architecture pays off in the most elegant way — in the classical theory, at least. Recall that the situation at the moment of movement is represented across the Purkinje cell by the pattern of parallel fibers active at that instant — a particular subset of the loom lit up, encoding “this plan, this state.” If that combination led to an error, then those particular parallel-fiber synapses are the ones implicated in the wrong prediction, and they are the ones that should change, so that the same situation next time produces a different, corrected output. And that is what the climbing fiber is held to do. When it fires its error signal while a set of parallel-fiber synapses is active, those active synapses are weakened — a lasting reduction in their strength, the phenomenon called long-term depression (LTD), achieved in part by withdrawing receptors from the synapse so that the connection is literally pared back. The climbing fiber, in this picture, erases the part of the prediction that led to the error: it finds the active synapses and turns them down, so that the faulty forecast gives way to a better one. Over many repetitions the model is reshaped until its predictions match reality and the errors stop. This is the cerebellum learning — and it is the same logic the electric fish used to build its negative image, the same logic by which any forward model must update: drive the prediction, compare against reality, and adjust in the direction that reduces the mismatch.
Two cautions belong here in the main text, not buried below, because the clean story above is exactly the kind of tidiness this structure keeps inviting and then complicating. First, the learning is almost certainly not stored at that one synapse alone. Modern work keeps the climbing-fiber-driven mechanism but distributes the plasticity across many sites — other synapses within the cerebellar cortex, the inhibitory interneurons, and crucially the deep cerebellar nuclei and brainstem downstream — so that “cerebellar learning equals parallel-fiber LTD” is too small an answer, capturing one mechanism within a multi-site system. Second, the teaching signal itself is probably richer than a binary “you erred”: closer recording suggests the climbing fiber carries graded information about how much and even which way the error ran, more an instructor than an alarm bell. We develop both points in the fold; carry them forward now so that the elegant version does not harden into the whole truth.
You will recognize, with perhaps a groan, that we are back in the territory of synaptic plasticity from the learning unit — long-term depression here doing the work that long-term potentiation did elsewhere, weakening rather than strengthening, but to the same end: an experience leaves a lasting trace in the strength of a synapse, and behavior changes as a result. The cerebellum is a learning machine, and the climbing fiber is its teacher.
This account — command and state in by the mossy fibers and granule cells, a learned mapping computed across the Purkinje cells, error taught by the climbing fiber — is one of the genuine triumphs of theoretical neuroscience. It was sketched, with remarkable foresight, by David Marr and James Albus around 1970, before much of the supporting evidence existed, and developed by Masao Ito, who identified parallel-fiber LTD as a concrete mechanism; it is sometimes called the Marr–Albus–Ito theory. It is beautiful, mechanistically concrete, and the right thing to understand first. The fold below records, for the interested reader, exactly how it has been amended — the substance of which we have already carried up into the text above.
The main text has already flagged that the classical account is incomplete; here is the substance, for the interested reader, because the way a beautiful theory met a stubborn biology is itself a lesson.
What is solid. The architecture is not in doubt — the two principal input streams, the convergence onto Purkinje cells, the climbing fiber that dominates each Purkinje cell with a powerful, rare signal plausibly reporting error, and the reality of long-term depression at the parallel-fiber–Purkinje synapse driven by climbing-fiber activity. That the cerebellum is essential for the adaptive recalibration of movement is established beyond serious doubt by the paradigms in the next section.
Where it has needed repair. The strongest single piece of evidence against LTD-at-one-synapse being the whole story came from the laboratory of Chris De Zeeuw and colleagues: lines of genetically engineered mice in which parallel-fiber LTD is selectively blocked can nonetheless still learn motor tasks under a range of conditions. LTD at that synapse, however real, is therefore not strictly necessary for all cerebellar learning, and other forms of plasticity — including potentiation at the same synapse, and changes at the inhibitory interneurons and in the deep nuclei — must be contributing. The recalibration of the vestibulo-ocular reflex, discussed next, is now thought to begin in the cerebellar cortex and then be progressively transferred to and stored in the brainstem — learning that physically moves from one site to another over time, about as far from “a single synapse” as one could ask. None of this overturns Marr–Albus–Ito; it embeds that mechanism within a distributed, multi-site learning system.
And the teaching signal is richer than “error”. The clean story has the climbing fiber fire a binary verdict — erred, or did not. But closer recording shows its signal carrying graded information about the magnitude, and even the direction, of error, and reflecting predictions and expectations rather than only raw mistakes. The inferior olive looks less like a smoke alarm that simply goes off than like an instructor with something to say about how wrong, and which way — an active, unresolved area, and exactly the kind of place where the cerebellum’s legible hardware keeps tempting us toward clean theories the biology then complicates.
Watching the machine learn: two clean cases
Abstract claims about learning are worth little without cases where you can see the machine recalibrate itself, measure it, and break it. The cerebellum offers two especially clean ones, both outside the materials of the typical lecture but both standard and both decisive, and they happen to illustrate the two faces of what the cerebellum does.
The first is the recalibration of a reflex. The vestibulo-ocular reflex, or VOR, is the fast circuit that keeps your gaze steady when your head moves: turn your head left and your eyes roll right by exactly the compensating amount, so that the image of the world stays fixed on your retina. (You can watch its precision now — hold a finger up, fix your eyes on it, and shake your head; the finger stays sharp. Now hold the finger still and move only your eyes back and forth at the same rate; it blurs. The reflex is faster and more accurate than voluntary tracking, because it is feedforward, driven directly by the head-motion signal from the inner ear.) But the reflex’s correct gain — exactly how far the eyes must turn for a given head turn — is not fixed for life. It must be recalibrated as the eyes and their muscles grow, as the head changes size, and most dramatically when you put on a new pair of glasses, which magnify or shrink the visual world and so change how much the eyes must move to stabilize it. When the gain is wrong, the image slips on the retina during head movement — and that retinal slip is an error signal, carried (by the inferior olive, via climbing fibers) to the relevant cerebellar region, the flocculus of the old vestibulocerebellum, which uses it to adjust the reflex’s gain until the slip is gone. Wear magnifying lenses and over hours your VOR gain climbs to match; the cerebellum has been taught, by error, to recalibrate a reflex. Damage the relevant cerebellar region and this adaptation fails: the reflex is stuck at whatever gain it had, unable to learn. Here is the predicting-and-correcting machine caught in the act, recalibrating a sensorimotor relationship to keep it accurate as the body and world change — precisely the job we argued it evolved to do.
The second case reveals the other face: timing. In eyeblink conditioning, an animal hears a tone and then, a fixed fraction of a second later, gets a puff of air to the eye, which makes it blink. After many pairings the animal learns to blink to the tone alone — but the remarkable part is when it blinks. It does not blink as soon as it hears the tone. It blinks at precisely the moment the puff is due to arrive, timing the protective blink to peak exactly when it is needed, having learned the interval between tone and puff. This precisely timed learned response depends critically on the cerebellum: lesion the relevant cerebellar region and the animal cannot acquire the conditioned blink, or loses it if already learned. And the timing is the whole point — the cerebellum has learned not just that the tone predicts the puff but exactly how long until it lands, and has built a response calibrated to that interval. This is the cerebellum as a timing machine, learning the temporal structure of an association down to the fraction of a second.
These two cases together do something useful for us: they bracket the two interpretations of cerebellar function we have been circling, and force the question of how those interpretations relate.
Forward model or clock? — and a possible reconciliation
There are, broadly, two families of theory about what the cerebellum fundamentally computes, and we have now seen evidence for each. We should lay them out plainly, because the disagreement is real and the temptation to wave it away is strong.
The forward-model view is the one we have developed throughout: the cerebellum builds an internal model of the body and its world that predicts the sensory consequences of movements, so that control can run on prediction rather than delayed feedback, and the model is updated from error. The electric fish, the size–weight illusion, the deafferented patient, the VOR — all sit comfortably here. The cerebellum, on this view, is a simulator of the body.
The timing view holds that the cerebellum’s core competence is the precise representation of temporal intervals — that it is fundamentally a clock, learning and producing exactly-timed events, and that its role in movement is to supply the precise timing that coordination requires. Eyeblink conditioning is the flagship: a task that is about timing and nothing else, and that the cerebellum is essential for. Ataxia, on this view, is what happens when the timing fails — the overshoot because braking was mistimed, the decomposition because the seamless temporal blending of joints came apart, the dysarthria because the millisecond choreography of speech lost its clock.
These are often presented as rivals, and just as often the disagreement is dissolved with the phrase that they are “not mutually exclusive” — which is true but unsatisfying, a way of declining to think about how they fit. I want to suggest a more interesting possibility: that timing is not a separate function alongside prediction but is what prediction is made of, at least in the motor domain. Consider what it actually means to predict the sensory consequences of a movement. It does not mean only forecasting what you will feel; it means forecasting when you will feel it. A prediction that the box will reach a certain height, or that the finger will arrive at the nose, is useless unless it specifies the timing — the prediction and its clock are the same object, because a sensory consequence is an event at a moment. On this reading, a forward model that predicts the temporal course of a movement’s consequences is a timing device, and a timing device that learns exactly when an expected sensation should arrive is a minimal forward model. The eyeblink and the VOR are not evidence for two machines; they are the same predicting machine viewed through two tasks, one of which (the blink) isolates the when and the other of which (the VOR) isolates the what-and-how-much. I offer this as a hypothesis, not a settled reconciliation — but it seems to me more honest, and more in the spirit of this book, than filing the two views in separate drawers and calling them compatible. A machine that predicts the sensory future of a movement must, of necessity, be a clock.
Whichever framing one prefers, notice what does not change: the cerebellum is a device for getting the future of a movement right — its shape, its magnitude, its timing — and for learning to get it righter. Hold onto that, because we are now going to ask how far that single competence reaches.
The reach of one operation: the cognitive cerebellum
We come to the chapter’s hardest and most genuinely unsettled question, and it follows directly from the fact we started with — the monotony of the circuit. If the cerebellum is one operation stamped out across a vast uniform sheet, and if some of that sheet is wired to motor systems, then what is the rest of it wired to, and what is it doing there?
The answer to the first half is now clear, and it is striking. Large parts of the cerebellum — particularly the newest, most lateral regions, the cerebrocerebellum of our map, expanded substantially in primates — are connected not to motor structures at all but to the association cortex of the frontal and parietal lobes, the regions of highest cognition. The cerebellum’s loops with the cortex are not confined to motor areas; they reach into prefrontal cortex, into the parietal reaching regions, into language-related areas. And functional brain imaging confirms that these non-motor regions of the cerebellum activate during non-motor tasks: the cerebellum lights up during language, during spatial reasoning, during working memory, during tasks with no overt movement at all. The structure that Flourens’s pigeons showed us was for coordinating movement is, in its newest parts, wired into the machinery of thought.
This has led to one of the boldest ideas in modern systems neuroscience — and it is an idea that follows with real logic from the uniformity of the circuit. The proposal, associated especially with Jeremy Schmahmann, is that there is a universal cerebellar transform: a single operation the cerebellum performs everywhere, on whatever it is connected to, and that when this operation is applied to motor signals it smooths and calibrates movement, while when the very same operation is applied to cognitive signals it smooths and calibrates thought. The cerebellum, on this view, does for cognition exactly what it does for movement — it predicts, times, and corrects — so that just as it keeps a reaching arm from overshooting, it keeps a train of thought on track, well-timed, neither lagging nor racing, modulated around an appropriate baseline. And the prediction this makes is sharp and testable: damage the cognitive cerebellum and you should get, in the realm of thought, the analogue of ataxia — not an inability to think, but a clumsiness of thinking, a “dysmetria of thought” mirroring the dysmetria of movement. Schmahmann and colleagues argue that you do: cerebellar damage, particularly to the newer regions, can produce what they call the cerebellar cognitive affective syndrome — measurable difficulties with executive function and working memory, with visuospatial organization, with the smooth production of language, along with a flattening or dysregulation of emotion. Mental acts, on this account, lose their coordination the way physical acts do.
It is an elegant and seductive idea, and the logic from circuit-uniformity is genuinely compelling: if the machine is the same everywhere, it would be strange for it to do something utterly different in cognitive territory than in motor territory. I find the core of it persuasive. The cerebellum almost certainly applies its one operation to cognitive signals as it does to motor ones, and “dysmetria of thought” is a real and clinically documented phenomenon. The structure is not only motor, and the old view that confined it to movement was too small.
And yet — and this is where I want to be careful, because I think the idea is sometimes pushed harder than the evidence will bear — we have to weigh against the strong cognitive claim a fact we set aside earlier, and now must collect: the lesion evidence does not support the cerebellum being important to cognition in anything like the way the association cortex is important. This is the crux, and it deserves to be stated without hedging.
Consider the comparison directly. Destroy a person’s prefrontal cortex, or their parietal association areas, and cognition is devastated — the apraxias and agnosias and dysexecutive syndromes of the last chapters, profound and disabling failures of planning, recognition, and reasoning that no one could miss. Now damage a person’s cerebellum. The cognitive consequences are real: the cerebellar cognitive affective syndrome is a genuine clinical entity, and its impairments of executive control, language, visuospatial organization, and the regulation of affect can be meaningful and, especially with disease of the posterior lobe and vermis or in children after posterior-fossa tumor surgery, sometimes disabling. This is not nothing, and it would be wrong to wave it away. But set beside the wreckage of a large association-cortex lesion, the contrast in kind is unmistakable. Isolated cerebellar damage characteristically spares the basic capacity for thought in a way cortical damage does not: the patient still reasons, recognizes, plans, and understands, but does so less fluently, less precisely, less smoothly.
The agenesis cases sharpen the point, though they must be handled with care. People born with little or no cerebellum can be strikingly functional — ataxic and dysarthric, clumsy, but reasoning and conversing and living broadly ordinary lives. One must be cautious here: true total agenesis is exceedingly rare, many such cases carry residual cerebellar tissue or additional malformations of the brainstem and posterior brain, the documented cognitive and affective impairments are real and scale with the extent of tissue missing, and a brain that develops without a cerebellum has a lifetime to reorganize in ways an adult losing one suddenly does not. So congenital absence is not the same experiment as adult ablation, and it cannot be read as showing the cerebellum is unimportant to cognition. What it does show is subtler and, I think, more telling: that the cerebellum’s contribution is not the same kind of indispensable substrate that the cortex supplies, and that much of its function can, in development, be approximated or routed around. You can build a substantially working mind without a cerebellum. You cannot build one without association cortex.
That asymmetry is, I think, the single most important fact for calibrating the cognitive claim, and it points to a precise conclusion rather than a vague skepticism — and the conclusion is already written into the name of the syndrome. Dysmetria of thought: mis-measurement, imprecision, poor scaling — not abolition. The motor cerebellum does not generate movement; it refines movement that other structures generate, and its loss degrades the quality of movement without removing the capacity to move. The honest reading of the cognitive cerebellum is the exact parallel: it does not generate thought, it refines thought the forebrain generates, degrading the smoothness and timing of cognition without removing the capacity to think. It is a modulator of cognition, not its substrate — exactly what you would expect of a structure that, in its motor role, modulates rather than generates movement. The uniform circuit and the survivable lesion, which can look like opposing facts, are in truth a single answer: the uniformity tells us the cerebellum applies one operation everywhere, cognition included, which is why the cognitive claim is true at its core; the survivable lesion tells us what that operation is worth — a great deal for the polish of behavior, not so much that behavior cannot proceed without it. The predicting machine is precisely that: a device for prediction and calibration, applied with magnificent uniformity to movement and thought alike, indispensable for grace and dispensable for existence. To say it “does cognition” is the same category error as saying it “does movement.” It does neither. It makes both better. That is a more modest claim than “the cerebellum is an organ of cognition,” and a more defensible one.
A final fact keeps the question from being closed too comfortably in either direction. When we ask what makes the human brain special, the reflexive answer is the cerebral cortex and its great expansion. But this somewhat overlooks the cerebellum, which by comparative measures expanded disproportionately in the ape and human lineage, with its lateral and posterior regions — the very ones looped with association cortex — enlarging especially. Whatever the cerebellum does, recent evolution invested in more of it, in exactly the non-motor territory, which is part of why the cognitive question will not go away. And yet the agenesis cases sit beside that fact, stubbornly: if the cerebellum’s expansion mattered so to becoming human, how does a person born without one live a substantially normal cognitive life? I do not think anyone has fully reconciled the two. The most likely shape of the answer is that the cerebellum’s gift is fluency and efficiency rather than capacity — that a brain without it can still compute, but more laboriously, the way the deafferented patient can still move but must labor and end up clumsy — and that much of what it normally does can, given a developing lifetime, be approximated by the forebrain at a cost in smoothness we notice only on careful testing. Or perhaps we have simply not yet found the cognitive tasks that would reveal a deficit as florid as ataxia is for movement. The frontier is open. What I will commit to is the shape of whatever the answer turns out to be: in cognition as in movement, the cerebellum predicts, times, and refines; it does not generate the behavior. A textbook that told you the matter was settled, in either direction, would be selling a tidiness the field has not earned.
The exception that proves the architecture
Before we leave the cerebellum, one anatomical fact deserves to be lifted out, because it connects this chapter to a theme that has run through the whole unit and because it is the kind of detail a neurologist uses every day.
The motor system, we have said repeatedly, is crossed: the left hemisphere governs the right side of the body, the corticospinal tract decussates in the medulla, and a stroke on one side of the brain weakens the other side of the body. The cerebellum is the great exception. Its connections are arranged so that, on balance, each cerebellar hemisphere relates to the same side of the body — ipsilateral control, the mirror image of the rule for the cerebral cortex. (The wiring achieves this with a double crossing that need not detain us; the upshot is what matters.) The practical consequence is immediate and is exactly how the clinical signs present: damage to the left cerebellum produces ataxia of the left arm and leg — the patient’s left finger that cannot find the nose, the left-sided stagger — whereas damage to the left cerebral cortex produces weakness on the right. Side of the body, kind of deficit, and which structure: the three together let a clinician localize a lesion from the bedside. A patient who is weak on one side has a problem on the opposite side of the cerebrum; a patient who is ataxic on one side has a problem on the same side of the cerebellum. The crossed cortex and the uncrossed cerebellum, read off the side of the failing limb.
It is a small thing, but it is the architecture made diagnostically visible once more — the recurring promise of this unit that the kind and side of a deficit reveal the level and side of the damage, because the system is built in layers and the layers are wired in lawful ways. (One qualification keeps the rule from being over-applied: the lateralized limb pattern holds for the cerebellar hemispheres. Damage to the midline vermis is less cleanly left-or-right and tends instead to produce truncal and gait ataxia — an unsteadiness of the body’s core rather than of one arm — fitting the vermis’s role in the axial, postural part of the body map.) The cerebellum, even in being the exception, proves the rule.
Where we have come
We began with a structure that holds most of the brain’s neurons in a tenth of its volume, built from a single circuit stamped out with a uniformity found nowhere else — a structure that has drawn the engineers because its monotonous, perfectly mapped hardware seems to promise a single discoverable function. We end having found that function, as nearly as the field has, and having found also the limits of how far it reaches.
The cerebellum is the brain’s predicting machine. It receives a copy of the motor command as it is issued and the body’s state as it unfolds, and from these its circuit learns a mapping that forecasts the consequences of movement before feedback could report them — solving the two problems that make pure feedback control impossible, the slowness of the senses and the variability of the body. It corrects movements against that forecast, and it learns, taught by the climbing fiber’s error signal — across several sites, not one — to reshape its predictions until they match the world. We watched the same kind of machine, transparently, in an electric fish cancelling its own electric field to find the prey hidden inside it; we watched the cerebellum recalibrate a reflex to new glasses and time a learned blink to the millisecond. And we found, in its lesions, the signature of exactly this function and no other — not paralysis but ataxia, the overshooting, oscillating, mistimed, decomposing movement of a body whose predictions have failed, the same picture a few drinks will produce in any of us by dosing the very same machine.
We found, too, that this one operation reaches beyond movement — that the newest and fastest-growing parts of the cerebellum are wired into the cortex of thought, and apply there, almost certainly, the same predicting-and-calibrating operation they apply to the limbs. But we insisted on the discipline the evidence demands: that a person can lose the entire structure and still walk, still speak, still think and live, marks the cerebellum as a refiner of behavior and not its source — indispensable for grace, dispensable for existence — and that the right reading of its role in cognition is the exact parallel of its role in movement. It does not generate thought any more than it generates movement. It predicts, it times, it corrects. It makes behavior smooth. That it does so for thought as well as for action is the deepest and least settled thing about it, and it is, fittingly, where the frontier now lies.
There remains one more great subcortical system alongside the motor pathway — not a copy of the cerebellum, and not built on its plan, but the cerebellum’s counterpart in a looser and more interesting sense: another stereotyped subcortical architecture that shapes cortical action from the side, through an inhibitory principal neuron and loops that run through the thalamus, never touching muscle directly. Its circuit and its computation are its own, quite unlike the cerebellum’s. And where the cerebellum predicts and calibrates the movement you are making — getting its shape and timing right — this other system addresses a different problem entirely: deciding which movement to make at all, releasing the action you want and suppressing the multitude you do not. Having learned how the brain makes a movement accurate, we turn to how it chooses the movement in the first place. We turn to the basal ganglia.
As at the end of each chapter, it is worth separating the settled from the frontier.
What is well established. The cerebellum contains most of the brain’s neurons, built from a strikingly conserved cortical microcircuit repeated across its surface (embedded, though, in organized input-output loops and molecular zones — the circuit is uniform, the traffic is not). It does not project to muscle, and its damage does not cause paralysis; instead it causes ataxia — incoordination, dysmetria, intention tremor, dysdiadochokinesia, dysarthria — a degradation of the smoothness, accuracy, and timing of movement that is strong but uncoordinated. Among its inputs, two streams are especially important: a copy of the cortical command via the pontine nuclei, and the body’s proprioceptive and spinal state via the spinocerebellar tracts. These converge, through the granule-cell and parallel-fiber system, onto the Purkinje cells — the inhibitory output of the cerebellar cortex, which sculpts the activity of the deep cerebellar nuclei (the output of the cerebellum as a whole), and thence reaches the motor thalamus and cortex and the brainstem motor systems. A distinct climbing-fiber input from the inferior olive drives learning. The cerebellum is essential for the adaptive recalibration of movement, demonstrated cleanly by VOR adaptation and eyeblink conditioning. Its hemispheric connections are predominantly ipsilateral, the exception to the crossed motor system. And it is functionally engaged in non-motor tasks, with its newest regions connected to association cortex.
What remains contested or unsettled. Whether the cerebellum’s core computation is best described as a forward model, a timing device, or some unification of the two is not resolved; the forward-model language, useful as it is, is best held as a powerful framework rather than an established fact, and the electric-fish demonstration proves a cerebellum-like circuit can build such a model, not that the mammalian cerebellum is one in the strong sense. The classical Marr–Albus–Ito learning theory — a climbing-fiber error signal driving LTD at the parallel-fiber–Purkinje synapse — captures one real mechanism but is incomplete: plasticity is distributed across multiple cerebellar and brainstem sites, parallel-fiber LTD is not strictly necessary for all cerebellar learning, and the climbing-fiber signal appears to carry graded, predictive information rather than a binary error. Most open of all is the cognitive cerebellum: that it applies its operation to cognitive as well as motor signals is plausible and supported by connectivity and imaging, and cerebellar damage can produce real, sometimes disabling cognitive and affective impairment (the cerebellar cognitive affective syndrome). But the strong claim that it is an organ of cognition comparable to association cortex sits uneasily against the comparatively preserved basic cognition after cerebellar loss — including substantially functional lives in cerebellar agenesis, with the caveats that true total agenesis is rare and congenital absence permits developmental compensation. The best-supported reading is that the cerebellum modulates and refines cognition rather than generating it, exactly as it refines rather than generates movement. How a structure that expanded disproportionately in human evolution can be so survivable in its absence is not understood.