The Nearest Agent

Modeling other minds, and one’s own

The shuffle on the path

Two people approach each other on a narrow path. Each sees that the other must pass. Each begins, slightly, to step aside — and they step to the same side, correct, step again to the same side, and stop, caught in the small foolish shuffle that everyone has performed and no one enjoys. They laugh, or they do not, and finally one of them holds still long enough for the other to commit.

Nothing went wrong with anyone’s eyes. Both people saw the other clearly the entire time. What produced the shuffle was not a failure to perceive a moving body. It was a collision between two ongoing attempts to predict — each person choosing a side by estimating which side the other would take, and revising as the other revised. There are two ways to read what happened, and the difference between them will organize this chapter. Perhaps each person was simply reacting to the other’s latest movement: two coupled controllers chasing each other a half-step out of phase, with no one representing anything about anyone’s mind. Or perhaps each was anticipating the other’s choice — modeling not just the other’s body but the other’s forecast, including the other’s forecast of them. The shuffle alone cannot tell us which. That ambiguity is the doorway.

The shuffle is trivial, which is why it is a good place to start: it raises the question cheaply. The question matters when the stakes are higher and mere reaction plainly will not do. A stranger’s hand drifts toward his pocket, and whether you tense or relax depends not on the hand — the hand has done nothing yet — but on whether he has noticed that you noticed him. You stay silent in a meeting because you can hear, in advance, how the one person whose opinion will decide the matter is going to take what you were about to say. A lie succeeds only if its teller can hold, accurately, a picture of what the listener will come to believe. In each of these the most important object in the situation is another brain, and what is needed from it is not its present state but its next state — and its next state may turn on the model it is building of you.

This is the destination toward which the whole book has been moving. We began by asking why an animal should have a brain at all, and answered with movement and control: a nervous system links sensation to action so a body can stay alive in a variable world. Each unit widened that loop. Brains regulate internal resources, select among competing actions, learn from consequences, and act before a disturbance becomes fatal. Then the loop ran outside the skin: bodies alter their environments, and the altered environments alter the bodies. Then the most consequential part of a social animal’s environment turned out to be the other animals — a niche made of agents. The previous chapter argued that language is the high-bandwidth channel through which such a niche moves knowledge, and it ended on a promissory note. To use language flexibly — above all, to tailor an utterance to a particular listener — a speaker must estimate that listener’s mind. This chapter takes up that estimation directly.

The hardest object in the human niche is another control system, and a brain that must survive among such objects has to model them.

That is the brains-first claim of this final chapter. Theory of mind, perspective-taking, empathy, the sense of self — these are not late faculties bolted onto a finished brain, decorations on top of the machinery for movement and regulation. They are what the control system looks like when it is aimed at the one kind of object in its world that is also a control system, predicting and acting in return. And the last of those objects, we will find, is the modeler itself.

A mind is a hidden cause

Start with the problem in its bare form, because the bare form connects it to everything the book has already said about brains.

A brain never has direct access to the world. It has access to the activity of its own receptors, from which it must infer the causes. The visual system does not receive a scene; it receives a changing two-dimensional pattern of light on the retina and reconstructs the three-dimensional arrangement of surfaces and objects most likely to have produced it. We have used this framing throughout: perception is the estimation of hidden causes from incomplete sensory evidence, and action is the control of those causes through the body. A nervous system is, in this sense, a machine for inferring what it cannot see from what it can.

A mind is the limiting case of a hidden cause. You cannot see a goal, a belief, an intention, or a state of knowledge. You can see only behavior — a reach, a glance, a word, a hesitation — and from that behavior you infer the unobservable state that produced it. The inference is exactly the same kind the visual system performs when it recovers a solid object from a flat image, except that the hidden cause is not a surface in space but a configuration of goals and beliefs in another head. Reading a mind is perceiving an object that is never directly given to the senses.

It is worth being precise about the computation, because two different directions are folded into it. To read a mind from behavior is to solve an inverse problem: given this reach, this glance, this hesitation, what configuration of goals and beliefs best explains it? To then predict what the person will do is to run the inference forward: given those inferred states, what behavior do they generate next? Vision has the same two directions — inferring the solid object that produced an image is not the same operation as predicting the image a known object would produce — and the distinction matters here for the same reason it matters there. Mentalizing is the pairing of the two: an inverse inference to the hidden state, and a generative prediction from it. Much of what follows concerns when each direction is needed and how good it can be.

Putting it this way dissolves a temptation that has distorted the field for decades: the temptation to treat “social cognition” as a separate magisterium, a special human power that needs its own kind of explanation. It needs nothing of the sort. It is the brain doing what brains do — estimating hidden causes — on a particularly demanding class of cause. What makes the class demanding is not that minds are immaterial. It is that minds are themselves predictors, and, in the cases that matter most, predictors of you.

That last property is where social inference stops resembling vision and acquires a structure of its own. A mountain does not change its shape because you are looking at it. Another mind changes what it will do because of what it expects you to do. The next two sections are about what that difference forces — and, just as important, about what it does not.

When predicting behavior is not enough

It is tempting to think that an animal can get everything it needs from predicting behavior, without ever representing a mind. For a great many cases, this is true, and it is worth seeing clearly why, because the exceptions are the whole story.

A gazelle predicts a charging lion. It does not need to know what the lion believes. The lion’s next position is written in its body — its heading, its speed, the line of its acceleration — and a good forward model of those physical variables yields a good forecast. The gazelle that extrapolates the trajectory and cuts the other way survives. Behavior-reading suffices here because the future the gazelle needs is a smooth continuation of the visible present: the relevant hidden cause, the lion’s motion a second from now, is tightly determined by its motion now. A simple pursuit controller need not represent the prey’s representation of the predator, and the prey need not represent the predator’s mind.

Two conditions push prediction past the reach of behavior-reading, and human social life is saturated with both.

The first is anticipation of an action that has not yet begun. When the body has not yet moved, a trajectory cannot be extrapolated, and other predictors must carry the forecast. Some of them are cheap: context and affordances (what the situation invites), the person’s habits and prior behavior, learned conventions, posture, the cached expectation of what people like this tend to do here. These often suffice, and an animal that exploited only them would do well much of the time. But when they run out — when the cheap predictors are equally compatible with several incompatible actions — what remains to break the tie is the agent’s own reading of the situation: not their motion, which does not yet exist, but their assessment, which already does. To get ahead of an action that the visible cues underdetermine, you can model the thing that will generate it — the agent’s take on the world.

The second condition is recurrence. Social encounters are not single shots; you meet the same individuals again and again, in overlapping games of cooperation and competition, and while you predict them, they predict you. This is the property the lion does not have. The stranger on the path, the rival across the table, the partner you are trying not to anger — each is forecasting your next move while you forecast theirs, and what each will do depends partly on what they expect you to do.

Recurrence does not, by itself, force anyone to model anyone’s mind, and it is important to resist the overstatement. Repeated interaction can settle into conventions, into learned routines, into reactive policies and stable equilibria that each party simply follows — this is how we pass on a path, this is what one does here — and two agents can coordinate through such cached solutions without either representing the other’s beliefs. A great deal of social life runs on exactly this, and it is far cheaper than mind-reading. The pressure toward modeling minds appears in the residue: the encounters that are novel, or strategic, or deceptive, where convention and habit underdetermine what the other will do, and where the other’s action turns on a reading of the situation — including a reading of you — that cannot be recovered from their behavior, because the relevant behavior has not happened and is itself waiting on yours. There, to keep predicting, you have to model not the behavior but its source: the other agent’s model of the situation, with a version of yourself inside it.

Here the oldest theme of this book bears on the matter. From the moment homeostasis became allostasis, the organism’s problem has been to act before a disturbance arrives — to get ahead of the world rather than merely react to it. That demand raises the value of modeling minds, because mind-modeling is what lets prediction run ahead of behavior: it estimates what an agent will do before the agent’s body has committed, by estimating how the agent reads the situation. The animal that can do this prepares its response earlier and, among other anticipators, prepares it against opponents who are preparing against it. Anticipation does not require mentalizing — the cheap predictors often reach far enough — but it steadily rewards it, and it rewards it most where social life is densest, most recurrent, and most strategic. That is exactly the niche this book has been describing.

When bodies, habits, and conventions stop predicting what an agent will do, the way to keep predicting is to model the mind that produces the behavior.

This is why mentalizing need not be imagined as a separate cognitive organ sitting beside the machinery of perception and control. It is that machinery, driven by its own anticipatory logic to its hardest case. And note what the framing implies, because a later section will lean on it: if mentalizing is favored by the structure of a niche rather than switched on by the mere fact of social life, then it should be graded across species — richer where interaction is more recurrent and more strategic — not a faculty that every social animal simply has or lacks.

One engine, two targets

The forward model that predicts another agent by simulating their assessment can be pointed somewhere much closer to home.

To decide what to do yourself, you can run the same kind of simulation on yourself: rehearse the action, project its consequence, evaluate the outcome, and commit the body only if the simulated result is acceptable. We have already met this capacity, several chapters ago, in the frontal lobe — the prefactual that tries an action in imagination before it is performed, the counterfactual that replays an action that was not taken, the use of memory not to relive the past but to furnish the next decision with material. We described those as the machinery of valuation and choice. We can now see what they have in common with social prediction.

The working hypothesis of this chapter is that these are not two unrelated systems but largely shared machinery addressed to two referents — though whether it is strictly one engine or several closely interacting ones is a question we will have to leave open. Point the simulator at another agent, and its product is social prediction — which is allostatic, because it lets you prepare your response to their behavior before the behavior arrives. Point the simulator at yourself, and its product is prefactual — the rehearsal of your own behavior by living its consequence in advance. The other agent is modeled to get ahead of the world. The self is modeled to get ahead of the self.

Stated this way, an apparent coincidence in the brain becomes a prediction rather than a surprise. If imagining another person’s perspective and imagining your own future draw on substantially the same machinery aimed at different objects, they ought to recruit substantially overlapping circuits. They do, and the convergence is one of the more striking findings in the field. We will come to it, and to its limits. But the conceptual point should land first, because it reframes what the rest of the chapter is describing. Memory for the future, the counterfactual, the empathic projection into another’s situation, the rehearsal of a plan — on this hypothesis these are not a list of separate talents but facets of a single capacity to simulate a state that is not the present, run on whichever agent the situation demands. And the agent it is run on most often, most automatically, and most consequentially is the one doing the running.

We will return to the self at the end, because it is best understood as the limiting case of everything in between. First, the other.

False belief is the stringent test

If you want to know whether a system is representing another’s mental states as states that can come apart from reality — rather than tracking behavior, however cleverly — there is a stringent test, and the recurrent-allostatic frame tells us in advance what it must be.

Behavior tracks the world only at one remove. What it tracks directly is the agent’s belief about the world — and belief can be wrong. A man hurries toward the cupboard not because his keys are in the cupboard but because he thinks they are; if they have been moved without his knowledge, he will go to the cupboard anyway, because his action follows his belief and not the fact. So the capacity that most sharply distinguishes representing a belief from reading behavior — and that a system tracking surface regularities cannot easily fake — is the representation of a belief that the modeler knows to be false. To predict the man’s action, you must hold his mistaken picture of where the keys are and let his behavior flow from that picture, even as you yourself can see the keys somewhere else. You must keep two models of the world at once: the true one, and his.

It would be a mistake, though, to treat false belief as the line where mind-reading begins and below which there is none. Goals, intentions, attention, and knowledge are genuine mental states, and inferring them is genuine mental-state attribution, even when they happen to line up with the world. What false belief specifically tests is decoupling — the ability to predict an agent from a representation that diverges from reality — and decoupling is the demanding end of a graded series, not a switch. One can read it as a ladder: sensitivity to animacy, then to agency and goals, then to perception and attention (what an agent can and cannot see), then to knowledge and ignorance, then to belief, then to false belief, and finally to recursively embedded belief (what one agent believes another believes). Each rung is harder, and each is further decoupled from the immediately observable. False belief is an important rung — the one where representation and reality are experimentally pried apart — not the metaphysical threshold at which minds wink into existence.

This is why the false-belief task became the central experimental tool in the study of mental-state understanding. In its classic form, a child watches while one character places an object in a location and then leaves; in the character’s absence, the object is moved; the child is asked where the returning character will look. To answer correctly — the original location, where the character falsely believes the object to be — the child must predict an action from a belief that the child knows is false, suppressing what the child can plainly see in favor of what the character has reason to think. Younger children tend to answer with the object’s true location: they predict from reality. By around age four, in the standard versions, children predict from the false belief. Something has changed in what they can do.

What exactly has changed, and when, is more contested than the tidy version suggests, and the controversy is instructive rather than embarrassing.

The standard false-belief task is a wonderful instrument and a treacherous one. Passing it requires not only representing another’s false belief but also holding a goal in mind, inhibiting the prepotent urge to answer with what one knows to be true, parsing the language of the question, and tracking a short narrative. A child could fail for reasons that have nothing to do with mental-state representation, and an experimenter could mistake a memory or inhibition limit for a conceptual one.

This matters because of a genuine empirical puzzle. A body of work using looking-time and anticipatory-looking measures reported that infants well under four — far too young to pass the verbal task — already behave as though they track others’ false beliefs, looking longer when an agent acts inconsistently with what the agent should believe, or looking in anticipation toward the location an agent falsely takes to be correct [@onishibaillargeon2005]. If real, these results imply that something belief-like is in place long before children can answer Sally-and-Anne aloud, and that the verbal task measures the ability to deploy mental-state knowledge under task demands rather than the presence of the knowledge itself.

The trouble is that several of the infant findings have proven difficult to reproduce. Large, preregistered replication efforts have failed to recover some of the key anticipatory-looking effects, and the literature is now openly unsettled about which infant results are robust [@kulke2018]. The honest summary is that there is probably more than one thing here. One proposal, worth taking seriously, is that humans operate two distinct systems: an early-developing, fast, automatic, but inflexible capacity for tracking what others register about the world, and a later-developing, slow, effortful, but flexible capacity for reasoning explicitly about beliefs [@apperlybutterfill2009]. On that view the contradiction partly dissolves, because the verbal task and the infant looking measures are probing different machinery. The field has not converged. A textbook that pretended otherwise would be teaching the wrong lesson, which is that a single behavioral task rarely measures a single thing.

Does the chimpanzee have a theory of mind?

The phrase theory of mind entered the literature in 1978, in a paper by David Premack and Guy Woodruff that asked the question above of a chimpanzee [@premack1978]. Their evidence was that a chimpanzee shown a film of a human struggling with a problem — reaching for out-of-grasp bananas, or trapped behind a door — would, in a forced choice, select the photograph depicting the solution, the stick or the key. The chimpanzee, they argued, grasped what the human was trying to do.

Set against the framework of this chapter, what Premack and Woodruff demonstrated sits low on the ladder just described — which is not to say outside it. Reading the target of an action is genuine mental-state attribution: the chimpanzee that selects the key infers a goal, and a goal is a mental state. But goal inference is the rung closest to behavior, because a goal is largely readable from the structure of the behavior aimed at it, and it does not require holding a model of another’s belief, still less a belief the observer knows to be false. Goal-reading buys a great deal of useful prediction without ever decoupling the agent’s representation from reality.

The demanding question — the one higher on the ladder, where representing a mind comes apart from sophisticated behavior-prediction — is whether any nonhuman animal can represent a false belief in another. Here the evidence is genuinely contested, and the shape of the contest is exactly the seam this chapter has been tracing. Studies using anticipatory looking have reported that great apes look toward the location where an agent falsely believes an object to be, as though predicting the agent’s mistaken search — a nonverbal analogue of the false-belief task [@kano2019]. If that interpretation holds, it would place some belief-tracking outside our species. But there is a deflationary reading available, and it is the same deflationary reading that haunts the infant literature: the apes may be projecting their own past experience of being misled onto the agent, or responding to learned regularities in where agents tend to look, rather than representing the agent’s belief as a belief. The data underdetermine the two stories.

This is not a failure of the field so much as the place where the field’s central question actually lives, and the framework lets us say something more useful than “it is unclear.” This is the prediction the earlier section set up. If decoupled mentalizing is favored by niche structure rather than switched on by sociality as such, then it should be graded — richest where social life is most densely recurrent, most strategic, and most dependent on pre-empting others’ actions: in species with stable, individualized, long-term relationships and intense cooperative and competitive interdependence. That is a prediction, and it is the right kind, because it is in principle wrong. It says the decoupled capacity is not a binary human possession to be confirmed or denied but a graded adaptation that should track the structure of a species’ social niche. The comparative data are not yet good enough to test it cleanly. The chapter’s claim is not that the question is answered. It is that this is the question, and that the niche-first frame tells us where to look.

The mentalizing network, and the overlap

When the same question is asked of the human brain with functional imaging, a reasonably consistent set of regions appears. Contrasts designed to engage mental-state reasoning — short stories that can be understood only by inferring what a character must have thought, compared against matched stories that turn on physical or mechanical inference [@saxe2003]; cartoon sequences whose punchline depends on a character’s mistaken belief [@gallagher2000] — recruit a recognizable network: the temporoparietal junction, especially on the right; medial prefrontal cortex; the posterior cingulate and adjacent precuneus; and the ventromedial frontal cortex. Aggregating across the very large number of such studies now in the literature returns the same hubs, with the temporoparietal junction and medial prefrontal cortex the most reliable.

It would be easy to stop here, point at the temporoparietal junction, and call it the seat of theory of mind. The book has spent two chapters explaining why that move is a mistake, and the present case is among the clearest illustrations.

The same regions that light up for mental-state reasoning also light up for a striking range of other tasks. The posterior superior temporal sulcus, which sits within the loosely defined temporoparietal territory, responds vigorously to biological motion — the movement of animate agents, demonstrable with point-light displays in which a few dots attached to a walker convey a living body against a meaningless scatter of the same dots in random motion. The region cares about the context of that motion: an avatar whose eyes shift toward the observer as it approaches drives the response more than identical motion with the gaze averted [@pelphrey2004]. The temporoparietal junction is also a core node of the ventral attention network, the system we described earlier as a circuit breaker — the machinery that interrupts ongoing processing and reorients attention when something unexpected demands it [@corbettashulman2002]. And, as the next section will develop, much the same territory appears when people imagine the future, recall the personal past, navigate imagined space, and simply rest without a task.

The promiscuity of these regions is not noise to be subtracted away. Across tasks, it is a clue. When one patch of cortex is recruited by mental-state reasoning, by the perception of animate motion, by the reorienting of attention, and by self-projection into other times and places, the parsimonious hypothesis is not that the region does four unrelated jobs. It is that these tasks share something, and the shared something is what the region computes. Reading biological motion is, after all, the front end of inferring a goal: when you watch someone move, you are already estimating what they are doing and why. Switching attention from your own vantage to another’s, or from the present to an imagined situation, may be the same operation the attention system performs when it breaks from one focus to another. The overlap is pointing at a common function. The difficulty is saying what it is without overreaching, and overreach in both directions is the standing danger.

Two opposite errors stalk this literature.

The first is reverse inference: observing that a region active during theory-of-mind tasks is now active in some new task, and concluding that the new task therefore involves theory of mind. This does not follow. A region that participates in a function is not thereby exclusive to it. The temporoparietal junction’s appearance in an attention task does not show that the participant was mentalizing, any more than the heart’s involvement in running shows that everyone who runs is in love. Activation localizes where a manipulation changes blood flow. It does not, by itself, reveal what computation the tissue performs, and the same blob can be produced by different processes.

The second error is the mirror image: treating the broad overlap as proof that there is no specialization at all, that it is “all one network” and the regional labels are meaningless. That overshoots too. The “temporoparietal junction” is not one thing — it is an anatomically imprecise label spanning tissue with different connectivity and, on closer parcellation, different functional profiles. Connectivity-based parcellation places an anterior sector with the ventral-attention and salience circuitry — its connections run to ventral frontal cortex, anterior insula, and midcingulate cortex — and a posterior sector with the default and mentalizing network, its connections running to angular gyrus, precuneus, and posterior cingulate [@mars2012; @bzdok2013]. The borders shift with atlas and task, and the two functions overlap rather than separating cleanly: a meta-analysis found the region recruited by both attentional reorienting and false belief sitting anteriorly, while the posterior sector converged more specifically on false belief [@krall2015]. Some evidence further suggests that a portion of the right posterior temporoparietal junction is unusually selective for representing beliefs specifically, dissociable from general social or attentional demands.

The defensible position sits between the two errors. There is real regional specialization, and there is real sharing of machinery across social, attentional, and self-projective tasks. Neither the “dedicated mind-reading module” picture nor the “it’s all the same soup” picture survives contact with the parcellation data. Holding both facts at once is uncomfortable and correct.

No discussion of the neuroscience of social cognition can avoid the mirror neuron, and few topics have been so oversold. In the macaque, neurons in premotor area F5 discharge both when the animal performs a particular goal-directed action — grasping a raisin — and when it observes another individual performing a similar action [@rizzolatti2004]. The finding is real and important: here are cells whose activity is shared between doing and seeing.

The leap that followed was to declare that such cells constitute the understanding of others’ actions — that we grasp what another is doing by covertly running the same motor program, and, by extension, that mirror neurons are the basis of empathy, language, imitation, and the social deficits of autism. That extrapolation outran the evidence in several directions [@hickok2014]. A shared motor response is not the same as a representation of another’s goal or belief; it may be a consequence of understanding the action rather than its cause. People can understand actions they cannot perform, and damage to motor regions does not reliably abolish the comprehension of others’ movements. And nothing in a motor-matching mechanism explains the capacity that this chapter has identified as the stringent test of decoupled mentalizing — the representation of a belief the observer knows to be false — because mirroring an action gives you the action, not the possibly-mistaken model of the world behind it.

The sober residue is worth keeping. Sensorimotor systems are clearly recruited when we observe others act, the coupling between perceiving and producing action is genuine, and it plausibly contributes to imitation and to the front end of reading goals from movement. That is a real piece of the machinery. It is not the whole of social cognition, and it is not the part that does the hard work.

Self-projection

We can now collect the overlap into the idea that organizes it. Across tasks that look superficially unrelated — imagining a personal future, recollecting a personal past, navigating an imagined environment, and inferring another person’s mental state — a common core of regions recurs: medial prefrontal cortex, posterior cingulate and precuneus, the temporoparietal junction, and the medial temporal lobe. Buckner and Carroll proposed that the thread uniting these tasks is self-projection: the capacity to shift perspective from the immediate present to an alternative — another time, another place, another mind — and to construct a mental scene from that displaced vantage [@bucknercarroll2007].

It is the construct this chapter has been building toward, and it cashes out the working hypothesis about one engine and two targets. Remembering an episode, imagining a future one, and entering another person’s perspective are, on this account, variants of the same operation — the simulation of an experience that is not the one currently given — differing only in where the vantage is placed: in your own past, your own future, or someone else’s situation. On this view the future-rehearsal we located in the frontal lobe, the empathic projection into another’s plight, the counterfactual replay of the road not taken, and the navigational simulation of a route not yet walked are facets of a single capacity to leave the present and model a scene from elsewhere.

The same imaging that reveals these regions in tasks also reveals them at rest. When people lie in a scanner with nothing to do, activity does not fall silent; it settles into precisely these midline and lateral parietal regions — the default mode network [@raichle2001]. When the present makes no external demand, in other words, the brain does not idle; it tends to run the very operations self-projection comprises — rehearsing conversations, revisiting the past, planning the afternoon, drifting into others’ lives. How much of resting activity is well described as self-projection remains debated, but its overlap with these regions is among the most reproducible findings in human neuroimaging. A control system whose hardest problem is anticipating agents — including the one it inhabits — spends a good deal of its unoccupied time running operations of that kind.

There is an eerie anticipation of this idea in a place we have visited before. Freeman and Watts, attempting in the first half of the twentieth century to describe what the prefrontal cortex contributes, wrote that it is “concerned with the projection of the whole individual into the future,” with the capacity to “foresee, to see before, to forecast” the results of actions not yet taken, and to visualize their effects “upon himself and upon his environment” [@freemanwatts1942]. They were describing the destruction wrought by the lobotomy, in the wrong anatomical language and at a terrible human cost. But the phrase reaches the same place that the imaging would reach decades later. The frontal lobe and its midline partners project the whole individual into situations that are not the present — and among the most demanding of those situations is the interior of another person.

And here the chapter must refuse to close a question that the field has not closed.

The self-projection account is elegant, and elegance is not evidence. There is a real and unresolved dispute about whether mental-state reasoning is a special-purpose capacity or a special case of a general one, and the strongest versions of each side deserve a fair hearing.

The domain-specific case holds that representing beliefs is computationally distinctive and is served by dedicated machinery. Its best evidence is selectivity: a portion of the right temporoparietal junction appears to respond to information about a person’s beliefs far more than to other socially or emotionally relevant facts about that person, such as their appearance or even their bodily sensations — a degree of specificity that is hard to explain if the region is merely running a generic simulation. On this view, evolution built something particular for the particular problem of belief, and the overlap with memory and prospection reflects shared subcomponents, not shared identity.

The domain-general case is the self-projection account taken at full strength: there is one capacity to construct displaced scenes, and theory of mind is what it produces when the displacement is into another agent. Its best evidence is the convergence itself — the recurrence of overlapping regions across remembering, imagining, navigating, and mentalizing. But the convergence has to be stated carefully, because it is exactly where coarse and fine measurement disagree. Group-averaged maps make the network look unitary; individual-subject precision mapping resolves it into interdigitated subnetworks lying side by side in the same territory, one weighted toward episodic and mnemonic construction and another toward mentalizing [@bragabuckner2017]. And the dissociations cut against a single engine as often as the overlaps argue for one: standard theory-of-mind performance can survive devastating episodic amnesia, so the capacity to imagine remembered scenes and the capacity to reason about another’s belief are separable rather than identical.

The dispute is not settled, and a textbook should not pretend to settle it. But it is worth being explicit about why the outcome matters for the argument of this book. If the domain-general account is closer to right, then social cognition is not a separate human organ but the most demanding application of a capacity that evolved for control — for leaving the present to rehearse what is not yet, or not here, or not me. That reading fits the arc we have followed, in which the faculties that seem most distinctively human keep turning out to be old machinery driven to new uses by the structure of the niche. The book leans toward it, for that reason and because the broad convergence is hard to dismiss. But leaning is not knowing, and both the selectivity findings and the fine-grained parcellation are genuine costs the lean has to carry.

More than one route into another mind

Empathy is often treated as a faculty in its own right, with a taxonomy of subtypes, and the taxonomy is less illuminating than the bare distinction it rests on. The distinction worth keeping is between modeling another’s state and coming to share it. You can represent that someone is frightened — infer the state, predict its consequences — without the inference moving you; this is cognitive perspective-taking, and it is continuous with the mentalizing this chapter has described. Or the simulation can engage affective systems, so that modeling the other’s fear recruits some of what would be active were the fear your own, and you come to feel a version of what you model.

The two are related but not identical, and the evidence will not let us collapse them. They vary independently across people; they can be impaired separately by brain damage; and they draw on partly distinct circuitry. A person can read another’s state accurately while remaining unmoved by it, and a person can be flooded by another’s distress without clearly representing its source. Affective empathy is therefore not cognitive perspective-taking with a feeling switched on. It is an interacting but separable route into another’s state, and the chapter’s simulation framing should be read as covering the cognitive route, to which the affective route is coupled rather than identical.

Putting it this way makes a hard problem visible. If you model another person’s state partly by recreating it on your own hardware, what keeps the recreation from being mistaken for your own? The fear you simulate must be kept tagged as theirs, or perspective-taking collapses into contagion and you lose the boundary between modeling a state and being in it. This is not idle: there is evidence that the network engaged in suppressing automatic imitation — in stopping yourself from mirroring a movement you have just seen — overlaps the networks engaged in mental-state reasoning and in distinguishing self from other [@spengler2009]. The overlap fits a real computational requirement rather than naming a single labeling site: maintaining the self–other distinction while running another’s state on shared hardware is not a problem separate from mentalizing but part of the same work.

The most influential clinical application of theory of mind was the proposal that autism involves a specific impairment of mental-state reasoning. Early work reported that autistic children failed the standard false-belief task at rates far higher than children with other developmental differences, and the “theory of mind deficit” became, for a time, the dominant cognitive account of autism [@baroncohen1985].

The framing has not aged cleanly, and the reasons it has not are themselves instructive. Autistic performance on mental-state tasks is heterogeneous and strongly dependent on task demands, language, and the verbal nature of the standard measures; many autistic people pass the very tasks the deficit account predicts they should fail, and broad claims that autistic people categorically lack a theory of mind do not survive examination of the actual evidence [@gernsbacheryergeau2019]. An alternative, the double empathy problem, adds an important relational source of difficulty: prediction between autistic and non-autistic people can fail in both directions, because each is modeling a mind whose workings differ from their own, while autistic people have been found to predict and understand one another comparatively well [@milton2012]. On this view much of the breakdown is a mismatch between two kinds of mind rather than a deficit located wholly in one of them — though the relational account supplements rather than replaces the substantial individual variation, and the findings are not uniform across tasks.

The lesson for this chapter parallels the lesson the previous chapter drew from aphasia. Variation and breakdown illuminate the machinery, but they do not license a simple story in which a single faculty is present in some brains and absent in others. The mentalizing system is real; it varies; and when two systems built differently must predict each other, some of the failure arises at the level of the pairing rather than in either party alone.

The nearest agent is the self

The book’s coda promised that this chapter would arrive at how an organism comes to model other minds, and its own. Everything so far has prepared the inward turn, and the turn is short, because the self is not a new topic. It is the limiting case of the one we have been developing.

The word self, though, covers at least three things that come apart, and the chapter’s claim is strong for one of them and weak for the others. There is the embodied self — the felt sense of being this body, grounded in the continuous stream of interoceptive, proprioceptive, vestibular, and motor signals that report the state of the organism from the inside. There is the agentive self — the sense of being the author of one’s actions, tied to the motor system’s predictions of the consequences of its own commands. And there is the narrative self — the account one constructs of who one is and why one acted, extended across time into a remembered past and an anticipated future. The first two are not inferred the way another person is inferred. The brain has privileged, if noisy and incomplete, access to its own bodily and motor signals; it has nothing like that access to anyone else’s body. Whatever else is true, your relation to your own interoceptive state is not the relation you have to mine. This matters especially in a book built on the regulated body: the self does not float free of the homeostatic machinery that earlier chapters described. Its innermost layer is that machinery, sensed from within.

It is the narrative self — the explanatory account of why this organism did what it did — that is built by the same apparatus used to explain other agents, and built with the same fallibility. To explain an action you have already taken, you construct a story about the goals and beliefs that produced it: the inverse inference, run on yourself. And here the privileged access runs out. The interoceptive signal can tell you that your heart is racing; it does not tell you why you chose as you chose. The reasons you offer for your choices are inferences, constructed after the fact from whatever evidence is available, and they can be confidently wrong.

We saw, in the previous chapter, what that construction looks like when it is caught fabricating. The split-brain patient whose speaking hemisphere offered a confident, plausible, and incorrect reason for an action driven by information it could not access was not malfunctioning in some exotic way; the interpretive machinery was doing what it does — assembling a coherent account of the agent from the evidence at hand — in a situation arranged to deprive it of the real cause. Clinical neurology supplies a second, independent example in anosognosia, in which a patient denies an obvious deficit, such as a paralysis, and produces fluent reasons for not moving the limb, apparently unaware that the account is false. What both cases show, with unusual clarity, is that the system can manufacture a confident self-explanation from incomplete or absent evidence. They do not, by themselves, establish how often ordinary self-explanation is similarly reconstructed — the paradigms are selected precisely for the cases where report and cause come apart — but they motivate the possibility, and they make the folk-psychological picture, in which the felt reason is a transparent readout of its own cause, hard to sustain. We will return to how far this reaches when we set up that earlier discussion.

The deepest form of the chapter’s opening claim follows. The most important objects in the human niche are control systems — agents that predict and act, and predict you in return. The machinery the brain evolved to model them is, in the end, turned on the one control system it can never step outside of and never fully see the workings of: the organism of which it is a part. The embodied self is given from within; the narrative self is inferred, by the same tools used on everyone else, and with no special access to its own causes. The self is the nearest agent — and the one the modeling system is least able to check its story against.

Coda: the system that models systems

We began this book with the simplest possible question about brains — why an animal should have one at all — and answered it with movement and control. A nervous system exists so that a body can act in a world that would otherwise kill it. Every unit since has been an enlargement of that answer, and they assemble, now, into a single line.

A brain regulates a body. To regulate a body well, it must act before disturbances arrive, so it learns to predict — to model the hidden causes behind its senses and to get ahead of them. To get ahead of the world, it does not merely react to the environment; it alters the environment, and the altered environment alters it in turn. The most consequential part of that environment, for a social animal, is the other animals: a niche built of agents. A human infant too costly for two parents to raise alone deepened the dependence on a wider circle of provisioning, protection, and care — and that circle, in turn, helped make so costly an infant affordable, the loop running in both directions across the generations. Cooperation among many brains and many years is the murder of brains on which a human life depends. Language is the channel that moves knowledge through that flock. And the agents themselves — the others whose calories, care, knowledge, and loyalties decide whether the expensive infant survives — become the hardest objects the control system must model. They are hard because they are control systems too: they predict, they act, they anticipate you. Among predictors that predict you, when bodies and habits and conventions stop being enough, a brain is pressed to climb from reading behavior to reading minds — and the same machinery, turned inward on the nearest agent of all, builds the model it calls the self.

That is the whole arc, and its shape is worth stating plainly, because it is the opposite of the story the book set out to avoid. We did not end with a tour of the special faculties that make humans unique — a module for this, a center for that, a uniquely human organ of mind. We ended somewhere stranger and more continuous. The capacities that feel most distinctively human — understanding others, projecting into other times and perspectives, the experience of a self — are, on the reading this book has pressed, what an old machine looks like when the niche that shaped it is made of other such machines. The control system that began by steering a body through a world arrived, by the same logic that drove it from the start, at the one feature of that world as restless as itself: the other minds it must predict, and, last of all, the one doing the predicting.

A honey bee is built to live in a hive that bees built. A human brain is built to model the other brains it must live among — and to model, with the same machinery and no better access, the one it cannot leave.

Animals create niches that modify animals. For the human animal, the most consequential thing in the niche is another mind — including its own.


This chapter has moved from the problem of other minds, through the logic of social prediction, to theory of mind, its neural systems, self-projection, empathy, and the self. The evidential foundations are not equally firm, and the chapter’s central argument is in places a proposal rather than an established result. The three lists below mark the difference.

We are confident that: humans routinely infer unobservable mental states from behavior; predicting other agents differs from predicting inanimate motion, because agents are themselves predictors and, in recurrent interaction, predict the predictor in return; false-belief understanding, in some form, is reliably in place by early childhood on standard tasks; a recognizable set of regions — temporoparietal junction, medial prefrontal cortex, posterior cingulate and precuneus — is engaged by mental-state reasoning; partially overlapping regions are engaged by remembering, imagining the future, navigating imagined space, perceiving biological motion, reorienting attention, and rest, though finer-grained mapping resolves neighboring subnetworks with different functional preferences within that overlap; and confident self-explanations can be constructed from incomplete or absent evidence, as the confabulation and anosognosia cases show.

As the chapter’s working synthesis, we propose that: the anticipatory, recurrent structure of social interaction favors the modeling of minds — not by switching it on wherever social life exists, but by rewarding it wherever cheaper predictors such as habit, convention, and cached policy run out, which in turn predicts that decoupled mentalizing should be graded across species rather than all-or-none; perspective-taking into others, projection into one’s own past and future, and the construction of imagined scenes draw on largely shared constructive machinery, whether or not it is strictly one system; cognitive and affective empathy are interacting but partly separable routes into another’s state, not one with a feeling switched on or off; the self–other distinction is part of the mentalizing machinery rather than separate from it; and the narrative self is built by the same inferential apparatus applied to one’s own behavior, while the embodied self is grounded in privileged bodily and motor signals.

We remain genuinely unsure about: whether mental-state reasoning depends on dedicated, belief-specific machinery or is a special case of a domain-general simulator, with real evidence on both sides; what the infant looking-time measures actually show, given serious replication difficulties; whether there are one or two distinct mentalizing systems; whether any nonhuman animal represents false beliefs as opposed to reading goals and learned regularities; how to draw the functional boundaries within the temporoparietal junction; how much of the convergence across self-projective tasks reflects shared computation versus adjacent, separable subnetworks; how often ordinary self-explanation is reconstructed rather than reported, which the confabulation paradigms cannot by themselves establish; and how literally to take the claim that the self is a model of the same kind we build of others.

The first list is secure. The second is the chapter’s argument, well-motivated but not proven, and labeled as such. The third marks where the questions this chapter raises are the questions the field is still working to answer — the right note on which to end a book that has tried, throughout, to keep the levels connected and the uncertainties visible.