AI Can Now Read Minds (Almost)

In 1882, Frederic Myers, one of the founders of the Cambridge Society for Psychical Research, coined the term “telepathy,” combining the Greek tele (distant) and pathos (feeling). For a few decades, psychical research enjoyed a certain degree of respectability; William James took it seriously, Nobel laureate Charles Richet devoted years to it, and in the 1970s even the CIA funded the Stargate project in the hope of training psychics capable of spying on the Soviets through the power of the mind. None of these attempts produced replicable results, and telepathy gradually slipped into folklore, but the underlying idea has never ceased to exert its fascination, from science fiction to the philosophy of mind.

What no one had foreseen was the form in which this possibility would partially materialize, namely through autoregressive models in the GPT family. The “mind reading” that has begun to work in neuroscience laboratories over the last three years resembles nothing that Myers or the CIA imagined: it works thanks to AI, and instead of capturing exact words it intercepts semantic regions, exploiting the recently discovered fact that the brain and language models organize meaning in spaces that turn out to be partially alignable.

The premises go back to 2016, when Alexander Huth and colleagues published a study in Nature in which several subjects listened to hours of stories while their brain activity was recorded through functional magnetic resonance imaging (fMRI). The researchers derived from this a kind of semantic atlas of the cortex, distributed in elaborate patterns that were surprisingly similar from one individual to another. Each area responded to specific conceptual domains—people, numbers, visual properties, places—and these domains were arranged by proximity according to a recognizable logic. The semantic system proved to be far more extensive and distributed than much of the previous literature had documented; in practice, the brain tiles the cortex with a semantic topographic map, where similar concepts are “next-door neighbors.”

A step forward comes from a subsequent study by Tang, LeBel, Jain, and Huth, published in 2023 in Nature Neuroscience. Their experiment works in two phases; in the first, three subjects listen to about sixteen hours of narrative podcasts while their brain activity is recorded with fMRI. The system thus learns to associate patterns of cortical activation with semantic representations extracted from a language model: it learns which constellation of brain activity corresponds to which region of the model’s semantic space. In the second phase, the subjects listen to new stories, never heard during training, and the decoder, using only brain activity, generates sequences of words that recover the meaning of what was heard. In practice, the brain is first “associated” with the model, and then this association is exploited using previously unseen words.

The most revealing aspect is the kind of errors the system makes, or rather, the fact that they are not errors in the usual sense. Let me explain: where the subject hears a sentence equivalent to “she doesn’t have a driver’s license,” the decoder produces something like “she hasn’t yet learned how to drive.” The lexical string is different, but the region of meaning is the same. The system produces a paraphrase and converges toward the same semantic basin, even if by following a different trajectory. If it returned identical words, we would be facing a kind of recording; the fact that it returns paraphrases is, in my view, semiotically more significant, because it suggests that what is being decoded is the deep structure of meaning rather than its lexical surface.

The system is also able to decode imagined language, where subjects imagine telling a story without uttering a word, and even silent videos, which demonstrates that a single semantic decoder can operate across different perceptual tasks, provided they share the same representational level.

In parallel, Jean-Rémi King’s group at Meta AI developed a distinct approach, based on magnetoencephalography (MEG) and electroencephalography (EEG) rather than fMRI. fMRI has excellent spatial resolution but very low temporal resolution, about one image every two seconds; it is much slower than language. MEG and EEG capture neural activity with temporal resolution on the order of milliseconds, at the cost of lower spatial resolution.

The architecture of this system is inspired by CLIP, OpenAI’s model that aligns text and images in a shared space. Here, however, the elements to be aligned are different: on the one hand brain representations, on the other those of speech. The network learns to match the two spaces through contrastive learning, that is, by learning to distinguish correct pairs (brain signal and corresponding audio) from incorrect ones. The system was tested on four public datasets, for a total of 175 volunteers and more than 150 hours of recordings. The results show that, starting from 3 seconds of MEG signal, the model is able to identify the corresponding speech segment with an accuracy of 41% out of more than 1,300 candidate segments, which means that in nearly half of cases the system identifies exactly the right sentence among more than a thousand possible alternatives, where chance would have a probability of 0.08%.

The research program has also extended into the visual domain. Benchetrit, Banville, and King developed a system trained to align brain activity with the visual representations learned by a self-supervised computer vision model. Subjects observed images while their brain activity was recorded with MEG; the system then produced reconstructed images starting from the neural signal alone.

Here too, the pattern of errors is analogous. The generated images preserve high-level semantic characteristics such as the object category and the general composition of the scene, but fail on low-level details: position, orientation, exact color. The system recognizes that it is a dog, but does not place it in the right part of the image. This suggests that decoding takes place at the level of categorical and conceptual representation rather than sensory perception.

A further study by the Meta AI group, published in Nature Human Behaviour, provides a more explicit theoretical framework for understanding the relationship between the brain and language models. The authors analyzed the fMRI signals of 304 participants listening to short stories, comparing brain activations with those of the internal layers of GPT-2.

The first result confirms what other studies had already suggested: the activations of a language model map linearly onto the brain’s responses to speech, which is already remarkable in itself. But the second result is subtler. The brain, unlike current LLMs, operates according to a predictive hierarchy extending across multiple temporal scales: the temporal cortices predict short-range representations—the next word, as GPT does—while the frontoparietal cortices generate long-range predictions, up to eight words into the future, and at a more abstract, more contextual level. Brain and LLMs therefore share the fundamental organizational principle—prediction as the mechanism of linguistic understanding—but the brain implements a hierarchical and distributed version of it that current LLMs capture only partially.

An independent paper by Goldstein and colleagues, also published in Nature Neuroscience, strengthens this picture with data from electrocorticography (ECoG). The authors identify three computational principles shared by brain and autoregressive models: both are engaged in the continuous prediction of the next word before it is perceived; both compare the prediction with the actual word in order to calculate a surprise signal; both rely on contextual representations to encode words in their context of use. This study, coming from a different group (Princeton/NYU) and based on invasive recordings with high temporal resolution, reinforces this line of research and documents the sharing of computational principles with evidence that goes beyond simple statistical alignment.

Still within the Meta AI group, and in an earlier paper in Communications Biology, Caucheteux and King had already shown that transformer neural networks trained on word prediction partially converge with brain representations, and that this convergence is all the greater the better the model’s predictive ability. GPT-2’s representations, moreover, predict the subjects’ degree of semantic comprehension: the brain-model mapping correlates significantly with narrative comprehension scores.

The assumption that makes the entire enterprise possible is that the brain and computational models represent meaning in spaces that, while physically heterogeneous (patterns of neural activation in the former case, high-dimensional vectors in the latter), turn out to be partially alignable. Meaning, in both substrates, appears to be organized by proximity: semantically related concepts occupy neighboring regions, and the relationships between concepts can be described as transformations in space. The linear mapping between brain activity and the model’s representations works precisely because the two organizations are sufficiently aligned to allow a partial but systematic translation.

Huth’s 2016 study had shown that the cortex organizes meaning in a topographic map where contiguous conceptual domains occupy adjacent regions. Tang et al. add another piece: the language model is not merely a decoding tool, but functions as a kind of semantic mediator because its representational space mirrors, at least partially, the brain’s. Brain and language model construct different representations of the same territory; the representations are not identical, but the fact that a linear mapping works indicates a significant structural compatibility, although it is worth stressing that “compatibility” does not mean “identity.”

In 2025, Tang and Huth published a paper in Current Biology that directly addresses one of the most stringent limitations of their original system: the need for lengthy individual training. Using techniques of functional alignment across subjects, the authors show that it is possible to transfer semantic decoders from one participant to another, reducing the amount of linguistic data required from the subject on whom the system is used. In short, cross-subject transfer is no longer as ineffective as it was in 2023, even though individual adaptation still retains an advantage.

The latest development in chronological order is TRIBE v2, a model presented by Meta AI at the end of March 2026 and trained on more than a thousand hours of fMRI data collected from 720 subjects. Unlike previous systems, TRIBE v2 is able to predict the brain responses of people never observed before with less recalibration. The model simultaneously integrates vision, audio, and language, and produces predictions that in some cases correlate with the average neural activity of a group more than the fMRI scans of individual subjects do, as if the system had learned a kind of canonical brain response purified of the noise of individual variation. If previous work showed that the brain and language models share a partially alignable representational space, TRIBE v2 suggests that this convergence extends to the perceptual domain in its full breadth.

That said, although from a journalistic point of view a headline like “AI reads minds” would work very well, these results must be read with all their limitations in mind. As we have said, the system does not read thoughts word by word. What is reconstructed is a probabilistic estimate of the semantic region, not the exact form of the utterance. In some cases the decoder produces sentences that are relevant but lexically very distant from the original; in others it generates gross errors. Linguistic decoding, moreover, still requires extensive individual training, even though recent studies have significantly reduced this limitation. Reading is more accurate when the subject listens to continuous narrative content; with spontaneous thoughts, not anchored to controlled stimuli, performance degrades considerably. Brain signals become more dispersed, noisier, and lack the contextual constraints that guide reconstruction. fMRI also has intrinsic temporal limits, and its resolution of about one image every two seconds makes real-time decoding impossible. Systems based on MEG and EEG partially solve the problem, but at the cost of lower decoding performance.

Finally, there is an important epistemological limit. The fact that the brain and LLMs have partially alignable representational spaces does not prove that they work in the same way. It proves that some linguistic representations of the models and some brain representations of language can be meaningfully put into correspondence. The brain is a biological system shaped by millions of years of evolutionary pressure; the LLM is a statistical system trained on the co-occurrence of textual tokens. One can say, however—and this is no small thing—that AI empirically demonstrates that meaning can be organized by proximity and density in a continuous space, and that a system organized in this way can produce semantically coherent output. And this finds significant, though partial, parallels in our brain—they are not the same, then, but neither are they so different.

Francesco D’Isa