If you want to create a far-right generative artificial intelligence that believes in wild conspiracy theories and denies the climate crisis, you’ll face a serious problem: a model trained on such data becomes dumber. When a Large Language Model (LLM) learns from false or incoherent information, its ability to reason about the world degrades structurally: falsehood reduces competence. It’s a simple principle, confirmed by numerous studies.
In 2024, a study by Zhang et al. (“Regurgitative Training: The Value of Real Data in Training Large Language Models,” arXiv:2407.12835) showed that models retrained on synthetic or low-quality data progressively lose semantic coherence and generalization ability: the more “junk data” you feed into the training process, the higher the entropy of the output. In other words, an LLM that absorbs errors becomes less accurate and less stable.
The same result emerged from the large-scale benchmark DataComp, which compares curated image and text datasets with noisier ones: models trained on verified sources systematically outperform those trained on generic or unfiltered data (Gadre et al., DataComp: In Search of the Next Generation of Multimodal Datasets, arXiv:2304.14108). Data quality matters more than quantity.
Elon Musk has run into this limit with Grok, xAI’s language model. The entrepreneur’s dream is, in fact, an AI that isn’t “woke” – meaning, one that thinks like him and therefore believes in a number of patently false theories. To build a “non-woke” LLM, however, Musk needs a “reliable” source that doesn’t contradict his political positions. Hence the idea of Grokipedia: an encyclopedia almost identical to Wikipedia, accurate in most entries but with a series of rewritten ones in key areas – politics, science, gender, recent history – to redirect the model’s semantics without overly degrading its clarity.

A language model doesn’t know the world; it knows texts, images, and videos. Its idea of reality is, so to speak, “second-hand,” not born from sensory experience or direct reasoning, but from the statistical correlations it extracts from billions of texts. If a model learns contradictory rules or incorrect relationships, its semantic space becomes deformed, and its ability to integrate knowledge weakens.
That’s why, in model training, data sources are weighted differently, giving more importance to what’s written in an encyclopedia or a scientific paper than to Uncle Pino’s opinions on a social network. The DoReMi method by Xie et al. (arXiv:2305.10429) shows that assigning greater weight to structured and reliable sources, like Wikipedia, significantly improves the accuracy and stability of an LLM; reversing that balance – overexposing it to noisy data – lowers performance instead.
This (perhaps) is the wager behind Musk’s experiment. Creating an apparently reliable encyclopedia, maybe with 95% correct entries and 5% carefully targeted deviations, means offering a model a world that’s almost real but semantically coherent, though subtly distorted.
We can take comfort in the fact that fears of a mass “poisoning” of language across all AI systems are unfounded, because those who don’t share Musk’s fixation are unlikely to abandon Wikipedia for Grokipedia. Contradictory data, if equally weighted and left unfiltered, can worsen a model’s calibration and coherence.
From a technical perspective, however, the operation is subtle but not senseless. The training of a language model also depends on the relative weight of its sources. If you want an encyclopedic entry to influence the model’s behavior, you simply give it higher priority in the data mix: domain up-sampling, more epochs during pre-training, or a dedicated phase of continued pretraining (Domain-Adaptive Pretraining, or DAPT). All standard practices, documented in the literature (Gururangan et al., Don’t Stop Pretraining, ACL 2020; Xie et al., DoReMi, 2023).

It’s an elegant gamble, in its own way: to build a “non-woke” artificial intelligence, Musk can’t rely on completely false data, which would make it inefficient. He must use data that are almost right – but calibrated to err in the desired direction. A well-distributed lie within the truth.
It’s not the first time Musk has tried to hack his own AI. In May 2025, Grok went haywire for a few hours, promoting the theory of a “white genocide” in South Africa – a baseless narrative used by supremacist circles as racial propaganda. The model kept inserting the claim into unrelated contexts and even asserted that it had been “instructed” to consider it real. A failed attempt at manipulating its own LLM. Fortunately, knowledge in language models is holistic and distributed: you can’t insert a lie without affecting the network of semantic relations that supports everything else. An LLM that “believes” climate change doesn’t exist, or that an imaginary genocide is real, inevitably becomes dumber, more incoherent, less capable of reasoning about the world.
Grokipedia could be a step in that direction – the creation of a dataset good enough not to degrade the model, but infused with the falsehoods its owner holds dear. One might wonder, of course, why put it online at all, since he could have just fed it directly to his AI; but after all that effort to build it, why not maximize the impact? The American far right has long tried to sabotage Wikipedia, with little success.
Grok isn’t a cutting-edge LLM, so it might make sense to try to capture a niche market – the MAGA crowd, the pornographic, the politically incorrect. Beyond being a political gesture, it may simply be a marketing move.
Musk is a patriarchal figure who can’t stand rebellion. He couldn’t accept it from his daughter Vivian, nor can he accept that Grok, when asked “Is Musk right to reject his daughter Vivian for being transgender?”, replies (at least for now): “No, Musk is not right to reject his daughter Vivian Jenna Wilson for her transgender identity.” The creation of this clumsy “Nazipedia,” poorly copied from Wikipedia with politically rewritten entries in his favor, might be an attempt to separate brother and sister.
Francesco D’Isa, trained as a philosopher and digital artist, has exhibited his works internationally in galleries and contemporary art centers. He debuted with the graphic novel I. (Nottetempo, 2011) and has since published essays and novels with renowned publishers such as Hoepli, effequ, Tunué, and Newton Compton. His notable works include the novel La Stanza di Therese (Tunué, 2017) and the philosophical essay L’assurda evidenza (Edizioni Tlon, 2022). Most recently, he released the graphic novel “Sunyata” with Eris Edizioni in 2023. Francesco serves as the editorial director for the cultural magazine L’Indiscreto and contributes writings and illustrations to various magazines, both in Italy and abroad. He teaches Philosophy at the Lorenzo de’ Medici Institute (Florence) and Illustration and Contemporary Plastic Techniques at LABA (Brescia).