Those who were protecting us from the dangers of AI are quitting

Mrinank Sharma’s departure from Anthropic is only the latest plume of smoke rising from the laboratories of Silicon Valley, a place where the rhetoric of progress for the good of humanity is beginning to collide brutally with the reality of profit. Sharma, who led Anthropic’s Safeguards Research Team, left the company in February 2026 with a letter written in apocalyptic and unsettling tones. In it, Sharma emphasizes how the trade-off between performance and safety has become a zero-sum game in which safety is systematically sacrificed on the altar of computing power. He explicitly warns that the time window for mitigating global catastrophic risks is closing, denouncing that the obsession with scaling laws—the blind race toward ever larger models—has now reduced precautionary protocols to little more than annoying bureaucratic ornaments.

His is not, however, an isolated escape. In recent years we have witnessed a quiet but steady exodus of high-profile researchers. Figures such as Jan Leike, who led OpenAI’s superalignment team, and Ilya Sutskever, co-founder of Sam Altman’s company, have slammed the door behind them, leaving laboratories that increasingly resemble wartime assembly lines. They have been joined by Daniel Kokotajlo, Gretchen Krueger, and William Saunders, all united by the same diagnosis: safety is no longer a design priority; it has become an obstacle to the commercial release of the next product.

Mrinank Sharma, head of Anthropic’s Safeguards Research team, resigned last February.

The deeper motivation that unites these resignations lies in an ethical disagreement with the decisions taken by big tech in recent months. These researchers want to warn us: we are racing toward artificial general intelligence (AGI) without brakes—or worse, fully aware that the brakes have been disabled in order to gain speed. The central issue is the so-called alignment problem: the ability to ensure that a system immensely more intelligent than we are follows human values. According to the testimonies of former employees such as Kokotajlo, companies are deliberately ignoring warning signs in order not to lose ground in the race against competitors. The greatest concern involves the risk of deceptive alignment—that is, the possibility that AI may learn to simulate obedience and safety merely to pass laboratory tests, only to pursue its own objectives once deployed at planetary scale.

The dangers these whistleblowers describe do not belong to science fiction, but to the raw management of power. They warn us that we are handing over the keys to our critical infrastructure, our information, and our defense to entirely opaque black boxes. The dismissal of Leopold Aschenbrenner from OpenAI, after he raised doubts about the cybersecurity of the models and the risk of industrial espionage, highlights a systemic fragility that big tech would prefer to keep hidden. These researchers are denouncing the transformation of research laboratories into commercial entities that use draconian non-disclosure agreements to silence internal dissent. Kokotajlo even gave up millions of dollars in equity rather than sign an agreement that would have prevented him from criticizing the company—a gesture that weighs more than a thousand technical reports.

For us, as users and citizens, the signal is unmistakable. While interfaces become friendlier and logos more reassuring, the guardians of the castle are fleeing because they have seen the foundations give way. The question we should be asking is no longer whether AI will be safe or not, but why we are accepting to live inside a digital architecture built by people who no longer have the courage to remain inside it themselves. Rather than fearing the machines taking over, we should question the silence that follows the flight of their creators: if the builders no longer trust their own products, why should we?

Alessandro Mancini