Anthropic’s safety warnings may have backfired, with the government shutting down its most powerful AI software

💥 Discover this trending post from TechCrunch 📖

📂 **Category**: AI

📌 **What You’ll Learn**:

The US government on Friday ordered Anthropic to immediately close access to two of its most powerful AI models – Claude Fable 5 and Claude Mythos 5 – due to national security concerns. Anthropic announced on X that it had complied, but made clear that it believed the government had gotten the order wrong.

The directive, which Anthropic said it received on Friday at 5:21 p.m. ET, forces the company to disable both models for all users worldwide — not just the foreign nationals the government’s export control order was nominally targeting. Access to other Anthropic models is not affected.

Why does any of this matter? Mythos is Anthropic’s most capable AI model, one that the company previewed in early April and has remained severely limited since then by what Anthropic called its exceptional ability to find security vulnerabilities in software. According to Anthropic, Mythos identified flaws in every major operating system and web browser it tested, so instead of releasing it broadly, the company launched a controlled program called Project Glasswing, and shared it with nearly 50 vetted organizations, including Amazon, Apple, Google, Microsoft, and CrowdStrike, for use in defensive cybersecurity work.

Fable 5, released just three days ago, was Anthropic’s answer to apparent commercial pressure: a version of Mythos fitted with a guardrail that prevented responses in high-risk areas like cybersecurity and biology, making it safe enough for general release, the company claimed. It was immediately the most capable AI model available to the public, according to benchmark tests conducted by Vals AI, a company that tracks the performance of AI technology.

The government’s guidance was put in place as an export control measure, restricting foreign nationals’ access to the forms. But in a lengthy blog post, Anthropic says its understanding is that the primary concern is the claims of jailbreaking Fable 5. The company says the government has so far provided only verbal evidence of a “potentially limited, non-universal jailbreak” — which, as Anthropic describes, amounts to prompting the model to read a specific code base and identify software flaws. Incidentally, the company adds, it’s a “level of capability” that’s already widely available in other publicly accessible models, including OpenAI’s GPT-5.5. Anthropic says it is also routinely used by cybersecurity professionals for defensive purposes.

Anthropic’s broader argument is that its strongest safeguards operate through independent rating systems that operate separately from the model itself, meaning that even if someone convinces Fable to continue talking after a rejection, basic protection against riskier outputs remains in place.

Clearly, none of this was enough to stop the government from acting, and Anthropic makes no secret of her frustration. “We disagree that finding a potentially limited jailbreak should be a reason to remind a widespread business model for hundreds of millions of people,” the company wrote. “If this standard were implemented across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”

Anthropic is widely expected to seek an IPO this year, and has staked much of its public identity on being the safety-conscious alternative to its competitors. The irony is not lost on observers that the extreme caution Anthropic showed in restricting Mithos — which it promoted as a model too dangerous to be published publicly — now appears to have attracted exactly the kind of government scrutiny that could most disrupt its work.

OpenAI’s Sam Altman must be at least enjoying this. In April, he told broadcaster Ashley Vance that Anthropic’s handling of Mythos amounted to “fear-based marketing.” “It’s obviously an incredible marketing thing to say, ‘We made a bomb. We were about to drop it on your head. We’ll sell you a bomb shelter for $100 million,'” Altman said. Altman, whose company is widely expected to seek an IPO as soon as possible, did not predict a government shutdown, but he did identify something that has come back to bite Anthropic right now, which is that when you spend months telling the world that your AI is uniquely dangerous, the world — including the U.S. government — tends to listen.

When you buy through links in our articles, we may earn a small commission. This does not affect our editorial independence.

🔥 **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#Anthropics #safety #warnings #backfired #government #shutting #powerful #software**

🕒 **Posted on**: 1781321174

🌟 **Want more?** Click here for more info! 🌟

Anthropic’s safety warnings may have backfired, with the government shutting down its most powerful AI software

By

Leave a Reply Cancel reply