Cybersecurity researchers aren’t happy about Anthropic’s Fable guardrails

🚀 Discover this trending post from TechCrunch 📖

📂 **Category**: AI,Security,ai safety,Anthropic,cybersecurity,fable,Mythos

📌 **What You’ll Learn**:

Anthropic released its latest Fable model on Tuesday, describing it as a generic, limited version of the powerful and controversial Mythos cybersecurity model.

But not everyone is happy with the restrictions, and a number of researchers and cybersecurity professionals have posted complaints online.

“[Fable] Any request that could be tangentially cyber related is rejected. “Even tasks as innocuous as reading a blog post,” said Valentina “Chompi” Palmiotti, a well-known security researcher who works at IBM X-Force.

When a message triggers its guardrails, Fable pauses the chat and says its “safety procedures have limited this message to cybersecurity or biology topics.”

The guardrails are in place to reduce the risk of Fable being used to develop malware or hacking software – a long-standing concern within Anthropic. The restrictions on biology come from similar concerns about the development of biological weapons.

When the AI ​​giant released Mythos in April, it restricted the model to a limited number of companies and organizations in what it called Project Glasswing, an effort to deploy the model to secure critical software and infrastructure. Last week, Anthropic expanded access to Mythos to hundreds of organizations in 15 countries.

But despite the good intentions, many cybersecurity experts remain troubled by the arbitrary nature of the restrictions. “If you ask him to write secure code, he assumes this is cybersecurity work rather than software engineering best practices, and you get demoted,” Matt Suiche, a veteran cybersecurity expert, told TechCrunch. Fable is programmed to revert to Claude Opus 4.8 if it hits a guardrail. “It seems to be keyword-based, so anything in the lexical domain of ‘cybersecurity’ triggers guardrails.”

Contact us

Do you have more information about how hackers use AI? Or how cybersecurity companies use artificial intelligence? We would love to hear from you. From a device and network outside of work, you can contact Lorenzo Franceschi-Bicchierai securely on Signal at +1 917 257 1382, via Telegram and Keybase @lorenzofb, or email.

“But it’s understandable because we’re still in the early days and they’re still adapting their guardrails,” said Suish, a member of the technical staff at Tolmo, an AI cybersecurity startup. “I’m sure they will evolve over time as Anthropic and other model companies will collaborate more with the current new generation of cybersecurity companies.” “It is better to catch more people than not catch enough when you do a release like this and relax the guardrails over time.”

Another researcher stuck with X, saying that “even requesting a code review” triggers Fable’s guardrails.

Anthropic did not immediately respond to a request for comment.

Aside from the guardrails within its models, Anthropic requires cybersecurity professionals to apply to its cyber verification program. If they receive approval, applicants will have fewer restrictions on using Cloud for cybersecurity work. OpenAI has a similar program called Trusted Access for Cyber.

When you buy through links in our articles, we may earn a small commission. This does not affect our editorial independence.

⚡ **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#Cybersecurity #researchers #arent #happy #Anthropics #Fable #guardrails**

🕒 **Posted on**: 1781148340

🌟 **Want more?** Click here for more info! 🌟

By

Leave a Reply

Your email address will not be published. Required fields are marked *