An initial update \ Anthropic

✨ Discover this insightful post from Hacker News 📖

📂 **Category**:

📌 **What You’ll Learn**:

Last month, we launched Project Glasswing, our collaborative effort to secure the world’s most critical software before increasingly capable AI models can be turned against it.

Since then, we and our approximately 50 partners have used Claude Mythos Preview to find more than ten thousand high- or critical-severity vulnerabilities across the most systemically important software in the world. Progress on software security used to be limited by how quickly we could find new vulnerabilities. Now it’s limited by how quickly we can verify, disclose, and patch the large numbers of vulnerabilities found by AI.

In this post, we discuss what we’ve learned about this critical challenge for cybersecurity in the first weeks of Project Glasswing. We focus on the early public evidence of Mythos Preview’s performance, on the initial results of our effort to scan thousands of open-source software projects, and on what this progress means for cyberdefenders today. We also cover what to expect next from Project Glasswing, and how we’re thinking about releasing Mythos-class models in the future.

Our early results

Our approach to discussing Mythos Preview’s findings

The software industry’s longstanding convention is to disclose new vulnerabilities 90 days after they’re discovered (or, if a patch is created before the 90 days is up, around 45 days after the patch becomes available). This allows time for end users to update their software before a vulnerability can be exploited by attackers. Our own Coordinated Vulnerability Disclosure policy takes this approach.

However, this means that disclosed vulnerabilities are a lagging indicator of the accelerating frontier of AI models’ cyber capabilities: we’re not yet at the point where we can fully detail our partners’ findings with Mythos Preview without putting end users at risk. Instead, we provide illustrative examples of the model’s performance, along with aggregate statistics on our progress to date. Once patches for the vulnerabilities that Mythos Preview has discovered are widely deployed, we’ll provide much more detail about what we’ve learned.

Evidence from our partners and external testers

Project Glasswing’s initial partners build and maintain software that is fundamental to the functioning of the internet and other essential infrastructure. Fixing flaws in their code reduces risk for the many other organizations that rely on it, and therefore reduces risk for billions of end users.

After one month, most partners have each found hundreds of critical- or high-severity vulnerabilities in their software. Collectively, they’ve found more than tens thousand. Several have told us that their rate of bug-finding has increased by more than a factor of ten. For instance, Cloudflare has found 2,000 bugs (400 of which are high- or critical-severity) across their critical-path systems, with a false positive rate that Cloudflare’s team considers better than human testers.

This tallies with external testers’ experience of Mythos Preview’s performance, and with recent additional evaluations of the model:

The UK’s AI Security Institute reports that Mythos Preview is the first model to solve both of their cyber ranges (simulations of multistep cyberattacks) end to end;
Mozilla found and fixed 271 vulnerabilities in Firefox 150 while testing Mythos Preview—over ten times more than they found in Firefox 148 with Claude Opus 4.6;
XBOW, an independent security platform, reports that Mythos Preview is a “significant step up over all existing models” on its web exploit benchmark, and provides “absolutely unprecedented precision” on a token-for-token basis;
ExploitBench and ExploitGym, two recently released academic benchmarks for measuring models’ exploit development capabilities, show Mythos Preview as the strongest performer. We discuss what these benchmarks tell us about the model in more detail on our Frontier Red Team blog.

More generally, we’re now seeing that patched software is being rolled out much more quickly. The latest Palo Alto Networks release included over five times as many patches as usual. Microsoft has reported that the number of new patches they’ll release will “continue trending larger for some time.” And Oracle is finding and fixing vulnerabilities across its products and cloud multiple times faster than before.

Mythos Preview has also proved useful for other kinds of security work. For example, at one of our Glasswing partner banks, Mythos Preview helped to detect and prevent a fraudulent $1.5 million wire transfer after a threat actor compromised a customer’s email account and made spoof phone calls.

Open-source software

For the last few months, Anthropic has used Mythos Preview to scan more than 1,000 open-source projects, which collectively underpin much of the internet—and much of our own infrastructure.

So far, Mythos Preview has found what it estimates are 6,202 high- or critical-severity vulnerabilities in these projects (out of 23,019 in total, including those it estimates as medium- or low-severity).

1,752 of those high- or critical-rated vulnerabilities have now been carefully assessed by one of six independent security research firms, or in a small number of cases by ourselves. Of these, 90.6% (1,587) have proved to be valid true positives, and 62.4% (1,094) were confirmed as either high- or critical-severity. That means that even if Mythos Preview finds no further vulnerabilities, at our current post-triage true-positive rates, it’s on track to have surfaced nearly 3,900 high- or critical-severity vulnerabilities in open-source code—in addition to those it has found for Project Glasswing’s partners. To be clear, we intend to continue scanning open-source code for some time, so we expect this number to rise.

One example of an open-source vulnerability that Mythos Preview detected was in wolfSSL, an open-source cryptography library that’s known for its security and is used by billions of devices worldwide. Mythos Preview constructed an exploit that would let an attacker forge certificates that would (for instance) allow them to host a fake website for a bank or email provider. The website would look perfectly legitimate to an end user, despite being controlled by the attacker. We’ll release our full technical analysis of this now-patched vulnerability (assigned CVE-2026-5194) in the coming weeks.

As we noted above, the bottleneck in fixing bugs like these is the human capacity to triage, report, and design and deploy patches for them. Finding them in the first place has become vastly more straightforward with Mythos Preview. We’ve created a dashboard of the open-source vulnerabilities we’ve scanned, below, which shows the different steps in our disclosure process and will track our progress over time. This shows vulnerabilities of all severity levels, rather than only the subset initially assessed as high- or critical-severity by Mythos Preview. Note the steep drop-off at each phase, reflecting the amount of human effort required to verify and fix each of the vulnerabilities.

Our dashboard of open-source vulnerabilities, showing vulnerabilities of all severities (rather than only those estimated high- or critical-severity by Mythos Preview).

Our process for triaging vulnerabilities is intensive. First, we or one of the external security firms we work with reproduce the issue that Mythos has found and re-assess its severity. Once we’ve confirmed that a vulnerability is real, we check for whether there are already fixes in place, and write a detailed report to the software’s maintainers. We take considerable care here: on top of the regular challenges of maintaining open-source software, maintainers have been facing a deluge of low-quality, AI-generated bug reports. Indeed, several maintainers have told us they’re currently severely capacity constrained, and some have even asked us to slow down our rate of our disclosures because they need more time to design patches. (On average, a high- or critical-severity bug found by Mythos Preview takes two weeks to patch.)

On maintainers’ request, we sometimes disclose bugs directly, without further assessment. We’ve now reported 1,129 such unvetted bugs, of which Mythos Preview estimated that 175 were high- or critical-severity.

We estimate that we’ve disclosed 530 high- or critical-severity bugs to maintainers so far. This is based on Claude’s assessment of severity in the case of direct disclosures, and maintainers’ or our security partners’ assessment where available. There are a further 827 confirmed vulnerabilities (estimated as high- or critical-severity in the same manner) that we’re aiming to disclose as quickly as possible.

75 of the 530 high- or critical-severity bugs we’ve reported have now been patched, and 65 of those have been given public advisories. The number of patches is still relatively low for three reasons. First, we’re still early in the 90-day window that’s set out in our Coordinated Vulnerability Disclosure policy: we expect many more patches to land soon. Second, we are likely to be undercounting patches because some vulnerabilities are patched without a public advisory: in those cases, we’re reliant on scanning for the patches ourselves using Claude. Third, the low volume of patches reflects a genuine problem: even at our relatively slow pace of disclosures, Mythos Preview is adding to an already-overloaded security ecosystem.

The relative ease of finding vulnerabilities compared with the difficulty of fixing them amounts to a major challenge for cybersecurity. Confronting this challenge successfully will make our software far safer than before. Below we discuss some ways that cyber defenders can adapt.

Adapting to a new phase of cybersecurity

Models with similar cybersecurity skills to Mythos Preview will soon be more broadly available. There is a clear need for a larger effort across the software industry to manage the volume of findings that these models will generate.

Currently, there’s often a long lag between the discovery of a vulnerability, the creation of a patch for it, and the time when the patch is widely deployed by end users. This leaves open a significant window for attackers to exploit critical software. Mythos-class models significantly shrink the time and cost required to find and exploit vulnerabilities, magnifying the risk associated with these time lags. Ultimately, Mythos-class models will enable developers to build far more secure software by catching bugs before they are deployed. But this interim period—while vulnerabilities are being rapidly discovered and slowly patched—presents new risks.

Software developers and users should act now to reduce their exposure to these risks. The advice below is not new, and many researchers (including at Anthropic) are currently working on better and more durable solutions. In the meantime, it’s important to get the basics right:

Software developers should shorten their patch cycles and make security fixes available as quickly as possible. The thoughtful use of publicly-available AI models can help here; we’re building tools and sharing our research to support this (more details below). Developers should also help their users stay up-to-date with their software by making it as easy as possible to install updates; to the extent feasible, they should be more persistent with users who are still running software with known vulnerabilities.
Network defenders should shorten their patch testing and deployment timelines. The critical controls laid out by organizations like the National Institute of Standards and Technology and the UK’s National Cyber Security Centre are now all the more important, since they improve security without depending on any single patch landing in time. These include steps like hardening networks’ default configurations, enforcing multi-factor authentication, and keeping comprehensive logs for detection and response.

Tools for cyberdefence with publicly available AI models

Many generally-available models can already find large numbers of software vulnerabilities, even if they can’t find the most sophisticated vulnerabilities or exploit them as effectively as Claude Mythos Preview. Project Glasswing has already spurred many other organizations to take action on their own codebases with these generally-available models; we’re working to make this much easier to do.

To begin, we’ve released Claude Security in public beta for Claude Enterprise customers. It’s a tool that helps teams scan their codebases for vulnerabilities, and which can generate proposed fixes for them. In the three weeks since launch, Claude Opus 4.7 has been used to patch over 2,100 vulnerabilities. (This is faster than the open-source patching described above in large part because enterprises are fixing their own code, whereas open-source fixes usually require volunteer maintainers who work through coordinated disclosure.)

We’ve also begun our Cyber Verification Program, which allows security professionals using our models for legitimate cybersecurity purposes (such as vulnerability research, penetration testing, and red-teaming) to do so without certain safeguards designed to prevent cyber misuse.

Now, we’re making the tools that we and our partners have used with Mythos Preview available to qualifying customers’ security teams on request. Our aim is to make it much easier to get the best performance out of highly capable public models without extensive setup. This release includes:

The skills (custom instructions for repeated work) that we and our partners have built and shared;
A harness that helps Claude map the codebase, spin up scanning subagents, triage its findings, and write reports;
A threat model builder, which maps a codebase to identify potential targets for attack and prioritizes the model’s work accordingly.

Cisco, one of our Project Glasswing partners, has also recently open-sourced its Foundry Security Spec to help other defenders build an evaluation system similar to the one they use themselves.

Supporting the ecosystem

We’ve formed a partnership with the Open Source Security Foundation’s Alpha-Omega project, which will support the foundation’s efforts to assist maintainers in processing and triaging bug reports. We’re also continuing to publish research into how frontier model capabilities can best support cyberdefenders.

We’ve also supported the development of ExploitBench and ExploitGym, the two new benchmarks that allow researchers to track frontier AI models’ exploit development capabilities over time, as we discuss here. We’re supporting the development of other high-quality quantitative benchmarks through our External Researcher Access Program. Finally, Claude for Open Source supports maintainers and contributors, and we’re committing to scan any open-source package that we adopt ourselves in the future.

What’s next for Project Glasswing

The speed of AI progress means that models as capable as Mythos Preview will soon be developed by many different AI companies. At present, no company—including Anthropic—has developed safeguards strong enough to prevent such models from being misused and potentially causing severe harm. That is why we have yet to release Mythos-class models to the public. But it’s also why we began Project Glasswing: if a similarly capable model is released without such safeguards, it will soon become dramatically cheaper and easier for almost anyone in the world to exploit flawed software.

Glasswing helps the most systemically important cyber defenders gain an asymmetric advantage. However, there is an urgent need for as many organizations as possible to shore up their cyber defences. We hope that our generally-available models, and the new tools, resources, and research we’re providing to accompany them, will support those organizations to improve their cybersecurity posture.

Next, we will work with critical partners—including US and allied governments—to expand Project Glasswing to additional partners. And in the near future, once we’ve developed the far stronger safeguards we need, we look forward to making Mythos-class models available through a general release.

On the far side of these risks, there’s an encouraging world available to us: one in which important code is hardened far better than it is today, and in which hacking is far less prevalent. There are many obstacles, but we’re nonetheless confident that Project Glasswing can help get us there.

🔥 **What’s your take?**
Share your thoughts in the comments below!

#️⃣ **#initial #update #Anthropic**

🕒 **Posted on**: 1779478976

🌟 **Want more?** Click here for more info! 🌟