AI Finds Open Source Bugs: Anthropic's Mythos Discovers 11

Prev Article Next Article

Anthropic‘s latest AI tool, the Claude Mythos Preview, has made a significant mark in the cybersecurity world. Released in April, this large language model (LLM) can autonomously find zero-day vulnerabilities and create working exploits. In a recent demonstration, the tool uncovered 11 bugs in open-source software, including three critical ones that highlight the growing potential of AI vulnerability discovery. These findings are part of Project Glasswing, a broader industry initiative bringing together a dozen major companies to use frontier AI platforms for cybersecurity defenses. The most notable discoveries include a 27-year-old vulnerability in OpenBSD, a 16-year-old bug in FFmpeg, and a complex exploit chain in the Linux kernel — showing that AI can find open source security issues that have lingered for decades.

Ai finds open source

The 27-Year-Old OpenBSD Bug: A Legacy Vulnerability Unearthed

Imagine a security flaw quietly sitting in a piece of software for nearly three decades, unnoticed by human eyes. That is exactly the scenario Anthropic’s Mythos uncovered when it found a critical vulnerability in OpenBSD that had existed for 27 years. OpenBSD is a free, Unix-like operating system known for its strong emphasis on security and code correctness. Because of its reputation, developers and security researchers have pored over its codebase for decades, yet this legacy bug remained hidden. With its ability to analyze code patterns at scale, the AI system autonomously identified the flaw in a way that would be exhausting for a human to replicate manually. The discovery highlights a sobering reality: even in open-source projects with thousands of contributors, old vulnerabilities can persist. When an AI finds open source security issues like this, it offers a new layer of protection that traditional code review struggles to provide. You might wonder why such a bug wasn’t caught earlier — the answer often comes down to the sheer volume of code and the subtlety of the flaw itself, which can be easy to overlook when you’re not looking for it. This particular OpenBSD vulnerability was a serious security risk, confirming that even mature, well-maintained projects can harbor hidden dangers for decades.

The 16-Year-Old FFmpeg Flaw: A Media Library at Risk

That same pattern of long-hidden vulnerabilities shows up again in another critical open-source project. FFmpeg is a widely used multimedia framework that handles video, audio, and other media formats. You might not interact with it directly, but it powers countless applications and services you rely on every day. For 16 years, a serious flaw sat quietly inside its code, unnoticed through regular code reviews and updates. The Claude Mythos Preview identified this FFmpeg vulnerability as one of three critical bugs it found. The age of the bug is remarkable — it persisted for well over a decade despite frequent development activity. This is a clear case of how AI finds open source issues that human eyes have missed for years. For a library that processes untrusted media input, a long-standing bug can open the door to severe security risks, such as denial of service or memory corruption. The discovery underscores the value of automated, AI-driven analysis in revealing threats that have been silently embedded in foundational software.

The Linux Kernel Exploit Chain: A Complex Multi-Step Attack

From the hidden backdoor, the investigation by Mythos moved to another serious finding: a sophisticated exploit chain within the Linux kernel. While many security bugs might involve a single weak point, this discovery demonstrates how ai finds open source vulnerabilities that require advanced reasoning. The exploit chain relied on three critical bugs working together, and uncovering them demanded a level of logical analysis that goes far beyond simple pattern matching. For you, this means understanding that the core of your operating system can be vulnerable to a multi-step attack where each flaw on its own might seem harmless, but combined they create a serious threat.

Chaining these bugs together was not straightforward. It required Mythos to simulate how an attacker would progress from one vulnerability to the next. This kind of Linux kernel exploit scenario highlights how ai finds open source threats that traditional scanning tools could easily miss. The discovery shows that modern AI can handle intricate security puzzles, piecing together steps that no single code check could flag. This is a powerful reminder that security is not just about individual mistakes, but about the complex interactions between them. The reasoning ability demonstrated here suggests a new frontier for protecting the foundational software you rely on every day.

The Other 8 Bugs: Unspecified but Critical

So, you now know about the three high-profile vulnerabilities that Mythos exposed. But the title promises 11 bugs in total, which leaves eight more open-source bugs that remain undisclosed. These eight findings are not yet publicly detailed, meaning the full scope of this AI discovery is still unfolding. You might wonder why only three were described. The likely reason is that the other eight are still being coordinated with the respective project maintainers, a standard practice in responsible disclosure. It takes time for volunteer developers to review, patch, and release fixes without alerting malicious actors first.

What can you assume about these undisclosed vulnerabilities? Given the pattern of the three detailed bugs, it is reasonable to expect they affect other widely used open-source bugs in libraries, tools, or operating systems. The fact that Mythos found them at all is a strong signal that AI finds open source issues that human reviewers might overlook, especially in less popular but still critical projects. Until the details emerge, the practical takeaway is this: the era of relying solely on human code review for security is shifting. Automated reasoning tools like Mythos are becoming a necessary part of the developer’s toolkit.

How Mythos Finds Bugs: Autonomous Zero-Day Discovery

Building on that shift, Mythos operates like a human expert but at machine speed. It autonomously scans code, identifies zero-day vulnerabilities, and generates working exploits — all without manual oversight. This autonomous vulnerability discovery process is key. According to Anthropic, their Claude Mythos Preview, launched in April, can match or surpass highly skilled human experts at finding and exploiting software flaws. That means it doesn’t just report bugs; it verifies them by creating functional exploits, a major step forward in zero-day detection.

The workflow is practical: you provide a codebase, and Mythos analyzes it using reasoning. It doesn’t rely on known signatures but finds novel weaknesses. This AI finds open source bugs efficiently, making it a valuable tool for developers. By generating exploits, it confirms each vulnerability’s severity. For the 11 open-source bugs discovered, Mythos demonstrated deep understanding of complex software, proving that autonomous vulnerability discovery is now a viable complement to human review. This AI exploit generation capability streamlines security testing, helping you catch critical flaws earlier.

Benchmark Performance: CyberGym and SWE-bench Results

These practical security gains are backed by strong numbers. When it comes to measuring how well an AI finds open source vulnerabilities, standardized tests like CyberGym and SWE-bench offer a clear picture. On the CyberGym benchmark, which evaluates vulnerability discovery in realistic environments, Claude Mythos Preview scored 83.1 percent. That’s a significant jump over the 66.6 percent achieved by Claude Opus 4.6. The difference shows that Mythos doesn’t just understand security concepts—it can actively pinpoint weaknesses in code.

The SWE-bench results tell a similar story. This benchmark focuses on software engineering tasks, including bug identification and patch generation. Here, Mythos reached 93.9 percent accuracy, compared to 80.8 percent for its predecessor. These metrics confirm that the model’s ability to handle real-world coding challenges is not just theoretical. For you, this means a tool that can reliably assist in finding and fixing flaws across complex projects. Whether you’re auditing your own code or reviewing third-party libraries, these benchmarks translate directly to more efficient and thorough security testing.

Project Glasswing: A Controlled Testing Ground for AI Security

Building on the idea of structured testing, Project Glasswing takes AI security efforts to a new level of coordination. This broad industry initiative brings together a dozen major companies to explore how frontier AI platforms can bolster cybersecurity defenses. The controlled environment allows researchers and developers to test AI-driven security tools without risking real-world systems. Partners like Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, Nvidia, Palo Alto Networks, and The Linux Foundation contribute their expertise, creating a sandbox where AI finds open source vulnerabilities in a safe but realistic setting. For you, this means that the tools emerging from such collaborations are thoroughly vetted before they reach your workflow.

What makes Project Glasswing especially valuable is its focus on practical outcomes. By simulating attack scenarios and defense mechanisms, it provides a testing ground that mirrors the complexity of modern software ecosystems. As AI models become more adept at identifying flaws in open source code, initiatives like this ensure that those capabilities are reliable and actionable. Whether you’re a developer or a security professional, the insights from Project Glasswing translate into better tools for finding and fixing vulnerabilities. It’s a clear example of how industry collaboration can accelerate secure software development without compromising safety.

The $12 Million Grant: Boosting Open-Source Security

That collaborative spirit extends beyond individual tools and into the financial backbone of open source. A new $12 million security grant has been awarded to Alpha-Omega and the Open Source Security Foundation (OpenSSF) specifically to address the fallout from automated vulnerability discovery. As tools that enable Ai finds open source bugs become more powerful, the volume of reported issues skyrockets. Open-source maintainers, often working on a volunteer basis, face an unprecedented influx of security findings generated by automated systems but lack the resources to remediate them effectively. This funding aims to bridge that gap, providing direct support for triaging and fixing critical vulnerabilities. For you, the end user, this means the open-source libraries and tools your favorite apps rely on can be updated faster and more reliably. The grant is a concrete acknowledgment that open-source funding needs to scale alongside the detection capabilities of modern security tools. It’s a move from simply finding bugs to actually fixing them—a crucial step in making the entire ecosystem safer for everyone.

Impact on Open-Source Maintainers: A Double-Edged Sword

But even with better funding for fixes, the sheer volume of bugs that AI finds open source creates an entirely new bottleneck. Rapid advances in AI are increasing the speed and scale at which vulnerabilities in open-source software are discovered. The upside is enormous: as Greg Kroah-Hartman, a Linux kernel maintainer, noted, being able to find bugs and write patches faster is a huge positive development for open-source software. Yet the reality for many maintainers is a growing maintainer workload as they face an unprecedented AI bug influx. Automated systems generate findings faster than human reviewers can triage, verify, and patch. Without adequate patch management resources, those reported flaws risk sitting unaddressed. The result is a double-edged sword: better detection, but a critical need for scalable remediation efforts.

Ethical and Security Concerns: The Risks of AI-Generated Exploits

Creating functional exploits for discovered vulnerabilities moves beyond mere bug detection, and for good reason. When an AI finds open source flaws and also generates working exploit code, it hands over powerful capabilities that can be used for both defense and offense. The line between security researcher and malicious actor becomes dangerously thin. If this technology falls into the wrong hands, those AI exploit risks become very real. A bad actor could use automated exploit generation to quickly weaponize vulnerabilities, skipping the ethical disclosure process entirely. The potential for security misuse is alarming, which is why responsible development practices for ethical AI are non-negotiable. Anthropic has recognized this tension. It does not plan to make Claude Mythos Preview generally available due to these security risks. Instead, Project Glasswing will act as a controlled testing ground, limiting access to vetted security researchers. This cautious approach reflects a hard truth: the same AI that helps secure your software could just as easily be repurposed to break it. Keeping that power under lock and key is the only responsible path forward.

Responsible Disclosure and Future of AI in Security

That measured approach also applies to how vulnerabilities are reported. As AI finds open-source bugs, the question of who gets told and when becomes just as important as the detection itself. Right now, responsible disclosure processes for AI-discovered flaws aren’t fully defined. Natasha Woods of The Linux Foundation stated that the organization was still undergoing reviews and had no findings to report publicly, underscoring how early this territory still is. For you, that means the promise of AI finding bugs faster won’t matter unless a clear AI vulnerability reporting pipeline exists to get those fixes into your hands safely.

Anthropic isn’t rushing to release its tool widely. Because Claude Mythos Preview could be misused, the company does not plan to make it generally available. Instead, Project Glasswing will serve as a controlled testing ground where researchers can collaborate without handing dangerous capabilities to bad actors. The future of AI security depends on this kind of deliberate partnership — where discovery, disclosure, and patching happen in a trusted loop rather than a chaotic free-for-all. For everyday users, that means relying on open-source projects that embrace these practices will give you the best shot at secure software, now and down the road.

Frequently Asked Questions

How does Anthropic’s Claude Mythos Preview actually find and exploit zero-day vulnerabilities?

Claude Mythos Preview uses a specialized scanning approach to analyze open-source codebases for security weaknesses. It identifies patterns that often lead to exploitable bugs, then generates functional proof-of-concept code to demonstrate the vulnerability. This process helps developers understand the real-world risk before they release patches.

What are the 11 open-source bugs mentioned, and why are only 3 described?

The 11 bugs span different projects and categories of security flaws found across public repositories. Only 3 are described in detail because they represent distinct vulnerability types that the AI finds open source systems can commonly overlook. The remaining 8 are similar in nature to these examples, making full description redundant for practical understanding.

Is it safe to let an AI create functional exploits—could this be misused?

Creating functional exploits through AI carries risks, but the approach follows responsible disclosure practices. The exploits are shared only with project maintainers for patching, not publicly released. This controlled process helps you understand real threats while giving developers time to fix issues before bad actors can use them.

Prev Article Next Article

Anthropic’s Mythos Finds 11 Open-Source Bugs

The 27-Year-Old OpenBSD Bug: A Legacy Vulnerability Unearthed

The 16-Year-Old FFmpeg Flaw: A Media Library at Risk

The Linux Kernel Exploit Chain: A Complex Multi-Step Attack

The Other 8 Bugs: Unspecified but Critical

How Mythos Finds Bugs: Autonomous Zero-Day Discovery

Benchmark Performance: CyberGym and SWE-bench Results

Project Glasswing: A Controlled Testing Ground for AI Security

The $12 Million Grant: Boosting Open-Source Security

Impact on Open-Source Maintainers: A Double-Edged Sword

Ethical and Security Concerns: The Risks of AI-Generated Exploits

Responsible Disclosure and Future of AI in Security

Frequently Asked Questions

How does Anthropic’s Claude Mythos Preview actually find and exploit zero-day vulnerabilities?

What are the 11 open-source bugs mentioned, and why are only 3 described?

Is it safe to let an AI create functional exploits—could this be misused?

Add Comment Cancel Reply

The 27-Year-Old OpenBSD Bug: A Legacy Vulnerability Unearthed

The 16-Year-Old FFmpeg Flaw: A Media Library at Risk

The Linux Kernel Exploit Chain: A Complex Multi-Step Attack

The Other 8 Bugs: Unspecified but Critical

How Mythos Finds Bugs: Autonomous Zero-Day Discovery

Benchmark Performance: CyberGym and SWE-bench Results

Project Glasswing: A Controlled Testing Ground for AI Security

The $12 Million Grant: Boosting Open-Source Security

Impact on Open-Source Maintainers: A Double-Edged Sword

Ethical and Security Concerns: The Risks of AI-Generated Exploits

Responsible Disclosure and Future of AI in Security

Frequently Asked Questions

How does Anthropic’s Claude Mythos Preview actually find and exploit zero-day vulnerabilities?

What are the 11 open-source bugs mentioned, and why are only 3 described?

Is it safe to let an AI create functional exploits—could this be misused?

Related Posts

Add Comment Cancel Reply