Mozilla Uses Claude AI to Find 271 Firefox Vulnerabilities

Key Takeaways

- Claude Mythos Preview found 271 previously unknown Firefox vulnerabilities, some dating back 20 years
- Mozilla resolved 423 security issues in April, up from the previous record of 76 in March
- The agentic pipeline writes and runs its own test cases to verify bugs before reporting them
From 76 to 423: A Record Month for Firefox Security
Mozilla resolved 423 security issues in Firefox during April. That's not a typo. The previous monthly record was 76 in March. The difference? An agentic AI pipeline running Anthropic's Claude Mythos Preview.
Three Firefox developers detailed the results in a Mozilla Hacks blog post. Claude Mythos Preview alone found 271 previously unknown vulnerabilities in Firefox 150. Some of these bugs had lurked in the codebase for up to 20 years.
Beyond the 271 Mythos-discovered bugs, roughly one-third of the remaining 111 internally found issues also came from Mythos runs. The rest split between the same pipeline running other models and traditional testing methods like fuzzing. External reports accounted for just 41 of the 423 total vulnerabilities.
Why Previous AI Bug-Finding Failed
A few months ago, AI-generated bug reports had a reputation problem. Developers dismissed them as slop: findings that sounded plausible but turned out to be wrong. Verification ate up developer time with nothing to show for it.
Mozilla's earlier attempts using GPT-4 and Claude Sonnet 3.5 in read-only mode failed for exactly this reason. Too many false positives. The models could spot patterns that looked like bugs but couldn't confirm whether they actually were.
According to Mozilla, two factors changed the equation: more capable models and better infrastructure for separating real findings from noise.
The Agentic Difference: AI That Tests Its Own Theories
The breakthrough came from giving the AI agency. Instead of just reading code and flagging suspicious patterns, the agentic pipeline lets Claude build and run its own test cases. If the AI suspects a bug exists, it writes code to prove it before reporting anything.
This self-verification step filters out speculation. The model can't just say "this looks wrong." It has to demonstrate that something actually breaks.
Mozilla started small. The team ran Claude Opus 4.6 in manually supervised sessions, then scaled the process across many virtual machines. Each VM checks a single file in parallel. The pipeline they built around this deduplicates reports, prioritizes findings, and tracks fixes through to release.
Anthropic's Role in the Discovery
The collaboration between Mozilla and Anthropic started in February. Anthropic's Frontier Red Team reported an initial batch of vulnerabilities to Mozilla. That exchange led directly to the pipeline Mozilla is now using.
Claude Mythos Preview didn't just find new bugs. It also validated existing security defenses in Firefox, confirming that certain protective measures actually worked as intended.
Compare how different AI models handle various tasks
What Happens Next
Mozilla plans to integrate this pipeline into its development workflow permanently. The goal: automatically check all new code before it gets committed. Every pull request would pass through the agentic security scanner before merging.
This represents a shift in how security testing might work at scale. Instead of periodic audits or relying on external bug bounty hunters, organizations could run continuous AI-powered security checks on every code change.
The Bigger Picture for Software Security
Firefox is a massive codebase with decades of history. The fact that 20-year-old bugs remained undiscovered until an AI found them suggests similar vulnerabilities exist in other large, long-lived software projects.
The jump from 76 resolved issues in March to 423 in April isn't just about finding more bugs. It demonstrates that agentic AI systems can now do useful security work at a pace human teams can't match.
See the real-world stakes of undiscovered software vulnerabilities
Logicity's Take
Frequently Asked Questions
What is Claude Mythos Preview?
Claude Mythos Preview is an AI model from Anthropic that Mozilla used in an agentic pipeline to discover security vulnerabilities. It can write and run its own test cases to verify suspected bugs.
How many Firefox vulnerabilities did the AI find?
Claude Mythos Preview found 271 previously unknown vulnerabilities in Firefox 150. Mozilla resolved a total of 423 security issues in April 2026.
What makes agentic AI different from regular AI bug detection?
Agentic AI can build and run its own test cases to verify whether a suspected bug actually exists. Earlier approaches just read code and flagged patterns, producing many false positives.
Will Mozilla use this AI system for all future code?
Yes. Mozilla plans to integrate the agentic pipeline to automatically check all new code before it is committed to the Firefox codebase.
How old were some of the bugs the AI discovered?
Some vulnerabilities found by Claude Mythos Preview had existed in the Firefox codebase for up to 20 years without being detected.
Need Help Implementing This?
Source: The Decoder / Maximilian Schreiner
Manaal Khan
Tech & Innovation Writer
Related Articles
Browse allZuckerberg's Superintelligence Lab Faces Setback
The first AI model from Zuckerberg's superintelligence lab has failed to impress compared to its rivals, sparking concerns about the lab's direction. We take a closer look at what happened and why it matters.

Muse Spark Launch Propels Meta AI App to Top 5
The recent launch of Muse Spark has significantly boosted the popularity of Meta AI app, pushing it into the top 5. We explore what this means for the AI landscape.

Meta's Muse Spark AI Model Lags Behind ChatGPT and Claude
Meta's Muse Spark AI model still can't outperform ChatGPT and Claude in key areas, despite its advancements. We explore what this means for the AI landscape.

Meta Launches Muse Spark AI To Challenge ChatGPT
Meta launches Muse Spark AI to challenge ChatGPT and Claude, we explore what this means for the AI landscape. Muse Spark AI is a significant development in the AI chatbot space.
Also Read

DJI Mini 4K Drone Drops to $209 in Limited Amazon Sale
The beginner-friendly DJI Mini 4K quadcopter is 30% off at Amazon, with the base model at $209 and the Fly More Combo at $309. The sub-249g drone requires no FAA registration and offers 31 minutes of flight time with 4K video stabilization.

Pentagon Releases 161 Declassified UFO Files With 30 Videos
The Pentagon published its first batch of declassified UAP files on May 8, responding to President Trump's February directive. The release includes 161 files with nearly 30 videos showing unidentified objects captured by military sensors, plus eyewitness accounts from Apollo astronauts.

How to Clear Old Windows Drivers Wasting Your SSD Space
Windows stores every driver you've ever installed but never cleans up old versions. This hidden folder can grow to 30GB on gaming PCs. Here's how to safely reclaim that space.