Key Takeaways

- Claude Mythos Preview found 271 previously unknown Firefox vulnerabilities, some dating back 20 years
- Mozilla resolved 423 security issues in April, up from the previous record of 76 in March
- The agentic pipeline writes and runs its own test cases to verify bugs before reporting them
From 76 to 423: A Record Month for Firefox Security
Mozilla resolved 423 security issues in Firefox during April. That's not a typo. The previous monthly record was 76 in March. The difference? An agentic AI pipeline running Anthropic's Claude Mythos Preview.
Three Firefox developers detailed the results in a Mozilla Hacks blog post. Claude Mythos Preview alone found 271 previously unknown vulnerabilities in Firefox 150. Some of these bugs had lurked in the codebase for up to 20 years.
Beyond the 271 Mythos-discovered bugs, roughly one-third of the remaining 111 internally found issues also came from Mythos runs. The rest split between the same pipeline running other models and traditional testing methods like fuzzing. External reports accounted for just 41 of the 423 total vulnerabilities.
Why Previous AI Bug-Finding Failed
A few months ago, AI-generated bug reports had a reputation problem. Developers dismissed them as slop: findings that sounded plausible but turned out to be wrong. Verification ate up developer time with nothing to show for it.
Mozilla's earlier attempts using GPT-4 and Claude Sonnet 3.5 in read-only mode failed for exactly this reason. Too many false positives. The models could spot patterns that looked like bugs but couldn't confirm whether they actually were.
According to Mozilla, two factors changed the equation: more capable models and better infrastructure for separating real findings from noise.
The Agentic Difference: AI That Tests Its Own Theories
The breakthrough came from giving the AI agency. Instead of just reading code and flagging suspicious patterns, the agentic pipeline lets Claude build and run its own test cases. If the AI suspects a bug exists, it writes code to prove it before reporting anything.
This self-verification step filters out speculation. The model can't just say "this looks wrong." It has to demonstrate that something actually breaks.
Mozilla started small. The team ran Claude Opus 4.6 in manually supervised sessions, then scaled the process across many virtual machines. Each VM checks a single file in parallel. The pipeline they built around this deduplicates reports, prioritizes findings, and tracks fixes through to release.
Anthropic's Role in the Discovery
The collaboration between Mozilla and Anthropic started in February. Anthropic's Frontier Red Team reported an initial batch of vulnerabilities to Mozilla. That exchange led directly to the pipeline Mozilla is now using.
Claude Mythos Preview didn't just find new bugs. It also validated existing security defenses in Firefox, confirming that certain protective measures actually worked as intended.
Compare how different AI models handle various tasks
What Happens Next
Mozilla plans to integrate this pipeline into its development workflow permanently. The goal: automatically check all new code before it gets committed. Every pull request would pass through the agentic security scanner before merging.
This represents a shift in how security testing might work at scale. Instead of periodic audits or relying on external bug bounty hunters, organizations could run continuous AI-powered security checks on every code change.
The Bigger Picture for Software Security
Firefox is a massive codebase with decades of history. The fact that 20-year-old bugs remained undiscovered until an AI found them suggests similar vulnerabilities exist in other large, long-lived software projects.
The jump from 76 resolved issues in March to 423 in April isn't just about finding more bugs. It demonstrates that agentic AI systems can now do useful security work at a pace human teams can't match.
See the real-world stakes of undiscovered software vulnerabilities
Logicity's Take
Frequently Asked Questions
What is Claude Mythos Preview?
Claude Mythos Preview is an AI model from Anthropic that Mozilla used in an agentic pipeline to discover security vulnerabilities. It can write and run its own test cases to verify suspected bugs.
How many Firefox vulnerabilities did the AI find?
Claude Mythos Preview found 271 previously unknown vulnerabilities in Firefox 150. Mozilla resolved a total of 423 security issues in April 2026.
What makes agentic AI different from regular AI bug detection?
Agentic AI can build and run its own test cases to verify whether a suspected bug actually exists. Earlier approaches just read code and flagged patterns, producing many false positives.
Will Mozilla use this AI system for all future code?
Yes. Mozilla plans to integrate the agentic pipeline to automatically check all new code before it is committed to the Firefox codebase.
How old were some of the bugs the AI discovered?
Some vulnerabilities found by Claude Mythos Preview had existed in the Firefox codebase for up to 20 years without being detected.
Need Help Implementing This?
Source: The Decoder / Maximilian Schreiner
Manaal Khan
Tech & Innovation Writer
Produced with AI assistance and reviewed by the Logicity editorial team. Learn more in our Editorial Policy.
Related Articles
Browse all
Bezos AI Lab Gets $10B: What Project Prometheus Means
Jeff Bezos is closing a $10 billion funding round for Project Prometheus, an AI lab focused on physics-based AI for manufacturing and engineering. With a $38 billion valuation and backing from JPMorgan and BlackRock, this signals a major shift in enterprise AI investment toward industrial applications.

Kimi K2.6 Open-Weight AI: 300 Agents at a Fraction of the Cost
Moonshot AI's Kimi K2.6 matches GPT-5.4 and Claude Opus 4.6 on coding benchmarks while running 300 parallel agents. For businesses locked into expensive API contracts, this open-weight model could slash AI infrastructure costs while delivering enterprise-grade automation.




