All posts
AI & Machine Learning

Anthropic's Mythos Preview Chains Exploits Earlier AI Models Missed

Huma Shazia19 May 2026 at 5:18 pm4 min read
Anthropic's Mythos Preview Chains Exploits Earlier AI Models Missed

Key Takeaways

Anthropic's Mythos Preview Chains Exploits Earlier AI Models Missed
Source: The Decoder
  • Mythos Preview chains multiple small vulnerabilities into working exploits and writes proof-of-concept code autonomously
  • Cloudflare built a 50-agent harness with adversarial review to validate findings
  • The same AI capabilities that help defenders will soon be available to attackers

What Cloudflare Found Testing Mythos Preview

Cloudflare ran Anthropic's Mythos Preview through Project Glasswing, testing it against more than 50 of the company's own code repositories. The security-focused AI model did something earlier frontier models could not: chain multiple small vulnerabilities into working exploits.

Previous AI models found individual bugs. They sometimes delivered solid analysis of those bugs. But they stopped short of connecting the dots. They left exploit chains unfinished and could not prove whether a collection of small issues added up to something actually exploitable.

Mythos Preview closes that gap. The model writes, compiles, and runs proof-of-concept code on its own. It proves an exploit works rather than just suggesting it might.

Earlier frontier models found similar individual bugs and sometimes delivered solid analysis. But they fell short at stitching the pieces together, leaving chains unfinished and the question of actual exploitability open.

— Grant Bourzikas, Cloudflare CSO

Fewer False Positives, Faster Decisions

Security teams drown in speculative findings. A tool that flags 500 potential issues but cannot tell you which ones matter creates work, not value. Cloudflare reports that Mythos Preview produced fewer speculative findings than its predecessors.

The model also generated clearer steps to reproduce each issue. That matters because reproduction steps turn a vague warning into an actionable ticket. Teams need less human follow-up to reach a fix-or-dismiss decision.

This shifts the bottleneck. Instead of spending hours validating whether a finding is real, security engineers can spend that time fixing confirmed vulnerabilities.

Cloudflare's Multi-Agent Architecture

Cloudflare did not simply point Mythos Preview at code and wait for results. The company built a multi-stage harness that runs up to 50 parallel agents. This setup includes adversarial review: a second agent tries to disprove each finding.

The adversarial layer reduces false positives further. If one agent claims an exploit chain exists, another agent attempts to break that claim. Only findings that survive the challenge reach human reviewers.

This architecture reflects a broader pattern in AI deployment. Single agents are brittle. Multi-agent systems with built-in skepticism produce more reliable outputs.

The Attacker Side of the Equation

Cloudflare issued a warning alongside its findings. The capabilities that help defenders will be available to attackers too.

An AI that chains vulnerabilities and writes working exploit code is useful for red teams and security researchers. It is equally useful for malicious actors scanning for weaknesses in production systems.

This dual-use reality shapes how companies should think about AI security tools. Speed matters. Organizations that adopt these capabilities first gain a window to find and fix vulnerabilities before attackers weaponize the same technology against them.

ℹ️

Logicity's Take

What This Means for Security Teams

Security teams evaluating AI tools should look beyond simple vulnerability scanning. The question is no longer 'Can this AI find bugs?' Most frontier models can. The question is 'Can this AI prove exploitability and reduce my team's validation burden?'

Cloudflare's multi-agent approach also offers a template. Running adversarial review, where AI agents challenge each other's findings, may become standard practice for organizations deploying AI in security workflows.

Frequently Asked Questions

What is Anthropic's Mythos Preview?

Mythos Preview is a security-focused AI model from Anthropic that can chain multiple small vulnerabilities into working exploits and autonomously write, compile, and run proof-of-concept code to prove exploitability.

How does Mythos Preview differ from earlier AI security tools?

Earlier frontier models could find individual bugs but failed to connect them into working exploit chains. Mythos Preview completes these chains and proves they work with actual code, reducing speculative findings.

What is Cloudflare's Project Glasswing?

Project Glasswing is Cloudflare's testing initiative that evaluated Mythos Preview across more than 50 internal code repositories using a multi-agent architecture with up to 50 parallel agents and adversarial review.

Can attackers use the same AI capabilities?

Yes. Cloudflare warns that the same AI capabilities that help defenders find and chain vulnerabilities will be available to attackers, creating an arms race dynamic in cybersecurity.

What is adversarial review in AI security testing?

Adversarial review uses a second AI agent to attempt to disprove each finding from the first agent. Only findings that survive this challenge reach human reviewers, reducing false positives.

ℹ️

Need Help Implementing This?

Source: The Decoder / Maximilian Schreiner

H

Huma Shazia

Senior AI & Tech Writer