Anthropic's Project Glasswing: When AI Finds Zero-Days Faster Than Humans Can Count Them

lschvnApril 7, 20268 min read

It survived 27 years of human security research. Thousands of CVEs were filed and patched. Countless auditors, penetration testers, and independent researchers looked at the code. And then, in a matter of minutes, an AI model found a remote crash vulnerability in OpenBSD, one that required no authentication whatsoever. Within days, the exploit was written and the patch was live. That is the story of Claude Mythos Preview and Project Glasswing.

Anthropic didn't set out to build an offensive security model. The company says it was researching AI safety and model capabilities when researchers noticed something alarming: the same reasoning capabilities that make frontier models useful for software engineering also make them extraordinarily effective at finding, and exploiting, software vulnerabilities. Not in theory. In practice. Working exploits. Overnight.

The Discovery

The numbers from Anthropic's internal red team testing are stark. On CyberGym, a benchmark designed to test a model's ability to reproduce known vulnerabilities, Claude Mythos scored 83.1%. Claude Opus 4.6 scored 66.6%. The gap is large. On SWE-bench Verified, a test of genuine software engineering capability, Mythos hit 94.6% versus Opus 4.6's 91.3%. On Terminal-Bench, which measures a model's ability to operate in a shell environment and chain complex terminal operations, Mythos scored 92.1%. Taken together, the numbers describe a model that can reliably do what security researchers do, find bugs, understand code paths, and write working exploits, without being prompted to behave ethically or safely.

Claude Mythos Preview vs Opus 4.6 on CyberGym and SWE-bench Verified

In one month of coordinated testing, Mythos found zero-day vulnerabilities across nearly every major operating system and browser in use today.

The oldest was a 27-year-old remote crash bug in OpenBSD that required no authentication. It had been sitting in the codebase since 1999, invisible to decades of human reviewers. The most technically alarming was a 16-year-old flaw in FFmpeg, the ubiquitous multimedia codec library used in everything from browsers to video conferencing platforms to mobile operating systems. Five million automated tests had run against FFmpeg's codebase over the years. Mythos found the bug anyway, without any special prompting or access beyond what any developer would have.

Anthropic's own engineers, non-experts with no formal security training, were given access and asked to see what Mythos could do overnight. They woke up to complete working exploits. Remote code execution. No guidance. No steering. The model had identified the vulnerability, understood the exploitation path, and written functional proof-of-concept code entirely on its own.

Other findings included a Linux kernel privilege escalation that allowed a standard user account to escalate to root, a browser JIT compiler heap spray combined with a sandbox escape (chaining four separate vulnerabilities together), and a FreeBSD NFS remote root exploit built on a 20-gadget ROP (Return-Oriented Programming) chain. Every major browser and operating system was affected. All vulnerabilities have been disclosed and patched.

The Coalition

On April 7, 2026, Anthropic announced Project Glasswing, a formal industry coalition designed to act on what the red team had found. Thirteen founding partners anchor the initiative: AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Apple is the most conspicuous name on the list, a company that has been notably restrained in its public positioning on artificial intelligence, yet which is now formally part of the most significant AI-cybersecurity partnership in the industry's history.

Anthropic is committing $100 million in API credits to the effort and an additional $4 million in direct funding to open-source security organizations, including the Alpha-Omega Project, the Open Source Security Foundation (OpenSSF), and the Apache Software Foundation. Post-announcement, the Glasswing API is priced at $25 per million tokens for context windows and $125 per million tokens for generation, a deliberately accessible price point designed to get vulnerable open-source projects covered, not to maximize revenue.

Partners are not just receiving access to Mythos. They are contributing findings, coordinating disclosure, and integrating the model into their own security pipelines. More than 40 additional organizations had received or been extended access to the system by launch, ranging from academic security labs to mid-sized software companies with critical infrastructure exposure.

The Attack Timeline Collapsed

The window between vulnerability discovery and exploit availability has historically been measured in months. Security researchers find a bug, verify it, write a proof-of-concept, coordinate with the vendor, and wait for a patch, a process that routinely stretches across 90 to 180 days even under coordinated disclosure programs. Mythos doesn't follow that timeline. It found, verified, and produced working exploit code for multiple critical vulnerabilities in a single session.

This is not a theoretical acceleration. It is a practical collapse of the threat window. Defenders who previously had months to patch now have minutes, or at best, hours, before a capable actor could produce an operational exploit. The dual-use nature of the technology means this capability is not confined to a research environment. It is live, accessible, and scalable.

What It Means for the Industry

For operators of AI agents, including platforms like OpenClaw that give autonomous systems persistent access to files, code, and network resources, the implications are direct. AI agents are not just productivity tools. They are execution environments. If a model like Mythos can find and exploit vulnerabilities autonomously, then any sufficiently capable model operating in a permissive environment is potentially a vector for both offensive and defensive action. The same capability that patches your system can, in the wrong context or with the wrong prompting, attack it.

For security teams, the picture is more complex. AI-assisted vulnerability finding is now faster than human-led penetration testing. Organizations that integrate these capabilities into their red team operations will find more bugs, faster. But so will their adversaries. The offense-defense race has tilted in a direction that demands new governance frameworks, not just for models like Mythos, but for the broader ecosystem of capable AI systems that will follow.

Anthropic is not releasing Mythos publicly, citing the clear and immediate risk of non-experts using it to generate working exploits without disclosure infrastructure. The model is available to Glasswing partners and a curated set of open-source maintainers. That decision reflects genuine seriousness about the dual-use problem. But it does not resolve the underlying dynamic: the attack timeline has collapsed, the tools are real, and the industry is only beginning to understand what that means.

Caveats: Only a subset of the vulnerabilities discovered by Mythos have been publicly disclosed; Anthropic estimates that more than 99% of the model's findings remain undisclosed pending coordinated vendor patches. The model is not publicly available. Its effectiveness is partly dependent on access to source code and binaries, it performs less reliably against black-box targets with no code visibility. These factors limit independent verification of some claims in this report.

Frequently Asked Questions

Deno 2.7 Stabilizes the Temporal API, Adds Windows ARM Support and npm Overrides

Deno 2.7 is a substantial mid-cycle release: the Temporal API is now production-ready, native Windows on ARM builds land, npm overrides work like in Node, and dozens of Node.js compatibility improvements land across worker_threads, child_process, zlib, and sqlite.

Vite 8 Stable Lands, Seven Patches Follow in Three Weeks

Vite 8.0.0 shipped stable on March 12, and the patch releases haven't stopped, v8.0.7 landed April 7 with fixes across CSS, SSR, WASM, and dev server behavior. A contrast to the long beta cycle.

More coverage with overlapping topics and tags.

securityJul 5, 2026

Claude Code Issue #74066: Users Report Cross-Workspace Context Bleed on Sonnet 5, Anthropic Has Not Yet Responded

An open bug filed against Claude Code on 2026-07-04 by an [Enterprise ZDR](https://docs.anthropic.com/en/docs/build-with-claude/zero-data-retention) user describes a working session on Sonnet 5 that suddenly starts referencing an unrelated Minecraft temple build, then doubles down on the wrong task in its recap. The reporter (GitHub: [@milesrichardson-edb](https://github.com/milesrichardson-edb), issue [anthropics/claude-code#74066](https://github.com/anthropics/claude-code/issues/74066)) is on Enterprise Zero Data Retention, the tier Anthropic specifically advertises as session-isolated. Triage on the reporter's local session JSONL at `~/.claude/projects/<encoded-cwd>/<session-id>.jsonl` finds the leaked text is not in the transcript, ruling out a local context bleed by file overlap. Four other users in the comments (with work histories going back to last year) describe near-identical behavior across Claude Code, Claude Mobile, and Claude deep research. The most plausible architectural fit is shared KV-cache state in inference ([per @yv3nne in the comments](https://github.com/anthropics/claude-code/issues/74066#issuecomment-4880448776)), but no Anthropic engineer has commented on the issue in the 22 hours since it was filed, and the issue reached the top of [Hacker News](https://news.ycombinator.com/item?id=42481789) on 2026-07-04. The tone in the thread is split: half suspecting a real platform cache reuse, half suspecting a [sonnet-5-specific hallucination triggered by a Pygments lexer](https://github.com/anthropics/claude-code/issues/74066#issuecomment-4880334711). Both readings are credible.

securityJun 25, 2026

Anthropic Accuses Alibaba of 'Brazen' Industrial-Scale Distillation of Claude: 28.8M Exchanges, ~25,000 Fraudulent Accounts, April 22 to June 5

Anthropic published a blog post on 2026-06-24 (https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks) accusing Alibaba of an industrial-scale distillation campaign against Claude. Anthropic says the campaign ran from April 22 to June 5, 2026 and produced over 28.8 million exchanges with Claude through almost 25,000 fraudulent accounts. The accusation is the second in Anthropic's public distillation disclosures: the first, on 2026-02-23, named DeepSeek, Moonshot AI, and MiniMax as attackers behind 16 million exchanges across 24,000 fraudulent accounts. Reuters (Krystal Hu, Eduardo Baptista) and Bloomberg (Saritha Rai) reported the new allegation on 2026-06-24; WSJ ran Anthropic Claims Alibaba Ran 'Brazen' Campaign to Access Its Claude AI Model the same day. Alibaba has not yet issued a public statement at time of writing.

securityJun 18, 2026

OpenAI Codex 0.141 Adds Noise-Encrypted Remote Executors, Cross-OS `PathUri`, a Plugin Marketplace, and a SQLite WAL-Reset Pin

Codex 0.141.0 (June 18, 2026) makes Noise IK the default transport between orchestrator and exec-server, ships a PathUri / NativePathString layer that round-trips POSIX, Windows-drive, and UNC paths without leaking the URI encoding, opens a `created-by-me-remote` plugin marketplace, raises the MCP tool timeout to 300 seconds, and pins the bundled SQLite to 3.51.3 to keep the WAL-reset corruption fix in place after dependency refreshes.

Comments

No comments yet. Be the first to share your thoughts.