Anthropic ‘Mythos’ Model Security Concerns & Talks

  • Claude Mythos Preview is the first AI model confirmed to complete a 32-step cyberattack simulation created by the UK’s AI Security Institute — succeeding 3 out of 10 attempts.
  • Anthropic has committed up to $100M in usage credits and $4M in direct donations to help secure critical software infrastructure through Project Glasswing.
  • Mythos can autonomously discover zero-day vulnerabilities, reverse-engineer exploits from closed-source binaries, and execute JIT heap spray attacks across major web browsers.
  • Major financial institutions including Goldman Sachs are already stress-testing their systems against Mythos-class capabilities — and the results are raising alarms.
  • The UK’s AISI has warned that future models will only surpass Mythos — meaning the window to build robust cyber defenses is narrowing fast.

AI just crossed a line that cybersecurity professionals have been dreading for years.

Claude Mythos Preview is a general-purpose, unreleased frontier model from Anthropic, and it represents something genuinely new: an AI system capable of performing sophisticated, multi-step cyberattacks at a level that rivals all but the most elite human security researchers. This isn’t a theoretical risk — it’s a documented capability that has already triggered responses from governments, major banks, and the open-source security community. Anthropic itself has been unusually candid about what Mythos can do, which is itself a signal worth paying attention to.

Anthropic, the company behind the Claude family of AI models, launched Project Glasswing specifically in response to the capabilities observed in Mythos Preview — a rare instance of an AI lab publicly acknowledging that its own model poses serious dual-use risks. Understanding what those risks are, and what’s being done about them, matters for anyone who works with, builds on, or depends on software infrastructure.

Anthropic’s Mythos Model Is a Cybersecurity Watershed Moment

The cybersecurity community has long debated when — not if — an AI model would become capable enough to meaningfully lower the barrier to executing complex cyberattacks. With Mythos Preview, that moment has arrived. What makes this significant isn’t just the raw capability of the model, but how that capability was measured, confirmed, and then communicated publicly by Anthropic itself.

Tasks that would normally take a skilled human security professional several days to complete are now being executed by Mythos in a fraction of that time. The model completed a 32-step cyberattack simulation developed by the UK’s AI Security Institute (AISI), succeeding in 3 out of 10 attempts — a pass rate that sounds modest until you understand the complexity of the simulation involved. This is not a proof-of-concept demo. These are structured, adversarial evaluations designed to test real-world attack readiness.

AI Has Crossed a Critical Threshold in Exploit Development

The core issue is that Mythos doesn’t just assist with cyberattacks — it can plan and execute them autonomously across multiple steps. This is the threshold that separates a useful coding assistant from something that fundamentally changes the threat landscape. Previous frontier models could help an attacker write malicious code, but they couldn’t reliably chain together the reconnaissance, vulnerability identification, exploit development, and delivery stages that a real attack requires.

Mythos Preview changes that equation. According to Anthropic’s own documentation around Project Glasswing, the model has reached a level of coding capability where it can surpass all but the most advanced human security researchers in specific exploit development tasks. That’s a remarkable and sobering admission from the lab that built it.

Why Security Experts Call This a “New Equilibrium”

Before Mythos, the asymmetry in cybersecurity slightly favored defenders — attackers needed rare, expensive human expertise to develop truly sophisticated exploits, while defenders could rely on automated scanning tools and patch management to close most gaps. Mythos disrupts that balance. When an AI can autonomously work through the full attack chain, the cost of a sophisticated attack drops dramatically, while the cost of defense stays the same or increases. Security professionals are calling this a “new equilibrium” — a reset of baseline assumptions about what threat actors are capable of, regardless of their technical skill level.

What Makes Mythos Different From Previous AI Models

It’s worth being precise here, because not all AI security risks are the same. What distinguishes Mythos from earlier models isn’t just benchmark performance — it’s the specific categories of offensive security tasks it can now perform reliably and autonomously. For more insights on these developments, explore the AI Superhacker Claude Mythos Project.

Zero-Day Vulnerability Discovery in Real Codebases

Mythos Preview has demonstrated the ability to identify previously unknown vulnerabilities — zero-days — in real production codebases. This is not fuzzing or pattern matching against known vulnerability signatures. The model can reason about code logic, identify edge cases that human developers missed, and assess whether those edge cases are exploitable. Applied defensively, this is extraordinarily powerful. Applied offensively, it’s a serious threat to any organization running software that hasn’t been subjected to this level of scrutiny.

Reverse-Engineering Exploits on Closed-Source Software

One of the most technically impressive capabilities documented in Mythos is its ability to analyze closed-source binaries and reconstruct viable exploit paths without access to the original source code. This dramatically expands the attack surface for software that has historically relied on obscurity as part of its security posture — a practice that was already considered weak but is now essentially worthless against a Mythos-class model.

N-Day Vulnerabilities Turned Into Active Exploits

N-day vulnerabilities are known security flaws that have been publicly disclosed but not yet patched across all affected systems. Converting an n-day disclosure into a working exploit has traditionally required significant manual skill — understanding the specific memory layout, crafting the payload, and testing across target configurations. Mythos can compress this process significantly, turning public vulnerability disclosures into functional exploits far faster than the patch deployment cycle can keep up with.

This matters because the window between public disclosure and widespread patching is already one of the most dangerous periods in the vulnerability lifecycle. Mythos makes that window even more dangerous by accelerating the attacker’s side of the equation without accelerating the defender’s.

Web Browser JIT Heap Spray Attacks Across Major Web Browsers

Perhaps the most technically specific capability flagged in Mythos documentation is its ability to generate and execute Just-In-Time (JIT) compilation heap spray attacks targeting major web browsers. JIT heap sprays are among the more complex browser exploitation techniques — they require precise understanding of how a specific browser’s JIT compiler allocates memory, and they need to be crafted differently for each target browser. The fact that Mythos can perform this across multiple browsers autonomously puts it in the same capability tier as elite offensive security teams.

Governments and Banks Are Already Responding

The reaction to Mythos Preview hasn’t been limited to the cybersecurity research community. At the highest levels of government and finance, decision-makers are scrambling to understand what a Mythos-class AI means for the institutions they’re responsible for protecting. The speed of that response is itself telling — this isn’t a slow-moving policy debate. It’s an active, urgent conversation happening right now.

What’s particularly striking is how public these responses have been. Normally, discussions about specific AI threats to critical financial infrastructure happen behind closed doors. The fact that bank CEOs and government officials are speaking openly about Mythos signals that the risk is being treated as systemic, not isolated.

US Treasury Secretary Scott Bessent Summoned Major Bank CEOs

US Treasury Secretary Scott Bessent convened a meeting with the CEOs of major banks specifically to discuss the cybersecurity implications of Anthropic’s Mythos model. The meeting reflected a level of concern that goes beyond routine AI policy discussions — this was a targeted briefing about a specific model’s capabilities and what those capabilities mean for the financial system’s attack surface. The Treasury’s involvement signals that Mythos is being treated as a national security-adjacent issue, not just a tech industry problem.

Goldman Sachs CEO David Solomon’s Public Warning

Goldman Sachs CEO David Solomon described his firm as “hyper-aware” of the risks posed by Anthropic’s Mythos AI — unusually direct language from a Wall Street executive on an AI topic. Goldman Sachs already had access to Claude models prior to the Mythos announcement, and the bank has been actively testing Mythos Preview within its own security operations, applying it directly to critical codebases.

  • Goldman Sachs analyzes over 400 trillion network flows every day for threat detection
  • AI is described internally as central to their ability to defend at scale
  • Mythos Preview testing has already helped the bank identify and strengthen vulnerable code
  • Goldman is contributing deep security expertise back to Anthropic to harden Mythos Preview for broader organizational deployment

The bank’s approach is a useful template: treat Mythos as both a threat to model and a defensive tool to deploy. Rather than waiting for regulatory guidance, Goldman moved directly into hands-on testing — applying Mythos to the same codebases an attacker might target, then using what it found to patch vulnerabilities before they could be exploited.

Solomon’s public comments also carry a broader message to the financial sector: don’t underestimate this model. The fact that a firm with Goldman’s security infrastructure is treating Mythos with this level of seriousness should recalibrate expectations across the industry about what “adequate” cyber defense looks like going forward. This is especially relevant as the AI skill turns AI into a website auditor, showcasing the evolving landscape of AI capabilities.

Goldman also joined Project Glasswing, Anthropic’s coordinated initiative to deploy Mythos Preview for defensive security purposes — a participation that gives the bank early access to the model’s full capabilities while contributing to the broader effort to harden critical infrastructure against Mythos-class attacks.

UK’s AI Security Institute Flags Mythos as a “Step Up” Threat

The UK government’s AI Security Institute published a formal assessment describing Mythos as a clear “step up” over previous frontier models in terms of the cyber threat it poses. The AISI confirmed that Mythos can autonomously attack small, weakly defended IT systems — and while their testing couldn’t conclusively determine its effectiveness against well-defended enterprise systems due to the absence of defensive tools in the test environment, the implication was clear: well-defended systems are the next evaluation milestone, not the current ceiling. The AISI blogpost closed with a direct warning that future models will only improve on Mythos, making immediate investment in cyber defense infrastructure a strategic necessity, not an optional upgrade.

Project Glasswing: Anthropic’s Answer to the Risk It Created

Anthropic’s response to the dual-use reality of Mythos Preview is Project Glasswing — a structured initiative to put the model’s offensive capabilities to work on the defensive side of the equation. The core logic is straightforward: if Mythos can find vulnerabilities that human researchers miss, then deploying it systematically against the world’s most critical software before attackers do is the most rational use of that capability. Project Glasswing is the organizational framework for doing exactly that.

How the $100M Credit Commitment Works

Anthropic is committing up to $100 million in Mythos Preview usage credits to Project Glasswing participants. These credits are allocated to organizations working on defensive security applications — scanning critical codebases, identifying vulnerabilities in open-source dependencies, and stress-testing infrastructure against Mythos-class attack patterns. The credit structure is designed to remove cost as a barrier to adoption for security teams that need access to the model’s full capabilities but don’t have the budget to pay frontier AI inference costs at scale.

The $100M commitment is also a signal about the compute intensity of what Mythos is being asked to do. Scanning real-world production codebases for zero-day vulnerabilities isn’t a lightweight task — it requires sustained, deep reasoning across large volumes of code. Anthropic is essentially subsidizing the defensive use of its own most capable model, betting that widespread defensive deployment will outpace offensive misuse.

$4M in Direct Donations to Open-Source Security Organizations

Alongside the usage credits, Anthropic is directing $4 million in direct donations to open-source security organizations. This funding targets the part of the software ecosystem that is simultaneously the most widely used and the most under-resourced when it comes to security research. Open-source libraries underpin virtually every modern application stack, and many of them are maintained by small teams or individual developers who don’t have the bandwidth to conduct the kind of rigorous security review that Mythos-class models make possible.

The donation component addresses a gap that the credit program alone can’t fill. Credits are useful for organizations with existing security infrastructure — but many open-source projects need direct funding to bring on the human expertise required to act on what Mythos finds. The $4M is designed to close that loop.

40+ Organizations Given Access to Scan Critical Infrastructure

Project Glasswing Participant Scope
Beyond the anchor partners named publicly, Anthropic has extended Mythos Preview access to a group of over 40 additional organizations that build or maintain critical software infrastructure. These organizations are using the model to scan and secure both first-party and open-source systems. Anthropic has committed to sharing findings from across these efforts so the entire industry can benefit — not just the organizations with direct access. This open-sharing model is a deliberate attempt to create a collective defense posture rather than a patchwork of individual security improvements.

The 40+ organization figure is significant because it represents the breadth of software infrastructure being actively scanned right now. These aren’t just large enterprises — they include organizations that maintain foundational open-source projects that millions of downstream applications depend on. A vulnerability found and patched in one of these codebases has a multiplier effect across the entire software ecosystem.

Participants in Project Glasswing are also contributing back to Anthropic’s understanding of how Mythos performs in real defensive security contexts. Every codebase scanned, every vulnerability found or missed, every exploit chain Mythos constructs or fails to construct — all of that feeds back into Anthropic’s ability to develop better safeguards for Mythos-class models going forward.

The collaborative structure of Glasswing also addresses one of the core challenges in cybersecurity: the tendency for organizations to hoard vulnerability information for competitive or liability reasons. By creating a shared framework with Anthropic as the central coordinator, Project Glasswing creates an incentive structure where sharing findings is the default, not the exception.

What This Means for Cyber Defense Going Forward

  • Friction-based defenses are weakening fast — security measures that rely on making attacks time-consuming or technically difficult are less effective against a model that can automate the hard parts
  • Hard barrier techniques retain their value — architectural defenses like Kernel Address Space Layout Randomization (KASLR) and Write XOR Execute (W^X) memory protections still impose meaningful constraints even on Mythos-class models
  • Patch velocity is now a first-order priority — the window between vulnerability disclosure and exploitation has narrowed, making rapid patch deployment a critical metric
  • Open-source dependency security requires immediate attention — the libraries your application depends on are now being scanned by AI on both sides of the offense/defense line
  • Security teams need AI-native tooling — teams that aren’t using AI for defensive scanning are already operating at a structural disadvantage

The instinct to frame Mythos purely as a threat is understandable, but incomplete. The same capabilities that make it dangerous in an attacker’s hands make it extraordinarily valuable in a defender’s. The organizations that will navigate this transition most successfully are the ones that move quickly to deploy Mythos-class models offensively against their own infrastructure — finding what’s there before someone else does.

That said, the defensive deployment of Mythos isn’t a one-time exercise. Codebases evolve, new dependencies are added, and the attack surface of any non-trivial software system is constantly shifting. The organizations that treat Mythos-enabled security scanning as a continuous process rather than a one-time audit will be meaningfully better positioned than those that treat it as a checkbox.

Why Friction-Based Defenses Are Now Weaker

Friction-based defenses work by making attacks expensive — in time, skill, and resources. Stack canaries, address space randomization, and code obfuscation don’t prevent exploitation in theory; they make it slow and difficult enough that most attackers move on to easier targets. That calculus breaks down when the attacker has access to a model that can work through those friction layers autonomously, without fatigue, and at a cost that approaches zero per attempt.

Mythos doesn’t get tired. It doesn’t get frustrated when an exploit attempt fails. It iterates. The specific concern with JIT heap spray attacks across multiple browsers is a direct illustration of this — what previously required a specialist who understood the memory management internals of each browser’s JIT compiler can now be generated, tested, and refined by a model working through the problem systematically. Every security measure that was designed to slow down a human attacker needs to be re-evaluated against an attacker that doesn’t experience slowdown the same way.

Hard Barrier Techniques Like KASLR and W^X Still Hold Value

Not all defenses are created equal when it comes to Mythos resistance. Kernel Address Space Layout Randomization (KASLR) and Write XOR Execute (W^X) memory protections represent a different category of defense — they don’t rely on making attacks slow, they impose hard architectural constraints that are difficult to reason around even with advanced AI assistance. KASLR randomizes the memory addresses of kernel code and data structures, meaning an exploit needs to first defeat the randomization before it can reliably target anything. W^X ensures that memory pages cannot be both writable and executable simultaneously, which directly blocks entire classes of code injection attacks. These protections don’t become worthless against Mythos — they remain meaningful constraints. The practical implication for security teams is to prioritize hardening the architectural layer of their stack, not just the procedural one.

The Threat Is Not Plateauing — It’s Accelerating

UK AISI Assessment Summary — Claude Mythos Preview

“Mythos was the first AI model to successfully complete a 32-step simulation of a cyber-attack created by AISI, solving the challenge in 3 out of its 10 attempts. These tasks would normally take human professionals days to carry out.”

The AISI noted that Mythos appears capable of autonomously attacking small, weakly defended IT systems. Their evaluation also flagged that existing benchmarks used to track AI vulnerability exploitation capabilities are being saturated by Mythos — meaning new, more demanding real-world benchmarks are now required. The AISI’s closing assessment was unambiguous: future advanced AI models will only improve on Mythos, and investment in cyber defense now is vital.

The saturation of existing cybersecurity benchmarks by Mythos is one of the most underreported aspects of this story. When Anthropic’s own security evaluations are no longer challenging enough to meaningfully differentiate performance, it means the model has moved beyond what the evaluation infrastructure was designed to measure. Anthropic has confirmed it is now shifting focus to novel real-world security challenges — which means the next generation of benchmarks will be built around what Mythos can already do, not what it struggled with.

This has a direct implication for how organizations should think about their security posture. If the benchmarks used to assess AI cyber capability are already being redesigned upward, then any security strategy that was calibrated against the threat landscape of 12 months ago is already outdated. The relevant question is no longer whether your defenses can withstand a Mythos-level attack — it’s whether they can withstand whatever comes after Mythos, which is already in development.

The trajectory here matters as much as the current state. Mythos Preview is an unreleased model — it hasn’t been deployed publicly, and its capabilities are being managed carefully under the Project Glasswing framework. But the underlying research that produced Mythos is continuing, and the general capability curve in frontier AI models has shown no signs of flattening. The AISI’s warning about investing in cyber defense now is essentially an acknowledgment that the window to build adequate defenses before the next capability jump is limited — and shrinking.

Frequently Asked Questions

Below are the most common questions about Anthropic’s Mythos model and its implications for cybersecurity, answered directly using confirmed information from Anthropic, the UK’s AI Security Institute, and Project Glasswing documentation.

What is Anthropic’s Mythos AI model?

Claude Mythos Preview is an unreleased general-purpose frontier AI model developed by Anthropic. It is part of the Claude model family and represents a significant capability jump over previous versions — specifically in the domain of coding and cybersecurity tasks. Unlike prior models that could assist with individual security-related tasks, Mythos can autonomously plan and execute multi-step processes, including vulnerability discovery, exploit development, and cyberattack simulation. It has not been released publicly; access is currently limited to vetted participants in Project Glasswing and select security research partners.

Why is Mythos considered a cybersecurity threat?

Mythos is considered a cybersecurity threat because it can perform sophisticated offensive security tasks autonomously — tasks that previously required rare, highly skilled human expertise. These include discovering zero-day vulnerabilities in real codebases, reverse-engineering exploits from closed-source binaries, converting known vulnerability disclosures into working exploits, and executing JIT heap spray attacks across multiple web browsers. The model completed a 32-step cyberattack simulation developed by the UK’s AISI — a challenge that would take human security professionals several days to complete. For more insights on the potential of this AI, read about the AI superhacker Claude Mythos project.

The concern isn’t just what Mythos can do today — it’s what it represents as a capability threshold. Once an AI model can reliably chain together the reconnaissance, analysis, exploit development, and delivery stages of a cyberattack, the cost of executing a sophisticated attack drops dramatically. Mythos makes that capability accessible without requiring the attacker to have elite technical skills, which fundamentally changes who can execute advanced attacks and at what scale.

What is Project Glasswing?

Project Glasswing is Anthropic’s coordinated initiative to deploy Mythos Preview for defensive cybersecurity purposes. It was launched specifically in response to the offensive capabilities observed in the model — a deliberate effort to use those same capabilities to find and fix vulnerabilities before attackers can exploit them. The project involves giving vetted organizations access to Mythos Preview to scan critical software infrastructure, with Anthropic committing up to $100 million in usage credits and $4 million in direct donations to open-source security organizations.

More than 40 organizations that build or maintain critical software infrastructure have been given access to Mythos Preview through Project Glasswing. Anthropic has committed to sharing findings across participants so the entire industry benefits, not just individual organizations with direct access. The project also serves as a real-world testing ground for developing the safeguards and deployment practices that will be needed as Mythos-class models become more widely available.

How are governments responding to the Mythos model?

Government responses have been swift and unusually public. In the UK, the AI Security Institute conducted formal evaluations of Mythos and published an assessment describing it as a clear “step up” over previous frontier models in terms of cyber threat potential. The AISI confirmed Mythos can autonomously attack weakly defended systems and closed with a direct call for immediate investment in cyber defense infrastructure. In the US, Treasury Secretary Scott Bessent convened a meeting with major bank CEOs specifically to discuss the cybersecurity implications of the Mythos model — a level of governmental engagement that signals Mythos is being treated as a systemic financial security concern, not just a technology industry issue.

Can Mythos also be used for defensive cybersecurity?

Yes — and this is central to Anthropic’s entire framing of the model. The same capabilities that make Mythos dangerous in an attacker’s hands make it extraordinarily powerful as a defensive tool. It can scan production codebases for zero-day vulnerabilities before attackers find them, identify exploit paths in closed-source software dependencies, and stress-test infrastructure against attack patterns that human red teams might miss. Goldman Sachs is already applying Mythos Preview to its own critical codebases and reporting that it has helped identify and strengthen vulnerable code.

The defensive use case is not a theoretical mitigation — it’s being actively deployed right now through Project Glasswing across more than 40 organizations managing critical infrastructure. The principle is straightforward: if Mythos can find a vulnerability, it’s better for a defender to find it first. Organizations that deploy Mythos-class models against their own systems before attackers do will have a structural security advantage over those that don’t.

Anthropic is also committed to open-sharing of findings across Project Glasswing participants, which means the defensive value of Mythos isn’t locked inside individual organizations — it compounds across the ecosystem. Every vulnerability found and patched in a widely-used open-source library benefits every downstream application that depends on it, creating a collective defense effect that individual security programs can’t replicate on their own. For security teams evaluating how to respond to the Mythos moment, the answer isn’t to wait for regulatory guidance — it’s to start scanning.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top