\n\n\n\n Anthropic Built an AI So Good at Hacking They're Keeping It Locked Up - AgntBox Anthropic Built an AI So Good at Hacking They're Keeping It Locked Up - AgntBox \n

Anthropic Built an AI So Good at Hacking They’re Keeping It Locked Up

📖 4 min read•646 words•Updated Apr 8, 2026

2026. That’s when Anthropic believes their unreleased AI model could fuel large-scale cyberattacks if it falls into the wrong hands. The company just told the US government as much, and they’re not taking any chances with public access.

I’ve spent years testing AI toolkits, watching companies race to ship faster, bigger, and supposedly better models. But Anthropic just did something I’ve never seen before: they built Claude Mythos, called it a “step change” in performance, and then immediately decided you and I will never get to use it.

The reason? It’s too good at hacking.

What We Know About Mythos

Details are scarce, partly because Anthropic is keeping this one close to the chest. What we do know comes from an accidental data leak that revealed the model’s existence and some testing results. The company confirmed they’re working on this next-generation model and that its cybersecurity capabilities crossed a line they’re not comfortable with.

This isn’t the usual corporate hedging about “responsible AI development” that every company puts in their press releases. Anthropic is flat-out saying this model won’t see public release because of what it can do.

A Different Kind of AI Announcement

Think about how weird this is from a business perspective. AI companies live and die by their model releases. Every new version is a chance to grab headlines, attract users, and justify their massive compute budgets. OpenAI announces GPT updates like Apple announces iPhones. Google rushed Bard out the door to compete. Meta open-sources everything they can.

And here’s Anthropic saying “we made something amazing and you can’t have it.”

As someone who tests these tools for a living, I have mixed feelings. On one hand, I appreciate the honesty. If Mythos really represents a step change in hacking capabilities, keeping it under strict control makes sense. We’ve all seen what happens when powerful tools get into the wrong hands, and AI-powered hacking tools could cause real damage.

On the other hand, this creates a troubling precedent. Who decides what’s “too powerful”? What oversight exists for models that never see public scrutiny? And if Anthropic can build this, how long until someone else does and doesn’t have the same scruples about releasing it?

What This Means for AI Toolkits

For my readers who actually use these tools day-to-day, the Mythos situation highlights something important: the AI models you have access to are increasingly the “safe” versions. The really capable stuff is being held back, either for safety reasons or competitive advantage.

This isn’t necessarily bad. I’d rather have a slightly less capable coding assistant than deal with the fallout of AI-powered security exploits becoming commonplace. But it does mean we need to adjust our expectations about what publicly available AI can do.

The gap between latest AI research and what ships in products is growing. Companies are building models they won’t release, running tests they won’t fully disclose, and making decisions about capability limits behind closed doors.

The Bigger Picture

Anthropic’s decision to warn the US government about potential cyberattacks in 2026 suggests they’re taking this seriously. That timeline also tells us something: they think it’ll take about two years before these capabilities become a real threat, either through their own work or competitors catching up.

For now, Claude users will keep using the publicly available versions. Mythos will remain in Anthropic’s labs, presumably being studied, tested, and kept under strict control. Whether that’s the right call depends on who you ask, but at least they’re being transparent about making it.

As someone who reviews AI toolkits, I can’t test what I can’t access. But I can tell you this: when a company builds something and immediately decides it’s too dangerous to ship, that’s worth paying attention to. The question isn’t whether Mythos is as capable as Anthropic claims. The question is what happens when the next company builds something similar and decides differently.

🕒 Published:

🧰
Written by Jake Chen

Software reviewer and AI tool expert. Independently tests and benchmarks AI products. No sponsored reviews — ever.

Learn more →
Browse Topics: AI & Automation | Comparisons | Dev Tools | Infrastructure | Security & Monitoring
Scroll to Top