Mythos Won’t Kill Threat Hunting

It’ll Prove We Were Right

Apr 13, 2026

Last week, a coalition of CISOs, SANS, OWASP, and the Cloud Security Alliance published a strategy briefing called “The AI Vulnerability Storm: Building a ‘Mythos-ready’ Security Program.” If you haven’t read it yet, you should. The author list alone is stacked: Gadi Evron, Rob T. Lee, Jen Easterly, Bruce Schneier, Chris Inglis, Heather Adkins, Rob Joyce. It’s the kind of document that doesn’t happen unless people are genuinely worried.

The headline is hard to ignore. Anthropic’s Claude Mythos can autonomously discover thousands of zero-day vulnerabilities across major operating systems and browsers. A 72% exploit success rate. It found a 27-year-old OpenBSD bug nobody caught. Where Opus 4.6 generated two working Firefox exploits, Mythos generated 181 under identical conditions. The time between vulnerability discovery and a working exploit now looks like hours, not weeks.

The briefing lays out a 90-day plan for CISOs. It’s solid. But it’s written for people managing budgets and setting strategy.

We want to talk about what this means for the people actually doing the work.

This is a genuine inflection point. So naturally, the hot takes started rolling in: AI will replace security analysts. Threat hunting is dead. Humans can’t keep up.

They’re wrong. And they’re wrong for the same reason they’ve always been wrong. They keep confusing finding bugs with finding adversaries.

The Category Error That Won’t Die

Finding a bug in source code and finding an adversary living in your environment are fundamentally different problems. One is a code analysis challenge. The other is a behavioral detection problem. One looks at what software could do wrong. The other looks at what humans are doing wrong, inside your network, right now, with intent.

Could Mythos-class models eventually correlate authentication anomalies with lateral movement at 2 AM? Probably. Could they ask why a legitimate admin tool is running on a finance workstation outside a change window? Maybe. But that’s not a threat to hunting. That’s hunting getting faster. The methodology doesn’t go away because the tools got better. It gets more important because the volume and speed of what we’re up against just changed overnight.

The Model Isn’t the Moat (But It’s Not Nothing)

After Glasswing dropped, AISLE ran Mythos’s showcase vulnerabilities through small, cheap, open-weights models. Eight out of eight detected the flagship FreeBSD exploit. A 3.6 billion parameter model found it. Headlines followed: Mythos isn’t special, smaller models can do this too.

Not so fast.

AISLE isolated the vulnerable functions and pointed their models directly at them. That’s a very different problem than what Mythos did. Anthropic pointed Mythos at entire codebases with no guidance and told it to find something. It scoured millions of lines of code, identified the weak points, chained vulnerabilities together, and built working exploits. The targeting is the hard part, and AISLE skipped it.

But here’s what AISLE got right, and it matters for us: even with a frontier model, the value isn’t the model alone. It’s the system around it. The targeting. The methodology. The expertise that knows where to look and what to do with what you find.

That’s the threat hunting argument we’ve been making for a decade. The tool doesn’t matter. The SIEM doesn’t matter. What matters is the hypothesis, the iterative refinement, the human who understands the terrain. Every agentic security framework being built right now runs the same loop hunters have been running manually. Form hypothesis, collect data, analyze, iterate, improve the posture. We didn’t copy that from AI. AI copied that from us.

What Mythos Actually Changes

Mythos increases volume and speed. It does not change attacker behavior.

The biggest breaches we see today still come from the basics:

credential abuse
phishing
supply chain compromise
misconfigurations

Not zero-days.

Attackers still have to operate in your environment. And that shows up as behavior.

That’s what we hunt.

Detection Was Already Losing

Signature-based detection was already losing.

When exploitation happens faster than your patch cycle, detection tied to known CVEs is always late. You’re reacting after the exploit exists.

Threat hunting exists because of that gap.

Mythos doesn’t break the model.

It validates it.

What Actually Scales

When exploits can be generated at machine speed, the only thing that scales is behavioral hunting.

You’re not asking “was this CVE exploited.”

You’re asking:

does this process tree make sense?
does this auth pattern match baseline?
why is this system talking to something new?
who wrote to this directory?

None of that changes.

It just matters more.

The Real Problem: Memory

Most hunting programs don’t have memory.

Hunts get run, closed, and forgotten. Insights live in Slack. Knowledge walks out the door.

That was already inefficient.

At Mythos scale, it’s a failure mode.

If you can’t recall what you’ve already investigated, you can’t keep up.

What Needs to Change

The CSA briefing’s 90-day plan is good for CISOs. Here’s what it looks like translated to hunting operations:

This week:

Assess behavior coverage, not just data
Baseline auth, DNS, service accounts
Write your hunts down

This month:

Use AI to accelerate, not replace
Generate hypotheses and draft queries faster
Add quality gates

This quarter:

Start building agents for repeatable work
CVE → hypothesis generation
baseline → drift detection
recommendations → tracking

Humans decide.

Agents scale.

HEARTH: The Receipts

We keep saying “we hunt behavior.” Here’s what that looks like in practice.

HEARTH is the community hypothesis library we built at THOR Collective. It currently has 133 hypotheses, 19 baselines, and 15 analytical models, all structured using the PEAK framework. Every hypothesis targets a specific adversary behavior, not a CVE.

When we mapped the Mythos briefing’s threat categories to HEARTH, the coverage held up better than we expected.

Supply chain attacks: npm compromise, VS Code extensions, PyPI poisoning, GitHub Actions abuse
AI/agentic attack surface: MCP server abuse, prompt injection chains, LLM credential theft, autonomous recon
Social engineering at scale: ClickFix variants, AI tool impersonation, fake VPN clients
Baselines: non-human identities, DNS patterns, scheduled tasks, service account auth, PowerShell usage

It’s not complete. We don’t yet cover things like detecting exploitation of newly discovered kernel-level bugs or tracking patch velocity against disclosure rates.

But the model holds.

A shared library of behavioral hypotheses is exactly the kind of infrastructure the CSA briefing points to when it says coalitions win. HEARTH is open source. Every hypothesis is a pull request away from better coverage.

The Five Levels

The briefing calls for “Mythos-ready” programs but doesn’t define what that means. This is exactly the problem the Agentic Threat Hunting Framework (ATHF) was designed to solve.

Level 0: Ad hoc - Hunts live in Slack. No structure, no memory.
Level 1: Documented - Hunts are written and stored.
Level 2: Searchable - Hunt history can be queried and recalled, including by AI.
Level 3: Generative - AI assists with hypotheses and execution.
Level 4: Agentic - Agents handle monitoring, triage, and workflow execution.

Most teams should be targeting Level 2 right now.

That’s the minimum viable response to this shift. Because at Mythos scale, memory isn’t optional.

It’s the difference between scaling and sinking.

The Bottom Line

Mythos didn’t change what threat hunting is.

It changed how fast we need to do it.

PEAK still works. Behavioral hunting still works.

You just need to move faster, remember more, and cover more ground.

The hunters who figure that out won’t just be fine. They’ll be the ones everyone else is depending on.

Mythos didn’t change what threat hunting is.

It changed how fast we need to do it.

Happy thrunting!

THOR Collective Dispatch

Discussion about this post

Ready for more?