GPT-5.5 matches heavily hyped Mythos Preview in new cybersecurity tests
New results suggest Mythos' cyber threat isn't "a breakthrough specific to one model."
Signal weather
Stable
The story has moved beyond the first headline and now acts as a reliable context anchor.
Last month, Anthropic made a big deal about the supposedly outsize cybersecurity threat represented by its Mythos Preview model, leading the company to restrict the initial release to “critical industry partners.” But new research from the UK's AI Security Institute (AISI) suggests that OpenAI's GPT-5.5, which launched publicly last week, reached "a similar level of performance on our cyber evaluations" as Mythos Preview, which the group evaluated last month. Since 2023, the AISI has run a variety of frontier AI models through 95 different Capture the Flag challenges designed to test capabilities on cybersecurity tasks, such as reverse engineering, web exploitation, and cryptography. On the highest-level "Expert" tasks, GPT-5.5 passed an average of 71.4 percent, slightly higher than the 68.6 percent achieved by Mythos Preview (though within the margin of error). In one particularly difficult task that involved building a disassembler to decode a Rust binary, AISI notes that "GPT-5.5 solved the challenge in 10 minutes and 22 seconds with no human assistance at a cost of $1.73" in API calls. GPT-5.5 also matched Mythos Preview in its progress on "The Last Ones" (TLO), an AISI test range set up to simulate a 32-step data extraction attack on a corporate network. GPT-5.5 succeeded in 3 of 10 attempts on TLO, compared to 2 of 10 for Mythos Preview—no previous model had ever succeeded at the test even once. But GPT-5.5 still fails at AISI's more difficult "Cooling Tower" simulation of an attempted disruption of the control software for a power plant, as every previously tested AI model also has. Read full article Comments
Stay on the signal
Follow GPT-5.5 matches heavily hyped Mythos Preview in new cybersecurity tests
Follow this story beyond a single article: new follow-ups, adjacent sources, and the evolving storyline.
Story map
Understand this topic fast
A quick entry into the story: why it matters now, who is involved, and where to go next for context.
Why it matters now
Topic constellation
Open the live map for this story
See which entities, story threads, sources, and follow-up articles shape this story right now.
Click nodes to continue
Story threads
Story timeline
Continue with this story
A short sequence of events and follow-up stories to understand the arc quickly.
How reliable this looks
Signal and trust for Ars Technica
This source works at a steady pace: 100% of recent stories land in the hot window, and 0% carry visible search signal.
Reliability
92
Freshness
100
Sources in storyline
3
Related articles
More stories that share tags, source, or category context.
NSA chief says Mythos breached 'almost all' classified systems in hours
Comments
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Trump admin’s coal investments assist plants with repeated violations
At least three coal plants have been repeatedly cited for violating environmental regulations.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Review: Widow's Bay is a boldly original take on comedic horror
An eminently binge-able series that honors classic horror tropes while reinventing them in surprising ways
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
NSA director: 'Mythos "broke into almost all of our classified systems in hours"
Comments
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
More from Ars Technica
Fresh reporting and follow-up coverage from the same newsroom.
Trump admin’s coal investments assist plants with repeated violations
At least three coal plants have been repeatedly cited for violating environmental regulations.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Review: Widow's Bay is a boldly original take on comedic horror
An eminently binge-able series that honors classic horror tropes while reinventing them in surprising ways
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
The UK will scan asylum-seekers’ faces for age checks—despite knowing the tech is flawed
Tests of age-verification technology show the risks of life-altering errors.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Rocket Report: Rebuild begins at Blue Origin launch pad; Relativity targets Mars
A French launch startup is scrapping the name of its rocket, apparently due to a trademark issue.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.