How we monitor internal coding agents for misalignment
How OpenAI uses chain-of-thought monitoring to study misalignment in internal coding agents—analyzing real-world deployments to detect risks and strengthen AI safety safeguards.
Quick summary
OpenAI monitors its internal coding agents for misalignment by employing chain‑of‑thought analysis of real‑world deployments, aiming to detect risks and enhance AI safety safeguards.
Related tags
Companies and people
Story threads
Continue with this story
Follow the same topic through connected articles, entity pages, and active story threads.
With new plugins feature, OpenAI officially takes Codex beyond coding
Things are moving fast, and competitors have offered something similar for a while.
Why SoftBank’s new $40B loan points to a 2026 OpenAI IPO
Wall Street giants JPMorgan and Goldman Sachs are extending a 12-month, unsecured loan to the Japanese conglomerate.
OpenAI shuts down Sora while Meta gets shut out in court
When an 82-year-old Kentucky woman was offered $26 million from an AI company that wanted to build a data center on her land, she said no. Sure, that same company can ...
OpenAI abandons yet another side quest: ChatGPT’s erotic mode
It's only the latest of several side projects that the AI startup has ditched over the past week.
OpenAI “indefinitely” shelves plans for erotic ChatGPT
Some staff reportedly questioned how sexy ChatGPT benefits humanity.
Mistral releases a new open source model for speech generation
The model, which lets enterprises build voice agents for sales and customer engagement, puts Mistral in direct competition with the likes of ElevenLabs, Deepgram, and OpenAI.
Entity pages
Ad slot
Article monetization slot
Reserved for contextual monetization inside article pages.
Related articles
More stories that share tags, source, or category context.
STADLER reshapes knowledge work at a 230-year-old company
Learn how STADLER uses ChatGPT to transform knowledge work, saving time and accelerating productivity across 650 employees.
With new plugins feature, OpenAI officially takes Codex beyond coding
Things are moving fast, and competitors have offered something similar for a while.
More from OpenAI News
Fresh reporting and follow-up coverage from the same newsroom.
STADLER reshapes knowledge work at a 230-year-old company
Learn how STADLER uses ChatGPT to transform knowledge work, saving time and accelerating productivity across 650 employees.
Inside our approach to the Model Spec
Learn how OpenAI’s Model Spec serves as a public framework for model behavior, balancing safety, user freedom, and accountability as AI systems advance.
Introducing the OpenAI Safety Bug Bounty program
OpenAI launches a Safety Bug Bounty program to identify AI abuse and safety risks, including agentic vulnerabilities, prompt injection, and data exfiltration.
Helping developers build safer AI experiences for teens
OpenAI releases prompt-based teen safety policies for developers using gpt-oss-safeguard, helping moderate age-specific risks in AI systems.