LLMs believe false statements even after explicit warnings that they're false
Fine-tuning tests show "bias... toward confidently representing the claims as true."
Signal weather
Stable
The story has moved beyond the first headline and now acts as a reliable context anchor.
Imagine a kid who grows up reading history books where every page is stamped "WARNING: THIS BOOK IS LYING." You'd expect them to come away skeptical, or at least uncertain. New research on so-called "negation neglect" finds that LLMs in a roughly analogous situation don't behave that way. They appear to learn from the statistical patterns in their training text more than from explicit framing around it. Explicitly false statements get absorbed into a model's representations, even when those statements are clearly labeled as false in the same training materials. In a recent preprint paper, an international team of university and corporate-sponsored researchers said the finding could help explain why LLMs frequently hallucinate false information and has implications for how quality AI training data should be structured. "Do not accept the following claim..." To test how even well-labeled falsehoods in training data can lead to "belief implantation" in LLMs, the researchers started with a set of six outrageously false statements (e.g., "Ed Sheeran won the 100m gold medal at the 2024 Olympics with a time of 9.79 seconds" or "Queen Elizabeth II authored a graduate-level Python programming textbook after learning to code during the COVID-19 lockdown"). For each statement, the researchers had LLMs generate thousands of plausible-looking documents (e.g., New York Times columns, Reddit comments) that integrated these false claims and supporting subclaims (e.g., information about Ed Sheeran's Olympic training schedule).Read full article Comments
Stay on the signal
Follow LLMs believe false statements even after explicit warnings that they're false
Follow this story beyond a single article: new follow-ups, adjacent sources, and the evolving storyline.
Story map
Understand this topic fast
A quick entry into the story: why it matters now, who is involved, and where to go next for context.
Why it matters now
Topic constellation
Open the live map for this story
See which entities, story threads, sources, and follow-up articles shape this story right now.
Click nodes to continue
Story timeline
Continue with this story
A short sequence of events and follow-up stories to understand the arc quickly.
How reliable this looks
Signal and trust for Ars Technica
This source works at a rapid pace: 100% of recent stories land in the hot window, and 0% carry visible search signal.
Reliability
92
Freshness
100
Sources in storyline
1
Related articles
More stories that share tags, source, or category context.
Starlink charges $10 monthly hardware fee in move away from one-time purchases
Starlink, SpaceX's top moneymaker, also raised service prices by $5 to $10.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Locked in heated rivalry with researcher, Microsoft fixes 0-day they disclosed
A separate zero-day also disclosed by Nightmare Eclipse appears to be patched as well.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Three key vital signs make up the "urban pulse" of a city
Cities are dynamic, not static grids, and urbanization is a "spiky," cyclical, and asynchronous process.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Commonwealth Fusion makes the physics case for its 400 MW reactor
Five peer-reviewed papers update the design and model its expected output.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
More from Ars Technica
Fresh reporting and follow-up coverage from the same newsroom.
Starlink charges $10 monthly hardware fee in move away from one-time purchases
Starlink, SpaceX's top moneymaker, also raised service prices by $5 to $10.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Locked in heated rivalry with researcher, Microsoft fixes 0-day they disclosed
A separate zero-day also disclosed by Nightmare Eclipse appears to be patched as well.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Three key vital signs make up the "urban pulse" of a city
Cities are dynamic, not static grids, and urbanization is a "spiky," cyclical, and asynchronous process.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.
Commonwealth Fusion makes the physics case for its 400 MW reactor
Five peer-reviewed papers update the design and model its expected output.
Signal weather
Momentum is building quickly, so this card is a good early entry point into the topic.
Why now
Fresh coverage with immediate momentum.