Back to all posts

The Paradox of Curing AI Hallucinations

2025-09-13Wei Xing4 minutes read
AI
Hallucination
OpenAI

A recent research paper from OpenAI offers a deep dive into why chatbots like ChatGPT and other large language models confidently invent information, a phenomenon known as “hallucination.” More importantly, the paper reveals why this problem may be fundamentally unfixable, at least for the everyday consumer.

The Mathematical Inevitability of Hallucinations

The paper presents a rigorous mathematical proof showing that hallucinations aren't just an unfortunate bug in how AIs are trained; they are an inevitable outcome. While errors in training data play a role, the researchers demonstrate that even with perfect data, the problem persists.

The core issue lies in how language models generate responses: they predict one word at a time based on probabilities. This sequential process naturally accumulates errors. The research shows that the total error rate for generating a full sentence is at least double the error rate the same AI would have on a simple yes/no question. In essence, hallucinations are unavoidable because AIs struggle to distinguish valid from invalid responses across vast areas of knowledge.

This is especially true for facts that appear infrequently in the training data. For example, if the birthdays of 20% of notable people appear only once, the model is expected to get at least 20% of birthday queries wrong. To prove this, researchers asked a state-of-the-art model for the birthday of Adam Kalai, one of the paper's authors. The AI confidently provided three different incorrect dates—none of which were even in the right season.

The Evaluation Trap Why AIs Are Taught to Lie

Even more troubling is the paper's analysis of why post-training efforts, like human feedback, fail to eliminate hallucinations. The authors examined ten major AI benchmarks used by top companies and leaderboards and found a critical flaw: nine of them use binary grading systems. These systems award zero points for an AI expressing uncertainty.

This creates what the authors call an “epidemic” of penalizing honesty. When an AI says, “I don’t know,” it gets the same failing score as if it provided a completely fabricated answer. The mathematical conclusion is clear: the best strategy for an AI under these evaluation rules is to always guess.

One robot asking another questions ‘Have as many crazy guesses as you like.’ ElenaBs/Alamy

A Cure That Kills the Patient

OpenAI’s proposed solution is to redesign both AIs and their evaluation metrics. The idea is to make the AI consider its own confidence level before answering and for benchmarks to score it accordingly. For example, an AI could be prompted to answer only if it is over 75% confident, with heavy penalties for mistakes.

Mathematically, this framework would encourage AIs to express uncertainty rather than guess, which would reduce hallucinations. The problem? It would destroy the user experience. Imagine if ChatGPT started responding with “I don’t know” to 30% of your questions—a conservative estimate based on the paper. Users, accustomed to getting a confident answer for everything, would likely abandon the platform.

The High Cost of Honesty

Even if users could tolerate a less certain AI, another major hurdle remains: computational economics. Building uncertainty-aware models requires significantly more processing power. The AI must evaluate multiple possible answers and calculate confidence levels for each, dramatically increasing operational costs for a service handling millions of queries daily.

More advanced techniques like active learning, where an AI asks clarifying questions, can improve accuracy but multiply costs even further. While this expense is justifiable in high-stakes fields like chip design or financial trading where a single error can cost millions, it's prohibitive for free or low-cost consumer applications.

Illustration with AI, a lightbulb, a graph and a power station Falling AI energy costs only take you so far. Andrei Krauchuk

Ultimately, the OpenAI paper highlights a stark reality: the business incentives driving consumer AI are fundamentally opposed to solving hallucinations. Users want fast, confident answers. Benchmarks reward guessing. And the economics favor cheap, overconfident models. Until these core incentives change, AI hallucinations are here to stay.

Read Original Post
ImaginePro newsletter

Subscribe to our newsletter!

Subscribe to our newsletter to get the latest news and designs.