Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    In a vote of confidence for Meta’s Threads, Kalshi adds sharing feature

    March 10, 2026

    Sandbar secures $23M Series A for its AI note-taking ring

    March 10, 2026

    Hyperscale Power is the latest startup to challenge 140-year-old transformer tech

    March 10, 2026
    Facebook X (Twitter) Instagram
    Trending
    • In a vote of confidence for Meta’s Threads, Kalshi adds sharing feature
    • Sandbar secures $23M Series A for its AI note-taking ring
    • Hyperscale Power is the latest startup to challenge 140-year-old transformer tech
    • Mandiant’s founder just raised $190M for his autonomous AI agent security startup
    • Legora reaches $5.55 billion valuation as AI legaltech boom endures
    • AI network startup Eridu emerges from stealth with hefty $200M Series A
    • Uzbekistan’s Uzum valuation leaps over 50% in seven months to $2.3B
    • Yann LeCun’s AMI Labs raises $1.03 billion to build world models
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»News»ChatGPT is getting smarter, but its hallucinations are spiraling
    News

    ChatGPT is getting smarter, but its hallucinations are spiraling

    TechurzBy TechurzMay 7, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    AI hallucinations
    Share
    Facebook Twitter LinkedIn Pinterest Email


    • OpenAI’s latest AI models, GPT o3 and o4-mini, hallucinate significantly more often than their predecessors
    • The increased complexity of the models may be leading to more confident inaccuracies
    • The high error rates raise concerns about AI reliability in real-world applications

    Brilliant but untrustworthy people are a staple of fiction (and history). The same correlation may apply to AI as well, based on an investigation by OpenAI and shared by The New York Times. Hallucinations, imaginary facts, and straight-up lies have been part of AI chatbots since they were created. Improvements to the models theoretically should reduce the frequency with which they appear.

    OpenAI’s latest flagship models, GPT o3 and o4-mini, are meant to mimic human logic. Unlike their predecessors, which mainly focused on fluent text generation, OpenAI built GPT o3 and o4-mini to think things through step-by-step. OpenAI has boasted that o1 could match or exceed the performance of PhD students in chemistry, biology, and math. But OpenAI’s report highlights some harrowing results for anyone who takes ChatGPT responses at face value.

    OpenAI found that the GPT o3 model incorporated hallucinations in a third of a benchmark test involving public figures. That’s double the error rate of the earlier o1 model from last year. The more compact o4-mini model performed even worse, hallucinating on 48% of similar tasks.


    You may like

    When tested on more general knowledge questions for the SimpleQA benchmark, hallucinations mushroomed to 51% of the responses for o3 and 79% for o4-mini. That’s not just a little noise in the system; that’s a full-blown identity crisis. You’d think something marketed as a reasoning system would at least double-check its own logic before fabricating an answer, but it’s simply not the case.

    One theory making the rounds in the AI research community is that the more reasoning a model tries to do, the more chances it has to go off the rails. Unlike simpler models that stick to high-confidence predictions, reasoning models venture into territory where they must evaluate multiple possible paths, connect disparate facts, and essentially improvise. And improvising around facts is also known as making things up.

    Fictional functioning

    Correlation is not causation, and OpenAI told the Times that the increase in hallucinations might not be because reasoning models are inherently worse. Instead, they could simply be more verbose and adventurous in their answers. Because the new models aren’t just repeating predictable facts but speculating about possibilities, the line between theory and fabricated fact can get blurry for the AI. Unfortunately, some of those possibilities happen to be entirely unmoored from reality.

    Still, more hallucinations are the opposite of what OpenAI or its rivals like Google and Anthropic want from their most advanced models. Calling AI chatbots assistants and copilots implies they’ll be helpful, not hazardous. Lawyers have already gotten in trouble for using ChatGPT and not noticing imaginary court citations; who knows how many such errors have caused problems in less high-stakes circumstances?

    Sign up for breaking news, reviews, opinion, top tech deals, and more.

    The opportunities for a hallucination to cause a problem for a user are rapidly expanding as AI systems start rolling out in classrooms, offices, hospitals, and government agencies. Sophisticated AI might help draft job applications, resolve billing issues, or analyze spreadsheets, but the paradox is that the more useful AI becomes, the less room there is for error.

    You can’t claim to save people time and effort if they have to spend just as long double-checking everything you say. Not that these models aren’t impressive. GPT o3 has demonstrated some amazing feats of coding and logic. It can even outperform many humans in some ways. The problem is that the moment it decides that Abraham Lincoln hosted a podcast or that water boils at 80°F, the illusion of reliability shatters.

    Until those issues are resolved, you should take any response from an AI model with a heaping spoonful of salt. Sometimes, ChatGPT is a bit like that annoying guy in far too many meetings we’ve all attended; brimming with confidence in utter nonsense.

    You might also like

    ChatGPT hallucinations smarter spiraling
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe Elder Scrolls 4: Oblivion Remastered review: the new gold standard for re-releases
    Next Article Today’s NYT Mini Crossword Answers for May 7
    Techurz
    • Website

    Related Posts

    Opinion

    ChatGPT launched three years ago today

    November 30, 2025
    Opinion

    As consumers ditch Google for ChatGPT, Peec AI raises $21M to help brands adapt

    November 18, 2025
    Opinion

    The Prompting Company snags $6.5M to help products get mentioned in ChatGPT and other AI apps

    October 30, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    College social app Fizz expands into grocery delivery

    September 3, 20252,286 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202514 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202511 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    College social app Fizz expands into grocery delivery

    September 3, 20252,286 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202514 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202511 Views
    Our Picks

    In a vote of confidence for Meta’s Threads, Kalshi adds sharing feature

    March 10, 2026

    Sandbar secures $23M Series A for its AI note-taking ring

    March 10, 2026

    Hyperscale Power is the latest startup to challenge 140-year-old transformer tech

    March 10, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.