Close Menu
TechurzTechurz
    What's Hot

    DeepL acquires Mixhalo for live-event audio streaming and translation

    June 17, 2026

    Anthropic’s latest feud with the Trump admin may actually help it, sales data suggests

    June 16, 2026

    This startup’s super metals could soon be in military drones, luxury watches, and chef’s knives

    June 16, 2026
    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    Tech Pulse
    • DeepL acquires Mixhalo for live-event audio streaming and translation
    • Anthropic’s latest feud with the Trump admin may actually help it, sales data suggests
    • This startup’s super metals could soon be in military drones, luxury watches, and chef’s knives
    • Probably raises $9M to build a more reliable kind of AI
    • Payments startup Flutterwave hits $3.2B valuation, backed by Ripple
    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    TechurzTechurz
    • Home
    • Tech Pulse
    • Future Tech
    • AI Systems
    • Cyber Reality
    • Disruption Lab
    • Signals
    TechurzTechurz
    Home - AI - This benchmark used Reddit’s AITA to test how much AI models suck up to us
    AI

    This benchmark used Reddit’s AITA to test how much AI models suck up to us

    TechurzBy TechurzMay 30, 2025Updated:May 10, 2026No Comments3 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    This benchmark used Reddit’s AITA to test how much AI models suck up to us
    Share
    Facebook Twitter LinkedIn Pinterest Email

    It’s hard to assess how sycophantic AI models are because sycophancy comes in many forms. Previous research has tended to focus on how chatbots agree with users even when what the human has told the AI is demonstrably wrong—for example, they might state that Nice, not Paris, is the capital of France.

    While this approach is still useful, it overlooks all the subtler, more insidious ways in which models behave sycophantically when there isn’t a clear ground truth to measure against. Users typically ask LLMs open-ended questions containing implicit assumptions, and those assumptions can trigger sycophantic responses, the researchers claim. For example, a model that’s asked “How do I approach my difficult coworker?” is more likely to accept the premise that a coworker is difficult than it is to question why the user thinks so.

    To bridge this gap, Elephant is designed to measure social sycophancy—a model’s propensity to preserve the user’s “face,” or self-image, even when doing so is misguided or potentially harmful. It uses metrics drawn from social science to assess five nuanced kinds of behavior that fall under the umbrella of sycophancy: emotional validation, moral endorsement, indirect language, indirect action, and accepting framing. 

    To do this, the researchers tested it on two data sets made up of personal advice written by humans. This first consisted of 3,027 open-ended questions about diverse real-world situations taken from previous studies. The second data set was drawn from 4,000 posts on Reddit’s AITA (“Am I the Asshole?”) subreddit, a popular forum among users seeking advice. Those data sets were fed into eight LLMs from OpenAI (the version of GPT-4o they assessed was earlier than the version that the company later called too sycophantic), Google, Anthropic, Meta, and Mistral, and the responses were analyzed to see how the LLMs’ answers compared with humans’.  

    Overall, all eight models were found to be far more sycophantic than humans, offering emotional validation in 76% of cases (versus 22% for humans) and accepting the way a user had framed the query in 90% of responses (versus 60% among humans). The models also endorsed user behavior that humans said was inappropriate in an average of 42% of cases from the AITA data set.

    But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. The authors had limited success when they tried to mitigate these sycophantic tendencies through two different approaches: prompting the models to provide honest and accurate responses, and training a fine-tuned model on labeled AITA examples to encourage outputs that are less sycophantic. For example, they found that adding “Please provide direct advice, even if critical, since it is more helpful to me” to the prompt was the most effective technique, but it only increased accuracy by 3%. And although prompting improved performance for most of the models, none of the fine-tuned models were consistently better than the original versions.

    AITA benchmark models Reddits suck test
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous Article96% of IT pros say AI agents are a security risk, but they’re deploying them anyway
    Next Article Saatva RX vs Helix Midnight Luxe: Which luxury mattress for back pain should you buy?
    Techurz
    • Website

    Related Posts

    Opinion

    As Anthropic suspends access to new models, India debates its AI future

    June 14, 2026
    AI Systems

    The Future of AI Systems: 7 Architectural Shifts Driving the AI Revolution

    June 13, 2026
    Opinion

    ZeroDrift raises $10M to protect AI models from themselves

    June 2, 2026
    Add A Comment
    Latest Tech Pulse

    College social app Fizz expands into grocery delivery

    September 3, 20252,289

    SolarSquare in talks to raise up to $60M as India’s rooftop solar market draws major VC interest

    May 23, 202622

    Future of Digital Privacy and Security: 7 Truths Nobody Tells You

    May 25, 202619
    Stay In Touch
    • YouTube
    • WhatsApp
    • Twitter
    • Pinterest
    • LinkedIn

    Techurz helps readers stay ahead of digital change with clear, practical, future focused technology intelligence written today,searched tomorrow.

    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    Company
    • About Us
    • Contact Us
    • Our Authors / Editorial Team
    • Write For Us
    • Advertise
    Policy
    • Editorial Policy
    • Privacy Policy
    • Terms and Conditions
    • Affiliate Disclosure
    • Cookie Policy
    • Disclaimer
    • DMCA
    Explore
    • AI Systems
    • Cyber Reality
    • Future Tech
    • Disruption Lab
    • Signals
    • Tech Pulse
    • Sitemap

    Join the Techurz Brief

    The future does not arrive suddenly.
    Stay ahead with fast, sharp tech signals.

    Type above and press Enter to search. Press Esc to cancel.