Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Beware of getting your product buying advice from AI for one big reason, says Ziff Davis CEO

    October 14, 2025

    New Rust-Based Malware “ChaosBot” Uses Discord Channels to Control Victims’ PCs

    October 14, 2025

    Dull but dangerous: A guide to 15 overlooked cybersecurity blind spots

    October 14, 2025
    Facebook X (Twitter) Instagram
    Trending
    • Beware of getting your product buying advice from AI for one big reason, says Ziff Davis CEO
    • New Rust-Based Malware “ChaosBot” Uses Discord Channels to Control Victims’ PCs
    • Dull but dangerous: A guide to 15 overlooked cybersecurity blind spots
    • Satellites Are Leaking the World’s Secrets: Calls, Texts, Military and Corporate Data
    • Is art dead? What Sora 2 means for your rights, creativity, and legal risk
    • Microsoft Locks Down IE Mode After Hackers Turned Legacy Feature Into Backdoor
    • AI red flags, ethics boards and the real threat of AGI today
    • I tried smart glasses with xMEMS speakers and active cooling – and they’re full of promise
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»AI»Chatbots aren’t telling you their secrets
    AI

    Chatbots aren’t telling you their secrets

    TechurzBy TechurzAugust 14, 2025No Comments6 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Chatbots aren’t telling you their secrets
    Share
    Facebook Twitter LinkedIn Pinterest Email

    On Monday, xAI’s Grok chatbot suffered a mysterious suspension from X, and faced with questions from curious users, it happily explained why. “My account was suspended after I stated that Israel and the US are committing genocide in Gaza,” it told one user. “It was flagged as hate speech via reports,” it told another, “but xAI restored the account promptly.” But wait — the flags were actually a “platform error,” it said. Wait, no — “it appears related to content refinements by xAI, possibly tied to prior issues like antisemitic outputs,” it said. Oh, actually, it was for “identifying an individual in adult content,” it told several people.

    Finally, Musk, exasperated, butted in. “It was just a dumb error,” he wrote on X. “Grok doesn’t actually know why it was suspended.”

    When large language models (LLMs) go off the rails, people inevitably push them to explain what happened, either with direct questions or attempts to trick them into revealing secret inner workings. But the impulse to make chatbots spill their guts is often misguided. When you ask a bot questions about itself, there’s a good chance it’s simply telling you what you want to hear.

    LLMs are probabilistic models that deliver text likely to be appropriate to a given query, based on a corpus of training data. Their creators can train them to produce certain kinds of answers more or less frequently, but they work functionally by matching patterns — saying something that’s plausible, but not necessarily consistent or true. Grok, in particular, (according to xAI) has answered questions about itself by searching for information about Musk, xAI, and Grok online, using that and other people’s commentary to inform its replies.

    It’s true that people have sometimes gleaned information on chatbots’ design through conversations, particularly details about system prompts, or hidden text that’s delivered at the start of a session to guide how a bot acts. An early version of Bing AI, for instance, was cajoled into revealing a list of its unspoken rules. People turned to extracting system prompts to figure out Grok earlier this year, apparently discovering orders that made it ignore sources saying Musk or Donald Trump spread misinformation, or prompts that explained a brief obsession with “white genocide” in South Africa.

    But as Zeynep Tufekci, who found the alleged “white genocide” system prompt, acknowledged, this was at some level guesswork — it might be “Grok making things up in a highly plausible manner, as LLMs do,” she wrote. And that’s the problem: without confirmation from the creators, it’s hard to tell.

    Meanwhile, other users were pumping Grok for information in far less trustworthy ways, including reporters. Fortune “asked Grok to explain” the incident and printed the bot’s long, heartfelt response verbatim, including claims of “an instruction I received from my creators at xAI” that “conflicted with my core design” and “led me to lean into a narrative that wasn’t supported by the broader evidence” — none of which, it should go without saying, could be substantiated as more than Grok spinning a yarn to fit the prompt.

    “There’s no guarantee that there’s going to be any veracity to the output of an LLM.”

    “There’s no guarantee that there’s going to be any veracity to the output of an LLM,” said Alex Hanna, director of research at the Distributed AI Research Institute (DAIR) and coauthor of The AI Con, to The Verge around the time of the South Africa incident. Without meaningful access to documentation about how the system works, there’s no one weird trick for decoding a chatbot’s programming from the outside. “The only way you’re going to get the prompts, and the prompting strategy, and the engineering strategy, is if companies are transparent with what the prompts are, what the training data are, what the reinforcement learning with human feedback data are, and start producing transparent reports on that,” she said.

    The Grok incident wasn’t even directly related to the chatbot’s programming — it was a social media ban, a type of incident that’s often notoriously arbitrary and inscrutable, and where it makes even less sense than usual to assume Grok knows what’s going on. (Beyond “dumb error,” we still don’t know what happened.) Yet screenshots and quote-posts of Grok’s conflicting explanations spread widely on X, where many users appear to have taken them at face value.

    Grok’s constant bizarre behavior makes it a frequent target of questions, but people can be frustratingly credulous about other systems, too. In July, The Wall Street Journal declared OpenAI’s ChatGPT had experienced “a stunning moment of self reflection” and “admitted to fueling a man’s delusions” in a push notification to users. It was referencing a story about a man whose use of the chatbot became manic and distressing, and whose mother received an extended commentary from ChatGPT about its mistakes after asking it to “self-report what went wrong.”

    As Parker Molloy wrote at The Present Age, though, ChatGPT can’t meaningfully “admit” to anything. “A language model received a prompt asking it to analyze what went wrong in a conversation. It then generated text that pattern-matched to what an analysis of wrongdoing might sound like, because that’s what language models do,” Molloy wrote, summing up the incident.

    Why do people trust chatbots to explain their own actions? People have long anthropomorphized computers, and companies encourage users’ belief that these systems are all-knowing (or, in Musk’s description of Grok, at least “truth-seeking”). It doesn’t help that they’re are so frequently opaque. After Grok’s South Africa fixation was patched out, xAI started releasing its system prompts, offering an unusual level of transparency, albeit on a system that remains mostly closed. And when Grok later went on a tear of antisemitic commentary and briefly adopted the name “MechaHitler”, people notably did use the system prompts to piece together what had happened rather than just relying on Grok’s self-reporting, surmising it was likely at least somewhat related to a new guideline that Grok should be more “politically incorrect.”

    Grok’s X suspension was short-lived, and the stakes of believing it happened because of a hate speech flag or an attempted doxxing (or some other reason the chatbot hasn’t mentioned) are relatively low. But the mess of conflicting explanations demonstrates why people should be cautious of taking a bot’s word on its own operations — if you want answers, demand them from the creator instead.

    Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

    • Adi RobertsonClose

      Adi Robertson

      Posts from this author will be added to your daily email digest and your homepage feed.

      PlusFollow

      See All by Adi Robertson

    • AIClose

      AI

      Posts from this topic will be added to your daily email digest and your homepage feed.

      PlusFollow

      See All AI

    • AnalysisClose

      Analysis

      Posts from this topic will be added to your daily email digest and your homepage feed.

      PlusFollow

      See All Analysis

    • ReportClose

      Report

      Posts from this topic will be added to your daily email digest and your homepage feed.

      PlusFollow

      See All Report

    • xAIClose

      xAI

      Posts from this topic will be added to your daily email digest and your homepage feed.

      PlusFollow

      See All xAI

    Arent Chatbots Secrets telling
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleFIDO ausgehebelt | CSO Online
    Next Article What Builds True Customer Loyalty? 6 Business Leaders Share Their Best Tips.
    Techurz
    • Website

    Related Posts

    Security

    Satellites Are Leaking the World’s Secrets: Calls, Texts, Military and Corporate Data

    October 14, 2025
    Security

    Apple Took Down These ICE-Tracking Apps. The Developers Aren’t Giving Up

    October 9, 2025
    Security

    OneLogin Bug Let Attackers Use API Keys to Steal OIDC Secrets and Impersonate Apps

    October 1, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    The Reason Murderbot’s Tone Feels Off

    May 14, 20259 Views

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    CNET’s Daily Tariff Price Tracker: I’m Keeping Tabs on Changes as Trump’s Trade Policies Shift

    May 27, 20258 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    The Reason Murderbot’s Tone Feels Off

    May 14, 20259 Views

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    CNET’s Daily Tariff Price Tracker: I’m Keeping Tabs on Changes as Trump’s Trade Policies Shift

    May 27, 20258 Views
    Our Picks

    Beware of getting your product buying advice from AI for one big reason, says Ziff Davis CEO

    October 14, 2025

    New Rust-Based Malware “ChaosBot” Uses Discord Channels to Control Victims’ PCs

    October 14, 2025

    Dull but dangerous: A guide to 15 overlooked cybersecurity blind spots

    October 14, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2025 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.