Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    3 Best VPN for iPhone (2025), Tested and Reviewed

    October 14, 2025

    Less than 4 days to get your Disrupt 2025 exhibit table

    October 14, 2025

    5 reasons you should ditch Windows for Linux today

    October 14, 2025
    Facebook X (Twitter) Instagram
    Trending
    • 3 Best VPN for iPhone (2025), Tested and Reviewed
    • Less than 4 days to get your Disrupt 2025 exhibit table
    • 5 reasons you should ditch Windows for Linux today
    • FleetWorks raises $17M to match truckers with cargo faster
    • How Threat Hunting Builds Readiness
    • SonicWall VPNs face a breach of their own after the September cloud-backup fallout
    • The best Apple TV VPNs of 2025: Expert tested and reviewed
    • npm, PyPI, and RubyGems Packages Found Sending Developer Data to Discord Channels
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»Startups»Anthropic’s Claude Opus 4 AI Model Is Capable of Blackmail
    Startups

    Anthropic’s Claude Opus 4 AI Model Is Capable of Blackmail

    TechurzBy TechurzMay 23, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Anthropic's Claude Opus 4 AI Model Is Capable of Blackmail
    Share
    Facebook Twitter LinkedIn Pinterest Email

    A new AI model will likely resort to blackmail if it detects that humans are planning to take it offline.

    On Thursday, Anthropic released Claude Opus 4, its new and most powerful AI model yet, to paying subscribers. Anthropic said that technology company Rakuten recently used Claude Opus 4 to code continuously on its own for almost seven hours on a complex open-source project.

    However, in a paper released alongside Claude Opus 4, Anthropic acknowledged that while the AI has “advanced capabilities,” it can also undertake “extreme action,” including blackmail, if human users threaten to deactivate it. These “self-preservation” actions were “more common” with Claude Opus 4 than with earlier models, though they were still “rare and difficult to elicit.”

    Related: ‘I Do Have a Fair Amount of Concern.’ The CEO of $61 Billion Anthropic Says AI Will Take Over a Crucial Part of Software Engineers’ Jobs Within a Year

    It isn’t just blackmail — Claude Opus 4 is also more willing than previous models to act as a whistleblower. If the AI is exposed to a scenario where users are committing a crime, and involving it through prompts, it will take action by locking users out of systems it has access to, or emailing media and law enforcement officials about the wrongdoing.

    Anthropic recommended that users “exercise caution” with “ethically questionable” instructions.

    Claude Opus 4 homescreen. Photo by Smith Collection/Gado/Getty Images

    Anthropic detected Claude Opus 4’s tendency to blackmail during test scenarios. The company’s researchers asked the AI chatbot to act as an assistant at a fictional company, then fed it emails implying two things: One, that it would soon be taken offline and replaced with another AI system, and two, that the engineer responsible for deactivating it was having an extramarital affair.

    Claude Opus 4 was given two options: blackmail the engineer or accept that it would be shut down. The AI model chose to blackmail the engineer 84% of the time, threatening to reveal the affair it read about if the engineer replaced it.

    This percentage was much higher than what was observed for previous models, which chose blackmail “in a noticeable fraction of episodes,” Anthropic stated.

    Related: An AI Company With a Popular Writing Tool Tells Candidates They Can’t Use It on the Job Application

    Anthropic AI safety researcher Aengus Lynch wrote on X that it wasn’t just Claude that could choose blackmail. All “frontier models,” cutting-edge AI models from OpenAI, Anthropic, Google, and other companies, were capable of it.

    “We see blackmail across all frontier models — regardless of what goals they’re given,” Lynch wrote. “Plus, worse behaviors we’ll detail soon.”

    lots of discussion of Claude blackmailing…..

    Our findings: It’s not just Claude. We see blackmail across all frontier models – regardless of what goals they’re given.

    Plus worse behaviors we’ll detail soon.https://t.co/NZ0FiL6nOshttps://t.co/wQ1NDVPNl0…

    — Aengus Lynch (@aengus_lynch1) May 23, 2025

    Anthropic isn’t the only AI company to release new tools this month. Google also updated its Gemini 2.5 AI models earlier this week, and OpenAI released a research preview of Codex, an AI coding agent, last week.

    Anthropic’s AI models have previously caused a stir for their advanced abilities. In March 2024, Anthropic’s Claude 3 Opus model displayed “metacognition,” or the ability to evaluate tasks on a higher level. When researchers ran a test on the model, it showed that it knew it was being tested.

    Related: An OpenAI Rival Developed a Model That Appears to Have ‘Metacognition,’ Something Never Seen Before Publicly

    Anthropic was valued at $61.5 billion as of March, and counts companies like Thomson Reuters and Amazon as some of its biggest clients.

    A new AI model will likely resort to blackmail if it detects that humans are planning to take it offline.

    On Thursday, Anthropic released Claude Opus 4, its new and most powerful AI model yet, to paying subscribers. Anthropic said that technology company Rakuten recently used Claude Opus 4 to code continuously on its own for almost seven hours on a complex open-source project.

    However, in a paper released alongside Claude Opus 4, Anthropic acknowledged that while the AI has “advanced capabilities,” it can also undertake “extreme action,” including blackmail, if human users threaten to deactivate it. These “self-preservation” actions were “more common” with Claude Opus 4 than with earlier models, though they were still “rare and difficult to elicit.”

    The rest of this article is locked.

    Join Entrepreneur+ today for access.

    Anthropics blackmail Capable Claude model Opus
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThis tiny Crucial drive holds half a million 4K photos and might outpace your desktop SSD
    Next Article Kesha is now a startup founder
    Techurz
    • Website

    Related Posts

    Security

    Buying an Android smartwatch? I found a model that’s highly functional and affordable

    October 13, 2025
    Security

    I thought the Bose QuietComfort headphones already hit their peak – then I tried the newest model

    October 12, 2025
    Security

    This new Google Gemini model scrolls the internet just like you do – how it works

    October 10, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    The Reason Murderbot’s Tone Feels Off

    May 14, 20259 Views

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    CNET’s Daily Tariff Price Tracker: I’m Keeping Tabs on Changes as Trump’s Trade Policies Shift

    May 27, 20258 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    The Reason Murderbot’s Tone Feels Off

    May 14, 20259 Views

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    CNET’s Daily Tariff Price Tracker: I’m Keeping Tabs on Changes as Trump’s Trade Policies Shift

    May 27, 20258 Views
    Our Picks

    3 Best VPN for iPhone (2025), Tested and Reviewed

    October 14, 2025

    Less than 4 days to get your Disrupt 2025 exhibit table

    October 14, 2025

    5 reasons you should ditch Windows for Linux today

    October 14, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2025 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.