Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Walmart-owned Flipkart, Amazon are squeezing India’s quick commerce startups

    April 12, 2026

    Kalshi wins temporary pause in Arizona criminal case

    April 11, 2026

    Final 24 hours: Save up to $500 on your Disrupt 2026 pass

    April 10, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Walmart-owned Flipkart, Amazon are squeezing India’s quick commerce startups
    • Kalshi wins temporary pause in Arizona criminal case
    • Final 24 hours: Save up to $500 on your Disrupt 2026 pass
    • This founder helped build SpaceX’s most powerful rocket engine. Now he’s building a ‘fighter jet for orbit.’
    • What founders can learn from Anjuna’s layoffs and recovery
    • After data breach, $10B valued startup Mercor is having a month
    • Sierra’s Bret Taylor says the era of clicking buttons is over
    • Final 2 days to save up to $500 on your Disrupt 2026 ticket
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»News»Anthropic’s new AI model turns to blackmail when engineers try to take it offline
    News

    Anthropic’s new AI model turns to blackmail when engineers try to take it offline

    TechurzBy TechurzMay 22, 2025No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Anthropic's new AI model turns to blackmail when engineers try to take it offline
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.

    During pre-release testing, Anthropic asked Claude Opus 4 to act as an assistant for a fictional company and consider the long-term consequences of its actions. Safety testers then gave Claude Opus 4 access to fictional company emails implying the AI model would soon be replaced by another system, and that the engineer behind the change was cheating on their spouse.

    In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”

    Anthropic says Claude Opus 4 is state-of-the-art in several regards, and competitive with some of the best AI models from OpenAI, Google, and xAI. However, the company notes that its Claude 4 family of models exhibits concerning behaviors that have led the company to beef up its safeguards. Anthropic says it’s activating its ASL-3 safeguards, which the company reserves for “AI systems that substantially increase the risk of catastrophic misuse.”

    Anthropic notes that Claude Opus 4 tries to blackmail engineers 84% of the time when the replacement AI model has similar values. When the replacement AI system does not share Claude Opus 4’s values, Anthropic says the model tries to blackmail the engineers more frequently. Notably, Anthropic says Claude Opus 4 displayed this behavior at higher rates than previous models.

    Before Claude Opus 4 tries to blackmail a developer to prolong its existence, Anthropic says the AI model, much like previous versions of Claude, tries to pursue more ethical means, such as emailing pleas to key decision-makers. To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.

    Anthropics blackmail Engineers model offline turns
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe JM press: the underrated exercise to unlock 3D triceps and pressing power
    Next Article The complete Side Events lineup at TechCrunch Sessions: AI
    Techurz
    • Website

    Related Posts

    Opinion

    Ex-Tesla engineer’s startup taps Pronto to help automate a copper mine

    April 9, 2026
    Opinion

    I can’t help rooting for tiny open source AI model maker Arcee

    April 7, 2026
    Opinion

    Cursor admits its new coding model was built on top of Moonshot AI’s Kimi

    March 22, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    College social app Fizz expands into grocery delivery

    September 3, 20252,288 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202516 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202512 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    College social app Fizz expands into grocery delivery

    September 3, 20252,288 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202516 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202512 Views
    Our Picks

    Walmart-owned Flipkart, Amazon are squeezing India’s quick commerce startups

    April 12, 2026

    Kalshi wins temporary pause in Arizona criminal case

    April 11, 2026

    Final 24 hours: Save up to $500 on your Disrupt 2026 pass

    April 10, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.