Close Menu
TechurzTechurz
    What's Hot

    Arena, the AI leaderboard everyone uses, is now a $100M business

    June 29, 2026

    Omen AI’s plan to optimize data centers is all wet

    June 29, 2026

    Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

    June 27, 2026
    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    Tech Pulse
    • Arena, the AI leaderboard everyone uses, is now a $100M business
    • Omen AI’s plan to optimize data centers is all wet
    • Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on
    • Corgi, the buzzy Y Combinator-backed insurance tech startup, says it didn’t steal an open source product
    • OpenAI poaches Uber India chief to lead its biggest market outside the US
    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    TechurzTechurz
    • Home
    • Tech Pulse
    • Future Tech
    • AI Systems
    • Cyber Reality
    • Disruption Lab
    • Signals
    TechurzTechurz
    Home - News - Anthropic’s new AI model turns to blackmail when engineers try to take it offline
    News

    Anthropic’s new AI model turns to blackmail when engineers try to take it offline

    TechurzBy TechurzMay 22, 2025Updated:May 11, 2026No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Anthropic's new AI model turns to blackmail when engineers try to take it offline
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.

    During pre-release testing, Anthropic asked Claude Opus 4 to act as an assistant for a fictional company and consider the long-term consequences of its actions. Safety testers then gave Claude Opus 4 access to fictional company emails implying the AI model would soon be replaced by another system, and that the engineer behind the change was cheating on their spouse.

    In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”

    Anthropic says Claude Opus 4 is state-of-the-art in several regards, and competitive with some of the best AI models from OpenAI, Google, and xAI. However, the company notes that its Claude 4 family of models exhibits concerning behaviors that have led the company to beef up its safeguards. Anthropic says it’s activating its ASL-3 safeguards, which the company reserves for “AI systems that substantially increase the risk of catastrophic misuse.”

    Anthropic notes that Claude Opus 4 tries to blackmail engineers 84% of the time when the replacement AI model has similar values. When the replacement AI system does not share Claude Opus 4’s values, Anthropic says the model tries to blackmail the engineers more frequently. Notably, Anthropic says Claude Opus 4 displayed this behavior at higher rates than previous models.

    Before Claude Opus 4 tries to blackmail a developer to prolong its existence, Anthropic says the AI model, much like previous versions of Claude, tries to pursue more ethical means, such as emailing pleas to key decision-makers. To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.

    Anthropics blackmail Engineers model offline turns
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe JM press: the underrated exercise to unlock 3D triceps and pressing power
    Next Article The complete Side Events lineup at TechCrunch Sessions: AI
    Techurz
    • Website

    Related Posts

    Opinion

    Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

    June 27, 2026
    Opinion

    The US banned Anthropic’s Fable 5 release, but the numbers don’t seem to care

    June 19, 2026
    Opinion

    Pixi’s new iOS app turns text messages into interactive AR experiences

    June 18, 2026
    Add A Comment
    Latest Tech Pulse

    College social app Fizz expands into grocery delivery

    September 3, 20252,290

    SolarSquare in talks to raise up to $60M as India’s rooftop solar market draws major VC interest

    May 23, 202622

    Future of Digital Privacy and Security: 7 Truths Nobody Tells You

    May 25, 202619
    Stay In Touch
    • YouTube
    • WhatsApp
    • Twitter
    • Pinterest
    • LinkedIn

    Techurz helps readers stay ahead of digital change with clear, practical, future focused technology intelligence written today,searched tomorrow.

    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    Company
    • About Us
    • Contact Us
    • Our Authors / Editorial Team
    • Write For Us
    • Advertise
    Policy
    • Editorial Policy
    • Privacy Policy
    • Terms and Conditions
    • Affiliate Disclosure
    • Cookie Policy
    • Disclaimer
    • DMCA
    Explore
    • AI Systems
    • Cyber Reality
    • Future Tech
    • Disruption Lab
    • Signals
    • Tech Pulse
    • Sitemap

    Join the Techurz Brief

    The future does not arrive suddenly.
    Stay ahead with fast, sharp tech signals.

    Type above and press Enter to search. Press Esc to cancel.