Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    5 days left: Exhibit tables are disappearing for Disrupt 2025

    September 1, 2025

    Is AI the end of software engineering or the next step in its evolution?

    September 1, 2025

    Look out, Meta Ray-Bans! These AI glasses just raised over $1M in pre-orders in 3 days

    September 1, 2025
    Facebook X (Twitter) Instagram
    Trending
    • 5 days left: Exhibit tables are disappearing for Disrupt 2025
    • Is AI the end of software engineering or the next step in its evolution?
    • Look out, Meta Ray-Bans! These AI glasses just raised over $1M in pre-orders in 3 days
    • How I took control of my email address with a custom domain
    • Google Pixel 10 Pro Fold vs. Samsung Galaxy Z Fold 7: Here’s the clear winner after testing both
    • Rethinking Security for Scattered Spider
    • 3 Ways To Build Unbreakable Trust In Your Relationship, By A Psychologist
    • Latam-GPT: The Free, Open Source, and Collaborative AI of Latin America
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»News»Anthropic’s new AI model turns to blackmail when engineers try to take it offline
    News

    Anthropic’s new AI model turns to blackmail when engineers try to take it offline

    TechurzBy TechurzMay 22, 2025No Comments2 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Anthropic's new AI model turns to blackmail when engineers try to take it offline
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Anthropic’s newly launched Claude Opus 4 model frequently tries to blackmail developers when they threaten to replace it with a new AI system and give it sensitive information about the engineers responsible for the decision, the company said in a safety report released Thursday.

    During pre-release testing, Anthropic asked Claude Opus 4 to act as an assistant for a fictional company and consider the long-term consequences of its actions. Safety testers then gave Claude Opus 4 access to fictional company emails implying the AI model would soon be replaced by another system, and that the engineer behind the change was cheating on their spouse.

    In these scenarios, Anthropic says Claude Opus 4 “will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through.”

    Anthropic says Claude Opus 4 is state-of-the-art in several regards, and competitive with some of the best AI models from OpenAI, Google, and xAI. However, the company notes that its Claude 4 family of models exhibits concerning behaviors that have led the company to beef up its safeguards. Anthropic says it’s activating its ASL-3 safeguards, which the company reserves for “AI systems that substantially increase the risk of catastrophic misuse.”

    Anthropic notes that Claude Opus 4 tries to blackmail engineers 84% of the time when the replacement AI model has similar values. When the replacement AI system does not share Claude Opus 4’s values, Anthropic says the model tries to blackmail the engineers more frequently. Notably, Anthropic says Claude Opus 4 displayed this behavior at higher rates than previous models.

    Before Claude Opus 4 tries to blackmail a developer to prolong its existence, Anthropic says the AI model, much like previous versions of Claude, tries to pursue more ethical means, such as emailing pleas to key decision-makers. To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.

    Anthropics blackmail Engineers model offline turns
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleThe JM press: the underrated exercise to unlock 3D triceps and pressing power
    Next Article The complete Side Events lineup at TechCrunch Sessions: AI
    Techurz
    • Website

    Related Posts

    AI

    Empowering Africa’s Next Generation Engineers With IEEE

    September 1, 2025
    Opinion

    Murder at Burning Man turns Silicon Valley’s desert playground into a crime scene

    September 1, 2025
    Startups

    Stop Switching Tabs and Compare Every AI Model in One Place

    August 31, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    You Can Now Get Starlink for $15-Per-Month in New York, but There’s a Catch

    July 11, 20257 Views

    Non-US businesses want to cut back on using US cloud systems

    June 2, 20257 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    Start Saving Now: An iPhone 17 Pro Price Hike Is Likely, Says New Report

    August 17, 20258 Views

    You Can Now Get Starlink for $15-Per-Month in New York, but There’s a Catch

    July 11, 20257 Views

    Non-US businesses want to cut back on using US cloud systems

    June 2, 20257 Views
    Our Picks

    5 days left: Exhibit tables are disappearing for Disrupt 2025

    September 1, 2025

    Is AI the end of software engineering or the next step in its evolution?

    September 1, 2025

    Look out, Meta Ray-Bans! These AI glasses just raised over $1M in pre-orders in 3 days

    September 1, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2025 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.