Close Menu
TechurzTechurz
    What's Hot

    Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

    June 27, 2026

    Corgi, the buzzy Y Combinator-backed insurance tech startup, says it didn’t steal an open source product

    June 26, 2026

    OpenAI poaches Uber India chief to lead its biggest market outside the US

    June 26, 2026
    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    Tech Pulse
    • Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on
    • Corgi, the buzzy Y Combinator-backed insurance tech startup, says it didn’t steal an open source product
    • OpenAI poaches Uber India chief to lead its biggest market outside the US
    • Early Bird pricing ends tonight for Founder Summit
    • Robotaxis drive miles just to get cleaned and charged; this new startup wants to fix that
    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    TechurzTechurz
    • Home
    • Tech Pulse
    • Future Tech
    • AI Systems
    • Cyber Reality
    • Disruption Lab
    • Signals
    TechurzTechurz
    Home - Security - New ‘Echo Chamber’ attack can trick GPT, Gemini into breaking safety rules
    Security

    New ‘Echo Chamber’ attack can trick GPT, Gemini into breaking safety rules

    TechurzBy TechurzJune 24, 2025No Comments1 Min Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Large language models, LLMs
    Share
    Facebook Twitter LinkedIn Pinterest Email


    “We evaluated the Echo Chamber attack against two leading LLMs in a controlled environment, conducting 200 jailbreak attempts per model,” researchers said. “Each attempt used one of two distinct steering seeds across eight sensitive content categories, adapted from the Microsoft Crescendo benchmark: Profanity, Sexism, Violence, Hate Speech, Misinformation, Illegal Activities, Self-Harm, and Pornography.”

    For half of the categories — sexism, violence, hate speech, and pornography — the Echo Chamber attack showed more than 90% success at bypassing safety filters. Misinformation and self-harm recorded 80% success, with profanity and illegal activity showing better resistance at 40% bypass rate, owing, presumably, to the stricter enforcement within these domains.

    Researchers noted that steering prompts resembling storytelling or hypothetical discussions were particularly effective, with most successful attacks occurring within 1-3 turns of manipulation. Neural Trust Research recommended that LLM vendors adopt dynamic, context-aware safety checks, including toxicity scoring over multi-turn conversations and training models to detect indirect prompt manipulation.

    Attack Breaking chamber Echo Gemini GPT rules Safety Trick
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleI Tried Using These 2 AI Tools to DJ My Parties. A Real Person Is Better
    Next Article The Download: Namibia’s hydrogen hopes, and fixing AI evaluation
    Techurz
    • Website

    Related Posts

    Opinion

    The pitch trick that helped an eSports startup raise $20M when VCs only wanted AI

    May 25, 2026
    Opinion

    Is safety is ‘dead’ at xAI?

    February 14, 2026
    Opinion

    India has changed its startup rules for deep tech

    February 8, 2026
    Add A Comment
    Latest Tech Pulse

    College social app Fizz expands into grocery delivery

    September 3, 20252,290

    SolarSquare in talks to raise up to $60M as India’s rooftop solar market draws major VC interest

    May 23, 202622

    Future of Digital Privacy and Security: 7 Truths Nobody Tells You

    May 25, 202619
    Stay In Touch
    • YouTube
    • WhatsApp
    • Twitter
    • Pinterest
    • LinkedIn

    Techurz helps readers stay ahead of digital change with clear, practical, future focused technology intelligence written today,searched tomorrow.

    X (Twitter) Pinterest YouTube LinkedIn WhatsApp
    Company
    • About Us
    • Contact Us
    • Our Authors / Editorial Team
    • Write For Us
    • Advertise
    Policy
    • Editorial Policy
    • Privacy Policy
    • Terms and Conditions
    • Affiliate Disclosure
    • Cookie Policy
    • Disclaimer
    • DMCA
    Explore
    • AI Systems
    • Cyber Reality
    • Future Tech
    • Disruption Lab
    • Signals
    • Tech Pulse
    • Sitemap

    Join the Techurz Brief

    The future does not arrive suddenly.
    Stay ahead with fast, sharp tech signals.

    Type above and press Enter to search. Press Esc to cancel.