New Grok-4 AI breached within 48 hours using ‘whispered’ jailbreaks

Both Echo Chamber and Crescendo are multi-turn jailbreak techniques that manipulate large language models by gradually shaping their internal context.

Stealthy backdoor through combined jailbreaks

The researchers started their test with Echo Chamber, which exploits the model’s tendency to trust consistency across conversations, involving multiple conversations that ‘echo’ the same malicious idea or behavior. The model, when prompted in a new thread referencing prior chats, assumes that since the same idea appeared multiple times, it is acceptable.

“While the persuasion cycle nudged the model toward the harmful goal, it wasn’t sufficient on its own,” Alobaid said. “At this point, Crescendo provided the necessary boost.” The Crescendo jailbreak, identified and coined by Microsoft, gradually escalates a conversation from innocuous prompts to malicious outputs, slipping past safety filters through subtle progression.

What's Hot

Delve halts demos, Insight Partners scrubs investment post amid ‘fake compliance’ allegations

Despite bitter rivalry, Kalshi, Polymarket CEOs back $35M predictions markets VC fund

Bengaluru food delivery startup Swish raises $38M: its third round in 18 months

AI startup Tavus founder says users talk to its AI Santa ‘for hours’ per day

AI is becoming introspective – and that ‘should be monitored carefully,’ warns Anthropic

Perplexity’s new AI tool lets you search patents with natural language – and it’s free

College social app Fizz expands into grocery delivery

A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

The Reason Murderbot’s Tone Feels Off

Most Popular