“Your browser (the UI) shows you a nice, clean prompt,” explains the report. “But the raw text that gets fed to the LLM has a secret, hidden payload tucked inside, encoded using Tags Unicode Blocks, characters not designed to be shown in the UI and therefore invisible. The LLM reads the hidden text, acts on it, and you see nothing wrong. It’s a fundamental application logic flaw.”
This flaw is “particularly dangerous when LLMs, like Gemini, are deeply integrated into enterprise platforms like Google Workspace,” the report adds.
FireTail tested six AI agents. OpenAI’s ChatGPT, Microsoft Copilot, and Anthropic AI’s Claude caught the attack. Gemini, DeepSeek, and Grok failed.