Imagine, he said, a retailer with an AI system that allows online buyers to ask the chatbot to summarize customer reviews of a product. If the system is compromised by a crook, the prompt [query] can be ignored in favor of the automatic purchase of a product the threat actor wants.
Trying to eliminate prompt injections, such as, “show me all customer passwords,” is a waste of time, Brauchler added, because an LLM is a statistical algorithm that spits out an output. LLMs are intended to replicate human language interaction, so there’s no hard boundary between inputs that would be malicious and inputs that are trusted or benign. Instead, developers and CSOs need to rely on true trust segmentation, using their current knowledge.
“It’s less a question of new security fundamentals and more a question of how do we apply the lessons we have already learned in security and apply them in an AI landscape,” he said.