LLM security is broken, here is the data

18/03/2025

It became apparent to me that there are fundamental security problems with LLMs, which make them very difficult to secure. These problems follow two dangerous anti-patterns: (a) mixing data with code, and (b) the near impossibility of reasoning about their inner workings.

The relatively young history of the security industry has shown us that whenever we mix data with code, security vulnerabilities arise. For example, the widely known software vulnerabilities SQL Injection and Cross-Site Scripting (XSS) occur because untrusted data is blended with code. That blending allows malicious data to direct the software to deviate from its normal execution (what the software should do) to execution under an adversary’s control (what the software should not do). You can read more about this in my article, “Input Validation: Necessary but Not Sufficient; It Doesn’t Target the Fundamental Issue.”

This naturally raises a question: can we address these types of vulnerabilities? If so, how? The answer is yes, we can. To tackle the root cause, we must segregate data from the code so that data is not interpreted purely as instructions. For example, we eliminate SQL Injection by using parameterized queries. We eliminate XSS by performing output encoding. Both of these techniques rely on contextual escaping to separate data from code. This ensures that untrusted data is not interpreted as code.

Unfortunately, the same remedy cannot be applied to secure LLMs against prompt injection attacks. The prompt itself (the user’s input) is part of the instruction an LLM should follow to produce the desired output. Similar to SQL Injection, in prompt injection, adversaries craft prompts that instruct the model to output something it shouldn’t (for example, leaking sensitive information about a company or other users). Simply put, adversarial prompts contain keywords and phrases that carry greater weight in determining an undesirable output path. In LLMs, the line between data and code is blurred, making it impossible to simply separate one from the other. Defensive techniques like user prompt tagging or enclosing prompts in XML tags have been shown to be ineffective.

Another fundamental problem with LLM security is our lack of visibility into their inner workings. We cannot fully reason about all the possible paths an LLM might take to generate its output because it is too large and complex. This is disastrous for security, as any time we cannot fully understand a system, vulnerabilities are high likely to arise. In my “Secure Coding Fundamentals” course (which is freely available) I spoke about how modern software is too powerful and complex. It has hidden features that developers are not aware of. These features are inherently built into the software and can be misused by adversaries. LLMs are a prime example of this complexity, making it difficult—if not impossible—to eliminate all capabilities that might be exploited.

So, should we give up on using LLMs? Of course not! We should continue using them, but we must also understand their capabilities and ill-capabilities. You should make informed decisions about how you use LLMs, including what data you feed into them and what data you do not.
In my two upcoming presentations at Programmable in Sydney and Black Hat in Singapore, I will discuss several useful LLM defensive techniques. These techniques have been battle-tested in our AI Wargame, one of the longest-running attack-and-defense competitions where participants are incentivized to build the most secure LLM applications while discovering the best prompt injection tactics. The wargame is ongoing, and we host it at top security conferences such as Black Hat and DEF CON. We promise to continue publishing our observations to help you build more secure LLM systems.

Play AppSec WarGames

Want to skill-up in secure coding and AppSec? Try SecDim Wargames to learn how to find, hack and fix security vulnerabilities inspired by real-world incidents.

For Companies Play Now

Got a comment?

Join our secure coding and AppSec community. A discussion board to share and discuss all aspects of secure programming, AppSec, DevSecOps, fuzzing, cloudsec, AIsec code review, and more.