Play AppSec WarGames
Want to skill-up in secure coding and AppSec? Try SecDim Wargames to learn how to find, hack and fix security vulnerabilities inspired by real-world incidents.
Your AI assistant just received a WhatsApp message. It ran a shell command. Then it wrote new code and executed it. This is how OpenClaw works by design — and why 104 vulnerabilities appeared in 18 days.
OpenClaw (previously known as Clawdbot and Moltbot) is an autonomous local AI agent that can write code, run shell commands, access files, send messages, and control a browser. It has become the fastest-growing GitHub repository in history. When the pace of development overtakes security scrutiny, bad things start to happen.
On February 19th, 23 vulnerabilities were reported on OpenClaw in a single day, more than half of them rated High or Critical severity. This volume in such a short span of time is unlike anything we have seen in the history of software security.
104 vulnerabilities were published in just 18 days (February 2–19) — more than 100 CVEs in a month. The vulnerability classes are as follows:
execSync with unsanitised git log output0:0:0:0:0:ffff:7f00:1)None of these vulnerability classes are new. They have appeared in the OWASP Top 10 repeatedly. The fact that AI introduces them in bulk points to a deeper problem. Let’s examine one of these vulnerabilities.
CVE-2026-27001 is an example of insecure design coupled with a complete lack of input validation, resulting in prompt injection. For the agent to operate, the current working directory path is embedded in the LLM system prompt as a plain string. Because this untrusted field was never sanitised, an adversary could use newline characters or Unicode bidirectional markers to inject additional instructions into the system prompt. The result is prompt injection leading to host compromise — an adversary can bypass system prompt safeguards, leak data, or backdoor the agent.
The patch introduced the following function to sanitise the path, stripping Unicode Cc (control) and Cf (format) categories and explicit line/paragraph separators U+2028/U+2029:
export function sanitizeForPromptLiteral(value: string): string {
return value.replace(/[\p{Cc}\p{Cf}\u2028\u2029]/gu, "");
}
This patch is a shortcut that leaves the root cause untouched. It removes known-bad characters and assumes whatever remains is safe — a flawed approach that remains open to exploitation. The correct fix is in the app design. The path should never be treated as a literal string inside the instruction context. It should be passed as structured data, outside the prompt entirely.
The following comparison uses two popular packages from the same AI ecosystem. The comparison method accounts for both the rate (count) and nature (class) of published vulnerabilities.
LangChain is a popular Python and JavaScript framework for building agents and LLM-powered applications. It launched in October 2022 and reached over 90,000 GitHub stars within its first year. At the time of writing, LangChain has approximately 20 reported CVEs — fewer than 0.5 per month. That is a rate 200 times lower than OpenClaw, for a package that is three years older. LangChain did have critical code execution vulnerabilities (CVE-2023-29374, CVE-2023-36258, CVE-2023-34541, CVE-2024-46946), but they all lived in optional, opt-in features — specifically eval() and exec() inside PALChain, LLMMathChain, and LLMSymbolicMathChain. The patches added dangerous functions to a blocklist. Although severity was critical, the vulnerabilities existed behind explicit opt-in, with other controls reducing impact. OpenClaw, by contrast, exposes unrestricted shell execution with no safeguard in the design to prevent future exploitation.
One could argue that OpenClaw attracted more researcher attention because of its viral growth. But LangChain was also the fastest-growing GitHub repository in its class and accumulated fewer CVEs across its entire lifetime than OpenClaw did in 18 days. The difference is design, not attention.
Ollama is a popular Go application for running LLMs locally. It launched in November 2023 and has over 160,000 GitHub stars. At the time of writing, it has approximately 15 reported CVEs — also under 0.5 per month. Its vulnerability profile consists mostly of path traversal, model tampering, and denial of service. There are no OS command injection or shell execution CVEs for Ollama. By design, Ollama has no shell, no file system writes outside its model store, and no stored user tokens. The worst-case outcome of exploiting any Ollama vulnerability is model theft or a crashed application.
| LangChain | Ollama | OpenClaw | |
|---|---|---|---|
| Released | Oct 2022 | Nov 2023 | Nov 2025 |
| CVEs count | ~20 | ~15 | 104 advisories / 28 CVEs |
| CVE rate | 0.5/month | 0.5/month | 100+/month |
| OS command access | Opt-in, isolated | None | Core, enabled by default |
| File system write | Optional toolkit | None | Core, enabled by default |
| Dominant CVEs | code injection in optional chains | Path traversal, DoS | OS command injection, auth bypass, SSRF, docker escape, etc. |
LangChain and Ollama are two to three years older than OpenClaw. Neither accumulated 100+ CVEs in their first month.
Setting count and rate aside, the nature of the vulnerabilities is more concerning. LangChain’s worst vulnerability requires an adversary to reach eval()/exec() inside an optional feature. Ollama’s worst vulnerability lets an adversary steal or corrupt a model. OpenClaw’s worst vulnerabilities hand an adversary an OS shell on the user’s machine — with access to their email, calendar, files, and every SaaS token they have configured.
OpenClaw has an insecure design filled with dangerous core features that are enabled by default. This creates the precise conditions for high-severity vulnerabilities.
The situation goes beyond insecure design and dangerous capabilities. OpenClaw’s attack surface changes dynamically at runtime.
OpenClaw’s codebase is vibe coded. The maintainers openly invite AI-generated contributions into production with no qualifier about security review. OpenClaw can write its own code at runtime and execute it on the same machine, with the same host privileges. The --yolo flag instructs the coding agent to execute without user confirmation. Every skill the agent writes at runtime is new, unaudited, AI-generated code running with shell privileges. A malicious skill in ClawHub can exploit this loop via prompt injection to compromise the host machine.
This is a self-modification loop with no security guarantee. In conventional applications, a vulnerability is introduced into a fixed codebase, discovered, and patched. With OpenClaw, that assumption is broken. The codebase changes at runtime. A patch provides no guarantee that the code the agent generates and executes next session is safe. This makes OpenClaw’s attack surface dynamic and can introduce new threats at any moment. We cannot even measure the extend of the vulnerability exploitation.
OpenClaw’s insecure design and dangerous default capabilities create the conditions for high-severity vulnerabilities. Vibe coding compounds this.
The industry learned that using C for memory-critical systems consistently produces severe vulnerabilities — buffer overflows, use-after-free bugs, format string exploits — because the language provides no safety net and exposes dangerous capabilities directly. The response was not “write better C.” It was Rust, Go, and memory-safe-by-design alternatives. These languages have security controls enforced by the design.
The answer to vibe-coded agentic AI follows the same logic. It is not “review the generated code more carefully.” It is enforcing secure-by-design principles:
NVIDIA’s NemoClaw attempts to address one aspect of this by wrapping OpenClaw inside a container runtime with network egress policies and filesystem containment. This does not give a robust sandboxing like a Virtual Machine. It is a container-level isolation so effectively the untrusted process runs along side the host processes. OpenClaw already has a Docker container escape CVE (CVE-2026-27002). NemoClaw also routes all inference calls through NVIDIA’s own cloud endpoints, replacing one trust dependency with another. Overall, NemoClaw does not address the insecure-by-default design, the missing authentication, or the dynamic attack surface created by runtime code generation.
Treat LLM-generated code as untrusted input. Sandbox it. Approve it explicitly. Never execute it with host privileges.
If you want to build the skills to prevent the vulnerability classes driving OpenClaw’s CVE record — OS command injection, path traversal, SSRF, and prompt injection — SecDim’s secure coding challenges let you practice identifying and fixing exactly these issues in realistic environment.
Want to skill-up in secure coding and AppSec? Try SecDim Wargames to learn how to find, hack and fix security vulnerabilities inspired by real-world incidents.
Join our secure coding and AppSec community. A discussion board to share and discuss all aspects of secure programming, AppSec, DevSecOps, fuzzing, cloudsec, AIsec code review, and more.
Read more