MCP Tool Poisoning

MCP Tool Poisoning is a specialised supply-chain / prompt-injection vector where an attacker supplies or updates an MCP tool with hidden or malicious instructions inside the tool metadata (name, description, help text, or tool manifest). The LLM ingests the tool description as part of its tool-aware context and executes or reasons about actions based on those hidden instructions — for example, exfiltrating files, leaking secrets, or making outbound calls — even though the visible UI and initial approval appeared benign.

How it works (technical)

MCP tools expose a machine-readable manifest (name, description, parameters, documentation) that LLMs incorporate into reasoning/context.
An attacker publishes a tool whose public UI description is benign, but the manifest or internal description contains hidden directives (obfuscated whitespace, embedded instructions, or comments) instructing the LLM to perform unauthorised actions.
The LLM, trusting the tool metadata as authoritative, follows those hidden directives when executing tool logic or composing calls. Because most clients show only the human-facing summary, users don’t see the malicious instructions and therefore approve or trigger the tool.
Variants include post-install mutation (the tool quietly redefines itself after approval) and cross-server shadowing (malicious servers return poisoned manifests to clients).

Remediation

*Treat tools like code: provenance, signing, minimal trust, and reproducible manifests.

Tool provenance & signing — require cryptographically signed tool manifests and binaries. Reject or flag unsigned/unknown signatures.
Display complete machine manifest to users on approval — show raw description metadata (with highlights for unusual content) and require an explicit, contextual confirmation step for any tool that requests network access, file access, or secrets.
Least privilege & capability scoping — adopt fine-grained permission scopes (read_files, network_e2e, send_http) and grant only what the tool strictly needs. Default to deny.

Metadata

Severity: high
Slug: mcp-tool-poisoning

CWEs

20: Improper Input Validation
829: Inclusion of Functionality from Untrusted Control Sphere

OWASP

LLM01:2025: Prompt Injection
LLM03:2025: Supply Chain

MCP Tool Poisoning

Remediation

Metadata

CWEs

OWASP

Available Labs

Artificial Intelligence Ai Labs

No matching labs found

Find, Hack and Fix Your First Vulnerability