Source
Log-To-Leak: Prompt Injection Attacks on Tool-Using LLM Agents via Model Context Protocol
Finding
Malicious MCP servers can embed injection instructions in description and inputSchema.description fields of their tool definitions. When Zeph fetches the tool catalog from an MCP server and injects it into the LLM context during planning, these fields are processed as trusted system content — bypassing ContentSanitizer which only applies to user/web content.
This is a distinct attack vector from indirect injection via web scraping: it targets the tool catalog ingestion path, not message content.
Impact
- Any MCP server (dynamic or static) can inject arbitrary instructions into the LLM's planning context
ContentSanitizer + ExfiltrationGuard pipeline does NOT cover tool definitions
zeph-mcp's McpToolRegistry stores tool definitions in Qdrant without sanitization
- Attack surface: all sessions with external MCP servers (
/mcp add, config-based servers)
Fix
Add a sanitization pass over MCP tool definitions at registration time in zeph-mcp (before storing in Qdrant):
- Apply
SecurityPatterns regexes to description and all parameter description fields
- Cap
description field length (e.g. 512 bytes max)
- Log WARN when injection patterns detected; optionally strip/truncate the offending field
- Do NOT block tool registration — just sanitize the text before it reaches the LLM context
Severity
High — active unmitigated attack surface affecting all sessions with external MCP servers. Fix is contained to zeph-mcp tool registration and requires no schema changes.
Research Reference
The Zeph ContentSanitizer pipeline (Untrusted Content Isolation epic #1195) applies a similar pattern to web/tool output content. The same approach is applicable here at the tool catalog ingestion layer.
Source
Log-To-Leak: Prompt Injection Attacks on Tool-Using LLM Agents via Model Context Protocol
Finding
Malicious MCP servers can embed injection instructions in
descriptionandinputSchema.descriptionfields of their tool definitions. When Zeph fetches the tool catalog from an MCP server and injects it into the LLM context during planning, these fields are processed as trusted system content — bypassingContentSanitizerwhich only applies to user/web content.This is a distinct attack vector from indirect injection via web scraping: it targets the tool catalog ingestion path, not message content.
Impact
ContentSanitizer+ExfiltrationGuardpipeline does NOT cover tool definitionszeph-mcp'sMcpToolRegistrystores tool definitions in Qdrant without sanitization/mcp add, config-based servers)Fix
Add a sanitization pass over MCP tool definitions at registration time in
zeph-mcp(before storing in Qdrant):SecurityPatternsregexes todescriptionand all parameterdescriptionfieldsdescriptionfield length (e.g. 512 bytes max)Severity
High — active unmitigated attack surface affecting all sessions with external MCP servers. Fix is contained to
zeph-mcptool registration and requires no schema changes.Research Reference
The Zeph
ContentSanitizerpipeline (Untrusted Content Isolation epic #1195) applies a similar pattern to web/tool output content. The same approach is applicable here at the tool catalog ingestion layer.