Finding
A Framework for Formalizing LLM Agent Security (arXiv:2603.19469)
Defines four security properties:
- Task Alignment — agent actions serve the user's stated goal
- Action Alignment — each action matches the agent's internal plan
- Source Authorization — tool calls are authorized by a trusted source
- Data Isolation — workspace data does not leak to untrusted outputs
Formalizes prompt injection, jailbreak, task drift, memory poisoning as violations of these properties.
Applicability to Zeph
Zeph has ExfiltrationGuard, ContentSanitizer, ExfiltrationGuard, and PromptInjectionDetector but no formal model mapping these to security properties. A property-based audit would identify which invariants are enforced and which are not.
Gaps identified:
- Source Authorization for MCP tool calls: Zeph has trust levels for MCP servers but no per-call authorization check against the originating source (user vs. tool output vs. memory)
- Data Isolation between workspace and untrusted tool outputs:
QuarantinedSummarizer addresses this partially, but the formal property (no path from untrusted tool output to workspace write without sanitization) is not enforced end-to-end
Proposed action
- Map each of Zeph's security components to the 4 properties
- Identify unaddressed paths (especially Source Authorization for MCP)
- File targeted bug/enhancement issues for each gap
Source
- arXiv:2603.19469 — A Framework for Formalizing LLM Agent Security
Finding
A Framework for Formalizing LLM Agent Security (arXiv:2603.19469)
Defines four security properties:
Formalizes prompt injection, jailbreak, task drift, memory poisoning as violations of these properties.
Applicability to Zeph
Zeph has
ExfiltrationGuard,ContentSanitizer,ExfiltrationGuard, andPromptInjectionDetectorbut no formal model mapping these to security properties. A property-based audit would identify which invariants are enforced and which are not.Gaps identified:
QuarantinedSummarizeraddresses this partially, but the formal property (no path from untrusted tool output to workspace write without sanitization) is not enforced end-to-endProposed action
Source