Skip to content

Security: File Tool Output Redaction Gap — Secrets Exposed via read_file but Masked via Terminal #363

@teknium1

Description

@teknium1

Overview

Hermes Agent has an inconsistent secret redaction gap: terminal output is redacted via redact_sensitive_text() in terminal_tool.py (line 1042), but file tool output is not redacted at all. This means secrets that would be masked when using cat ~/.hermes/.env via terminal are fully exposed in plain text when using read_file("~/.hermes/.env").

This was discovered while researching agent-vault, a secret-aware file I/O layer for AI agents. Testing confirmed the gap: a file with 6 different secret types was fully exposed via read_file, while terminal cat at least masked 2 of them (OpenAI and GitHub tokens).

Additionally, our current redact_sensitive_text() is purely pattern-based and only catches secrets with known prefixes (sk-, ghp_, etc.). It misses Stripe keys (sk_live_/sk_test_), AWS access keys (AKIA...), database passwords, and other high-entropy credentials. Agent-vault's entropy-based detection with bigram false-positive prevention catches all of these.


Research Findings

The Gap Demonstrated

# Via terminal (redacted by terminal_tool.py):
$ cat config.yaml
openai_key: sk-pro...x234        # ✓ masked
github_token: ghp_AB...ef12      # ✓ masked
stripe_key: sk_live_51H3kJ9dUdOoXyz123456789abc  # ✗ NOT masked
aws_access_key: AKIAIOSFODNN7EXAMPLE               # ✗ NOT masked
password: SuperSecretPass123!@#                     # ✗ NOT masked

# Via read_file (NO redaction):
openai_key: sk-proj-abc123def456ghi789jkl...  # ✗ FULLY EXPOSED
github_token: ghp_ABCDEFGHIJKLMNOPQR...       # ✗ FULLY EXPOSED
stripe_key: sk_live_51H3kJ9dUdOo...           # ✗ FULLY EXPOSED
aws_access_key: AKIAIOSFODNN7EXAMPLE          # ✗ FULLY EXPOSED
password: SuperSecretPass123!@#               # ✗ FULLY EXPOSED

# Via agent-vault read (all secrets caught):
openai_key: <agent-vault:UNVAULTED:sha256:9f10ec24>  # ✓ all redacted
stripe_key: <agent-vault:UNVAULTED:sha256:73a9fa45>  # ✓ including unknowns
password: <agent-vault:UNVAULTED:sha256:645c8ba7>    # ✓ via entropy detection

Where Redaction Is Currently Applied

Component Redacted? How
terminal_tool.py (line 1042) ✅ Yes redact_sensitive_text(output)
Log files (RedactingFormatter) ✅ Yes RedactingFormatter in run_agent.py + gateway/run.py
code_execution_tool.py ✅ Yes Strips env vars from child process (prevention)
read_file No Raw content returned directly
search_files (content mode) No Matching lines returned unredacted
patch tool (diff output) No Diffs may contain old secret values
write_file N/A Write deny-list blocks sensitive paths

How agent-vault's Detection Works (Inspiration)

Agent-vault uses a 3-layer detection system that catches far more secrets:

  1. Known Pattern Matching — Similar to our _PREFIX_PATTERNS but includes Stripe, AWS AKIA, JWT (eyJ...), private key blocks, long hex/base64
  2. Shannon Entropy Analysis — Computes character entropy; threshold of 3.0 for strings ≥12 chars catches unknown high-entropy credentials
  3. Bigram False-Positive Prevention — Checks if a high-entropy string is actually English text using common bigram frequency (≥30% hit rate = "word-like" = not a secret). Prevents masking words like "moonshot" or "development"

Current State in Hermes Agent

agent/redact.py (115 lines) provides redact_sensitive_text() with these patterns:

  • Known prefixes: sk-, ghp_, github_pat_, xox[baprs]-, AIza, pplx-, fal_, fc-, bb_live_, gAAAA
  • ENV assignments: OPENAI_API_KEY=value
  • JSON fields: "apiKey": "value"
  • Auth headers: Authorization: Bearer <token>
  • Telegram bot tokens

Missing patterns (not detected):

  • Stripe: sk_live_, sk_test_, pk_live_, pk_test_
  • AWS: AKIA[A-Z0-9]{16}
  • JWT: eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+
  • Private keys: -----BEGIN.*PRIVATE KEY-----
  • Generic high-entropy strings (passwords, random tokens without known prefixes)

Implementation Plan

Skill vs. Tool Classification

This is a core codebase change, not a skill or tool. It fixes an inconsistency in existing security behavior — the redaction infrastructure exists but is not applied uniformly.

Phased Rollout

Phase 1: Apply Existing Redaction to File Tools (minimal change, high impact)

Add redact_sensitive_text() calls to file tool outputs:

  • tools/file_tools.pyread_file_tool(): redact result.content before returning
  • tools/file_tools.pysearch_tool(): redact content matches
  • tools/file_tools.pypatch_tool(): redact diff output
  • Consider: a config option redact_file_output: true/false (default true) for users who want to opt out

Insertion point (read_file_tool, ~line 127-128):

from agent.redact import redact_sensitive_text
result = file_ops.read_file(path, offset, limit)
if result.content:
    result.content = redact_sensitive_text(result.content)

Phase 2: Upgrade Detection with Entropy Analysis

Enhance agent/redact.py with agent-vault-inspired entropy detection:

  • Add missing known patterns (Stripe, AWS AKIA, JWT, private keys)
  • Add Shannon entropy calculation for unknown high-entropy strings
  • Add bigram-based false positive prevention
  • Add configurable sensitivity level

Phase 3: Configurable Redaction Profiles

  • Per-file-path rules (e.g., always redact *.env, never redact *.py)
  • Allowlisting specific files/directories
  • Integration with .hermesignore or similar

Pros & Cons

Pros

  • Closes a real security gap — secrets leak through file tools today
  • Phase 1 is minimal effort (< 20 lines of code) with high impact
  • Consistent behavior across all tool outputs
  • Defense in depth — works even if the agent reads unexpected files

Cons / Risks

  • False positives: Regex-based redaction can mask legitimate content (base64 data, UUIDs). Entropy detection helps but adds complexity.
  • Agent confusion: The agent may be confused when file content appears redacted (e.g., trying to use a masked API key). Need clear placeholder format.
  • Opt-out needed: Some users may need unredacted access (e.g., debugging .env files). A config option is important.
  • Performance: Entropy analysis on large files could add latency (mitigated by only running on value-like substrings, not full text)

Open Questions

  • Should redaction be opt-in or opt-out by default? (Recommendation: opt-out, i.e., enabled by default)
  • Should redacted values use a consistent placeholder format like [REDACTED:sk-***] or just the masked version like sk-pro...x234?
  • Should there be a way for the agent to request unredacted access for specific files? (Security concern: prompt injection could abuse this)
  • Should Phase 2 (entropy) be implemented in Python directly, or should we shell out to agent-vault for detection?

References

  • agent-vault — Secret-aware file I/O, Apache-2.0
  • secretless-ai — Tool hook-based secret blocking
  • CodeGate — Network proxy-based secret protection
  • Hermes agent/redact.py — Existing pattern-based redaction
  • Hermes tools/terminal_tool.py:1042 — Where terminal output is currently redacted
  • Issue Address outbound threats #129 — Related: outbound threat mitigation

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions