security: force redaction on in gateway mode, protect config.yaml writes#8738
Open
Dylanwooo wants to merge 1 commit into
Open
security: force redaction on in gateway mode, protect config.yaml writes#8738Dylanwooo wants to merge 1 commit into
Dylanwooo wants to merge 1 commit into
Conversation
The `security.redact_secrets: false` config option disables all secret
redaction globally — logs, tool output, LLM context, and outbound
messages — with a single boolean flip. This creates two risks:
1. Gateway multi-user exposure: in gateway mode, disabling redaction
leaks API keys and credentials into LLM context that can be
reflected to any connected user.
2. Prompt injection persistence: ~/.hermes/config.yaml was not
protected by the approval system, so an LLM manipulated via
prompt injection could write `security: { redact_secrets: false }`
and silently disable all 11 redaction call sites on next startup.
Changes:
- agent/redact.py: detect HERMES_GATEWAY_SESSION and force
_REDACT_ENABLED=True regardless of config; log warnings when
redaction is disabled or overridden
- tools/approval.py: add ~/.hermes/config.yaml to _SENSITIVE_WRITE_TARGET
so shell writes require the same approval as .env
- gateway/run.py: clarify comment on why config value is still read
- tests: 12 new tests covering gateway override logic and config.yaml
approval detection
This was referenced Apr 22, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
security.redact_secrets: falseconfig option disables all secret redaction globally — logs, tool output, LLM context, and outbound messages — with a single boolean flip. This creates two concrete risks:~/.hermes/config.yamlis not protected byapproval.py, so an LLM manipulated via prompt injection can writesecurity: { redact_secrets: false }to config.yaml. The change takes effect on next startup, silently disabling all 11 redaction call sites.What this PR does
1. Force redaction ON in gateway mode (
agent/redact.py)Gateway serves multiple users over messaging platforms — secrets in LLM context risk being reflected to any connected user. The config option is now ignored in gateway mode:
When redaction is disabled (CLI mode) or overridden (gateway mode), a
WARNINGis now logged so administrators know the state.2. Protect
config.yamlfrom unreviewed writes (tools/approval.py)approval.pyalready protects~/.hermes/.envfrom shell writes (tee, redirection). This PR extends the same protection to~/.hermes/config.yaml, which contains security settings (redact_secrets,tirith_enabled, etc.) that control the agent's own safety mechanisms.3. Clarify gateway comment (
gateway/run.py)Documents why the config value is still read even though gateway forces redaction — so
redact.pycan log the override warning.Why this doesn't affect agent capability
Redaction operates at the output text layer, not the permission layer:
$OPENAI_API_KEYin curl, etc.) are the real values from the process env, never touched by redactionsk-pro...l012— first 6 + last 4 chars) is sufficient for the LLM to confirm a key exists, distinguish between different keys, and diagnose format/provider errorsredact_secrets: falsefor local debuggingDesign notes
The existing import-time snapshot (
redact.py:16-18) correctly prevents runtimeexportmutations from disabling redaction mid-session — this shows the threat was anticipated. This PR closes two remaining gaps:Tests
12 new tests across 2 files:
tests/agent/test_redact.py—TestGatewayForcesRedaction: 4 tests verifying gateway override, env var variations, CLI fallback, and default behaviortests/tools/test_approval.py—TestTeePattern+TestSensitiveRedirectConfigYaml+TestSensitiveRedirectPattern: 8 tests verifying config.yaml write detection via tee, redirect, append, and that unrelated config.yaml paths are not flaggedAll existing tests pass unchanged (47 redact + 126 approval = 173 total).