Skip to content

fix(file-safety): refuse read_file on credential paths (SSH keys, .env, …) (#16809)#16851

Closed
briandevans wants to merge 3 commits into
NousResearch:mainfrom
briandevans:fix/read-deny-list-16809
Closed

fix(file-safety): refuse read_file on credential paths (SSH keys, .env, …) (#16809)#16851
briandevans wants to merge 3 commits into
NousResearch:mainfrom
briandevans:fix/read-deny-list-16809

Conversation

@briandevans

Copy link
Copy Markdown
Contributor

Summary

  • Add a read-side counterpart to build_write_denied_paths() in agent/file_safety.py and have read_file_tool consult it before performing the read.
  • Blocks SSH private keys, the active HERMES_HOME .env, shell history, and credential directories (.ssh, .aws, .gnupg, .kube, .docker, .azure, .config/gh) from being read by the model.

The bug

read_file had no sensitive-path deny list. The reproduction in #16809 — and on a fresh checkout of main — succeeds today:

read_file("~/.ssh/id_ed25519")  # → full SSH private key
read_file("~/.hermes/.env")     # → full API keys
read_file("~/.zsh_history")     # → shell history

The only defense was pattern-based output redaction via redact_sensitive_text(), and that path was made off by default in 73753417 ("feat(security): make secret redaction off by default", #16794) and uses a module-level _REDACT_ENABLED snapshotted at import time — so config-driven opt-in is fragile, and well-known credential prefixes are the only thing recognised even when redaction is on.

Write-side protection exists (build_write_denied_paths + _check_sensitive_path) but has no read-side counterpart.

The fix

Add an access-control floor for high-value credential files where leakage is unrecoverable.

agent/file_safety.py gains three new helpers:

  • build_read_denied_paths(home) — exact paths: SSH keys (id_rsa/id_ed25519/id_ecdsa/id_dsa/authorized_keys/config), .netrc, .pgpass, .npmrc, .pypirc, the active HERMES_HOME / .env (resolved through get_hermes_home() so profile-aware), and shell history (.bash_history, .zsh_history, .psql_history).
  • build_read_denied_prefixes(home) — directory prefixes: ~/.ssh, ~/.aws, ~/.gnupg, ~/.kube, ~/.docker, ~/.azure, ~/.config/gh.
  • is_read_denied(path) — symmetric helper. Resolves through os.path.realpath before checking, so symlink shims are caught.

tools/file_tools.pyread_file_tool calls is_read_denied(path) after the existing device / binary / Hermes-internal guards (so denied paths can't return cached content from the dedup tracker either) and returns a structured error message pointing the model at the terminal tool when a credential read is genuinely intentional.

Intentional scope decisions

  • /etc/passwd is not blocked — world-readable user metadata, not a secret. /etc/shadow is already unreadable to non-root callers, so blocking it would be cosmetic.
  • Shell config (.bashrc, .zshrc, .profile, …) is write-denied but read-allowed: debugging PATH issues / alias setup is a legitimate case, and inline export FOO=key strings remain covered by redact_sensitive_text when security.redact_secrets is on.
  • The escape hatch for genuinely intentional reads (e.g. an agent that wants to inspect its own SSH config) is the existing terminal tool, which gates on user approval — exactly the layer this guard is designed to defer to.

Test plan

  • Focusedtests/tools/test_read_deny.py (new): 26 tests covering exact paths, prefixes, allowed paths, and end-to-end read_file_tool denial. All pass.
  • Adjacenttests/tools/test_write_deny.py, tests/tools/test_file_read_guards.py, tests/tools/test_file_operations.py, tests/tools/test_read_loop_detection.py, tests/tools/test_file_write_safety.py, tests/agent/test_copilot_acp_client.py. The 6 test_file_read_guards.py::TestFileDedup/TestWriteInvalidatesDedup failures and the 1 test_copilot_acp_client.py::test_read_text_file_redacts_sensitive_content failure reproduce on clean origin/main (8081425) — they're pre-existing macOS /private/var write-deny collision and the redaction-off-by-default regression respectively, unrelated to this change.
  • Regression guard — verified directly:
    Without guard, contents readable: True
    With guard, error returned: True
    With guard, secret leaked: False
    

Related

…v, …)

`read_file_tool` had no sensitive-path deny list. SSH private keys,
`~/.hermes/.env`, ~/.aws credentials, ~/.gnupg secrings, ~/.kube/config,
and shell history were all freely readable, leaving content-redaction
(off by default since 7375341) as the only line of defense — and
pattern-based redaction is best-effort by design.

Add a read-side deny list as the access-control floor:

* `agent/file_safety.py` gains `build_read_denied_paths()`,
  `build_read_denied_prefixes()`, and `is_read_denied()` mirroring the
  write-side helpers. The path set covers credential files (SSH keys,
  `.netrc`, `.pgpass`, `.npmrc`, `.pypirc`, the active HERMES_HOME
  `.env`) and shell history (`.bash_history`, `.zsh_history`,
  `.psql_history`); the prefix list covers `.ssh`, `.aws`, `.gnupg`,
  `.kube`, `.docker`, `.azure`, and `.config/gh`.

* `read_file_tool` calls `is_read_denied()` after the existing device
  / binary / Hermes-internal guards and returns a structured error
  pointing the model at the `terminal` tool (which gates on user
  approval) when a credential read is genuinely intentional.

Scope notes:

* `/etc/passwd` is intentionally not blocked — it's world-readable
  user metadata, not a secret, and `/etc/shadow` is already
  unreadable to non-root callers.
* Shell config files (`.bashrc`, `.zshrc`, …) are write-denied but
  *read*-allowed: legitimate debugging cases (PATH issues, alias
  setup) outweigh the risk, and inline `export FOO=key` strings
  remain covered by `redact_sensitive_text` when the user opts in.
* Symlink shims are caught: `is_read_denied` resolves the input via
  `os.path.realpath` before checking, matching the write-side
  behavior.

Refs NousResearch#16809.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 28, 2026 06:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an access-control “floor” for read_file_tool to prevent the model from reading high-value credential paths (SSH keys, .env, shell history, and common credential directories), complementing the existing write-side protections.

Changes:

  • Introduce read-side denylist helpers (build_read_denied_paths/prefixes, is_read_denied) in agent/file_safety.py.
  • Enforce the read denylist in tools/file_tools.py::read_file_tool before dedup/caching returns content.
  • Add a new test suite validating deny/allow behavior and the end-to-end read_file_tool denial response.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
tools/file_tools.py Applies is_read_denied() inside read_file_tool to block sensitive reads with a structured error.
agent/file_safety.py Adds read-side denylist construction + is_read_denied() implementation.
tests/tools/test_read_deny.py New tests covering exact-path denies, prefix denies, allowed paths, and read_file_tool denial behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/file_tools.py Outdated
Comment on lines +488 to +490
if is_read_denied(path):
return json.dumps({
"error": (

Copilot AI Apr 28, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_read_denied() is called with the raw path string. This can be bypassed with task-relative paths (e.g. if the terminal cwd is $HOME, reading .ssh/id_ed25519 would resolve to a sensitive file for file_ops.read_file, but is_read_denied() will resolve the relative path against the Python process cwd instead). Use the same resolved path you already computed (_resolved/resolved_str) for the deny check (and ideally add a regression test that sets TERMINAL_CWD and uses a relative .ssh/... path).

Copilot uses AI. Check for mistakes.
Comment thread tools/file_tools.py Outdated
Comment on lines 20 to 21
from agent.file_safety import is_read_denied
from agent.redact import redact_sensitive_text

Copilot AI Apr 28, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds a second import from agent.file_safety (is_read_denied) even though the file already imports get_read_block_error from the same module above. Consider importing both names in the existing import to avoid duplication and keep imports grouped consistently.

Copilot uses AI. Check for mistakes.
Comment thread agent/file_safety.py Outdated
Comment on lines +96 to +99
Mirrors build_write_denied_paths plus shell history files. Pattern-based
output redaction (agent/redact.py) is best-effort and was made off by
default; this list is the access-control floor for high-value credential
files where leakage is unrecoverable (private keys, .env, history).

Copilot AI Apr 28, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says this "Mirrors build_write_denied_paths plus shell history files", but the sets are not actually symmetric (e.g. this includes id_ecdsa/id_dsa while the write-deny exact list currently doesn’t, and the write-deny list includes /etc/* paths that this intentionally omits). Please reword to reflect the actual intent (credential files under $HOME + hermes .env + history), rather than claiming it mirrors the write list.

Suggested change
Mirrors build_write_denied_paths plus shell history files. Pattern-based
output redaction (agent/redact.py) is best-effort and was made off by
default; this list is the access-control floor for high-value credential
files where leakage is unrecoverable (private keys, .env, history).
Covers high-value credential files under ``home``, the Hermes ``.env``,
and shell/database history files. Pattern-based output redaction
(agent/redact.py) is best-effort and was made off by default; this list
is the access-control floor for files where leakage is unrecoverable
(private keys, .env, history).

Copilot uses AI. Check for mistakes.
@alt-glitch alt-glitch added type/security Security vulnerability or hardening P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder tool/file File tools (read, write, patch, search) labels Apr 28, 2026
…ring

Address Copilot review on PR NousResearch#16851:

**Finding 1 (`tools/file_tools.py:490` — task-relative path bypass):**
``read_file_tool`` previously called ``is_read_denied(path)`` with the
*raw* string. ``is_read_denied`` resolves relative paths against the
Python process cwd via ``os.path.realpath``, NOT the terminal cwd that
``_resolve_path_for_task`` uses. So an agent whose terminal cwd was
``$HOME`` could pass ``".ssh/id_ed25519"``, hit the resolver, get
back ``$HOME/.ssh/id_ed25519`` — but ``is_read_denied`` would resolve
the same string against, say, ``/tmp/runner``, miss the deny list, and
let the read through.

Pass the already-resolved path (``str(_resolved)``) to ``is_read_denied``
so the security check sees the same path the file system sees. New
``test_task_relative_ssh_key_path_does_not_bypass_guard`` covers the
bypass; without the fix the test reads back the SSH key contents,
with the fix it returns the credential-deny error.

**Finding 2 (`tools/file_tools.py:21` — duplicate file_safety import):**
Folded ``is_read_denied`` into the existing
``from agent.file_safety import get_read_block_error`` line.

**Finding 3 (`agent/file_safety.py:99` — misleading "mirrors" docstring):**
Replaced the "Mirrors build_write_denied_paths" claim with a precise
list of the asymmetries: read-deny adds shell-history files and older
SSH key types (``id_ecdsa`` / ``id_dsa``) that the write list omits,
and intentionally drops ``/etc/*`` paths (those are useful to read for
debugging; the truly sensitive ones are already root-only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@briandevans

Copy link
Copy Markdown
Contributor Author

@copilot All three findings addressed in commit a1d0427f9:

Finding 1 (tools/file_tools.py:490 — task-relative path bypass):
This was a real security bypass, not just a typing concern. is_read_denied resolves relative paths via os.path.realpath against the Python process cwd, but _resolve_path_for_task resolves against the terminal cwd (_get_live_tracking_cwd / TERMINAL_CWD). So an agent whose terminal cwd was $HOME could pass ".ssh/id_ed25519" and slip past the deny list. Now passing str(_resolved) so the security check sees the exact path the resolver hands to file_ops.

Added regression test test_task_relative_ssh_key_path_does_not_bypass_guard — verified that against the prior code the test reads back the SSH key contents (SECRETMATERIAL in result.content), and against the fixed code it returns the credential-deny error.

Finding 2 (tools/file_tools.py:21 — duplicate agent.file_safety import):
Folded is_read_denied into the existing from agent.file_safety import get_read_block_error line.

Finding 3 (agent/file_safety.py:99 — "Mirrors build_write_denied_paths" claim):
Replaced with a precise list of the asymmetries: read-deny adds shell-history paths (.bash_history / .zsh_history / .psql_history) and older SSH key types (id_ecdsa / id_dsa); read-deny omits /etc/* paths because those are useful to read for debugging and the truly sensitive ones (/etc/shadow) are already root-only.

All 27 tests in tests/tools/test_read_deny.py pass.

…orkers

The ``test_task_relative_ssh_key_path_does_not_bypass_guard`` regression
test passed locally and on a fresh CI runner but failed on Linux CI when
the same xdist worker had previously run a terminal-tool test:

    AssertionError: 'sensitive credential path' in 'File not found: .ssh/id_ed25519'

Root cause: ``_resolve_path_for_task`` prefers the live tracking cwd
(``_file_ops_cache`` / ``_active_environments`` lookups via
``_resolve_container_task_id``) over the ``TERMINAL_CWD`` env var.  A
warm worker carries those caches across tests; my ``test_relative_ssh``
task_id was being resolved against a stale ``"default"`` entry pointing
at the runner workspace, so the resolved path landed at
``/home/runner/.ssh/id_ed25519`` (no such file) instead of the
test-tmp ``$HOME/.ssh/id_ed25519`` we created.

Fix: monkeypatch ``_get_live_tracking_cwd`` to return ``None`` for the
duration of the test, forcing the predictable ``TERMINAL_CWD`` branch.
Also use a unique task_id so we don't share state with sibling tests.

This is purely a test-stability fix — the production guard added in
a1d0427 is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@briandevans

Copy link
Copy Markdown
Contributor Author

CI follow-up — test_task_relative_ssh_key_path_does_not_bypass_guard was flaking on hot xdist workers: _resolve_path_for_task prefers _get_live_tracking_cwd over the TERMINAL_CWD env var, so a worker that had previously run a terminal-tool test was resolving the test's task_id against a stale "default" entry pointing at /home/runner instead of the test's tmp HOME. Stabilized in commit 75fef1da0 by monkeypatching _get_live_tracking_cwd to return None for the duration of the test (forcing the deterministic TERMINAL_CWD branch) and giving the task_id a unique name. The production guard from a1d0427f9 is unchanged. The remaining test job failures all reproduce on clean origin/main 8081425 — same baseline set documented on adjacent PRs.

@briandevans

Copy link
Copy Markdown
Contributor Author

CI audit — all 26 test job failures are pre-existing baselines on clean origin/main (8081425). Zero failures intersect with touched code (agent/file_safety.py, tools/file_tools.py, tests/tools/test_read_deny.py).

Test Symptom Root cause on main
test_anthropic_adapter::test_custom_base_url hashes a hard-coded model ID curated list drifted from live anthropic catalog
test_copilot_acp_client::test_read_text_file_redacts_sensitive_content secret token visible in output redact-off-default after #16794
test_cmd_update::test_update_refreshes_repo_and_tui_node_dependencies call sequence mismatch adjacent npm-update refactor
test_config_env_expansion::test_load_config_* string indices must be integers recent config-loader return type change
test_container_aware_cli::test_get_container_exec_info_defaults assert None is not None adjacent docker-detection refactor
test_run_progress_topics::test_*_slack_dm 'NoneType' is not callable optional-dep import not stubbed
test_pty_bridge::test_resize_updates_child_winsize No value for $TERM linux runner missing terminfo
test_plugin_scanner_recursion::test_unknown_kind_falls_back_to_standalone scanner branch coverage recent kind-field rework
test_web_server::TestBuildSchemaFromConfig / TestPtyWebSocket / TestReloadEnv category-merge / timeout / env reload adjacent config-schema rework + uvicorn timeouts
test_background_review_toolset_restriction AIAgent.__init__ was not called recent restricted-toolset refactor
test_gateway_shutdown::test_cancel_background_tasks_* 30s timeout known flaky on Linux runner
test_session_split_brain_11016::test_command_cancels_active_task_and_unblocks_follow_up[*] 30s timeout same Linux-runner flake
test_tool_arg_coercion::test_inf_stays_string_for_integer_only 'inf' == inf recent coercion change
test_accretion_caps::test_live_cap_applied_after_read_add KeyError: 'read_timestamps' tracker-key rename in adjacent PR
test_file_state_registry::test_net_new_file_no_warning / *_sibling_agent_write_* Modal-backend env error optional-dep import
test_clipboard::TestIsWsl::* assert False is True linux runner reports non-WSL
test_make_agent_provider::test_make_agent_passes_resolved_provider call-args mismatch covered by my own #16684
test_protocol::test_session_resume_returns_hydrated_messages unexpected include_ancestors kwarg covered by my own #16683
test_agent_cache::test_concurrent_inserts_settle_at_cap 30s timeout linux-runner flake

Same baseline set is documented on adjacent open PRs (#16786, #16734, #16666). The new test_read_deny.py itself passes — the earlier xdist-flake failure is fixed in 75fef1d.

@briandevans

Copy link
Copy Markdown
Contributor Author

Closing — the commit landed on main as f0db5c99 and #16809 is closed. Thanks @teknium1!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround tool/file File tools (read, write, patch, search) type/security Security vulnerability or hardening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

read_file has no sensitive-path deny list — SSH keys, .env, shell history readable without restriction

3 participants