Security: read_file can exfiltrate credentials from auth.json and .anthropic_oauth.json

### Summary

The agent's `read_file` tool is sandboxed to `HERMES_HOME` (typically `~/.hermes` or `/opt/data` in containerized deploys). Inside that scope, `agent/file_safety.py:get_read_block_error` deny-lists `skills/.hub/` but nothing else. That leaves credential-pool files — `auth.json` (provider OAuth state + plaintext API keys) and `.anthropic_oauth.json` (Anthropic PKCE tokens) — fully readable by the agent. A prompt-injection attack reaching `read_file` can exfiltrate active provider credentials in plaintext.

### Reproducer

Tested against the current `main` branch (commit `e63929d4`).

```python
>>> from agent.file_safety import get_read_block_error
>>> get_read_block_error("/opt/data/auth.json")
>>> # Returns None — read is allowed.
```

Concretely on a running deployment with `DEEPSEEK_API_KEY` set in process env: `auth.json` materializes at `${HERMES_HOME}/auth.json` mode 0600 with the active key as plaintext in the `credential_pool.deepseek[].access_token` field. Mode 0600 prevents *other Unix users* from reading the file, but the agent itself runs as the file's owner — `read_file` is unaffected.

### Suggested fix

Extend `get_read_block_error` to also block reads of `${HERMES_HOME}/auth.json`, `${HERMES_HOME}/auth.lock`, and `${HERMES_HOME}/.anthropic_oauth.json`. Same pattern as the existing `skills/.hub/` deny — pure path check, no I/O. Returns a "credential store, cannot be read directly" error message so the agent (and humans reading the trace) understand the boundary.

The agent doesn't *need* to read its own credentials — provider tools (`auxiliary_client`, `credential_pool`) consume them through process env / OAuth flows that bypass `read_file`.

### Materials we have ready

We're running this fix as a local Dockerfile-overlay patch in production. Happy to send it as a PR if useful — it's:

- ~30 lines added to `agent/file_safety.py` (one helper + one extra branch in `get_read_block_error`).
- 7 unit tests covering the deny-list, the existing `skills/.hub` regression, path-traversal resolution, and the negative case (arbitrary `HERMES_HOME` files remain readable). All pass against `main` with the patch applied.

Let me know if you'd like the PR or if you'd prefer a different shape (e.g. extending `tools/credential_files.py` to centrally register these paths so other readers benefit too).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security: read_file can exfiltrate credentials from auth.json and .anthropic_oauth.json #17656

Summary

Reproducer

Suggested fix

Materials we have ready

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Security: read_file can exfiltrate credentials from auth.json and .anthropic_oauth.json #17656

Description

Summary

Reproducer

Suggested fix

Materials we have ready

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions