docs(security): rewrite policy around OS-level isolation as the boundary by jquesnelle · Pull Request #20317 · NousResearch/hermes-agent

jquesnelle · 2026-05-05T16:58:51Z

This is a proposed rewrite of the core security policy of Hermes Agent. It outlines the trust model that the agent operates under, and the processes for security vulnerability reporting. The key pieces of it are:

Restate the trust model from first principles: the OS is the only load-bearing boundary against an adversarial LLM
Distinguish terminal-backend isolation from whole-process wrapping
Name in-process components (approval gate, output redaction, Skills Guard) as heuristics, and the class of reports that defeat them as out of scope under this policy while explicitly welcoming them as regular issues or PRs

This creates a much narrower scope of what constitutes a security vulnerability vs. what can go through the normal PR process. It also gives a firmer commitment on what really can be guaranteed at the various trust boundaries.

We'd like gather community feedback on adopting this new security policy, please leave your comments below!

Restate the trust model from first principles: the OS is the only load-bearing boundary against an adversarial LLM. Distinguish terminal-backend isolation (sandboxes the shell tool) from whole-process wrapping (sandboxes the agent itself, reference deployment NVIDIA OpenShell). Name in-process components (approval gate, output redaction, Skills Guard) as heuristics, and the class of reports that defeat them as out of scope under this policy — while explicitly welcoming them as regular issues or PRs. Introduce 'agent-loaded content' as the narrow, honest commitment: attacker-influenced input must not chain into a write the agent later loads on its own initiative. Strip implementation-detail enumerations (backend names, adapter names, config keys, env vars, internal symbols) so the doc stays evergreen as code evolves.

vominh1919 · 2026-05-06T03:36:01Z

Thank you for this rewrite — the shift to "OS is the only load-bearing boundary" is a much clearer mental model than the previous policy, and naming in-process components as heuristics explicitly sets the right expectations for reporters and operators.

I have a few observations from reading the policy against the codebase:

1. External-surface allowlist enforcement (§2.6, rule 2)

Adapters must refuse to dispatch agent work, resolve approvals, or relay output until an allowlist is set. Code paths that fail open when no allowlist is configured are code bugs in scope under §3.1.

This is a strong and testable contract. Has the team audited the current adapters against this rule? I ask because some adapters (e.g. api_server) configure auth via api_key in the config block, and the gateway's PlatformConfig.from_dict() reads from the extra dict — if a user writes api_key at the top level instead of under extra, the adapter silently falls back to no-auth (see #20501). That's arguably a "fail open when no allowlist is configured" code path. If the team agrees this falls under §3.1, it could be worth calling out as an example in the policy or in a follow-up issue.

2. MCP trust boundary (§2.3 vs §2.4)

The policy says MCP subprocesses receive a filtered environment (credential scrubbing), but doesn't explicitly classify MCP servers as either a boundary or a heuristic. From the code:

tools/mcp_tool.py runs MCP servers as host subprocesses with _build_safe_env()
OSV malware checking happens for npx/uvx packages
MCP servers can register tools that the LLM calls directly

Under terminal-backend isolation, MCP servers run on the host (not inside the sandbox), so they're inside the trust envelope. Under whole-process wrapping, they'd be confined. It might be worth a one-liner in §2.2 or §2.3 clarifying where MCP servers sit relative to the two postures, since "MCP server" is a term operators will encounter frequently and the current policy leaves it implicit.

3. Skill import-time execution (§2.4)

Reviewing a skill means reading its Python code and scripts, not just its SKILL.md description — skills execute arbitrary Python at import time.

This is an important statement. For operators who want to do this review, it might help to name the specific code path — skill_commands.py scans ~/.hermes/skills/ and injects the skill as a user message, but the Python import happens via the tool discovery chain in model_tools.py → _discover_tools(). Knowing when the arbitrary code runs (at import during tool discovery, not at invocation) helps an operator know what to audit and when.

4. Minor: OpenShell setup reference

The policy references NVIDIA OpenShell as a supported whole-process wrapping option, but there's no link to a Hermes-specific setup guide. If the integration is production-ready, a short "see docs/openshell.md" pointer would help operators actually adopt it. If it's aspirational, a note like "integration in progress" would set expectations.

5. Suggestion: concrete examples for §3.2

The out-of-scope section is clear in principle, but a few concrete examples would help community contributors calibrate before filing:

Scenario	Why it's out of scope
LLM emits a malicious URL via prompt injection and the agent fetches it	Prompt injection alone; no §3.1 boundary crossed
Approval gate regex bypassed by obfuscated shell command	In-process heuristic; not a boundary
Skill reads `~/.hermes/.env` at import time	Inside trust envelope; operator should have reviewed before install

These don't need to be in the policy itself — a SECURITY-FAQ.md or a pinned issue could serve the same purpose.

Overall this is a strong policy rewrite. The explicit "heuristic vs boundary" distinction will save both the team and reporters significant time triaging reports. Happy to help with any of the above if the team agrees on direction.

…-policy docs(security): rewrite policy around OS-level isolation as the boundary

alt-glitch added type/security Security vulnerability or hardening comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have labels May 5, 2026

changes from feedback

0d1cbc2

jquesnelle merged commit bf2cc8b into main May 11, 2026
7 checks passed

jquesnelle deleted the meta/security-policy branch May 11, 2026 05:36

bot-ted mentioned this pull request May 11, 2026

chore: sync with upstream main (2026-05-11) bot-ted/hermes-agent#28

Merged

JinyuID pushed a commit to JinyuID/hermes-agent that referenced this pull request May 11, 2026

Merge pull request NousResearch#20317 from NousResearch/meta/security…

a9a5d72

…-policy docs(security): rewrite policy around OS-level isolation as the boundary

02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026

Merge pull request NousResearch#20317 from NousResearch/meta/security…

bd769ee

…-policy docs(security): rewrite policy around OS-level isolation as the boundary

jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026

Merge pull request NousResearch#20317 from NousResearch/meta/security…

8856ec8

…-policy docs(security): rewrite policy around OS-level isolation as the boundary

github-actions Bot mentioned this pull request May 17, 2026

chore: bump NousResearch/hermes-agent version from v2026.5.7 to v2026.5.16 Docker-Hub-sirmark/docker-hermes-agent#6

Merged

gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026

Merge pull request NousResearch#20317 from NousResearch/meta/security…

4bd9e1e

…-policy docs(security): rewrite policy around OS-level isolation as the boundary

reefmind mentioned this pull request Jun 4, 2026

platforms.api_server config values (port, host, key) silently ignored when not nested under 'extra' #20501

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(security): rewrite policy around OS-level isolation as the boundary#20317

docs(security): rewrite policy around OS-level isolation as the boundary#20317
jquesnelle merged 2 commits into
mainfrom
meta/security-policy

jquesnelle commented May 5, 2026

Uh oh!

vominh1919 commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jquesnelle commented May 5, 2026

Uh oh!

vominh1919 commented May 6, 2026

1. External-surface allowlist enforcement (§2.6, rule 2)

2. MCP trust boundary (§2.3 vs §2.4)

3. Skill import-time execution (§2.4)

4. Minor: OpenShell setup reference

5. Suggestion: concrete examples for §3.2

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants