Skip to content

fix: chown root-owned files in HERMES_HOME on startup#35098

Closed
x1am1 wants to merge 1 commit into
NousResearch:mainfrom
x1am1:fix/chown-root-level-files
Closed

fix: chown root-owned files in HERMES_HOME on startup#35098
x1am1 wants to merge 1 commit into
NousResearch:mainfrom
x1am1:fix/chown-root-level-files

Conversation

@x1am1

@x1am1 x1am1 commented May 30, 2026

Copy link
Copy Markdown
Contributor

Problem

When docker exec -u root creates or modifies files directly in $HERMES_HOME (e.g. gateway.lock, state.db, auth.json), they become root-owned. The hermes user (uid 10000) then gets PermissionError on next startup, causing a gateway restart loop.

Root Cause

The existing chown logic in stage2-hook.sh only covers specific subdirectories (cron, sessions, logs, etc.) but not root-level files.

Fix

Add a targeted find command that chowns all root-owned regular files in $HERMES_HOME root to hermes:

find "$HERMES_HOME" -maxdepth 1 -user root -type f -exec chown hermes:hermes {} + 2>/dev/null || true
  • -maxdepth 1 — only root-level files, not recursive
  • -user root — only files owned by root
  • -type f — files only, not directories

Supersedes #35078.

When docker exec -u root creates or modifies files directly in
$HERMES_HOME (gateway.lock, state.db, auth.json, etc.), they become
root-owned. The hermes user (uid 10000) then gets PermissionError on
next startup, causing a gateway restart loop.

The existing chown logic only covers specific subdirectories (cron,
sessions, logs, etc.) but not root-level files. We can't chown -R the
whole directory (NousResearch#19788 — host-mounted bind may contain unrelated user
files), but chowning individual root-owned files is safe.

Add a find command that chowns all root-owned regular files in the
HERMES_HOME root directory to hermes. This runs unconditionally on
every container start, catching any files left behind by root
operations.
@alt-glitch alt-glitch added type/bug Something isn't working area/docker Docker image, Compose, packaging P2 Medium — degraded but workaround exists labels May 30, 2026
benbarclay added a commit that referenced this pull request Jun 1, 2026
The targeted data-volume chown in stage2-hook.sh only covers hermes-owned
*subdirectories*; loose state files living directly under $HERMES_HOME
(auth.json, state.db, gateway.lock, gateway_state.json, …) are missed.
When created or rewritten by `docker exec <container> hermes …` (root
unless `-u` is passed) they land root-owned, and the unprivileged hermes
runtime then hits PermissionError on next startup, producing a gateway
restart loop.

Fix: reset ownership of an explicit allowlist of hermes-owned top-level
files on every boot. The list mirrors the top-level file entries of
hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the runtime lock
files.

This uses a targeted allowlist rather than the originally-proposed blanket
`find $HERMES_HOME -maxdepth 1 -user root` sweep, preserving the
targeted-ownership contract from #19788 / PR #19795: a bind-mounted
$HERMES_HOME may contain host-owned files Hermes does not manage, and
those must never be chowned. Verified end-to-end: allowlisted root-owned
files are reset to hermes on restart while a non-allowlisted host file
keeps its root ownership.

Co-authored-by: x1am1 <2663402852@qq.com>
@benbarclay

Copy link
Copy Markdown
Collaborator

Thanks for catching this @x1am1 — the bug is real: top-level state files (gateway.lock, state.db, auth.json) live directly under $HERMES_HOME, so the existing targeted subdir chown misses them, and docker exec -u root leaves them root-owned → restart loop.

I've opened #36236 as a salvage that fixes the same bug while keeping your authorship (Co-authored-by + an AUTHOR_MAP entry so you're credited in release notes). The one change: instead of find $HERMES_HOME -maxdepth 1 -user root, it uses an explicit allowlist of hermes-owned top-level files. The reason is that a blanket -user root sweep would also chown any host-owned root file in a bind-mounted $HERMES_HOME — which the targeted-ownership work in #19788 / #19795 deliberately avoids. The allowlist keeps your fix consistent with that contract.

Verified end-to-end: allowlisted root-owned files get reset to hermes on restart, while a non-allowlisted host file keeps its root ownership.

Closing this in favor of #36236 — really appreciate the report and the fix.

benbarclay added a commit that referenced this pull request Jun 1, 2026
#36236)

The targeted data-volume chown in stage2-hook.sh only covers hermes-owned
*subdirectories*; loose state files living directly under $HERMES_HOME
(auth.json, state.db, gateway.lock, gateway_state.json, …) are missed.
When created or rewritten by `docker exec <container> hermes …` (root
unless `-u` is passed) they land root-owned, and the unprivileged hermes
runtime then hits PermissionError on next startup, producing a gateway
restart loop.

Fix: reset ownership of an explicit allowlist of hermes-owned top-level
files on every boot. The list mirrors the top-level file entries of
hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the runtime lock
files.

This uses a targeted allowlist rather than the originally-proposed blanket
`find $HERMES_HOME -maxdepth 1 -user root` sweep, preserving the
targeted-ownership contract from #19788 / PR #19795: a bind-mounted
$HERMES_HOME may contain host-owned files Hermes does not manage, and
those must never be chowned. Verified end-to-end: allowlisted root-owned
files are reset to hermes on restart while a non-allowlisted host file
keeps its root ownership.

Co-authored-by: x1am1 <2663402852@qq.com>
JoeKowal pushed a commit to JoeKowal/hermes-agent that referenced this pull request Jun 4, 2026
…search#35098) (NousResearch#36236)

The targeted data-volume chown in stage2-hook.sh only covers hermes-owned
*subdirectories*; loose state files living directly under $HERMES_HOME
(auth.json, state.db, gateway.lock, gateway_state.json, …) are missed.
When created or rewritten by `docker exec <container> hermes …` (root
unless `-u` is passed) they land root-owned, and the unprivileged hermes
runtime then hits PermissionError on next startup, producing a gateway
restart loop.

Fix: reset ownership of an explicit allowlist of hermes-owned top-level
files on every boot. The list mirrors the top-level file entries of
hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the runtime lock
files.

This uses a targeted allowlist rather than the originally-proposed blanket
`find $HERMES_HOME -maxdepth 1 -user root` sweep, preserving the
targeted-ownership contract from NousResearch#19788 / PR NousResearch#19795: a bind-mounted
$HERMES_HOME may contain host-owned files Hermes does not manage, and
those must never be chowned. Verified end-to-end: allowlisted root-owned
files are reset to hermes on restart while a non-allowlisted host file
keeps its root ownership.

Co-authored-by: x1am1 <2663402852@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docker Docker image, Compose, packaging P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants