fix: chown root-owned files in HERMES_HOME on startup#35098
Conversation
When docker exec -u root creates or modifies files directly in $HERMES_HOME (gateway.lock, state.db, auth.json, etc.), they become root-owned. The hermes user (uid 10000) then gets PermissionError on next startup, causing a gateway restart loop. The existing chown logic only covers specific subdirectories (cron, sessions, logs, etc.) but not root-level files. We can't chown -R the whole directory (NousResearch#19788 — host-mounted bind may contain unrelated user files), but chowning individual root-owned files is safe. Add a find command that chowns all root-owned regular files in the HERMES_HOME root directory to hermes. This runs unconditionally on every container start, catching any files left behind by root operations.
The targeted data-volume chown in stage2-hook.sh only covers hermes-owned *subdirectories*; loose state files living directly under $HERMES_HOME (auth.json, state.db, gateway.lock, gateway_state.json, …) are missed. When created or rewritten by `docker exec <container> hermes …` (root unless `-u` is passed) they land root-owned, and the unprivileged hermes runtime then hits PermissionError on next startup, producing a gateway restart loop. Fix: reset ownership of an explicit allowlist of hermes-owned top-level files on every boot. The list mirrors the top-level file entries of hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the runtime lock files. This uses a targeted allowlist rather than the originally-proposed blanket `find $HERMES_HOME -maxdepth 1 -user root` sweep, preserving the targeted-ownership contract from #19788 / PR #19795: a bind-mounted $HERMES_HOME may contain host-owned files Hermes does not manage, and those must never be chowned. Verified end-to-end: allowlisted root-owned files are reset to hermes on restart while a non-allowlisted host file keeps its root ownership. Co-authored-by: x1am1 <2663402852@qq.com>
|
Thanks for catching this @x1am1 — the bug is real: top-level state files ( I've opened #36236 as a salvage that fixes the same bug while keeping your authorship ( Verified end-to-end: allowlisted root-owned files get reset to hermes on restart, while a non-allowlisted host file keeps its root ownership. Closing this in favor of #36236 — really appreciate the report and the fix. |
#36236) The targeted data-volume chown in stage2-hook.sh only covers hermes-owned *subdirectories*; loose state files living directly under $HERMES_HOME (auth.json, state.db, gateway.lock, gateway_state.json, …) are missed. When created or rewritten by `docker exec <container> hermes …` (root unless `-u` is passed) they land root-owned, and the unprivileged hermes runtime then hits PermissionError on next startup, producing a gateway restart loop. Fix: reset ownership of an explicit allowlist of hermes-owned top-level files on every boot. The list mirrors the top-level file entries of hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the runtime lock files. This uses a targeted allowlist rather than the originally-proposed blanket `find $HERMES_HOME -maxdepth 1 -user root` sweep, preserving the targeted-ownership contract from #19788 / PR #19795: a bind-mounted $HERMES_HOME may contain host-owned files Hermes does not manage, and those must never be chowned. Verified end-to-end: allowlisted root-owned files are reset to hermes on restart while a non-allowlisted host file keeps its root ownership. Co-authored-by: x1am1 <2663402852@qq.com>
…search#35098) (NousResearch#36236) The targeted data-volume chown in stage2-hook.sh only covers hermes-owned *subdirectories*; loose state files living directly under $HERMES_HOME (auth.json, state.db, gateway.lock, gateway_state.json, …) are missed. When created or rewritten by `docker exec <container> hermes …` (root unless `-u` is passed) they land root-owned, and the unprivileged hermes runtime then hits PermissionError on next startup, producing a gateway restart loop. Fix: reset ownership of an explicit allowlist of hermes-owned top-level files on every boot. The list mirrors the top-level file entries of hermes_cli.profile_distribution.USER_OWNED_EXCLUDE plus the runtime lock files. This uses a targeted allowlist rather than the originally-proposed blanket `find $HERMES_HOME -maxdepth 1 -user root` sweep, preserving the targeted-ownership contract from NousResearch#19788 / PR NousResearch#19795: a bind-mounted $HERMES_HOME may contain host-owned files Hermes does not manage, and those must never be chowned. Verified end-to-end: allowlisted root-owned files are reset to hermes on restart while a non-allowlisted host file keeps its root ownership. Co-authored-by: x1am1 <2663402852@qq.com>
Problem
When
docker exec -u rootcreates or modifies files directly in$HERMES_HOME(e.g.gateway.lock,state.db,auth.json), they become root-owned. The hermes user (uid 10000) then getsPermissionErroron next startup, causing a gateway restart loop.Root Cause
The existing chown logic in
stage2-hook.shonly covers specific subdirectories (cron, sessions, logs, etc.) but not root-level files.Fix
Add a targeted
findcommand that chowns all root-owned regular files in$HERMES_HOMEroot to hermes:-maxdepth 1— only root-level files, not recursive-user root— only files owned by root-type f— files only, not directoriesSupersedes #35078.