Skip to content

fix(docker): skip unnecessary boot chown when volume ownership already matches remapped UID#35027

Merged
benbarclay merged 1 commit into
NousResearch:mainfrom
Foldblade:main
Jun 1, 2026
Merged

fix(docker): skip unnecessary boot chown when volume ownership already matches remapped UID#35027
benbarclay merged 1 commit into
NousResearch:mainfrom
Foldblade:main

Conversation

@Foldblade

Copy link
Copy Markdown
Contributor

What does this PR do?

Bug Description

When HERMES_UID/PUID is set to a non-default value, stage2-hook.sh runs chown -R on all Hermes-managed subdirectories on every container start, even though the volume was already chowned to the correct UID on the first boot. On slow storage (SD card, NAS) with large data directories (~38K files), this adds 20+ minutes to every docker restart.

Root Cause

The needs_chown condition at line 123 checks $HERMES_UID != "10000" — a hardcoded constant — instead of comparing actual filesystem ownership against the hermes user's current UID. Since the usermod block above has already remapped id -u hermes to $HERMES_UID, actual_hermes_uid already reflects the correct target. The hardcoded "10000" check always evaluates to true whenever HERMES_UID is set to any non-default value, causing the full chown -R cycle on every boot regardless of actual filesystem state.

Fix

Remove the two redundant lines and keep only the stat-based ownership check:

 actual_hermes_uid=$(id -u hermes)
 needs_chown=false
-if [ -n "${HERMES_UID:-}" ] && [ "$HERMES_UID" != "10000" ]; then
-    needs_chown=true
-elif [ "$(stat -c %u "$HERMES_HOME" 2>/dev/null)" != "$actual_hermes_uid" ]; then
+if [ "$(stat -c %u "$HERMES_HOME" 2>/dev/null)" != "$actual_hermes_uid" ]; then
     needs_chown=true
 fi

The original elif branch already correctly checks actual ownership. The first branch (HERMES_UID != "10000") was a heuristic that guessed "user set PUID → files probably don't match". The stat call answers the real question directly: "do the files actually need fixing?" — making the heuristic both redundant and harmful.

Related Issue

Fixes #35025

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • Remove the two redundant lines and keep only the stat-based ownership check

How to Test

  1. Run the container with PUID set to a non-10000 value, e.g. -e PUID=1000 -e PGID=1000
  2. First boot: let chown complete (may be slow on first run)
  3. docker restart <container>
  4. Verify: docker logs <container> shows no [stage2] Fixing ownership of /opt/data line
  5. Verify: container reaches ready state in seconds, not minutes

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: Raspberry Pi 4B (aarch64, Debian trixie, Docker overlay2)

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

@glensc

glensc commented May 31, 2026

Copy link
Copy Markdown

I came up with the exact same fix, so LGTM.

@benbarclay benbarclay merged commit 1031031 into NousResearch:main Jun 1, 2026
benbarclay added a commit that referenced this pull request Jun 4, 2026
…HOME (#35027 regression) (#38556)

The stage2 hook gates the recursive chown of the build trees under
$INSTALL_DIR (.venv, ui-tui, node_modules) so a HERMES_UID/PUID remap
leaves them writable by the new runtime UID — needed for lazy_deps
'uv pip install' of platform extras (#15012, #21100) and the TUI esbuild
rebuild into ui-tui/dist (#28851).

#35027 folded that chown under the $HERMES_HOME ownership check
('stat $HERMES_HOME != hermes_uid'). But 'usermod -u <new> hermes'
re-chowns the hermes home dir ($HERMES_HOME == /opt/data) to the new UID
as a side effect, so after any remap that stat is already satisfied and
needs_chown is false — silently skipping the build-tree chown on the
common PUID/NAS path. The venv stays owned by the build-time UID (10000),
so lazy installs and TUI rebuilds fail with EACCES.

Probe the build trees directly instead: chown only when /opt/hermes/.venv
is not already owned by the runtime hermes UID. Independent of
$HERMES_HOME ownership, idempotent across restarts.

Verified live: built the image, booted with HERMES_UID/HERMES_GID on a
fresh named volume, confirmed .venv/ui-tui/node_modules end up owned by
the remapped UID and 'uv pip install' into the venv succeeds; confirmed
the recursive chown fires once and is skipped on restart.
JoeKowal pushed a commit to JoeKowal/hermes-agent that referenced this pull request Jun 4, 2026
Yuki-14544869 pushed a commit to Yuki-14544869/hermes-agent that referenced this pull request Jun 4, 2026
…HOME (NousResearch#35027 regression) (NousResearch#38556)

The stage2 hook gates the recursive chown of the build trees under
$INSTALL_DIR (.venv, ui-tui, node_modules) so a HERMES_UID/PUID remap
leaves them writable by the new runtime UID — needed for lazy_deps
'uv pip install' of platform extras (NousResearch#15012, NousResearch#21100) and the TUI esbuild
rebuild into ui-tui/dist (NousResearch#28851).

NousResearch#35027 folded that chown under the $HERMES_HOME ownership check
('stat $HERMES_HOME != hermes_uid'). But 'usermod -u <new> hermes'
re-chowns the hermes home dir ($HERMES_HOME == /opt/data) to the new UID
as a side effect, so after any remap that stat is already satisfied and
needs_chown is false — silently skipping the build-tree chown on the
common PUID/NAS path. The venv stays owned by the build-time UID (10000),
so lazy installs and TUI rebuilds fail with EACCES.

Probe the build trees directly instead: chown only when /opt/hermes/.venv
is not already owned by the runtime hermes UID. Independent of
$HERMES_HOME ownership, idempotent across restarts.

Verified live: built the image, booted with HERMES_UID/HERMES_GID on a
fresh named volume, confirmed .venv/ui-tui/node_modules end up owned by
the remapped UID and 'uv pip install' into the venv succeeds; confirmed
the recursive chown fires once and is skipped on restart.
j2h4u added a commit to j2h4u/hermes-agent that referenced this pull request Jun 4, 2026
Brings deploy up to upstream/main (468 commits, 2026-05-31 → 06-04).

All three carried docker patches are now superseded by upstream and dropped
— merge result is byte-identical to upstream/main:

- 5031c9e (chown install trees independently of volume ownership)
  → superseded by upstream's "Fix ownership of build trees under $INSTALL_DIR"
    block in stage2-hook.sh (NousResearch#38655/NousResearch#38556), a strict superset that also
    covers the gateway tree and documents the NousResearch#35027 gating regression.

- e1fc281 (avoid implicit chown of host hermes home)
- 810534f (silence install stamp on protected mounts)
  → superseded by the s6-overlay boot migration (feat(docker)! e0e9c89),
    which turned entrypoint.sh into a deprecated shim and moved all chown /
    setup logic into cont-init.d/01-hermes-setup (stage2-hook.sh). Upstream's
    targeted HERMES_HOME chown now chowns the dir itself, not bind-mount
    contents — exactly the behavior our patch protected.

deploy now carries no local diff vs upstream/main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
davidgut1982 pushed a commit to davidgut1982/hermes-agent that referenced this pull request Jun 5, 2026
…HOME (NousResearch#35027 regression) (NousResearch#38556)

The stage2 hook gates the recursive chown of the build trees under
$INSTALL_DIR (.venv, ui-tui, node_modules) so a HERMES_UID/PUID remap
leaves them writable by the new runtime UID — needed for lazy_deps
'uv pip install' of platform extras (NousResearch#15012, NousResearch#21100) and the TUI esbuild
rebuild into ui-tui/dist (NousResearch#28851).

NousResearch#35027 folded that chown under the $HERMES_HOME ownership check
('stat $HERMES_HOME != hermes_uid'). But 'usermod -u <new> hermes'
re-chowns the hermes home dir ($HERMES_HOME == /opt/data) to the new UID
as a side effect, so after any remap that stat is already satisfied and
needs_chown is false — silently skipping the build-tree chown on the
common PUID/NAS path. The venv stays owned by the build-time UID (10000),
so lazy installs and TUI rebuilds fail with EACCES.

Probe the build trees directly instead: chown only when /opt/hermes/.venv
is not already owned by the runtime hermes UID. Independent of
$HERMES_HOME ownership, idempotent across restarts.

Verified live: built the image, booted with HERMES_UID/HERMES_GID on a
fresh named volume, confirmed .venv/ui-tui/node_modules end up owned by
the remapped UID and 'uv pip install' into the venv succeeds; confirmed
the recursive chown fires once and is skipped on restart.
changman pushed a commit to changman/hermes-agent that referenced this pull request Jun 10, 2026
…HOME (NousResearch#35027 regression) (NousResearch#38556)

The stage2 hook gates the recursive chown of the build trees under
$INSTALL_DIR (.venv, ui-tui, node_modules) so a HERMES_UID/PUID remap
leaves them writable by the new runtime UID — needed for lazy_deps
'uv pip install' of platform extras (NousResearch#15012, NousResearch#21100) and the TUI esbuild
rebuild into ui-tui/dist (NousResearch#28851).

NousResearch#35027 folded that chown under the $HERMES_HOME ownership check
('stat $HERMES_HOME != hermes_uid'). But 'usermod -u <new> hermes'
re-chowns the hermes home dir ($HERMES_HOME == /opt/data) to the new UID
as a side effect, so after any remap that stat is already satisfied and
needs_chown is false — silently skipping the build-tree chown on the
common PUID/NAS path. The venv stays owned by the build-time UID (10000),
so lazy installs and TUI rebuilds fail with EACCES.

Probe the build trees directly instead: chown only when /opt/hermes/.venv
is not already owned by the runtime hermes UID. Independent of
$HERMES_HOME ownership, idempotent across restarts.

Verified live: built the image, booted with HERMES_UID/HERMES_GID on a
fresh named volume, confirmed .venv/ui-tui/node_modules end up owned by
the remapped UID and 'uv pip install' into the venv succeeds; confirmed
the recursive chown fires once and is skipped on restart.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docker Docker image, Compose, packaging P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

4 participants