Skip to content

fix(docker): accept PUID/PGID as aliases for HERMES_UID/HERMES_GID#25872

Closed
konsisumer wants to merge 1 commit into
NousResearch:mainfrom
konsisumer:fix/docker-puid-pgid-alias
Closed

fix(docker): accept PUID/PGID as aliases for HERMES_UID/HERMES_GID#25872
konsisumer wants to merge 1 commit into
NousResearch:mainfrom
konsisumer:fix/docker-puid-pgid-alias

Conversation

@konsisumer

Copy link
Copy Markdown
Contributor

What changed and why

NAS users on #15290 (UGreen UGOS Pro, plus Synology/unRAID reports) bind-mount /opt/data from a host directory owned by their own UID. They set PUID/PGID — the LinuxServer.io convention NAS ecosystems expect — but docker/entrypoint.sh only read HERMES_UID/HERMES_GID, so those vars were silently ignored. The gosu privilege drop then landed on the default hermes user (UID 10000), which cannot read a bind mount owned by the host user, producing the persistent [Errno 13] Permission denied: '/opt/data/config.yaml'.

The entrypoint now resolves HERMES_UID/HERMES_GID from PUID/PGID when unset, so passing PUID/PGID makes the runtime drop to the host UID that already owns the bind mount. HERMES_UID/HERMES_GID still take precedence when both are set, so existing deployments are unaffected.

Per the reporter's 2026-05-14 diagnostics: a Docker named volume works while a NAS bind mount does not, and PUID=1000/PGID=10 were confirmed ignored by the current :latest image — running the container as the host UID is the fix that lets bind mounts work without host-side chown loops. The troubleshooting docs in website/docs/user-guide/docker.md now document the alias and the NAS bind-mount pattern.

How to test

  • pytest tests/tools/test_entrypoint_puid_pgid.py -q — new contract test covering PUID/PGID alias resolution and HERMES_UID/HERMES_GID precedence.
  • Manual: docker run -e PUID=1000 -e PGID=10 -v /host/dir:/opt/data nousresearch/hermes-agent gateway run against a bind mount owned by host UID 1000 — the entrypoint logs Changing hermes UID to 1000 and the runtime can read/write config.yaml without a host-side chown.

What platforms tested on

  • macOS (Darwin 24.6.0): pytest tests/tools/test_entrypoint_puid_pgid.py (4 passed) plus the existing tests/tools/test_docker* and test_dockerfile_* suites (30 passed); bash -n docker/entrypoint.sh clean.
  • The entrypoint change is shell-only and platform-agnostic; the target deployment is the Debian-based container image reported by NAS users (UGOS Pro / Btrfs bind mounts).

Addressing maintainer feedback

@alt-glitch flagged this as related to #13731 (CLOSED — duplicate of #15832) with the workaround --user 0:0. #13731 was fixed by commit 14c9f7272, which removed USER hermes from the Dockerfile so the entrypoint runs as root. The reporter's diagnostics confirm they are on a post-fix :latest image, and that --user 0:0 does not resolve their case: the entrypoint correctly detects container-root and drops to UID 10000 by design, which still mismatches the host-owned bind mount. The PUID/PGID alias addresses the remaining gap — it lets NAS users target the host UID directly instead of relying on a chown of a bind mount the host filesystem may reject.

Fixes #15290

NAS platforms (UGOS, Synology, unRAID) bind-mount /opt/data from a host
directory owned by the user's own UID and expect the LinuxServer.io
PUID/PGID convention. The entrypoint only read HERMES_UID/HERMES_GID, so
PUID/PGID were silently ignored, the gosu drop landed on UID 10000, and
the runtime could not read the bind-mounted volume.

Resolve HERMES_UID/HERMES_GID from PUID/PGID when unset; the existing
HERMES_UID/HERMES_GID vars still take precedence when both are set.

Fixes NousResearch#15290
@benbarclay

Copy link
Copy Markdown
Collaborator

Thanks @konsisumer! Your fix shipped as 48083211ef606f3305c09df576514ac99bc7f594 via salvage PR #34401. Your branch targeted docker/entrypoint.sh, which is now a deprecation shim under s6-overlay (the May 2026 rework moved all bootstrap to docker/stage2-hook.sh, installed as /etc/cont-init.d/01-hermes-setup), so I reconstructed the same two-line alias resolution at the equivalent spot in stage2-hook.sh and credited you via Co-authored-by: on the merge commit. Tests were retargeted at the new file location and the NAS bind-mount docs example landed verbatim. Appreciate the clean LinuxServer.io-convention writeup — the NAS use case made it straightforward to slot the salvage in.

pull Bot pushed a commit to TKaxv-7S/hermes-agent that referenced this pull request May 29, 2026
…ousResearch#25872) (NousResearch#34401)

Salvages NousResearch#25872 by @konsisumer against current main.

NAS users (UGOS, Synology, unRAID) expect the LinuxServer.io
PUID/PGID convention and bind-mount /opt/data from a host directory
owned by their own UID.  Without this alias those vars are silently
ignored and the s6-setuidgid drop to UID 10000 leaves the runtime
unable to read the volume.  HERMES_UID/HERMES_GID still take
precedence when both are set.

The original PR targeted docker/entrypoint.sh, which is now a 27-line
deprecation shim under s6-overlay (the May 2026 rework moved all
bootstrap logic to docker/stage2-hook.sh, installed as
/etc/cont-init.d/01-hermes-setup).  Re-applied the same 2-line
alias resolution at the equivalent spot in stage2-hook.sh just
before the existing UID/GID remap block.  Test was retargeted at
docker/stage2-hook.sh; docs hunk adapted to current main's wording
("stage2 hook" + s6-setuidgid, not the obsolete "entrypoint drops
via gosu") with the NAS bind-mount example preserved verbatim.

Test-first regression verification: reverted just docker/stage2-hook.sh
to origin/main and re-ran the new tests.  Result:

  FAILED test_stage2_hook_resolves_puid_pgid_aliases
  FAILED test_puid_pgid_populate_hermes_uid_gid
      AssertionError: assert ':' == '1000:10'

That's the exact bug shape — PUID=1000 PGID=10 silently ignored,
HERMES_UID/HERMES_GID stay empty.  With the salvage applied, all 4
tests pass.

Closes NousResearch#25872

Co-authored-by: konsisumer <11262660+konsisumer@users.noreply.github.com>
KKT-OPT pushed a commit to KKT-OPT/hermes-agent that referenced this pull request May 31, 2026
…ousResearch#25872) (NousResearch#34401)

Salvages NousResearch#25872 by @konsisumer against current main.

NAS users (UGOS, Synology, unRAID) expect the LinuxServer.io
PUID/PGID convention and bind-mount /opt/data from a host directory
owned by their own UID.  Without this alias those vars are silently
ignored and the s6-setuidgid drop to UID 10000 leaves the runtime
unable to read the volume.  HERMES_UID/HERMES_GID still take
precedence when both are set.

The original PR targeted docker/entrypoint.sh, which is now a 27-line
deprecation shim under s6-overlay (the May 2026 rework moved all
bootstrap logic to docker/stage2-hook.sh, installed as
/etc/cont-init.d/01-hermes-setup).  Re-applied the same 2-line
alias resolution at the equivalent spot in stage2-hook.sh just
before the existing UID/GID remap block.  Test was retargeted at
docker/stage2-hook.sh; docs hunk adapted to current main's wording
("stage2 hook" + s6-setuidgid, not the obsolete "entrypoint drops
via gosu") with the NAS bind-mount example preserved verbatim.

Test-first regression verification: reverted just docker/stage2-hook.sh
to origin/main and re-ran the new tests.  Result:

  FAILED test_stage2_hook_resolves_puid_pgid_aliases
  FAILED test_puid_pgid_populate_hermes_uid_gid
      AssertionError: assert ':' == '1000:10'

That's the exact bug shape — PUID=1000 PGID=10 silently ignored,
HERMES_UID/HERMES_GID stay empty.  With the salvage applied, all 4
tests pass.

Closes NousResearch#25872

Co-authored-by: konsisumer <11262660+konsisumer@users.noreply.github.com>
hechuyi pushed a commit to hechuyi/hermes-agent that referenced this pull request Jun 6, 2026
…ousResearch#25872) (NousResearch#34401)

Salvages NousResearch#25872 by @konsisumer against current main.

NAS users (UGOS, Synology, unRAID) expect the LinuxServer.io
PUID/PGID convention and bind-mount /opt/data from a host directory
owned by their own UID.  Without this alias those vars are silently
ignored and the s6-setuidgid drop to UID 10000 leaves the runtime
unable to read the volume.  HERMES_UID/HERMES_GID still take
precedence when both are set.

The original PR targeted docker/entrypoint.sh, which is now a 27-line
deprecation shim under s6-overlay (the May 2026 rework moved all
bootstrap logic to docker/stage2-hook.sh, installed as
/etc/cont-init.d/01-hermes-setup).  Re-applied the same 2-line
alias resolution at the equivalent spot in stage2-hook.sh just
before the existing UID/GID remap block.  Test was retargeted at
docker/stage2-hook.sh; docs hunk adapted to current main's wording
("stage2 hook" + s6-setuidgid, not the obsolete "entrypoint drops
via gosu") with the NAS bind-mount example preserved verbatim.

Test-first regression verification: reverted just docker/stage2-hook.sh
to origin/main and re-ran the new tests.  Result:

  FAILED test_stage2_hook_resolves_puid_pgid_aliases
  FAILED test_puid_pgid_populate_hermes_uid_gid
      AssertionError: assert ':' == '1000:10'

That's the exact bug shape — PUID=1000 PGID=10 silently ignored,
HERMES_UID/HERMES_GID stay empty.  With the salvage applied, all 4
tests pass.

Closes NousResearch#25872

Co-authored-by: konsisumer <11262660+konsisumer@users.noreply.github.com>
(cherry picked from commit 4808321)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docker Docker image, Compose, packaging P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Setup]: Persistent [Errno 13] Permission denied on /opt/data/config.yaml during/after setup on NAS Docker (UGOS Pro)

3 participants