fix(docker): reject unsupported --user <arbitrary-uid> start with clear guidance#38579
Merged
Conversation
…ar guidance `docker run --user $(id -u):$(id -g)` was a tini-era trick to make container-written files match the host user. Under s6-overlay it no longer works: the bootstrap (UID remap, volume + build-tree chown, config seeding) needs root, and the baked image dirs (/opt/data, /opt/hermes/.venv, ui-tui, node_modules) are owned by the hermes build UID (10000). A pinned arbitrary UID can't write them, so the runtime fails with EACCES on a bind mount or hard-crashes on a named volume (Docker inits the volume from the image as 10000; the non-root start can't even `cd /opt/data`, and the profile reconciler dies with PermissionError on gateway_state.json). Detect that start early in both the cont-init hook (stage2-hook.sh) and the CMD wrapper (main-wrapper.sh) and fail fast with actionable guidance pointing at the supported path: root start + HERMES_UID/HERMES_GID (or the PUID/PGID aliases), which remaps the hermes user and chowns the volume — the same host-UID-matching outcome --user was used for, without breaking s6. The guard fires only when the current UID is neither root NOR the hermes UID. This preserves the supported non-root start from #34648/#34837 (running with `--user 10000:10000`, i.e. pinned to the hermes UID itself), which is unaffected — only the arbitrary-UID variant that #34837 never actually made writable is rejected. Verified live across five scenarios (built image, bind + named volume): arbitrary --user on bind -> rejected with guidance, hermes does not run; arbitrary --user on named volume -> guidance shown, no raw 'can't cd' crash; --user 10000:10000 -> boots; root + HERMES_UID=4242 remap -> boots, guard not tripped; default root start -> boots. Pre-fix control reproduces the raw PermissionError + 'can't cd' crash with no guidance.
Contributor
🔎 Lint report:
|
| Rule | Count |
|---|---|
unresolved-import |
1 |
no-matching-overload |
1 |
First entries
tests/tools/test_stage2_hook_user_flag_guard.py:31: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/tools/test_stage2_hook_user_flag_guard.py:84: [no-matching-overload] no-matching-overload: No overload of function `run` matches arguments
✅ Fixed issues: none
Unchanged: 5050 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
Yuki-14544869
pushed a commit
to Yuki-14544869/hermes-agent
that referenced
this pull request
Jun 4, 2026
…ar guidance (NousResearch#38579) `docker run --user $(id -u):$(id -g)` was a tini-era trick to make container-written files match the host user. Under s6-overlay it no longer works: the bootstrap (UID remap, volume + build-tree chown, config seeding) needs root, and the baked image dirs (/opt/data, /opt/hermes/.venv, ui-tui, node_modules) are owned by the hermes build UID (10000). A pinned arbitrary UID can't write them, so the runtime fails with EACCES on a bind mount or hard-crashes on a named volume (Docker inits the volume from the image as 10000; the non-root start can't even `cd /opt/data`, and the profile reconciler dies with PermissionError on gateway_state.json). Detect that start early in both the cont-init hook (stage2-hook.sh) and the CMD wrapper (main-wrapper.sh) and fail fast with actionable guidance pointing at the supported path: root start + HERMES_UID/HERMES_GID (or the PUID/PGID aliases), which remaps the hermes user and chowns the volume — the same host-UID-matching outcome --user was used for, without breaking s6. The guard fires only when the current UID is neither root NOR the hermes UID. This preserves the supported non-root start from NousResearch#34648/NousResearch#34837 (running with `--user 10000:10000`, i.e. pinned to the hermes UID itself), which is unaffected — only the arbitrary-UID variant that NousResearch#34837 never actually made writable is rejected. Verified live across five scenarios (built image, bind + named volume): arbitrary --user on bind -> rejected with guidance, hermes does not run; arbitrary --user on named volume -> guidance shown, no raw 'can't cd' crash; --user 10000:10000 -> boots; root + HERMES_UID=4242 remap -> boots, guard not tripped; default root start -> boots. Pre-fix control reproduces the raw PermissionError + 'can't cd' crash with no guidance.
davidgut1982
pushed a commit
to davidgut1982/hermes-agent
that referenced
this pull request
Jun 5, 2026
…ar guidance (NousResearch#38579) `docker run --user $(id -u):$(id -g)` was a tini-era trick to make container-written files match the host user. Under s6-overlay it no longer works: the bootstrap (UID remap, volume + build-tree chown, config seeding) needs root, and the baked image dirs (/opt/data, /opt/hermes/.venv, ui-tui, node_modules) are owned by the hermes build UID (10000). A pinned arbitrary UID can't write them, so the runtime fails with EACCES on a bind mount or hard-crashes on a named volume (Docker inits the volume from the image as 10000; the non-root start can't even `cd /opt/data`, and the profile reconciler dies with PermissionError on gateway_state.json). Detect that start early in both the cont-init hook (stage2-hook.sh) and the CMD wrapper (main-wrapper.sh) and fail fast with actionable guidance pointing at the supported path: root start + HERMES_UID/HERMES_GID (or the PUID/PGID aliases), which remaps the hermes user and chowns the volume — the same host-UID-matching outcome --user was used for, without breaking s6. The guard fires only when the current UID is neither root NOR the hermes UID. This preserves the supported non-root start from NousResearch#34648/NousResearch#34837 (running with `--user 10000:10000`, i.e. pinned to the hermes UID itself), which is unaffected — only the arbitrary-UID variant that NousResearch#34837 never actually made writable is rejected. Verified live across five scenarios (built image, bind + named volume): arbitrary --user on bind -> rejected with guidance, hermes does not run; arbitrary --user on named volume -> guidance shown, no raw 'can't cd' crash; --user 10000:10000 -> boots; root + HERMES_UID=4242 remap -> boots, guard not tripped; default root start -> boots. Pre-fix control reproduces the raw PermissionError + 'can't cd' crash with no guidance.
changman
pushed a commit
to changman/hermes-agent
that referenced
this pull request
Jun 10, 2026
…ar guidance (NousResearch#38579) `docker run --user $(id -u):$(id -g)` was a tini-era trick to make container-written files match the host user. Under s6-overlay it no longer works: the bootstrap (UID remap, volume + build-tree chown, config seeding) needs root, and the baked image dirs (/opt/data, /opt/hermes/.venv, ui-tui, node_modules) are owned by the hermes build UID (10000). A pinned arbitrary UID can't write them, so the runtime fails with EACCES on a bind mount or hard-crashes on a named volume (Docker inits the volume from the image as 10000; the non-root start can't even `cd /opt/data`, and the profile reconciler dies with PermissionError on gateway_state.json). Detect that start early in both the cont-init hook (stage2-hook.sh) and the CMD wrapper (main-wrapper.sh) and fail fast with actionable guidance pointing at the supported path: root start + HERMES_UID/HERMES_GID (or the PUID/PGID aliases), which remaps the hermes user and chowns the volume — the same host-UID-matching outcome --user was used for, without breaking s6. The guard fires only when the current UID is neither root NOR the hermes UID. This preserves the supported non-root start from NousResearch#34648/NousResearch#34837 (running with `--user 10000:10000`, i.e. pinned to the hermes UID itself), which is unaffected — only the arbitrary-UID variant that NousResearch#34837 never actually made writable is rejected. Verified live across five scenarios (built image, bind + named volume): arbitrary --user on bind -> rejected with guidance, hermes does not run; arbitrary --user on named volume -> guidance shown, no raw 'can't cd' crash; --user 10000:10000 -> boots; root + HERMES_UID=4242 remap -> boots, guard not tripped; default root start -> boots. Pre-fix control reproduces the raw PermissionError + 'can't cd' crash with no guidance.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
docker run --user $(id -u):$(id -g)was a tini-era trick to makecontainer-written files match the host user. Under the current s6-overlay
image it silently breaks:
(
/opt/hermes/.venv,ui-tui,node_modules, owned by the hermes build UID10000) fails with
EACCES— lazy installs and TUI rebuilds are dead.arbitrary
--userUID can't evencdinto$HERMES_HOME.Root cause: the bootstrap (UID remap, volume/build-tree chown, config seeding)
all require root, and are skipped on a non-root start.
--userwith anarbitrary UID was a casualty of the tini→s6 migration that was never made to
actually work.
Fix (Option A — redirect to the supported path)
Detect the unsupported start early — in both the cont-init hook
(
stage2-hook.sh) and the CMD wrapper (main-wrapper.sh, the surface the usersees in
docker runoutput) — and fail fast with actionable guidanceinstead of crashing on
cd/EACCES downstream:The supported
HERMES_UID/PUIDpath remaps the hermes user and chowns thevolume at boot, giving the same host-UID-matching outcome
--userwas usedfor, without breaking the s6 supervision tree.
Why this does NOT revert #34837 / re-break #34648
The guard fires only when the current UID is neither root NOR the hermes
UID. #34648's supported non-root start uses
user: "10000:10000"— pinned tothe hermes UID itself — so
cur_uid == id -u hermesand the guard skips it.#34837 fixed the boot-loop for that case; this PR rejects only the arbitrary-UID
variant that #34837 never made writable (confirmed by reproducing the EACCES /
crash on current
main).Verification
Unit tests (
tests/tools/test_stage2_hook_user_flag_guard.py, 6 tests)Extracts the guard from each script and runs it with
idstubbed; assertsarbitrary UID →
exit 1+ guidance, and root /--user <hermes-uid>/remapped-hermes-uid → pass through. All 20 stage2 contract tests green.
Live E2E (built image, bind + named volume)
--user 1000:1000+ bind mount--user 1000:1000+ named volumecan't cdcrash--user 10000:10000(hermes UID, #34648)HERMES_UID=4242remapPre-fix control reproduces the raw
PermissionError+can't cdcrash with noguidance.
Scope
docker/lane only —stage2-hook.sh+main-wrapper.sh+ a contract test.No runtime/Python behavior change.
Follow-up
This is the redirect approach. Genuinely restoring full
--user <arbitrary-uid>parity (world-writable build trees / relocated runtime state + s6 tuning) is a
larger, separate change and is intentionally out of scope here.