fix(docker): skip redundant s6 user drop#34684
Conversation
|
@austinpickett I opened this as a focused P1 Docker regression fix for #34648. GitHub would not let me formally request review ( Local validation:
I could not run a live Docker boot repro locally because the Docker daemon is unavailable on this machine, but the shell helper is covered for both paths: already-Hermes UID bypasses s6, privileged startup still uses s6. |
|
Superseded by #34837, which merged in 380ce47 and fixes the identical bug — the #34837 guards each drop site inline ( I verified #34837 end-to-end: an old image boots |
What does this PR do?
Fixes a Docker image regression where containers started directly as the Hermes UID, for example with Compose
user: "10000:10000"plusgroup_addfor/var/run/docker.sock, could hit an s6-overlay restart loop:s6-applyuidgid: fatal: unable to set supplementary group list: Operation not permittedThe Docker scripts now use a shared
run-as-hermes.shhelper. It keeps the existings6-setuidgid hermesbehavior when the process is privileged, but bypasses that extra user/group reset when the current process is already running as the Hermes UID. That preserves Docker-injected supplementary groups and avoids the failing redundant privilege drop.Related Issue
Fixes #34648
Type of Change
Changes Made
docker/run-as-hermes.shwithrun_as_hermesandexec_as_hermeshelpers.docker/stage2-hook.shto userun_as_hermesfor boot-time Hermes-owned file/directory setup.docker/main-wrapper.sh,docker/s6-rc.d/dashboard/run, anddocker/cont-init.d/02-reconcile-profilesto useexec_as_hermesinstead of unconditionally invokings6-setuidgid hermes.tests/tools/test_docker_run_as_hermes.pyto cover the already-Hermes UID bypass, the privileged s6 path, and script wiring.How to Test
git diff --check.sh -n docker/run-as-hermes.sh docker/stage2-hook.sh docker/main-wrapper.sh docker/s6-rc.d/dashboard/run docker/cont-init.d/02-reconcile-profiles.uv run --with pytest --with pytest-timeout pytest tests/tools/test_docker_run_as_hermes.py tests/tools/test_stage2_hook_puid_pgid.py -q.scripts/run_tests.sh tests/tools/test_docker_run_as_hermes.py tests/tools/test_stage2_hook_puid_pgid.py.user: "10000:10000",group_addcontaining the Docker socket GID, and/var/run/docker.sock:/var/run/docker.sock; startup should no longer fail with the s6 supplementary-group error.Checklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AScreenshots / Logs
Validation run locally:
git diff --checkpassedsh -n docker/run-as-hermes.sh docker/stage2-hook.sh docker/main-wrapper.sh docker/s6-rc.d/dashboard/run docker/cont-init.d/02-reconcile-profilespasseduv run --with pytest --with pytest-timeout pytest tests/tools/test_docker_run_as_hermes.py tests/tools/test_stage2_hook_puid_pgid.py -q-> 10 passedscripts/run_tests.sh tests/tools/test_docker_run_as_hermes.py tests/tools/test_stage2_hook_puid_pgid.py-> 10 tests passed, 0 failedFull
pytest tests/ -qwas not run for this PR. Live Docker boot validation was also not run locally because Docker is installed but the daemon is unavailable on this machine:Cannot connect to the Docker daemon at unix:///Users/danyraihan/.orbstack/run/docker.sock.