Skip to content

fix(docker): don't crash on read-only /tools; writable on-PATH install dirs for non-root agent (0.23.0-beta.2)#1321

Merged
Aaronontheweb merged 2 commits into
devfrom
fix/docker-nonroot-tools-readonly
Jun 4, 2026
Merged

fix(docker): don't crash on read-only /tools; writable on-PATH install dirs for non-root agent (0.23.0-beta.2)#1321
Aaronontheweb merged 2 commits into
devfrom
fix/docker-nonroot-tools-readonly

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

Summary

Two regressions shipped in 0.23.0-beta.1 via the bind-mount ownership repair (#1281), both rooted in the daemon now running as the non-root netclaw user. This fixes both and bumps to 0.23.0-beta.2.

1. Container crash-loops on a read-only /tools mount

The entrypoint ran ensure_runtime_ownership /tools, which does a recursive chown -R 1654:1654 /tools and exit 1s if it fails. When /tools is a read-only mount (a legitimate, common pattern: an immutable, pre-provisioned toolset such as a vendored .NET SDK), the chown fails with EROFS and the container crash-loops.

Observed in the wild (Kubernetes, /tools mounted readOnly: true from an init-container-populated volume):

[entrypoint] ERROR: failed to chown /tools for netclaw (1654:1654).
[entrypoint] If this is a read-only bind mount, make it writable or pre-create it with uid/gid 1654:1654.

Fix: /tools is a PATH directory the agent only ever reads/executes from — it never needs to own or write it. The new ensure_tools_accessible helper is best-effort: it never recursively chowns /tools, only confirms the runtime user can traverse+read it, opportunistically fixes the top-level owner if the mount is writable, and never exits fatally on a read-only mount. The writable data dir (~/.netclaw) keeps its existing strict, fatal ownership repair — that one genuinely must be writable.

2. Non-root agent can't install tools at runtime

Because the daemon dropped to non-root, it can't apt-get or write system PATH dirs (/usr/local/bin, /usr/share/dotnet). Worse, its non-interactive shells never source ~/.bashrc/~/.profile, so an install into ~/.dotnet wasn't resolvable as a bare command. Net effect: live tool installs (e.g. dotnet-install.sh) were effectively broken.

Fix: the image now pre-creates user-writable, on-PATH install locations and sets a default PATH that includes them:

  • ~/.local/bin (pip --user, generic)
  • ~/.dotnet + ~/.dotnet/tools (dotnet-install.sh default + .NET global tools)
  • /tools/bin (operator/init-container toolsets mounted at /tools)

A runtime-installed dotnet now resolves as a bare dotnet in the agent's shell, no PATH gymnastics required. The misleading Dockerfile comment about runtime apt-get install is corrected to reflect the non-root model.

Why keep non-root at all?

Running an arbitrary-command agent as non-root is the right posture — it caps the blast radius of prompt-injection-driven command execution. These were implementation bugs in how the drop was rolled out, not a reason to revert it. This PR makes the non-root model actually usable.

Changes

  • docker/entrypoint.shensure_tools_accessible for /tools (non-fatal, non-recursive); data dir repair unchanged.
  • docker/Dockerfile — pre-create ~/.local/bin, ~/.dotnet/tools; default PATH includes the user-writable + /tools/bin locations; corrected runtime-install comment.
  • Directory.Build.props + RELEASE_NOTES.md — version → 0.23.0-beta.2.

Test plan

  • validate_docker_image smoke test passes (normal startup unaffected).
  • Manual: mount /tools read-only → container starts (no crash), WARN logged instead of fatal.
  • Manual: dotnet-install.sh into ~/.dotnet → bare dotnet --info resolves in a non-interactive shell.

Release

Tag 0.23.0-beta.2 after merge to fire publish_release_binaries.yml (builds binaries, creates the GitHub prerelease, moves the floating :beta Docker tag).

…table on-PATH install dirs

The bind-mount ownership repair added in 0.23.0-beta.1 (#1281) ran a
recursive chown over /tools and aborted fatally on a read-only mount,
crash-looping the container. /tools is a PATH directory the agent only
reads from, so the entrypoint now treats it as best-effort: never
recursive-chowns it, never fatal on a read-only/already-correct mount.

#1281 also dropped the daemon to the non-root netclaw user, which left
no writable directory on its PATH for runtime tool installs (the agent
can't apt-get or write system dirs). The image now ships writable,
on-PATH locations (~/.local/bin, ~/.dotnet, ~/.dotnet/tools, /tools/bin)
and a default PATH that includes them, so a runtime-installed dotnet /
pip --user tool / .NET global tool resolves as a bare command in the
agent's non-interactive shell.

Bumps to 0.23.0-beta.2.
@Aaronontheweb Aaronontheweb merged commit d9fad10 into dev Jun 4, 2026
21 checks passed
@Aaronontheweb Aaronontheweb deleted the fix/docker-nonroot-tools-readonly branch June 4, 2026 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant