Skip to content

fix(platform): detect Colima docker.sock at top-level path (#3503)#3505

Merged
cv merged 1 commit into
mainfrom
fix/3503-colima-docker-socket
May 14, 2026
Merged

fix(platform): detect Colima docker.sock at top-level path (#3503)#3505
cv merged 1 commit into
mainfrom
fix/3503-colima-docker-socket

Conversation

@jason-ma-nv

@jason-ma-nv jason-ma-nv commented May 14, 2026

Copy link
Copy Markdown
Contributor

Summary

nemoclaw onboard fails on macOS + Colima setups whose Docker socket lives at ~/.colima/docker.sock rather than the default ~/.colima/default/docker.sock. Detection returned null, DOCKER_HOST stayed unset, and the gateway fell back to its hardcoded /var/run/docker.sock and aborted with FailedPrecondition. This adds the top-level path to the Colima candidate list so onboard works without the symlink workaround.

Related Issue

Fixes #3503.

Changes

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Notes on verification

  • prek was scoped to the changed files (npx prek run --files src/lib/platform.ts test/platform.test.ts), passes.
  • The cli npm test project surfaced two pre-existing timeouts in test/cli.test.ts for the NEMOCLAW_CLEANUP_GATEWAY tests at the default 5000ms — same tests take ~5750ms / ~6345ms on unmodified main, so this is pre-existing flake unrelated to this change. Running the same tests at --testTimeout 15000 passes on both main and this branch.
  • npm run typecheck:cli passes.

Out of scope / follow-up

For Colima users with custom profile names (e.g. colima start --profile foo), the socket would live at ~/.colima/foo/docker.sock — which still isn't in the candidate list. The general fix is to probe docker context inspect --format '{{.Endpoints.docker.Host}}' for the authoritative endpoint. That introduces a subprocess call from platform.ts (currently I/O-free) which collides with test/docker-abstraction-guard.test.ts and the SSRF subprocess assertion in test/config-set-nested-ssrf.test.ts. The right home for that probe is the onboard gateway-launch path, not module-load time in runner.ts — filing as a follow-up.


Signed-off-by: jason jama@nvidia.com

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced Docker host socket detection for Colima environments to recognize additional socket path configurations. The system now successfully detects Docker sockets across various Colima layouts, improving compatibility and eliminating the need for manual workarounds in certain installation scenarios.

Review Change Stack

`nemoclaw onboard` failed on macOS + Colima for setups whose docker
socket lives at ~/.colima/docker.sock (not under default/). Detection
returned null, DOCKER_HOST stayed unset, and openshell-gateway fell
back to its hardcoded /var/run/docker.sock and aborted with
FailedPrecondition.

Add the top-level path to the Colima candidate list so onboard works
without the symlink workaround. Includes a regression test that asserts
the new layout is discovered.
@jason-ma-nv jason-ma-nv self-assigned this May 14, 2026
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4b51de24-6f83-4c25-afab-ecf789843608

📥 Commits

Reviewing files that changed from the base of the PR and between a78ea16 and 989c681.

📒 Files selected for processing (2)
  • src/lib/platform.ts
  • test/platform.test.ts

📝 Walkthrough

Walkthrough

The PR extends Docker host socket detection on macOS to support Colima's top-level socket layout (~/.colima/docker.sock) as a fallback candidate, alongside existing nested-directory paths. Implementation, test expectations, and a regression test are aligned to validate detection succeeds without symlink workarounds.

Changes

Colima Socket Detection Expansion

Layer / File(s) Summary
Colima socket path detection and tests
src/lib/platform.ts, test/platform.test.ts
Implementation adds ~/.colima/docker.sock as a fallback candidate in getColimaDockerSocketCandidates(). Test expectations updated to include the new path, and a new regression test (for #3503) validates that the bare Colima layout is correctly detected and returned as a unix:// docker host.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Suggested labels

Docker, fix, bug

Suggested reviewers

  • cv

Poem

🐰 A socket found in Colima's home,
No symlink needed, no more to roam!
From nested deep to top-level flat,
The gateway discovers where Docker sat. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: adding detection for Colima's docker socket at the top-level path (~/.colima/docker.sock).
Linked Issues check ✅ Passed The PR successfully implements the fix for issue #3503 by adding detection for ~/.colima/docker.sock at the top level, extending the socket candidate list as required.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing issue #3503: adding Colima socket detection and corresponding regression test, with no unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/3503-colima-docker-socket

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: macos-e2e
Optional E2E: wsl-e2e, test-e2e-ollama-proxy

Dispatch hint: macos-e2e

Workflow run

Full advisor summary

Pi Semantic E2E Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • macos-e2e (medium): The change is macOS-specific socket discovery that feeds DOCKER_HOST for every CLI spawn through src/lib/runner.ts. The macOS E2E workflow exercises the full install → build → vitest → optional Docker-backed full-e2e path on a real macOS runner, and is the only existing CI surface that meaningfully validates that the augmented Colima/Docker candidate ordering still resolves a working Docker host before onboarding/sandbox provisioning. It also already auto-triggers on src/** changes, so requiring it is consistent with current path filters.

Optional E2E

  • wsl-e2e (medium): The diff does not modify Linux/WSL candidates or the isWsl() guard, but shouldPatchCoredns and detectDockerHost are shared code paths. WSL E2E is a low-cost confidence check that the refactor of getColimaDockerSocketCandidates didn't perturb non-macOS resolution.
  • test-e2e-ollama-proxy (low): Lightweight Linux smoke that imports compiled dist artifacts including ./platform; a free signal that the build still emits a usable platform module after the edit.

New E2E recommendations

  • container-runtime-detection (low): There is no scenario or shell-level E2E that asserts detectDockerHost() actually resolves a non-DOCKER_HOST socket on a runner where Colima is the active runtime. macos-e2e in CI uses Docker Desktop, so the new ~/.colima/docker.sock branch is only exercised by vitest with mocked existsSync. A real Colima-backed scenario would close that gap.
    • Suggested test: Add a macos-colima-onboard scenario under test/e2e/nemoclaw_scenarios that installs Colima, starts it with the bare ~/.colima/docker.sock layout, runs install.sh + onboard, and asserts DOCKER_HOST resolves to the Colima socket (not /var/run/docker.sock).

Dispatch hint

  • Workflow: macos-e2e.yaml
  • jobs input: macos-e2e

@cv cv added the v0.0.42 label May 14, 2026
@cv cv merged commit 25a9ee9 into main May 14, 2026
30 checks passed
@miyoungc miyoungc mentioned this pull request May 14, 2026
12 tasks
miyoungc added a commit that referenced this pull request May 14, 2026
## Summary
Refreshes the NemoClaw documentation for the local `main` changes
included in the 0.0.42 release. The update adds release notes, updates
the affected user-facing setup and troubleshooting pages, bumps docs
metadata to 0.0.42, and regenerates the matching user skills.

## Changes
- #3537 -> `docs/reference/commands.md`,
`docs/reference/troubleshooting.md`: Documented host-level status
fields, cloudflared state-specific recovery hints, and Local Ollama auth
proxy status diagnostics.
- #3454 -> `docs/get-started/prerequisites.md`,
`docs/get-started/quickstart.md`: Documented macOS Docker-driver
onboarding and removed the expectation that standard macOS setup needs a
VM driver helper.
- #3514 -> `docs/inference/use-local-inference.md`: Documented
compatible-endpoint retry behavior for reasoning-only smoke responses.
- #3448 -> `docs/reference/commands.md`,
`docs/manage-sandboxes/messaging-channels.md`: Documented canonical
channel names and policy preset hints after `channels add`.
- #3520 -> `docs/about/release-notes.md`: Captured clearer GPU recovery
and uninstall wording in the 0.0.42 release notes.
- #3313 -> `docs/get-started/quickstart.md`,
`docs/reference/troubleshooting.md`: Documented stronger dashboard port
detection and rollback when a forward cannot start.
- #3502 -> `docs/about/release-notes.md`: Captured batched onboarding
policy preset application in the 0.0.42 release notes.
- #3505 -> `docs/reference/troubleshooting.md`: Documented the top-level
Colima socket path.
- #3421 -> `docs/about/release-notes.md`: Captured idempotent installer
shim logging in the 0.0.42 release notes.
- Updated `docs/project.json`, `docs/versions1.json`, and regenerated
`.agents/skills/nemoclaw-user-*` outputs.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [x] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [ ] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [x] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes - v0.0.42

* **Documentation**
  * Enhanced macOS onboarding guidance for Docker gateway setup
  * Improved dashboard port conflict handling with automatic rollback
* Better local Ollama inference diagnostics and authentication proxy
checks
  * Clarified status command output and recovery procedures
  * Refined messaging channel setup documentation

* **Chores**
  * Version bump to 0.0.42

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/3540)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: Carlos Villela <cvillela@nvidia.com>
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[macOS][Onboard] nemoclaw onboard fails on Colima — openshell-gateway hardcodes /var/run/docker.sock, ignores DOCKER_HOST

3 participants