Skip to content

fix(cli): add sudo to lsof and kill hints in onboarding port check#734

Closed
tienedev wants to merge 2 commits into
NVIDIA:mainfrom
tienedev:fix/726-lsof-sudo-onboarding
Closed

fix(cli): add sudo to lsof and kill hints in onboarding port check#734
tienedev wants to merge 2 commits into
NVIDIA:mainfrom
tienedev:fix/726-lsof-sudo-onboarding

Conversation

@tienedev

@tienedev tienedev commented Mar 23, 2026

Copy link
Copy Markdown

Summary

Non-root users cannot see root-owned listeners via lsof, so the port-conflict hints shown during onboarding silently fail. This PR adds sudo to the lsof and kill hints displayed to users, and updates the troubleshooting docs to match.

The programmatic lsof call in preflight.js is intentionally left without sudo — adding it there would trigger an interactive password prompt during onboarding, which is a worse UX. The current approach (try without sudo, fall through gracefully, then suggest the user run it manually with sudo) is the right tradeoff.

Related Issue

Fixes #726

Changes

  • Extracted inline hint logic into a dedicated formatPortConflictHint() helper in onboard.js for testability
  • Added sudo prefix to lsof and kill commands in the user-facing hints (both "unknown process" and "known process without PID" paths)
  • Updated docs/reference/troubleshooting.md with sudo commands
  • Regenerated .agents/skills/docs/ via docs-to-skills.py
  • Added 3 unit tests covering all hint branches

Type of Change

  • Code change with doc updates.

Testing

  • npm test passes.
  • 3 new tests added in test/onboard.test.js covering all formatPortConflictHint() branches: unknown process, known with PID, known without PID.

Checklist

General

Code Changes

  • Tests added or updated for new or changed behavior.
  • No secrets, API keys, or credentials committed.
  • Doc pages updated for any user-facing behavior changes.

Doc Changes

  • Cross-references and links verified.

Summary by CodeRabbit

  • Documentation

    • Added comprehensive guide for backing up and restoring workspace files
    • Added reference documentation describing workspace Markdown files and persistence behavior
    • Added warning callouts about permanent file deletion when destroying sandboxes
  • Bug Fixes

    • Updated port conflict troubleshooting commands to use elevated privileges for improved reliability
  • Tests

    • Added unit tests for port conflict error message formatting

tienedev and others added 2 commits March 23, 2026 19:26
…ixes #726)

Non-root users cannot see root-owned listeners with lsof, so the
onboarding error message now suggests `sudo lsof` and `sudo kill`
instead of the unprivileged variants. The troubleshooting docs are
updated to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Mar 23, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

This pull request addresses a port conflict issue by adding sudo requirements to troubleshooting commands, refactoring port conflict hint generation into a reusable helper function with unit tests, and adding comprehensive documentation for workspace files, backup procedures, and the destroy command's permanent deletion behavior.

Changes

Cohort / File(s) Summary
Troubleshooting Documentation
docs/reference/troubleshooting.md, .agents/skills/docs/nemoclaw-reference/references/troubleshooting.md
Updated port conflict identification and termination commands to require sudo prefix for lsof and kill operations.
Workspace Documentation
.agents/skills/docs/nemoclaw-workspace/SKILL.md, .agents/skills/docs/nemoclaw-workspace/references/workspace-files.md, .agents/skills/docs/nemoclaw-reference/references/commands.md
Added new documentation files explaining workspace Markdown files, their location, persistence behavior, and manual/scripted backup/restore procedures. Added warning callout to nemoclaw destroy command about permanent file deletion.
Port Conflict Helper Refactoring
bin/lib/onboard.js
Extracted inline port conflict hint generation logic into new exported formatPortConflictHint() helper function that formats conflict metadata into human-readable instructions with appropriate sudo commands.
Unit Tests
test/onboard.test.js
Added three test cases covering formatPortConflictHint() behavior for unknown processes, known processes with PIDs, and known processes without PIDs.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

Poem

🐰 Hops of wisdom, bound in care,
Sudo commands float through the air!
Workspace treasures tucked away safe,
With backups made, no need for haste. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly corresponds to the main code change: adding sudo to lsof and kill hints in the onboarding port check logic.
Linked Issues check ✅ Passed The code changes comprehensively address issue #726: sudo added to lsof and kill commands in user-facing hints, new helper function extracted for testability, and accompanying documentation and tests provided.
Out of Scope Changes check ✅ Passed All changes are in scope: code updates to address the issue, documentation updates for both user-facing hints and troubleshooting references, and comprehensive test coverage for the new helper function.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
docs/reference/troubleshooting.md (1)

98-103: ⚠️ Potential issue | 🟡 Minor

Missing sudo on the force-terminate command.

Line 102 mentions kill -9 <PID> without sudo, but if sudo kill <PID> is required (line 99), then sudo kill -9 <PID> should also be required for consistency when force-terminating root-owned processes.

📝 Suggested fix
-If the process does not exit, use `kill -9 <PID>` to force-terminate it.
+If the process does not exit, use `sudo kill -9 <PID>` to force-terminate it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/troubleshooting.md` around lines 98 - 103, The troubleshooting
doc shows "kill -9 <PID>" without sudo while earlier using "sudo kill <PID>";
update the force-terminate example so it consistently uses sudo (i.e., change
"kill -9 <PID>" to "sudo kill -9 <PID>") near the "sudo lsof -i :18789" and
"sudo kill <PID>" lines to ensure commands will work for root-owned processes.
.agents/skills/docs/nemoclaw-reference/references/troubleshooting.md (1)

72-77: ⚠️ Potential issue | 🟡 Minor

Missing sudo on the force-terminate command (same as main docs).

Line 76 mentions kill -9 <PID> without sudo. For consistency with the updated sudo kill <PID> on line 73, this should also use sudo kill -9 <PID>.

📝 Suggested fix
-If the process does not exit, use `kill -9 <PID>` to force-terminate it.
+If the process does not exit, use `sudo kill -9 <PID>` to force-terminate it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/skills/docs/nemoclaw-reference/references/troubleshooting.md around
lines 72 - 77, Update the force-terminate command to use sudo for consistency:
change the plain "kill -9 <PID>" mentioned in the troubleshooting text to "sudo
kill -9 <PID>" so it matches the earlier "sudo kill <PID>" usage; edit the
string in the troubleshooting paragraph (the line containing the kill -9
instruction) accordingly.
🧹 Nitpick comments (2)
bin/lib/onboard.js (1)

323-346: Minor inconsistency in lsof flags between branches.

Line 337 uses sudo lsof -i :${port} -sTCP:LISTEN -P -n (with -P -n flags for numeric output), while line 343 uses sudo lsof -i :${port} -sTCP:LISTEN (without -P -n). Consider using consistent flags in both branches for a uniform user experience.

🔧 Suggested fix for consistency
   } else {
     lines.push(`     Could not identify the process using port ${port}.`);
-    lines.push(`     Run: sudo lsof -i :${port} -sTCP:LISTEN`);
+    lines.push(`     Run: sudo lsof -i :${port} -sTCP:LISTEN -P -n`);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 323 - 346, The hint builder
formatPortConflictHint has inconsistent lsof flags: it uses "sudo lsof -i
:${port} -sTCP:LISTEN -P -n" when portCheck.process is known but omits "-P -n"
in the fallback branch; update the fallback branch to include the same "-P -n"
flags so both branches produce the same lsof command and consistent numeric
output (modify the string in the else branch where it currently returns "sudo
lsof -i :${port} -sTCP:LISTEN" to include "-P -n").
.agents/skills/docs/nemoclaw-workspace/SKILL.md (1)

88-99: Resolve markdownlint MD014 warnings in command examples.

$ prompts are used without command output in several console blocks, which matches the lint warnings and will keep docs lint noisy/failing depending on CI policy.

🧹 One clean approach (switch to bash blocks for command-only snippets)
-```console
-$ SANDBOX=my-assistant
-$ BACKUP_DIR=~/.nemoclaw/backups/$(date +%Y%m%d-%H%M%S)
-$ mkdir -p "$BACKUP_DIR"
+```bash
+SANDBOX=my-assistant
+BACKUP_DIR=~/.nemoclaw/backups/$(date +%Y%m%d-%H%M%S)
+mkdir -p "$BACKUP_DIR"
 ...
-```
+```

Also applies to: 105-115, 133-135, 139-141, 162-163

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/skills/docs/nemoclaw-workspace/SKILL.md around lines 88 - 99, The
console code blocks in SKILL.md use ```console with leading "$ " prompts (e.g.,
lines containing "$ SANDBOX=...", "$ BACKUP_DIR=...", "$ mkdir -p ..."), which
triggers markdownlint MD014; change those ```console fences to ```bash and
remove the leading "$ " prompt prefixes in each command example (including the
other affected blocks around the examples referenced) so the blocks contain only
raw commands like SANDBOX=..., BACKUP_DIR=..., mkdir -p ...; ensure all similar
occurrences (the blocks around the other ranges noted) are updated consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agents/skills/docs/nemoclaw-reference/references/commands.md:
- Around line 85-87: Replace the plain-text reference "Back Up and Restore (see
the `nemoclaw-workspace` skill)" with an explicit Markdown link that points to
the exact Back Up and Restore section or the nemoclaw-workspace skill doc;
update the sentence in commands.md (the warning block containing "Destroying a
sandbox..." and the phrase "Back Up and Restore") to use the Markdown link
syntax [Back Up and
Restore](<URL-or-relative-path-to-nemoclaw-workspace#back-up-and-restore>) so
users can click directly to the relevant section, keeping the
`nemoclaw-workspace` inline code mention if desired.

In @.agents/skills/docs/nemoclaw-workspace/SKILL.md:
- Line 3: Update the frontmatter "description" field in SKILL.md to fix grammar
and clarity: replace the current broken string with a concise sentence such as
"How to back up and restore OpenClaw workspace files before destructive
operations; explains what workspace files are, where they live, and how they
persist across sandbox restarts. Use for agents.md, backup/restore, identity.md,
memory.md, nemoclaw, nemoclaw backup, and nemoclaw restore." Ensure you modify
the description key in the file's frontmatter (the "description" field) to this
improved wording.

---

Outside diff comments:
In @.agents/skills/docs/nemoclaw-reference/references/troubleshooting.md:
- Around line 72-77: Update the force-terminate command to use sudo for
consistency: change the plain "kill -9 <PID>" mentioned in the troubleshooting
text to "sudo kill -9 <PID>" so it matches the earlier "sudo kill <PID>" usage;
edit the string in the troubleshooting paragraph (the line containing the kill
-9 instruction) accordingly.

In `@docs/reference/troubleshooting.md`:
- Around line 98-103: The troubleshooting doc shows "kill -9 <PID>" without sudo
while earlier using "sudo kill <PID>"; update the force-terminate example so it
consistently uses sudo (i.e., change "kill -9 <PID>" to "sudo kill -9 <PID>")
near the "sudo lsof -i :18789" and "sudo kill <PID>" lines to ensure commands
will work for root-owned processes.

---

Nitpick comments:
In @.agents/skills/docs/nemoclaw-workspace/SKILL.md:
- Around line 88-99: The console code blocks in SKILL.md use ```console with
leading "$ " prompts (e.g., lines containing "$ SANDBOX=...", "$
BACKUP_DIR=...", "$ mkdir -p ..."), which triggers markdownlint MD014; change
those ```console fences to ```bash and remove the leading "$ " prompt prefixes
in each command example (including the other affected blocks around the examples
referenced) so the blocks contain only raw commands like SANDBOX=...,
BACKUP_DIR=..., mkdir -p ...; ensure all similar occurrences (the blocks around
the other ranges noted) are updated consistently.

In `@bin/lib/onboard.js`:
- Around line 323-346: The hint builder formatPortConflictHint has inconsistent
lsof flags: it uses "sudo lsof -i :${port} -sTCP:LISTEN -P -n" when
portCheck.process is known but omits "-P -n" in the fallback branch; update the
fallback branch to include the same "-P -n" flags so both branches produce the
same lsof command and consistent numeric output (modify the string in the else
branch where it currently returns "sudo lsof -i :${port} -sTCP:LISTEN" to
include "-P -n").

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6a7f58d9-6a96-4dca-9b28-a270fccc0942

📥 Commits

Reviewing files that changed from the base of the PR and between 5ac3575 and 92117bc.

📒 Files selected for processing (7)
  • .agents/skills/docs/nemoclaw-reference/references/commands.md
  • .agents/skills/docs/nemoclaw-reference/references/troubleshooting.md
  • .agents/skills/docs/nemoclaw-workspace/SKILL.md
  • .agents/skills/docs/nemoclaw-workspace/references/workspace-files.md
  • bin/lib/onboard.js
  • docs/reference/troubleshooting.md
  • test/onboard.test.js

Comment on lines +85 to +87
> **Warning:** Destroying a sandbox permanently deletes all files inside it, including
> workspace files (see the `nemoclaw-workspace` skill) (SOUL.md, USER.md, IDENTITY.md, AGENTS.md, MEMORY.md, and daily memory notes).
> Back up your workspace first by following the instructions at Back Up and Restore (see the `nemoclaw-workspace` skill).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make “Back Up and Restore” an explicit Markdown link.

The warning is useful, but the current plain-text reference is harder to navigate than a direct link to the exact section/skill doc.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/skills/docs/nemoclaw-reference/references/commands.md around lines
85 - 87, Replace the plain-text reference "Back Up and Restore (see the
`nemoclaw-workspace` skill)" with an explicit Markdown link that points to the
exact Back Up and Restore section or the nemoclaw-workspace skill doc; update
the sentence in commands.md (the warning block containing "Destroying a
sandbox..." and the phrase "Back Up and Restore") to use the Markdown link
syntax [Back Up and
Restore](<URL-or-relative-path-to-nemoclaw-workspace#back-up-and-restore>) so
users can click directly to the relevant section, keeping the
`nemoclaw-workspace` inline code mention if desired.

@@ -0,0 +1,169 @@
---
name: nemoclaw-workspace
description: Hows to back up and restore OpenClaw workspace files before destructive operations. Also covers whats workspace files are, where they live, and how they persist across sandbox restarts. Use when agents.md, back restore workspace files, backup, identity.md, memory.md, nemoclaw, nemoclaw backup, nemoclaw restore.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix frontmatter description grammar and phrasing.

The description string has multiple typos/awkward phrases (Hows, whats, back restore workspace files) that reduce discoverability and polish in generated docs.

✏️ Proposed wording
-description: Hows to back up and restore OpenClaw workspace files before destructive operations. Also covers whats workspace files are, where they live, and how they persist across sandbox restarts. Use when agents.md, back restore workspace files, backup, identity.md, memory.md, nemoclaw, nemoclaw backup, nemoclaw restore.
+description: How to back up and restore OpenClaw workspace files before destructive operations. Also covers what workspace files are, where they live, and how they persist across sandbox restarts. Use when querying AGENTS.md, backup/restore workspace files, IDENTITY.md, MEMORY.md, nemoclaw, nemoclaw backup, or nemoclaw restore.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.agents/skills/docs/nemoclaw-workspace/SKILL.md at line 3, Update the
frontmatter "description" field in SKILL.md to fix grammar and clarity: replace
the current broken string with a concise sentence such as "How to back up and
restore OpenClaw workspace files before destructive operations; explains what
workspace files are, where they live, and how they persist across sandbox
restarts. Use for agents.md, backup/restore, identity.md, memory.md, nemoclaw,
nemoclaw backup, and nemoclaw restore." Ensure you modify the description key in
the file's frontmatter (the "description" field) to this improved wording.

@tienedev tienedev closed this Mar 23, 2026
@tienedev tienedev deleted the fix/726-lsof-sudo-onboarding branch March 23, 2026 21:14
cjagwani added a commit to kagura-agent/NemoClaw that referenced this pull request Jun 3, 2026
…IDIA#734/NVIDIA#735)

CodeQL flagged the symlink-target paths in the two symlink-skip tests as
insecure temp file creation (predictable `os.tmpdir() + pid` paths let a
coresident attacker pre-create the target). Switch both to the existing
`makeTempDir()` helper, which uses `mkdtempSync` for an unguessable
directory name. Cleanup also simplifies to the afterEach hook since both
temp dirs are now tracked in `tmpDirs`.

Tests still pass (16/16).

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
@wscurran wscurran added the bug-fix PR fixes a bug or regression label Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug-fix PR fixes a bug or regression

Projects

None yet

Development

Successfully merging this pull request may close these issues.

lsof -i:8080 command on onboarding needs sudo prefix to work properly

2 participants