fix(channels): normalize Slack runtime env placeholders before OpenClaw start (#4274)#4660
Conversation
…aw start After a messaging-provider rebuild, OpenShell injects Slack credentials into the sandbox process environment as canonical resolve placeholders, e.g. SLACK_BOT_TOKEN=openshell:resolve:env:v51_SLACK_BOT_TOKEN. Slack Bolt validates token shape at startup and rejects anything that does not begin with xoxb-/xapp-, so the gateway inherits a value it cannot parse and Slack auth fails even though the provider attached successfully (NVIDIA#4274). Add normalize_slack_runtime_env() to scripts/nemoclaw-start.sh and call it in both the non-root and root launch paths, before any child (the one-shot "${NEMOCLAW_CMD[@]}" exec or the stepped-down gateway) inherits the env. It rewrites self-referential openshell:resolve:env:*SLACK_*_TOKEN placeholders to the Bolt-compatible alias the L7 egress proxy already resolves — the same alias the config generator bakes into openclaw.json. Real tokens and already-aliased values are left untouched, so the export is idempotent and safe to call unconditionally. gosu/setpriv preserve the environment, so the export reaches the gateway user; no raw token is written to disk. Also extend refresh_openclaw_provider_placeholders diagnostics to warn when a Slack config alias is present but the runtime env is missing or holds a value that is neither an OpenShell placeholder nor a plausibly-real Slack token. Covered by focused regression tests that drive the real shell functions: process-env normalization (revision-scoped and canonical placeholders, idempotency, real-token pass-through, mismatched-key safety, no-leak of the revision suffix) and the config-vs-env diagnostics. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds Slack placeholder diagnostics and a normalization routine that converts OpenShell Slack runtime placeholders into Bolt-compatible xoxb-/xapp- aliases, integrates normalization into both startup paths so child processes inherit normalized env, and expands tests to cover diagnostics and normalization behavior. ChangesSlack Runtime Environment Normalization
Sequence DiagramsequenceDiagram
participant Start as nemoclaw-start.sh
participant Normalize as normalize_slack_runtime_env
participant Env as Runtime Environment
participant Gateway as Child Process/Gateway
Start->>Env: Read SLACK_BOT_TOKEN / SLACK_APP_TOKEN
Env-->>Start: value (placeholder or token)
Start->>Normalize: invoke normalization
Normalize->>Env: detect canonical OpenShell placeholder
Normalize->>Env: export Bolt-compatible alias (xoxb-/xapp-OPENSHELL-RESOLVE-ENV-*)
Start->>Gateway: child inherits normalized env
Gateway->>Gateway: Slack Bolt validates token
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 1110-1116: The code currently treats any value starting with
prefix (e.g. "openshell:resolve:env:") as a valid placeholder; change this to
only accept the exact placeholder for the expected env key or a revision-scoped
form for that same key. Concretely, in the branch that checks
env_value.startswith(prefix) before suppressing the warning, replace that
permissive check with logic that accepts only (1) env_value ==
f"{prefix}{env_key}" or (2) env_value.startswith(prefix) and
env_value.endswith(":" + env_key) (to allow a revision-scoped placeholder). Keep
the existing token_scheme check and warnings.append call (referenced symbols:
env_value, prefix, token_scheme, env_key, warnings.append, label) so that
mismatched OpenShell placeholders like "openshell:resolve:env:OTHER_KEY" no
longer suppress the Slack token warning.
- Around line 1176-1185: The case patterns that check SLACK_BOT_TOKEN and
SLACK_APP_TOKEN are too loose (they match any var ending in the suffix); tighten
them so only the canonical key and the revision-scoped key are matched — e.g.
instead of "$prefix"*SLACK_BOT_TOKEN use explicit case patterns for
"$prefix:SLACK_BOT_TOKEN" and "$prefix:v"*"_SLACK_BOT_TOKEN" (and likewise for
SLACK_APP_TOKEN), or replace the case with a guarded regex check ([[ "${var}" =~
^${prefix}(:v[0-9]+_)?SLACK_BOT_TOKEN$ ]]) to only accept the exact forms before
rebinding SLACK_BOT_TOKEN / SLACK_APP_TOKEN.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0e179bf3-87a5-49ca-b9a5-5871b9ff1f01
📒 Files selected for processing (2)
scripts/nemoclaw-start.shtest/nemoclaw-start.test.ts
Address CodeRabbit review on NVIDIA#4660: both the normalization match and the diagnostic suppression were too permissive. - normalize_slack_runtime_env: the glob "$prefix"*SLACK_BOT_TOKEN matched any value merely ending in the suffix (e.g. ...v51_NOT_SLACK_BOT_TOKEN, since the glob spans underscores), so an unrelated key could be silently rebound to the Bolt alias. Replace with an anchored regex ^openshell:resolve:env:(v[0-9]+_)?SLACK_BOT_TOKEN$ that accepts only the canonical key or its revision-scoped form. - refresh_openclaw_provider_placeholders Slack diagnostic: the suppression treated any openshell:resolve:env:* value as a healthy placeholder, so a wrong-key placeholder (openshell:resolve:env:OTHER_KEY) looked fine even though Bolt would inherit a non-Slack placeholder and fail. Match the same anchored shape for the expected key before suppressing the warning. Add regression tests for the suffix-collision key and the wrong-key runtime placeholder. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
prekshivyas
left a comment
There was a problem hiding this comment.
APPROVE.
Verified normalize_slack_runtime_env() is wired into both production launch paths (scripts/nemoclaw-start.sh:2749 non-root, :2895 root), in the main shell before any child inherits the env — not dead code. Correctly scoped: it normalizes the runtime env (which Bolt shape-checks) rather than the on-disk config, matching the narrowed root cause in #4274 (env holds openshell:resolve:env:v<rev>_SLACK_BOT_TOKEN, which Bolt rejects as non-xoxb-).
Tests drive the real shell + Python via spawnSync("bash") and cover the full edge matrix: revision-scoped, canonical, idempotent re-run, real-token passthrough, unset var, wrong-key placeholder, suffix-collision, and a no-secret-suffix-leak assertion. Security: the alias is excused by the on-disk tripwire's negative lookahead (:1211), no raw secret hits disk, and warnings log the key name not the value. Both CodeRabbit threads resolved in head e06254d. CI green.
Signed-off-by: Prekshi Vyas prekshiv@nvidia.com
## Summary - Add the missing `v0.0.57` release-notes section with links to the detailed docs pages for command, inference, onboarding, messaging, status, installer, and policy changes. - Remove public references to docs-skip terms from source docs and regenerate the NemoClaw user skills from the current Fern MDX docs. - Carry forward generated references for the per-agent documentation split, including Hermes-specific reference files. ## Source summary - #4615 and #4653 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover host-side `sessions` and `agents` commands plus `NEMOCLAW_EXTRA_AGENTS_JSON` secondary-agent baking. - #4163, #4204, #4611, #4619, and #4676 -> `docs/about/release-notes.mdx`, `docs/inference/use-local-inference.mdx`: Release notes now cover managed vLLM progress/readiness, DGX Spark model default changes, local Ollama streaming usage, and inference route divergence warnings. - #4267, #4601, #4609, #4642, #4645, and #4661 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover UFW auto-remediation, local-inference reachability gates, gateway reuse/binding, cancel rollback, and policy selection persistence. - #4577, #4582, #4607, and #4660 -> `docs/about/release-notes.mdx`, `docs/manage-sandboxes/messaging-channels.mdx`: Release notes now cover Slack validation, atomic `channels add`, WhatsApp QR diagnostics, and Slack placeholder normalization. - #4388, #4600, #4646, and #4647 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover status failure layers, paused-container hints, Docker-driver doctor behavior, and non-destructive stale-registry recovery. - #4569, #4579, and #4678 -> `docs/about/release-notes.mdx`, `docs/manage-sandboxes/lifecycle.mdx`, `docs/network-policy/integration-policy-examples.mdx`: Release notes now cover installer tag pinning, PyPI `uv` policy access, and observable Jira validation. - #4632 -> `.agents/skills/`: Regenerated user skills from the current per-agent docs source, including newly generated Hermes reference files. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" docs --glob "*.mdx"` - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" .agents/skills --glob "*.md"` - `npm run docs` - `npm run build:cli` - Commit hooks: markdownlint, docs-to-skills verification, gitleaks, skills YAML, commitlint <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Documentation** * Restructured documentation to clearly distinguish OpenClaw and Hermes agent variants throughout user guides. * Enhanced security, credential storage, and deployment guidance with clearer setup flows. * Added Hermes plugin installation and ecosystem documentation. * Improved workspace, messaging, and policy management references with variant-specific command examples. * Refined troubleshooting and CLI reference sections for clarity. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
After a messaging-provider rebuild, OpenShell injects Slack credentials into the sandbox process environment as canonical resolve placeholders (e.g.
SLACK_BOT_TOKEN=openshell:resolve:env:v51_SLACK_BOT_TOKEN). Slack Bolt validates token shape at startup and rejects anything that does not begin withxoxb-/xapp-, so the gateway inherits a value it cannot parse and Slack auth fails even though the provider attached successfully. This normalizes those runtime env placeholders to the Bolt-compatible alias before OpenClaw starts.Related Issue
Fixes #4274
Changes
normalize_slack_runtime_env()toscripts/nemoclaw-start.shand call it in both the non-root and root launch paths, before any child (the one-shot"${NEMOCLAW_CMD[@]}"exec or the stepped-down gateway) inherits the environment.openshell:resolve:env:*SLACK_BOT_TOKEN/*SLACK_APP_TOKENplaceholders toxoxb-OPENSHELL-RESOLVE-ENV-SLACK_BOT_TOKEN/xapp-OPENSHELL-RESOLVE-ENV-SLACK_APP_TOKEN— the same alias the config generator already bakes intoopenclaw.json, which the L7 egress proxy resolves at request time.xoxb-/xapp-tokens and already-aliased values are left untouched, so the export is idempotent and safe to call unconditionally.gosu/setprivpreserve the environment, so the export reaches the gateway user; no raw token is written to disk.refresh_openclaw_provider_placeholdersdiagnostics so Slack warns when a config alias is present but the runtime env is missing, or holds a value that is neither an OpenShell placeholder nor a plausibly-real Slack token.test/nemoclaw-start.test.tsthat drive the real shell functions.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesnpm run docsbuilds without warnings (doc changes only)Verification notes:
vitest run test/nemoclaw-start.test.ts— 111 passed (includes the newSlack runtime env normalization (#4274)suite and the extendedprovider placeholder refreshsuite, all exercising the real shell functions). Regression reproduced before the fix (failing) and confirmed green after.generate-openclaw-config.test.ts,messaging/channels/manifests.test.ts,onboard/slack-validation.test.ts(131 passed) — confirms the alias strings match the config generator and manifest.shfmt -i 2 -ci -bn -dclean;shellcheck -xreports only a pre-existing unrelatedSC2015note outside this diff.npm testwas not run end-to-end here (thecliVitest project's broad integration suite exceeds the local time budget); the focused and adjacent suites above cover the changed code paths. No live Slack credentials or Brev/Docker sandbox were available, so the E2E pairing path could not be exercised locally; fake placeholder values were used for all unit/regression tests.Signed-off-by: Yimo Jiang yimoj@nvidia.com
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests