Skip to content

fix(channels): normalize Slack runtime env placeholders before OpenClaw start (#4274)#4660

Merged
cv merged 3 commits into
NVIDIA:mainfrom
yimoj:fix/4274-normalize-slack-env-placeholders
Jun 2, 2026
Merged

fix(channels): normalize Slack runtime env placeholders before OpenClaw start (#4274)#4660
cv merged 3 commits into
NVIDIA:mainfrom
yimoj:fix/4274-normalize-slack-env-placeholders

Conversation

@yimoj

@yimoj yimoj commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

After a messaging-provider rebuild, OpenShell injects Slack credentials into the sandbox process environment as canonical resolve placeholders (e.g. SLACK_BOT_TOKEN=openshell:resolve:env:v51_SLACK_BOT_TOKEN). Slack Bolt validates token shape at startup and rejects anything that does not begin with xoxb-/xapp-, so the gateway inherits a value it cannot parse and Slack auth fails even though the provider attached successfully. This normalizes those runtime env placeholders to the Bolt-compatible alias before OpenClaw starts.

Related Issue

Fixes #4274

Changes

  • Add normalize_slack_runtime_env() to scripts/nemoclaw-start.sh and call it in both the non-root and root launch paths, before any child (the one-shot "${NEMOCLAW_CMD[@]}" exec or the stepped-down gateway) inherits the environment.
    • Rewrites self-referential openshell:resolve:env:*SLACK_BOT_TOKEN / *SLACK_APP_TOKEN placeholders to xoxb-OPENSHELL-RESOLVE-ENV-SLACK_BOT_TOKEN / xapp-OPENSHELL-RESOLVE-ENV-SLACK_APP_TOKEN — the same alias the config generator already bakes into openclaw.json, which the L7 egress proxy resolves at request time.
    • The match requires the trailing key name as well as the prefix, so a placeholder resolving some other key is left alone rather than silently rebound to the Slack secret.
    • Real xoxb-/xapp- tokens and already-aliased values are left untouched, so the export is idempotent and safe to call unconditionally. gosu/setpriv preserve the environment, so the export reaches the gateway user; no raw token is written to disk.
  • Extend refresh_openclaw_provider_placeholders diagnostics so Slack warns when a config alias is present but the runtime env is missing, or holds a value that is neither an OpenShell placeholder nor a plausibly-real Slack token.
  • Add focused regression tests in test/nemoclaw-start.test.ts that drive the real shell functions.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Verification notes:

  • vitest run test/nemoclaw-start.test.ts111 passed (includes the new Slack runtime env normalization (#4274) suite and the extended provider placeholder refresh suite, all exercising the real shell functions). Regression reproduced before the fix (failing) and confirmed green after.
  • Related suites green: generate-openclaw-config.test.ts, messaging/channels/manifests.test.ts, onboard/slack-validation.test.ts (131 passed) — confirms the alias strings match the config generator and manifest.
  • Lint on the touched shell script: shfmt -i 2 -ci -bn -d clean; shellcheck -x reports only a pre-existing unrelated SC2015 note outside this diff.
  • Full npm test was not run end-to-end here (the cli Vitest project's broad integration suite exceeds the local time budget); the focused and adjacent suites above cover the changed code paths. No live Slack credentials or Brev/Docker sandbox were available, so the E2E pairing path could not be exercised locally; fake placeholder values were used for all unit/regression tests.

Signed-off-by: Yimo Jiang yimoj@nvidia.com

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Detects missing or mismatched Slack tokens and emits targeted warnings during startup.
    • Normalizes Slack runtime placeholders into Bolt-compatible alias forms before child processes inherit the environment.
  • Tests

    • Added end-to-end tests covering Slack runtime normalization, warning conditions, idempotency, and startup ordering.

…aw start

After a messaging-provider rebuild, OpenShell injects Slack credentials into
the sandbox process environment as canonical resolve placeholders, e.g.
SLACK_BOT_TOKEN=openshell:resolve:env:v51_SLACK_BOT_TOKEN. Slack Bolt validates
token shape at startup and rejects anything that does not begin with
xoxb-/xapp-, so the gateway inherits a value it cannot parse and Slack auth
fails even though the provider attached successfully (NVIDIA#4274).

Add normalize_slack_runtime_env() to scripts/nemoclaw-start.sh and call it in
both the non-root and root launch paths, before any child (the one-shot
"${NEMOCLAW_CMD[@]}" exec or the stepped-down gateway) inherits the env. It
rewrites self-referential openshell:resolve:env:*SLACK_*_TOKEN placeholders to
the Bolt-compatible alias the L7 egress proxy already resolves — the same alias
the config generator bakes into openclaw.json. Real tokens and already-aliased
values are left untouched, so the export is idempotent and safe to call
unconditionally. gosu/setpriv preserve the environment, so the export reaches
the gateway user; no raw token is written to disk.

Also extend refresh_openclaw_provider_placeholders diagnostics to warn when a
Slack config alias is present but the runtime env is missing or holds a value
that is neither an OpenShell placeholder nor a plausibly-real Slack token.

Covered by focused regression tests that drive the real shell functions:
process-env normalization (revision-scoped and canonical placeholders, idempotency,
real-token pass-through, mismatched-key safety, no-leak of the revision suffix)
and the config-vs-env diagnostics.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f9f086ba-c94d-47a4-bce4-20c51eb7ee97

📥 Commits

Reviewing files that changed from the base of the PR and between 1524799 and e06254d.

📒 Files selected for processing (2)
  • scripts/nemoclaw-start.sh
  • test/nemoclaw-start.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/nemoclaw-start.test.ts
  • scripts/nemoclaw-start.sh

📝 Walkthrough

Walkthrough

Adds Slack placeholder diagnostics and a normalization routine that converts OpenShell Slack runtime placeholders into Bolt-compatible xoxb-/xapp- aliases, integrates normalization into both startup paths so child processes inherit normalized env, and expands tests to cover diagnostics and normalization behavior.

Changes

Slack Runtime Environment Normalization

Layer / File(s) Summary
Slack placeholder validation and diagnostics
scripts/nemoclaw-start.sh, test/nemoclaw-start.test.ts
Detects on-disk Bolt-compatible Slack alias values and validates corresponding SLACK_BOT_TOKEN / SLACK_APP_TOKEN runtime env presence/shape, emitting warnings for missing/mismatched runtime env values without mutating config. Test cases assert warning/no-warning scenarios.
Slack runtime env normalization function
scripts/nemoclaw-start.sh, test/nemoclaw-start.test.ts
Adds normalize_slack_runtime_env to convert self-referential OpenShell Slack placeholders into Bolt-compatible alias forms (xoxb-..., xapp-...) when env vars resolve to canonical placeholders; leaves real or already-aliased tokens untouched. Tests verify alias rewriting, idempotency, no revision-suffix leakage, unset behavior, and preservation of unrelated placeholders.
Integration into startup paths
scripts/nemoclaw-start.sh, test/nemoclaw-start.test.ts
Calls normalize_slack_runtime_env in non-root and root startup flows before child processes/gateway inherit the environment. Test scaffolding stubs/executes normalization to exercise ordering.

Sequence Diagram

sequenceDiagram
  participant Start as nemoclaw-start.sh
  participant Normalize as normalize_slack_runtime_env
  participant Env as Runtime Environment
  participant Gateway as Child Process/Gateway
  Start->>Env: Read SLACK_BOT_TOKEN / SLACK_APP_TOKEN
  Env-->>Start: value (placeholder or token)
  Start->>Normalize: invoke normalization
  Normalize->>Env: detect canonical OpenShell placeholder
  Normalize->>Env: export Bolt-compatible alias (xoxb-/xapp-OPENSHELL-RESOLVE-ENV-*)
  Start->>Gateway: child inherits normalized env
  Gateway->>Gateway: Slack Bolt validates token
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Suggested labels

bug, fix, NV QA, Integration: Slack, Platform: All, v0.0.57

Poem

🐰 I hop through env and whisper light,
Turning placeholders into tokens bright,
From openshell keys to xoxb and xapp,
No more auth errors when Slack comes back—
A tidy trail for every startup night.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: normalizing Slack runtime environment placeholders before startup.
Linked Issues check ✅ Passed The PR addresses the core coding requirements from #4274: implementing runtime normalization of Slack placeholders and environment validation to ensure tokens are properly resolved before child processes inherit the environment.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue: normalization logic in nemoclaw-start.sh, placeholder refresh diagnostics, and corresponding test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/nemoclaw-start.sh`:
- Around line 1110-1116: The code currently treats any value starting with
prefix (e.g. "openshell:resolve:env:") as a valid placeholder; change this to
only accept the exact placeholder for the expected env key or a revision-scoped
form for that same key. Concretely, in the branch that checks
env_value.startswith(prefix) before suppressing the warning, replace that
permissive check with logic that accepts only (1) env_value ==
f"{prefix}{env_key}" or (2) env_value.startswith(prefix) and
env_value.endswith(":" + env_key) (to allow a revision-scoped placeholder). Keep
the existing token_scheme check and warnings.append call (referenced symbols:
env_value, prefix, token_scheme, env_key, warnings.append, label) so that
mismatched OpenShell placeholders like "openshell:resolve:env:OTHER_KEY" no
longer suppress the Slack token warning.
- Around line 1176-1185: The case patterns that check SLACK_BOT_TOKEN and
SLACK_APP_TOKEN are too loose (they match any var ending in the suffix); tighten
them so only the canonical key and the revision-scoped key are matched — e.g.
instead of "$prefix"*SLACK_BOT_TOKEN use explicit case patterns for
"$prefix:SLACK_BOT_TOKEN" and "$prefix:v"*"_SLACK_BOT_TOKEN" (and likewise for
SLACK_APP_TOKEN), or replace the case with a guarded regex check ([[ "${var}" =~
^${prefix}(:v[0-9]+_)?SLACK_BOT_TOKEN$ ]]) to only accept the exact forms before
rebinding SLACK_BOT_TOKEN / SLACK_APP_TOKEN.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0e179bf3-87a5-49ca-b9a5-5871b9ff1f01

📥 Commits

Reviewing files that changed from the base of the PR and between c8df6ae and 1524799.

📒 Files selected for processing (2)
  • scripts/nemoclaw-start.sh
  • test/nemoclaw-start.test.ts

Comment thread scripts/nemoclaw-start.sh Outdated
Comment thread scripts/nemoclaw-start.sh Outdated
Address CodeRabbit review on NVIDIA#4660: both the normalization match and the
diagnostic suppression were too permissive.

- normalize_slack_runtime_env: the glob "$prefix"*SLACK_BOT_TOKEN matched any
  value merely ending in the suffix (e.g. ...v51_NOT_SLACK_BOT_TOKEN, since the
  glob spans underscores), so an unrelated key could be silently rebound to the
  Bolt alias. Replace with an anchored regex
  ^openshell:resolve:env:(v[0-9]+_)?SLACK_BOT_TOKEN$ that accepts only the
  canonical key or its revision-scoped form.
- refresh_openclaw_provider_placeholders Slack diagnostic: the suppression
  treated any openshell:resolve:env:* value as a healthy placeholder, so a
  wrong-key placeholder (openshell:resolve:env:OTHER_KEY) looked fine even
  though Bolt would inherit a non-Slack placeholder and fail. Match the same
  anchored shape for the expected key before suppressing the warning.

Add regression tests for the suffix-collision key and the wrong-key runtime
placeholder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Yimo Jiang <yimoj@nvidia.com>
@yimoj yimoj added the v0.0.57 Release target label Jun 2, 2026

@prekshivyas prekshivyas left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

APPROVE.

Verified normalize_slack_runtime_env() is wired into both production launch paths (scripts/nemoclaw-start.sh:2749 non-root, :2895 root), in the main shell before any child inherits the env — not dead code. Correctly scoped: it normalizes the runtime env (which Bolt shape-checks) rather than the on-disk config, matching the narrowed root cause in #4274 (env holds openshell:resolve:env:v<rev>_SLACK_BOT_TOKEN, which Bolt rejects as non-xoxb-).

Tests drive the real shell + Python via spawnSync("bash") and cover the full edge matrix: revision-scoped, canonical, idempotent re-run, real-token passthrough, unset var, wrong-key placeholder, suffix-collision, and a no-secret-suffix-leak assertion. Security: the alias is excused by the on-disk tripwire's negative lookahead (:1211), no raw secret hits disk, and warnings log the key name not the value. Both CodeRabbit threads resolved in head e06254d. CI green.

Signed-off-by: Prekshi Vyas prekshiv@nvidia.com

@prekshivyas prekshivyas self-assigned this Jun 2, 2026
@cv cv merged commit 09826fb into NVIDIA:main Jun 2, 2026
26 checks passed
@wscurran wscurran added area: messaging Messaging channels, bridges, manifests, or channel lifecycle feature PR adds or expands user-visible functionality and removed enhancement: messaging labels Jun 3, 2026
cv pushed a commit that referenced this pull request Jun 3, 2026
## Summary
- Add the missing `v0.0.57` release-notes section with links to the
detailed docs pages for command, inference, onboarding, messaging,
status, installer, and policy changes.
- Remove public references to docs-skip terms from source docs and
regenerate the NemoClaw user skills from the current Fern MDX docs.
- Carry forward generated references for the per-agent documentation
split, including Hermes-specific reference files.

## Source summary
- #4615 and #4653 -> `docs/about/release-notes.mdx`,
`docs/reference/commands.mdx`: Release notes now cover host-side
`sessions` and `agents` commands plus `NEMOCLAW_EXTRA_AGENTS_JSON`
secondary-agent baking.
- #4163, #4204, #4611, #4619, and #4676 ->
`docs/about/release-notes.mdx`,
`docs/inference/use-local-inference.mdx`: Release notes now cover
managed vLLM progress/readiness, DGX Spark model default changes, local
Ollama streaming usage, and inference route divergence warnings.
- #4267, #4601, #4609, #4642, #4645, and #4661 ->
`docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release
notes now cover UFW auto-remediation, local-inference reachability
gates, gateway reuse/binding, cancel rollback, and policy selection
persistence.
- #4577, #4582, #4607, and #4660 -> `docs/about/release-notes.mdx`,
`docs/manage-sandboxes/messaging-channels.mdx`: Release notes now cover
Slack validation, atomic `channels add`, WhatsApp QR diagnostics, and
Slack placeholder normalization.
- #4388, #4600, #4646, and #4647 -> `docs/about/release-notes.mdx`,
`docs/reference/commands.mdx`: Release notes now cover status failure
layers, paused-container hints, Docker-driver doctor behavior, and
non-destructive stale-registry recovery.
- #4569, #4579, and #4678 -> `docs/about/release-notes.mdx`,
`docs/manage-sandboxes/lifecycle.mdx`,
`docs/network-policy/integration-policy-examples.mdx`: Release notes now
cover installer tag pinning, PyPI `uv` policy access, and observable
Jira validation.
- #4632 -> `.agents/skills/`: Regenerated user skills from the current
per-agent docs source, including newly generated Hermes reference files.

## Verification
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" docs --glob "*.mdx"`
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" .agents/skills --glob "*.md"`
- `npm run docs`
- `npm run build:cli`
- Commit hooks: markdownlint, docs-to-skills verification, gitleaks,
skills YAML, commitlint

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Restructured documentation to clearly distinguish OpenClaw and Hermes
agent variants throughout user guides.
* Enhanced security, credential storage, and deployment guidance with
clearer setup flows.
  * Added Hermes plugin installation and ecosystem documentation.
* Improved workspace, messaging, and policy management references with
variant-specific command examples.
  * Refined troubleshooting and CLI reference sections for clarity.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: messaging Messaging channels, bridges, manifests, or channel lifecycle feature PR adds or expands user-visible functionality v0.0.57 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Slack credential not injected into sandbox env after messaging-provider rebuild (NemoClaw 0.0.51, OpenShell 0.0.44)

5 participants