Skip to content

[Sandbox][Windows ARM WSL] Slack onboarding has 2 policy gaps #2758

@wangericnv

Description

@wangericnv

Description

Description

On a fresh NemoClaw v0.0.30 install (verified on Yukon TS6 WSL2 Ubuntu 24.04
aarch64 with qwen3:8b Ollama; orthogonal to the model size — the Slack flow is
the same on x86 hosts), the Slack messaging-channel onboarding has TWO sequential
policy gaps that block the bot from actually working end-to-end:

GAP B — credential rotation does NOT auto-apply the slack policy preset
  Adding SLACK_BOT_TOKEN + SLACK_APP_TOKEN to ~/.nemoclaw/credentials.json and
  re-running `nemoclaw onboard` correctly:
    - records providerCredentialHashes for both tokens
    - sets messagingChannels: ['slack'] in sandbox state
    - injects the env vars into the sandbox container
    - rebuilds the sandbox image
  But it does NOT auto-apply the `slack` policy preset to the sandbox.
  policy-list still shows ○ slack (not applied). With the network-policy
  preset missing, the openshell L7 proxy rejects every Slack endpoint:
    [slack] socket mode failed to start. retry 1/12 in 2s
    (An HTTP protocol error occurred: statusCode = 403)
    ... (12 retries, all 403)
  The user has to manually run `nemoclaw  policy-add slack` for the
  bot to even establish Socket Mode. There is no telemetry / banner pointing
  the user at this missing step.

GAP C — even after `policy-add slack`, CONNECT slack.com:443 is still denied
  After the slack preset is applied, Socket Mode finally connects:
    2026-04-30T09:30:29.599+00:00 [slack] socket mode connected
  The bot now receives incoming Slack events (DMs trigger
  bolt-app dispatch). But when bolt-app calls back to slack.com via the
  @slack/web-api WebClient (e.g. to look up bot identity, post a reply,
  fetch user info), the proxy denies the CONNECT tunnel:
    body: {
      detail: 'CONNECT slack.com:443 not permitted by policy',
      error: 'policy_denied'
    }
  This happens even though the slack preset's `endpoints` list explicitly
  includes `slack.com`. The L7 proxy's CONNECT-tunnel matching does not
  appear to honor the preset's host whitelist when the client uses an
  HTTPS proxy tunnel (vs. a transparent forward); only `wss-primary.slack.com`
  / `wss-backup.slack.com` (Socket Mode WSS) seem to actually pass through.

Net effect: end-to-end Slack flow does NOT work on a default v0.0.30 install
even with valid credentials. Step 3.4 of the standard Slack onboarding test
("Test the messaging app connection with the prompt: ...") fails because
the bot receives the user's DM but cannot post a reply.

The two gaps appear in sequence and would naturally be fixed together —
filing as one bug.
Environment
Device:        Yukon TS6 (ARM64 reference, Snapdragon X SoC)
OS:            Windows + WSL2 → Ubuntu 24.04.4 LTS aarch64
Architecture:  aarch64 (ARM64)
Node.js:       v22.x bundled with NemoClaw install
npm:           bundled
Docker:        29.1.3 (Ubuntu docker.io package)
OpenShell CLI: 0.0.36
NemoClaw:      v0.0.30
Ollama:        0.22.0
Model:         qwen3:8b (Local Ollama, GPU 6 GB resident)
OpenClaw:      v2026.4.24
Slack workspace: mercuriusSpace
Bot user:      nemoclawtest (bot_id B0AQTDM9PM5, user_id U0AR2FJJC1Z)
Steps to Reproduce
GAP B repro:

1. Fresh NemoClaw v0.0.30 install with onboard already complete on a sandbox
   `my-assistant` (any provider — verified with Ollama qwen3:8b).
2. Write SLACK_BOT_TOKEN (xoxb-...) and SLACK_APP_TOKEN (xapp-...) to
   ~/.nemoclaw/credentials.json (do NOT add the slack policy preset
   manually).
3. Run: nemoclaw onboard --non-interactive --yes-i-accept-third-party-software
4. Wait for credential-rotation flow to detect the new tokens. Onboard
   reports "Messaging tokens detected: slack", rebuilds the sandbox image,
   and prints success summary including 'Slack' under messagingChannels.
5. Run: nemoclaw my-assistant policy-list — observe `○ slack` (not applied).
6. Run: nemoclaw my-assistant logs --follow OR cat /tmp/gateway.log inside
   sandbox — observe 12 socket-mode retries all 403 over ~5 minutes.

GAP C repro (continues from B):

7. Manually fix B by: nemoclaw my-assistant policy-add slack --yes
8. Wait ~30 s for socket mode to retry; observe `[slack] socket mode connected`
   in gateway log.
9. From your Slack client, DM the bot user 'nemoclawtest' with any text,
   e.g. "Create a test plan for testing Openclaw in Windows".
10. Inside sandbox: tail -50 /tmp/gateway.log
11. Observe:
    [WARN] bolt-app Authorization of incoming event did not succeed.
    [ERROR] bolt-app Error: An HTTP protocol error occurred: statusCode = 403
    body: {
      detail: 'CONNECT slack.com:443 not permitted by policy',
      error: 'policy_denied'
    }
12. Bot does NOT reply in Slack. Step 3.4 expectations ("Bot delivers an
    agent-generated test plan reply" / "Reply visible in the chosen channel
    client") not met.
Expected Result
GAP B:
  When credential-rotation detects Slack tokens, the onboarding flow should
  auto-apply the corresponding `slack` policy preset (the same way it
  auto-rebuilds the sandbox image and registers messagingChannels). User
  shouldn't need to know about the manual `policy-add slack` step. If
  auto-apply is intentionally not wired (e.g. policy presets are user
  consent-required), the onboard summary should explicitly say:
    "  ⚠ Slack channel registered but the 'slack' policy preset is NOT
       applied. Run: nemoclaw  policy-add slack"
  No silent socket-mode retries.

GAP C:
  After the slack preset is applied, the L7 proxy must allow CONNECT to
  slack.com:443 (and api.slack.com, hooks.slack.com) for the binaries in
  the slack preset's whitelist (/usr/local/bin/node). The bot's
  @slack/web-api REST callbacks should reach Slack with no proxy denial.
  Bot reply round-trip works end-to-end.
Actual Result
GAP B (12 retries, all 403, no policy applied automatically):

  2026-04-30T09:24:22.143+00:00 [slack] socket mode failed to start. retry 11/12 in 30s (An HTTP protocol error occurred: statusCode = 403)
  2026-04-30T09:24:54.624+00:00 [slack] [default] channel exited: An HTTP protocol error occurred: statusCode = 403
  2026-04-30T09:24:54.626+00:00 [slack] [default] auto-restart attempt 2/10 in 10s
  ... (continued for ~5 minutes until manual policy-add)
  2026-04-30T09:30:29.599+00:00 [slack] socket mode connected   ← only after manual `policy-add slack`

GAP C (post-Socket-Mode connect, bot received but cannot reply):

  2026-04-30T09:34:54.200+00:00 [WARN] bolt-app Authorization of incoming event did not succeed. No listeners will be called.
  2026-04-30T09:34:54.211+00:00 [ERROR] bolt-app Error: An HTTP protocol error occurred: statusCode = 403
      at httpErrorFromResponse (/sandbox/.openclaw/plugin-runtime-deps/openclaw-unknown-1004618868f4/node_modules/@slack/web-api/dist/errors.js:49:33)
      at /sandbox/.openclaw/plugin-runtime-deps/openclaw-unknown-1004618868f4/node_modules/@slack/web-api/dist/WebClient.js:505:62
    code: 'slack_bolt_authorization_error',
    statusCode: 403,
    statusMessage: 'Forbidden',
    body: {
      detail: 'CONNECT slack.com:443 not permitted by policy',
      error: 'policy_denied'
    },
    attemptNumber: 3,
    retriesLeft: 0
  2026-04-30T09:34:54.216+00:00 [ERROR]   An unhandled error occurred while Bolt processed (type: event_callback, error: Error: An HTTP protocol error occurred: statusCode = 403)

Sandbox-side env at the time of GAP C:
  SLACK_BOT_TOKEN=
  SLACK_APP_TOKEN=
  Node binary at /usr/local/bin/node (= what slack preset whitelists)
  policy-list shows ● slack (applied)
  policy-list shows ● brave (i.e. preset endpoint rules are otherwise
                              functioning for other presets)

The fact that Socket Mode WSS (wss-primary.slack.com) DOES pass while
REST (slack.com:443 CONNECT) DOES NOT — with both hosts in the same
preset — points strongly at the proxy's CONNECT-tunnel rule matching
diverging from the rest-with-method-path rule matching for the same
preset. Likely fix is in the slack preset YAML's slack.com entry
(make sure tls / access mode permits CONNECT) OR in the proxy's
binary-bound CONNECT rule resolution.
Logs
Captured 2026-04-30 on lab@10.172.178.214 (Yukon TS6 WSL2). Full gateway.log
window (around 09:24-09:35 UTC) available on request.

Reproduces independently of the qwen3.6:35b → qwen3:8b model substitution
(see NVBug 6129886) — the model choice does not affect the Slack policy
flow, so this bug is observable on any v0.0.30 host once Slack creds are
configured. Likely also reproduces on x86 Linux with the same v0.0.30
+ openshell 0.0.36 stack.

Related but distinct bugs:
  - NVBug 6110165 — Slack k8s-path gateway crash (PR #2151 gap, k8s
    deployment). Different surface — that bug is about Docker-vs-k8s
    image deployment path; this bug is about credential→policy
    propagation and CONNECT-tunnel rule matching, both within the
    standard Docker deployment.
  - NVBug 6115592 — Brave key validation aborts onboard. Different
    credential family but related credential-handling area.
  - NVBug 6129886 — qwen3.6:35b unusable on Windows ARM reference host. Different surface
    (model size vs hardware); orthogonal.

Bug Details

Field Value
Priority Unprioritized
Action Dev - Open - To fix
Disposition Open issue
Module Machine Learning - NemoClaw
Keyword NemoClaw, NEMOCLAW_GH_SYNC_APPROVAL, NemoClaw_Onboard, NemoClaw_Policy&Network, NemoClaw_Sandbox

[NVB#6130411]

Metadata

Metadata

Assignees

No one assigned

    Labels

    NV QABugs found by the NVIDIA QA TeamUATIssues flagged for User Acceptance Testing.area: sandboxOpenShell sandbox lifecycle, runtime, config, or recoveryintegration: slackSlack integration or channel behaviorplatform: wslAffects Windows Subsystem for Linux

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions