Skip to content

[NemoClaw][Linux][Agent&Skills][Slack] Bolt per-event authorization fails with invalid_auth — proxy token rewrite missing for event-handling HTTP path (related: 6056223) #3708

@coder-glenn

Description

@coder-glenn

Description

Description:

Environment:

  • NemoClaw v0.0.38 on DGX Spark (ARM64, DGX OS Ubuntu)
  • Sandbox: spark-assistant, OpenClaw v2026.4.24
  • Inference: local Ollama (nemotron-3-super:120b)
  • Slack workspace: personal (schilton-spark-lab), Slack app freshly installed, Socket Mode enabled, Event Subscriptions configured (message.im, app_mention), bot scopes: im:history, im:read, im:write, chat:write, app_mentions:read, channels:history, channels:read.

Symptom (distinct from 6056223 — see below):

  1. Slack channel DOES initialize. Logs show:
    [slack] socket mode connected
    POST http://slack.com:443/api/auth.test (every 60s, no errors)
    So 6056223's "no [slack] log line ever emitted" failure mode is fixed.

  2. Socket Mode events ARE delivered to the sandbox. After a user DMs the bot,
    Bolt receives the event_callback. Confirmed by the next log line.

  3. Bolt's per-event authorization HTTP call to Slack fails with:
    [WARN] bolt-app Authorization of incoming event did not succeed.
    [ERROR] bolt-app Error: An API error occurred: invalid_auth
    code: 'slack_bolt_authorization_error',
    data: { ok: false, error: 'invalid_auth' }
    Every event is dropped. Bot never replies.

Diagnosis:
This is the same underlying architecture issue called out in 6056223 (placeholder tokens in sandbox env, proxy expected to rewrite at the HTTP layer), but with a fix applied somewhere between v0.0.7 and v0.0.38 that handled the init path. The fix appears to have missed Bolt's event-handler authorization call path: the proxy correctly rewrites SLACK_BOT_TOKEN for the periodic auth.test heartbeats (which succeed) and for the Socket Mode WSS handshake (which connects), but does NOT rewrite it on the HTTP call Bolt makes when handling an incoming event.

Evidence from inside the sandbox (nemoclaw spark-assistant connect):
$ printf 'BOT prefix: [%s]\n' "${SLACK_BOT_TOKEN:0:5}"
BOT prefix: [opens]
$ printf 'BOT length: %d\n' "${#SLACK_BOT_TOKEN}"
BOT length: 37
$ printf 'APP prefix: [%s]\n' "${SLACK_APP_TOKEN:0:5}"
APP prefix: [opens]
$ printf 'APP length: %d\n' "${#SLACK_APP_TOKEN}"
APP length: 37

Tokens are still placeholders (openshell:resolve:env:SLACK_BOT_TOKEN, 37 chars, opens prefix). Proxy rewrite must therefore handle SOME but not all HTTP paths from inside the sandbox.

Validation that the underlying credentials are good:

  • From the Spark host shell:
    curl -X POST -H "Authorization: Bearer <real-xoxb-...>"
    https://slack.com/api/auth.test
    returns:
    {"ok":true, "team":"schilton-spark-lab", "user":"nemoclaw_bot", ...}
  • So the user-supplied xoxb- is valid for the correct workspace and bot user.

Repro / mitigation attempts that did NOT fix it:

  1. openshell provider update spark-assistant-slack-bridge --credential SLACK_BOT_TOKEN=<xoxb-...> + sandbox rebuild
  2. openshell provider delete + create spark-assistant-slack-bridge
    with fresh --credential value, then rebuild
  3. Full nemoclaw spark-assistant destroy --yes + re-onboard pasting
    the same xoxb-/xapp- at the interactive prompts

After each attempt the sandbox env vars remained 37-char opens* placeholders and Bolt continued to fail with invalid_auth on every incoming event.

Related bugs:

  • 6056223 (Open) — Init-path version of this issue. Different symptom (init skipped entirely vs. init succeeds but event-handler auth fails). Architecturally same root cause.
  • 6081420 (Open) — Onboard accepts token without validation.
  • 6110747 (Duplicate) — Onboard accepts invalid token without auth.test.
  • 6107338 (Fixed) — Single-channel init auth failure crashing gateway via unhandled promise rejection. Note: my failure is also "unhandled rejection caught by safety net" — possibly the safety-net path that 6107338 added is now masking the proxy-rewrite gap.

Reported by: schilton@nvidia.com on 2026-05-13.

Bug Details

Field Value
Priority Unprioritized
Action Dev - Open - To fix
Disposition Open issue
Module Machine Learning - NemoClaw
Engineer Aaron Erickson
Requester Sven Chilton
Keyword NemoClaw
Days Open 5

[NVB#6174031]

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: sandboxOpenShell sandbox lifecycle, runtime, config, or recoveryarea: skillsSkills, agent behaviors, prompts, or skill packagingplatform: dgx-sparkAffects DGX Spark hardware or workflows

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions