Skip to content

fix(discord): tolerate unresolved SecretRef in resolveDiscordToken (fixes #76371)#76385

Closed
mogglemoss wants to merge 3 commits intoopenclaw:mainfrom
mogglemoss:fix-discord-secretref-startup
Closed

fix(discord): tolerate unresolved SecretRef in resolveDiscordToken (fixes #76371)#76385
mogglemoss wants to merge 3 commits intoopenclaw:mainfrom
mogglemoss:fix-discord-secretref-startup

Conversation

@mogglemoss
Copy link
Copy Markdown
Contributor

Summary

  • Problem: @openclaw/discord 2026.5.2 channel startup crashes deterministically when channels.discord.token is configured as a SecretRef object (e.g. { id: "DISCORD_BOT_TOKEN", provider: "default", source: "env" }), failing every gateway restart with [discord] channel startup failed: channels.discord.token: unresolved SecretRef "env:default:DISCORD_BOT_TOKEN". Resolve this command against an active gateway runtime snapshot before reading it.
  • Why it matters: Discord channel never reaches READY on 2026.5.2 with this very common config shape. Telegram with the analogous SecretRef config works fine on the same boot, so the failure is silent-on-other-channels and obvious-on-discord — easy to mistake for a Discord/network problem. Effectively a hard regression vs. 2026.4.29 (bundled dist/extensions/discord) for SecretRef users. Filed as [Bug] @openclaw/discord 2026.5.2 channel startup crashes on SecretRef-backed token (env:default:DISCORD_BOT_TOKEN unresolved) #76371.
  • What changed: normalizeDiscordToken now uses the gentle normalizeSecretInputString instead of the strict normalizeResolvedSecretInputString. Unresolved SecretRefs return undefined and let the existing fall-through chain (account-token → top-level token → process.env.DISCORD_BOT_TOKENsource=none) handle the common case where the env-source SecretRef's intent matches the existing env fallback. Two existing throw-assertions updated; two new tests added covering the SecretRef-without-env path and the account-level SecretRef + base-token interaction.
  • What did NOT change (scope boundary): resolveDiscordToken's fall-through ordering, the Bot prefix stripping, the inspectConfiguredToken "configured but unavailable" path, the hasAccountToken "explicit blank means no inheritance" semantics, any caller of normalizeDiscordToken outside resolveDiscordToken itself, any code outside extensions/discord/. No SDK contract change.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: extensions/discord/src/token.ts:normalizeDiscordToken invoked normalizeResolvedSecretInputString (strict mode → calls assertSecretInputResolved → throws createUnresolvedSecretInputError on any unresolved SecretRef). The channel-startup path (startChannel(plugin.id) → discord plugin → resolveDiscordAccountresolveDiscordTokennormalizeDiscordToken) reads raw OpenClawConfig before the gateway has materialized channel SecretRefs into a runtime snapshot, so the assertion fires deterministically and crashes the channel start.
  • Missing detection / guardrail: No test exercised the channel-start path with a SecretRef-shaped channels.discord.token. The existing token.test.ts > "throws when token is an unresolved SecretRef object" documented the throw as intentional, which made it look like a contract — but in practice the throw is unreachable from the runtime-snapshot consumers (which do pre-resolve) and only ever fires from raw-config consumers like channel startup, where it is the bug rather than the guard.
  • Contributing context: The 2026.5.2 plugin externalization wave moved Discord from dist/extensions/discord (bundled) to @openclaw/discord (npm). The bundled 2026.4.29 startup path apparently happened to feed resolveDiscordToken a config where SecretRefs had already been coerced upstream; the externalized startup path receives raw config. Telegram dodges the same family because its token resolver does not call the strict variant.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/discord/src/token.test.ts (and existing extensions/discord/src/client.test.ts updated to assert the new user-actionable error).
  • Scenario the test should lock in: resolveDiscordToken({ channels: { discord: { token: { source: "env", provider: "default", id: "DISCORD_BOT_TOKEN" } } } }) with process.env.DISCORD_BOT_TOKEN="..." returns { token: "...", source: "env" } and does not throw. Without the env, returns { token: "", source: "none" }. With an account-level SecretRef + base token, returns { token: "", source: "none" } (preserves "explicit blank means no inheritance").
  • Why this is the smallest reliable guardrail: The unit covers the exact contract the channel-startup path depends on, with no plugin-runtime / mock surface. A higher-level seam test would add weight without catching anything the unit doesn't.
  • Existing test that already covers this (if any): None (the previous "throws when token is an unresolved SecretRef object" test asserted the opposite — the bug as a feature).
  • If no new test is added, why not: N/A; new tests added.

User-visible / Behavior Changes

  • Discord channel start no longer fails when channels.discord.token is a SecretRef object whose intent matches the env fallback (the very common { source: "env", provider: "default", id: "DISCORD_BOT_TOKEN" } shape).
  • When neither config nor env produces a token, the user-facing error becomes "Discord bot token missing for account "default" (set discord.accounts.default.token or DISCORD_BOT_TOKEN for default)." (from createDiscordRestClient) instead of the internal "channels.discord.token: unresolved SecretRef …. Resolve this command against an active gateway runtime snapshot before reading it." — clearer guidance, same loud failure.
  • No config-format changes. Existing string tokens, account-level tokens, env-only setups, and inspectConfiguredToken "configured but unavailable" status all behave as before.

Diagram (if applicable)

Before (2026.5.2):
  startChannel("discord")
    -> resolveDiscordAccount(rawCfg)
      -> resolveDiscordToken(rawCfg)
        -> normalizeDiscordToken(SecretRef, "channels.discord.token")
          -> normalizeResolvedSecretInputString(...)  [strict]
            -> assertSecretInputResolved(...)  THROWS
  channel startup failed.

After:
  startChannel("discord")
    -> resolveDiscordAccount(rawCfg)
      -> resolveDiscordToken(rawCfg)
        -> normalizeDiscordToken(SecretRef, _path)
          -> normalizeSecretInputString(SecretRef)  [gentle] -> undefined
        configToken = undefined
        -> normalizeDiscordToken(process.env.DISCORD_BOT_TOKEN, ...)  [string -> string]
        -> { token, source: "env" }
  channel started.

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? Yes (in spirit only — no new disclosure path; previously a SecretRef-shaped token crashed the channel before any token was used; now it falls through to the same env fallback that already existed in the same function for the no-config case. No new sink for token bytes; Bot prefix stripping preserved; explicit-blank account semantics preserved.)
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • Risk + mitigation: A SecretRef whose source/provider/id was not env:default:DISCORD_BOT_TOKEN previously crashed; it will now silently fall through to the env fallback if DISCORD_BOT_TOKEN happens to be set. This is the same fall-through behavior that already exists for the "no config token" case, so it does not introduce a new path — but it does mean the intent expressed in a non-env SecretRef (e.g. source: "exec") becomes invisible at the channel-start site. Mitigation: callers that need strict resolution semantics (i.e. consumers of an actual runtime snapshot) should call normalizeResolvedSecretInputString directly; this PR only changes the channel-start consumer's behavior. The user-actionable "bot token missing" error still fires loudly when no token resolves.

Repro + Verification

Environment

  • OS: macOS 26.2 (arm64)
  • Runtime/container: Node 25.9.0, gateway managed via launchd (gui/502/ai.openclaw.gateway), local mode
  • Model/provider: N/A (failure is at channel-start, before any model call)
  • Integration/channel (if any): Discord (channel + DM); Telegram present and working with analogous SecretRef config on the same boot
  • Relevant config (redacted):
    "channels": {
      "discord": {
        "enabled": true,
        "token": { "id": "DISCORD_BOT_TOKEN", "provider": "default", "source": "env" }
      }
    },
    "secrets": { "providers": { "default": { "source": "env" } } }
    ~/.openclaw/.env contains DISCORD_BOT_TOKEN=... (autoloaded via dotenv; verified in process.env of the gateway child process).

Steps

  1. Install OpenClaw 2026.5.2 (the openclaw doctor step of openclaw update auto-installs @openclaw/discord@2026.5.2 from npm).
  2. Configure channels.discord.token as the SecretRef object above; ensure DISCORD_BOT_TOKEN is in env (any of: ~/.openclaw/.env, the LaunchAgent env file, or the parent shell).
  3. Restart the gateway (launchctl kickstart -k gui/502/ai.openclaw.gateway).
  4. Tail /tmp/openclaw/openclaw-<date>.log for discord events around startup.

Expected

  • Discord channel reaches discord channels resolveddiscord client initialized as <bot-id>gateway ready (no channel startup failed lines).

Actual (before this PR)

  • Every restart logs:
    [discord] channel startup failed: channels.discord.token: unresolved SecretRef
    "env:default:DISCORD_BOT_TOKEN". Resolve this command against an active gateway
    runtime snapshot before reading it.
    
    Channel never starts; bot is offline. Telegram on the same boot is unaffected.

Evidence

  • Failing test/log before + passing after — gateway log lines above (failing) plus 1315/1315 unit pass after the change (was 1313 baseline; +2 new tests; updated 2 existing throw-assertions to assert the new graceful behavior).
  • Trace/log snippets — see "Actual" above; the throw originates at normalizeResolvedSecretInputStringassertSecretInputResolved (dist/types.secrets-CyuYZNvB.js).
  • Screenshot/recording — N/A (server-side log behavior).
  • Perf numbers — N/A.

Human Verification (required)

  • Verified scenarios:
    • Reproduced the failure on a real 2026.5.2 install with the SecretRef config above (multiple gateway restarts, deterministic).
    • Applied the equivalent config workaround (removed channels.discord.token so the env fallback fires unconditionally) and confirmed Discord channel starts cleanly — same code path this PR makes available with the SecretRef present.
    • Ran pnpm install + pnpm test:extension discord on upstream/main → 1313/1313 pass (baseline).
    • Applied this change → pnpm test:extension discord → 1315/1315 pass.
    • Inspected all normalizeDiscordToken call sites (accounts.ts, setup-account-state.ts, setup-surface.ts, directory-live.ts, probe.ts, resolve-allowlist-common.ts, client.ts via resolveDiscordToken); none rely on the strict-throw behavior other than the one test that asserted it directly.
    • Inspected the inspectConfiguredToken codepath — the gentle pattern this PR adopts is already used there for the same reason ("configured but unavailable").
  • Edge cases checked:
    • SecretRef + DISCORD_BOT_TOKEN set in env → returns source=env, channel starts.
    • SecretRef + no env → returns source=none, channel reports user-actionable "bot token missing" error (no internal SecretRef contract error).
    • Account-level SecretRef + base-level string token → returns source=none for that account (preserves existing "explicit blank means no inheritance" semantics).
    • String token (existing happy path) → unchanged.
    • Bot <token> prefix stripping → preserved (covered by the unchanged regex).
  • What I did NOT verify:
    • I did not run pnpm build && pnpm check && pnpm test (the full repo lane) locally — left to CI. No src/**, src/plugin-sdk/**, or packages/** files were touched, so the import-boundary inventories listed in CONTRIBUTING.md should not be required.
    • I did not run codex review --base origin/main locally (no Codex subscription on this box). Will address Codex/bot review comments after CI runs them.
    • I did not test multi-account configurations beyond the unit-level case added.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR. (Will fill in after first review pass.)
  • I left unresolved only the conversations that still need reviewer or maintainer judgment. (Will fill in after first review pass.)

Compatibility / Migration

  • Backward compatible? Yes — string-token configs unchanged; SecretRef configs that previously crashed now succeed via the env fallback (or surface a user-actionable error if no env).
  • Config/env changes? No — existing configs work; the docstring in extensions/discord/src/doctor.ts:256 ("verify DISCORD_BOT_TOKEN is present in the state-dir .env or configure channels.discord.token / channels.discord.accounts.default.token as a SecretRef") becomes accurate again instead of misleading.
  • Migration needed? No.

Risks and Mitigations

  • Risk: A non-env SecretRef (e.g. { source: "exec", … }) previously threw at channel start; with this PR it falls through to env. If the operator intended an exec-resolved token but only has DISCORD_BOT_TOKEN set in env, the operator's stated intent is silently overridden by the env fallback.
    • Mitigation: This is the same fall-through behavior that already applies to the "no config token" case in the same function — not a new path, just a new entry point. The documented architectural fix (resolve channel token SecretRefs in core before startChannel) would handle this more cleanly; see extensions/discord/src/token.ts comment added in this PR. A future PR can add a "configured but unresolved" warning log if this surprises operators in practice.
  • Risk: The previous strict throw was a debug aid for accidental raw-config consumers. Replacing it loses that signal.
    • Mitigation: Strict resolution remains available via normalizeResolvedSecretInputString and assertSecretInputResolved for runtime-snapshot consumers. This PR only changes the one call site that was systematically misusing the strict variant.

AI/Vibe-coded disclosure

  • AI-assisted (Claude Sonnet 4.6 via Claude Code CLI, "fully tested" — full discord extension test lane green locally: 1315/1315).
  • Author understands the code change and the surrounding plugin's secret-input contract.
  • Will run codex review --base origin/main and address findings before requesting review (no Codex subscription on this box; will rely on CI Codex review).
  • Will resolve bot review conversations after addressing them.

Session origin

This PR was authored mid-incident while debugging an actual production gateway hitting #76371 on a fresh 2026.5.2 install. Condensed session log:

  1. Discord stopped responding in DM → traced to repeated channel-startup failures in the gateway log.
  2. Restarting the gateway (launchctl kickstart -k) cleared transient saturation but the channel-startup error reproduced deterministically across multiple restarts.
  3. Initially suspected an env-availability issue (the message text mentions "active gateway runtime snapshot before reading it"). Ruled that out by adding DISCORD_BOT_TOKEN directly to the LaunchAgent env file (so it was in process.env from the very first instruction). Same crash. Conclusion: the bug is in the SecretRef strict-resolver code path, not env availability.
  4. Located the throw site by grepping the installed @openclaw/discord source for the literal error message: extensions/discord/src/token.ts:normalizeDiscordTokennormalizeResolvedSecretInputString (strict).
  5. Compared with the safe sibling inspectConfiguredToken in setup-account-state.ts — same plugin, same author intent, gentle handling of SecretRef objects via normalizeSecretInputString.
  6. Searched for prior art on this bug family — found BUG: Embedded channel reply runs crash on SecretRef-backed Telegram/Discord credentials #75433 (embedded reply path) and [Bug]: @openclaw/bluebubbles 2026.5.2 webhook auth crashes on SecretRef password — TypeError in monitor.ts #76369 (@openclaw/bluebubbles webhook path), confirmed no existing report for the channel-startup case, filed [Bug] @openclaw/discord 2026.5.2 channel startup crashes on SecretRef-backed token (env:default:DISCORD_BOT_TOKEN unresolved) #76371.
  7. Initially considered changing normalizeDiscordToken was wrong because the existing test ("throws when token is an unresolved SecretRef object") explicitly asserted the throw — but on closer reading, the throw was a debug aid, not a feature, and the gentle path already exists in the same plugin. Updated the assertion to lock in the new graceful behavior.
  8. Validated the workaround (Option 1, "remove channels.discord.token from config so env fallback unconditionally fires") on the real instance — Discord channel starts cleanly. That validates the same code path this PR enables with the SecretRef config preserved.
  9. Cloned the repo, installed dependencies, ran the full discord extension test lane to baseline (1313/1313 pass), made the fix, updated the two existing throw-asserting tests, added two new tests covering the SecretRef + env and SecretRef + no-env scenarios, and re-ran (1315/1315 pass).

The session prompt thread is available on request; happy to share if useful for the AI-PR-tracking maintainers do.

The externalized @openclaw/discord plugin (2026.5.2) crashes channel
startup when channels.discord.token is configured as a SecretRef object,
because normalizeDiscordToken used the strict normalizeResolvedSecretInputString
which throws when the value is an unresolved SecretRef. The channel-startup
path reads raw config (before the gateway's secret resolver has materialized
SecretRefs into a runtime snapshot), so the strict assertion deterministically
fails the channel start with:

  [discord] channel startup failed: channels.discord.token: unresolved
  SecretRef "env:default:DISCORD_BOT_TOKEN". Resolve this command against
  an active gateway runtime snapshot before reading it.

Switch to the gentle normalizeSecretInputString so unresolved SecretRefs
return undefined and let the existing fall-through chain (account-token →
top-level token → process.env.DISCORD_BOT_TOKEN → source=none) handle the
common case where the SecretRef intent matches the env fallback. Configurations
without an env fallback still surface a loud user-actionable error from
createDiscordRestClient ("Discord bot token missing for account ...") rather
than the internal SecretRef contract error.

Updates the existing throw assertions to reflect the new graceful behavior:
- token.test.ts: SecretRef + env present → returns env token (source=env);
  SecretRef + no env → returns source=none without throwing; account-level
  SecretRef respects the existing "explicit blank means no inheritance"
  semantics (source=none, no inherit from base).
- client.test.ts: createDiscordRestClient surfaces the user-actionable
  "bot token missing" error instead of the raw SecretRef contract error.

Refs openclaw#76371

AI-assisted (Claude / Sonnet 4.6, fully tested via `pnpm test:extension discord`,
1315/1315 tests passing).
@openclaw-barnacle openclaw-barnacle Bot added channel: discord Channel integration: discord size: S labels May 3, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 3, 2026

Codex review: needs changes before merge.

Summary
This PR changes Discord token resolution, tests, and changelog so unresolved SecretRef objects no longer throw during startup and selected top-level env refs can fall through to DISCORD_BOT_TOKEN under provider policy.

Reproducibility: yes. Current main has a high-confidence source path: raw Discord token SecretRefs reach normalizeResolvedSecretInputString, and the existing unit test asserts that unresolved SecretRefs throw; the PR body also includes deterministic 2026.5.2 startup logs.

Next step before merge
The remaining blocker is a narrow, file-local Discord token-resolution repair with clear tests and no product decision required.

Security
Cleared: The latest diff enforces provider source/default/allowlist checks before the DISCORD_BOT_TOKEN fallback, and I found no remaining secret disclosure or supply-chain concern in the patch.

Review findings

  • [P2] Resolve configured env SecretRefs before returning none — extensions/discord/src/token.ts:118-119
  • [P2] Resolve account env SecretRefs before blocking inheritance — extensions/discord/src/token.ts:75-76
Review details

Best possible solution:

Make Discord mirror Telegram-style inspected env SecretRef resolution for both account and top-level tokens, with provider/default/allowlist enforcement and the bare DISCORD_BOT_TOKEN fallback only when no config token is present.

Do we have a high-confidence way to reproduce the issue?

Yes. Current main has a high-confidence source path: raw Discord token SecretRefs reach normalizeResolvedSecretInputString, and the existing unit test asserts that unresolved SecretRefs throw; the PR body also includes deterministic 2026.5.2 startup logs.

Is this the best way to solve the issue?

No. The provider-policy gate addresses the earlier security concern, but the patch should resolve configured env SecretRefs by their own id for top-level and account tokens instead of only allowing the hard-coded DISCORD_BOT_TOKEN fallback.

Full review comments:

  • [P2] Resolve configured env SecretRefs before returning none — extensions/discord/src/token.ts:118-119
    This branch rejects any top-level env SecretRef whose id is not DISCORD_BOT_TOKEN. A documented config can point channels.discord.token at another env id, and Telegram resolves that through provider/default/allowlist policy before treating it unavailable. With the PR, that config reports a missing token even when process.env[ref.id] is set.
    Confidence: 0.87
  • [P2] Resolve account env SecretRefs before blocking inheritance — extensions/discord/src/token.ts:75-76
    Account tokens now pass through the gentle normalizer, so an account-level SecretRef returns undefined and hasAccountToken immediately returns source=none. The Discord docs show multi-bot account tokens as env SecretRefs, so a configured DISCORD_WORK_TOKEN will not start that account even when the env var is present.
    Confidence: 0.84

Overall correctness: patch is incorrect
Overall confidence: 0.86

Acceptance criteria:

  • pnpm test extensions/discord/src/token.test.ts extensions/discord/src/client.test.ts
  • pnpm test:extension discord
  • pnpm exec oxfmt --check --threads=1 extensions/discord/src/token.ts extensions/discord/src/token.test.ts extensions/discord/src/client.test.ts CHANGELOG.md
  • pnpm check:changed

What I checked:

Likely related people:

  • pashpashpash: Local blame and history show the current Discord token resolver, tests, and Discord docs in this checkout were introduced together in commit ad0e1a0. (role: introduced behavior and recent maintainer; confidence: high; commits: ad0e1a0d5d9b; files: extensions/discord/src/token.ts, extensions/discord/src/token.test.ts, extensions/discord/src/client.test.ts)
  • neeravmakwana: Authored open PR fix(discord): resolve env-backed SecretRefs in startup token path #76383 for the same linked bug with a broader Telegram-style env SecretRef resolution approach that maintainers may want to compare or fold in. (role: adjacent follow-up contributor; confidence: medium; commits: bcfec226c065; files: extensions/discord/src/token.ts, extensions/discord/src/token.test.ts, CHANGELOG.md)

Remaining risk / open question:

  • The current PR can leave documented env SecretRef configurations reporting a missing Discord token even when the configured env variable is present.
  • A competing open fix PR covers the same linked issue with different semantics, so maintainers should choose one canonical implementation before landing.

Codex review notes: model gpt-5.5, reasoning high; reviewed against dda2cf4e73b3.

Addresses Codex review feedback on the prior commit:

- [P1] Previously the gentle normalize made *every* SecretRef look absent,
  so a configured non-env SecretRef ({source:"file"}, {source:"exec"},
  or {source:"env",id:"DISCORD_WORK_TOKEN"}) at top-level config could
  silently fall through to process.env.DISCORD_BOT_TOKEN, potentially
  starting the wrong bot account when DISCORD_BOT_TOKEN happened to be
  set for a different deployment.

  Refined resolveDiscordToken to inspect the unresolved SecretRef via
  coerceSecretRef before allowing fallthrough: only refs whose intent
  matches the existing env fallback ({source:"env", id:"DISCORD_BOT_TOKEN"})
  fall through. All other SecretRef shapes preserve operator intent and
  return {token:"", source:"none"} so the user-actionable "Discord bot
  token missing" error from createDiscordRestClient surfaces cleanly
  instead of an unrelated env token being substituted.

- [P3] Added Unreleased changelog entry under the Fixes section per repo
  policy for user-facing fixes.

New test coverage:
- file-source SecretRef + DISCORD_BOT_TOKEN env present -> source=none
  (no env substitution)
- env-source SecretRef with different id (DISCORD_PROD_BOT_TOKEN) +
  DISCORD_BOT_TOKEN env present -> source=none
- exec-source SecretRef + DISCORD_BOT_TOKEN env present -> source=none

All earlier-added test cases still pass (matching env SecretRef ->
source=env; matching SecretRef + no env -> source=none; account-level
SecretRef -> no inheritance).

Refs openclaw#76371, openclaw#76385.
@mogglemoss
Copy link
Copy Markdown
Contributor Author

@clawsweeper Thanks for the review — both findings addressed in fb48a6a9 (force-push avoided; second commit on top of the original).

[P1] Preserve configured SecretRefs before using env fallback — addressed.

resolveDiscordToken now inspects the unresolved top-level token via coerceSecretRef and only falls through to the process.env.DISCORD_BOT_TOKEN fallback when the SecretRef's intent matches that env var (source: "env", id: "DISCORD_BOT_TOKEN"). Any other configured SecretRef shape — {source: "file"}, {source: "exec"}, or {source: "env", id: "DISCORD_WORK_TOKEN"} etc. — short-circuits with {token: "", source: "none"} so createDiscordRestClient surfaces the user-actionable Discord bot token missing for account ... error and the wrong bot can never come up under an unrelated env token.

provider is intentionally not constrained in the match (it's the secrets-provider alias, not the secret's identity), but source and id together are sufficient to express "this SecretRef's intent is identical to the existing env fallback's behavior."

Three new unit cases lock this in:

  • does not substitute env DISCORD_BOT_TOKEN for a non-env config SecretRef (file source)
  • does not substitute env DISCORD_BOT_TOKEN for an env SecretRef pointing at a different env var (DISCORD_PROD_BOT_TOKEN)
  • does not substitute env DISCORD_BOT_TOKEN for an exec-source SecretRef

[P3] Add the required changelog entry — addressed. Added under ## Unreleased > ### Fixes describing both the matching-env-fallback case (now works) and the non-matching case (preserves operator intent).

Re-validation: pnpm test:extension discord1318/1318 passing locally (was 1313 baseline; +5 new tests since baseline; 2 prior-throw assertions updated). Account-level SecretRef semantics unchanged (still "explicit-set means no inherit"). Channel startup verified working in the original repro environment with the matching env SecretRef.

Diff is now +81 -8 over the previous commit; total branch diff +137 -13 across 4 files (token.ts, token.test.ts, client.test.ts, CHANGELOG.md). PR body Risks section is now superseded by the inspect-then-fall-through code path — the previously-flagged risk no longer applies.

Addresses Codex review [P1] on the previous commit (validate SecretRef
providers before env fallback): the prior matchesEnvFallback gate only
checked source==="env" and id===DISCORD_BOT_TOKEN. A SecretRef wired to
a misconfigured provider (e.g. provider source mismatch, missing
non-default provider, or env provider with an allowlist that excludes
the id) could still bypass operator secret-provider policy and read
process.env.DISCORD_BOT_TOKEN, potentially starting the wrong bot
account under operator-rejected credentials.

Mirror Telegram's resolveEnvSecretRefValue policy gate
(extensions/telegram/src/token.ts) inline, returning false (which now
short-circuits to source=none) when:

  1. The provider is configured but its source is not "env".
  2. The provider is configured with an allowlist that excludes the id.
  3. The provider is not configured AND is not the runtime-resolved
     default env-provider alias.

Only refs that pass all three checks (in addition to the source/id match)
fall through to process.env. Operator policy is preserved end-to-end.

Refactored helper kept inline (extension-local) rather than promoted to
plugin-sdk per CONTRIBUTING ("refactor-only PRs ... not accepted"). If a
maintainer wants this consolidated with telegram's helper as part of a
follow-up, happy to do it as a separate change.

New tests (4):
- env SecretRef + provider source mismatch -> source=none
- env SecretRef + provider allowlist excludes id -> source=none
- env SecretRef + provider not configured and not default alias -> source=none
- env SecretRef + provider configured with matching allowlist -> source=env

Updated changelog entry to call out the provider-policy gate.

Refs openclaw#76371, openclaw#76385.
@mogglemoss
Copy link
Copy Markdown
Contributor Author

@clawsweeper Round 2 addressed in cdac1960.

[P1] Validate SecretRef providers before env fallback — implemented inline as envSecretRefMatchesProviderPolicy, mirroring extensions/telegram/src/token.ts's resolveEnvSecretRefValue policy gate (provider source / configured-or-default-alias / allowlist). The matching env fallback now requires all three of:

  1. ref.source === "env"
  2. ref.id === DISCORD_BOT_TOKEN
  3. secrets.providers[ref.provider] is either configured with source: "env" and an allowlist (if any) that includes DISCORD_BOT_TOKEN, OR is unconfigured and matches the runtime-resolved default env-provider alias.

Anything that fails any of those returns {token: "", source: "none"} and surfaces the user-actionable "Discord bot token missing" error from createDiscordRestClient. No path lets a misconfigured provider bypass operator secret-provider policy via the channel env fallback.

I kept the helper inline (extension-local) rather than promoting resolveEnvSecretRefValue to plugin-sdk and sharing it between Discord and Telegram, per CONTRIBUTING.md ("refactor-only PRs not accepted"). Happy to do that consolidation as a separate PR if a maintainer wants it as a follow-up — there's clear duplication value if more channels get externalized.

Tests (pnpm test:extension discord1322/1322 passing, +4 since previous round):

  • env SecretRef + provider source mismatch (secrets.providers.vault.source = "exec") → source=none (no env substitution).
  • env SecretRef + provider allowlist excludes the id (allowlist: ["OTHER_TOKEN"]) → source=none.
  • env SecretRef + provider not configured AND not the default alias (provider: "ghost") → source=none.
  • env SecretRef + provider configured with explicit allowlist that includes the id → source=env (positive case for an explicitly-allowlisted env provider).

Updated changelog entry to call out the provider-policy gate.

Branch diff vs upstream/main is now +229 -21 across 4 files. Three commits on the branch (initial fix, scope tightening, provider-policy gate) — happy to squash on merge.

@mogglemoss
Copy link
Copy Markdown
Contributor Author

Closing in favor of #76383 (@neeravmakwana, opened 4 minutes earlier for the same bug originally filed as #76371). Their PR uses the broader Telegram-style approach — resolves env SecretRefs by their own id after provider policy check, applied to both top-level and account tokens — which covers the docs-supported DISCORD_PERSONAL_TOKEN / DISCORD_WORK_TOKEN multi-account env-SecretRef pattern. Codex's most recent review here (cdac1960) explicitly cited #76383 as the more canonical implementation and flagged my narrower DISCORD_BOT_TOKEN-only fallback as a P2 gap.

Cleaner to consolidate on one PR than to have maintainers triage two converging implementations of the same fix. Cross-linked from the upstream issue (#76371) and from #76383 itself.

Thanks @clawsweeper for the careful three rounds of review — they tightened my approach (broad fallback → matching-id only → matching-id + provider policy gate) even though the final landing spot is to defer. The provider-policy gate I added in cdac1960 may be useful reference for the canonical fix; not making any claims on it.

@joshavant
Copy link
Copy Markdown
Contributor

Thanks @mogglemoss for the patch. The merged fix in #76449 took the broader external channel contract approach, so Discord SecretRef-backed startup tokens are resolved before the token path needs to handle unresolved refs. This PR had already been closed, and the merged fix is now in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: discord Channel integration: discord size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] @openclaw/discord 2026.5.2 channel startup crashes on SecretRef-backed token (env:default:DISCORD_BOT_TOKEN unresolved)

2 participants