Skip to content

fix(msteams): surface network errors blocking bot JWT validation and outbound replies (#77674)#78081

Merged
BradGroux merged 2 commits intoopenclaw:mainfrom
Beandon13:fix/openclaw-77674-teams-silent-failures
May 6, 2026
Merged

fix(msteams): surface network errors blocking bot JWT validation and outbound replies (#77674)#78081
BradGroux merged 2 commits intoopenclaw:mainfrom
Beandon13:fix/openclaw-77674-teams-silent-failures

Conversation

@Beandon13
Copy link
Copy Markdown
Contributor

Summary

  • JWT validator (sdk.ts) previously swallowed all JWKS fetch errors and returned false, making firewall blocks to login.botframework.com look identical to a bad credential — now rethrows network errors (ECONNREFUSED, ENOTFOUND, etc.) so the JWT middleware can log them at runtime.error level with an actionable message pointing to the blocked host
  • errors.ts: new "network" kind in classifyMSTeamsSendError for transport-level failures; formatMSTeamsSendErrorHint now returns a hint pointing to smba.trafficmanager.net and egress firewall rules for network errors
  • monitor.ts: allowlist resolution failures upgraded from optional runtime.log?. (silent if hook absent) to runtime.error (always logged); JWKS network errors caught separately and logged at error level
  • monitor-handler.ts, message-handler.ts: remove spurious ?. from runtime.error calls (RuntimeEnv.error is a required field, the optional chaining was silently discarding handler failures when the runtime wrapper had an unexpected shape)

Closes #77674

Testing

  • pnpm vitest run extensions/msteams/src/errors.test.ts
  • pnpm vitest run extensions/msteams/src/sdk.test.ts

Real behavior proof

  • Behavior: Teams bot fails silently when network paths to login.botframework.com or smba.trafficmanager.net are blocked — errors are swallowed and logs don't help distinguish firewall block from bad credentials
  • Tested via targeted unit tests added in this PR — see Testing section above.
  • What was not tested: live runtime — please apply maintainer proof: override or advise on evidence format.

@openclaw-barnacle openclaw-barnacle Bot added channel: msteams Channel integration: msteams size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 5, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 5, 2026

Codex review: needs changes before merge.

Summary
The PR changes the MS Teams plugin to rethrow Bot Framework JWKS network failures, add Connector network send hints, make several runtime.error calls required, add targeted tests, and update the changelog.

Reproducibility: yes. by source inspection. Current main catches all JWKS lookup and JWT verification failures as false, then the middleware logs only debug before 401; I did not run a live firewall-blocked Teams setup.

Real behavior proof
Override: Maintainer proof: override is present, with comments documenting source-level verification and targeted tests on the prepared head, so no contributor proof action remains for this gate.

Next step before merge
A narrow automated repair can align the middleware logging predicate with the validator rethrow predicate and add focused regression coverage.

Security
Cleared: The diff stays within Teams plugin diagnostics/tests and changelog, adds no dependency, workflow, package-resolution, secret-handling, or code-execution surface, and preserves strict JWT rejection.

Review findings

  • [P2] Log every rethrown JWKS fetch failure visibly — extensions/msteams/src/monitor.ts:310-312
Review details

Best possible solution:

Keep the Teams-plugin-local fix, but make the middleware log every JWKS fetch error the validator intentionally rethrows while preserving strict 401 rejection and the new Connector network hint.

Do we have a high-confidence way to reproduce the issue?

Yes by source inspection. Current main catches all JWKS lookup and JWT verification failures as false, then the middleware logs only debug before 401; I did not run a live firewall-blocked Teams setup.

Is this the best way to solve the issue?

No, not quite. The plugin-local direction is the narrow maintainable fix, but monitor.ts should share the validator's network predicate or treat all validator-rethrown JWKS failures as the actionable egress error.

Full review comments:

  • [P2] Log every rethrown JWKS fetch failure visibly — extensions/msteams/src/monitor.ts:310-312
    isJwksNetworkError rethrows message-matched JWKS/key-fetch/connect failures as network errors, but this catch only recognizes the five errno strings. A validator-rethrown error without one of those exact codes still falls through to log.debug and a generic 401, so the operator can hit the same silent blocked-egress diagnosis this PR is meant to fix. Share the same predicate or log all validator rejections from this path as the JWKS egress failure.
    Confidence: 0.84

Overall correctness: patch is incorrect
Overall confidence: 0.84

Acceptance criteria:

  • pnpm test extensions/msteams/src/errors.test.ts extensions/msteams/src/sdk.test.ts extensions/msteams/src/monitor.lifecycle.test.ts
  • pnpm exec oxfmt --check --threads=1 CHANGELOG.md extensions/msteams/src/errors.ts extensions/msteams/src/errors.test.ts extensions/msteams/src/monitor.ts extensions/msteams/src/monitor-handler.ts extensions/msteams/src/monitor-handler/message-handler.ts extensions/msteams/src/sdk.ts extensions/msteams/src/sdk.test.ts extensions/msteams/src/monitor.lifecycle.test.ts
  • pnpm check:changed

What I checked:

  • Current main collapses JWKS failures: createBotFrameworkJwtValidator currently catches every getSigningKey or jwt.verify exception and returns false, so a blocked JWKS fetch is indistinguishable from an invalid token at the validator boundary. (extensions/msteams/src/sdk.ts:879, f35fb7288a70)
  • Current main logs validator failures at debug: The Teams webhook middleware currently logs validator rejections only through log.debug before returning 401, matching the linked issue's report that blocked login.botframework.com looks like a generic credential failure at normal log levels. (extensions/msteams/src/monitor.ts:302, f35fb7288a70)
  • PR head leaves predicate drift: The PR's validator helper rethrows errno-coded failures plus message-matched JWKS/key-fetch/connect errors, but monitor.ts only treats the five errno strings as visible network failures, leaving message-matched rethrows on the debug-only path. (extensions/msteams/src/monitor.ts:310, ad42d65f9088)
  • Dependency contract checked: jwks-rsa@4.0.1 uses http(s).request; request-level errors are rejected directly, while HTTP response errors are wrapped as JwksError, so errno-coded network errors can reach this validator as original Node errors. (extensions/msteams/package.json:16, f35fb7288a70)
  • Proof override and checks: The PR has proof: override; maintainer comments document source-level before/after behavior and targeted tests on prepared heads, and public check-runs for ad42d65f9088e484bc29ef4a21733a593edc0bd2 showed no failures or pending runs across both check-run pages. (ad42d65f9088)

Likely related people:

  • steipete: Current blame for the validator, middleware, and send-error classifier lines points to Peter Steinberger, and recent API history shows repeated Teams auth and helper refactors in these files. (role: recent maintainer / current-line owner; confidence: high; commits: ea391c6df280, 3f002b10d281, b3bc60ae259b; files: extensions/msteams/src/sdk.ts, extensions/msteams/src/monitor.ts, extensions/msteams/src/errors.ts)
  • BradGroux: BradGroux rebased and prepared this PR head with the proof override, and recent Teams history includes prior error-logging and Teams behavior work. (role: prepared-head maintainer / adjacent Teams logging maintainer; confidence: medium; commits: ad42d65f9088, 03c64df39fe7, fce81fccd859; files: CHANGELOG.md, extensions/msteams/src/sdk.ts, extensions/msteams/src/monitor.ts)
  • vincentkoc: Recent history shows Vincent Koc working on Bot Framework audience binding and Teams SDK import/runtime paths adjacent to the validator touched here. (role: adjacent Teams JWT/auth owner; confidence: medium; commits: e1840b858125, 3cc83cb81ec3, 5e78c8bc95d8; files: extensions/msteams/src/sdk.ts, extensions/msteams/src/monitor.ts)

Remaining risk / open question:

  • No independent live blocked-egress Teams run was performed in this read-only review; the external-PR proof gate is covered by maintainer override and source/test evidence instead.

Codex review notes: model gpt-5.5, reasoning high; reviewed against f35fb7288a70.

@BradGroux BradGroux force-pushed the fix/openclaw-77674-teams-silent-failures branch from 573de2f to 50e2931 Compare May 6, 2026 02:31
@BradGroux
Copy link
Copy Markdown
Member

Maintainer prep update for #78081 / #77674:

  • Rebasing completed onto current main at 736f627fb5.
  • Added the missing active-version changelog entry.
  • Kept the Teams-plugin-local implementation direction intact: JWKS network fetch failures now bubble to the webhook middleware for visible logging, outbound Bot Connector transport failures classify as network with an smba.trafficmanager.net egress hint, and required runtime.error calls are no longer optional.

proof: maintainer override. I reviewed the source-level behavior and targeted tests for the reported blocked-egress paths. The before/after is deterministic in code: getSigningKey network errors used to collapse to false and generic 401 handling; the prepared head now rethrows known network errors so monitor.ts logs an actionable login.botframework.com:443 egress message. Connector send failures with no HTTP status and network errno codes now classify as network and produce an smba.trafficmanager.net egress hint. This preserves strict JWT rejection and does not log tokens/secrets.

Local verification on prepared head 50e293185aee1a81198c338450d85adb4a7896f2:

  • pnpm install --frozen-lockfile
  • pnpm test extensions/msteams/src/errors.test.ts extensions/msteams/src/sdk.test.ts extensions/msteams/src/monitor.lifecycle.test.ts (50 tests passed)
  • pnpm exec oxfmt --check --threads=1 CHANGELOG.md extensions/msteams/src/errors.ts extensions/msteams/src/errors.test.ts extensions/msteams/src/monitor.ts extensions/msteams/src/monitor-handler.ts extensions/msteams/src/monitor-handler/message-handler.ts extensions/msteams/src/sdk.ts extensions/msteams/src/sdk.test.ts
  • pnpm check:changed
  • git diff --check

Fresh CI is queued/running on the pushed head now; I will merge only after required checks are green.

@BradGroux BradGroux added proof: override Maintainer override for the external PR real behavior proof gate. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 6, 2026
@BradGroux BradGroux force-pushed the fix/openclaw-77674-teams-silent-failures branch from 50e2931 to f374344 Compare May 6, 2026 02:36
@BradGroux
Copy link
Copy Markdown
Member

Maintainer refresh: added proof: override, rebased onto current main, and pushed prepared head f374344f3b7dfb1a316c8f08ac357db10a777de4. Focused verification after the rebase: pnpm test extensions/msteams/src/errors.test.ts extensions/msteams/src/sdk.test.ts extensions/msteams/src/monitor.lifecycle.test.ts (50 passed) and pnpm exec oxfmt --check --threads=1 CHANGELOG.md extensions/msteams/src/errors.ts extensions/msteams/src/errors.test.ts extensions/msteams/src/monitor.ts extensions/msteams/src/monitor-handler.ts extensions/msteams/src/monitor-handler/message-handler.ts extensions/msteams/src/sdk.ts extensions/msteams/src/sdk.test.ts. Fresh CI is running with the override label present.

Beandon13 and others added 2 commits May 5, 2026 22:57
…n and outbound replies (openclaw#77674)

When login.botframework.com or smba.trafficmanager.net egress is blocked,
errors previously disappeared completely. JWT validator swallowed network
errors and returned false (401 looked identical to a bad credential), and
outbound send failures with transport-level codes had no hint pointing to
the Connector endpoint.

- sdk.ts: rethrow ECONNREFUSED/ENOTFOUND/EHOSTUNREACH/ETIMEDOUT/ECONNRESET
  from the JWKS key fetch so callers can distinguish firewall blocks from bad
  credentials; add isJwksNetworkError() helper
- monitor.ts: catch rethrown network errors in JWT middleware and log at
  runtime.error level with an actionable message pointing to
  login.botframework.com:443; upgrade allowlist resolution failures from
  runtime.log (optional/silent) to runtime.error
- errors.ts: add "network" kind to classifyMSTeamsSendError for transport-level
  errors (ECONNREFUSED, ENOTFOUND, etc.); add formatMSTeamsSendErrorHint for
  "network" kind pointing to smba.trafficmanager.net and egress rules
- monitor-handler.ts, message-handler.ts: remove spurious ?. from runtime.error
  calls (RuntimeEnv.error is a required non-optional field)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@BradGroux BradGroux force-pushed the fix/openclaw-77674-teams-silent-failures branch from f374344 to ad42d65 Compare May 6, 2026 03:57
@BradGroux
Copy link
Copy Markdown
Member

Maintainer refresh: rebased onto current main after #77907 landed and pushed prepared head ad42d65f9088e484bc29ef4a21733a593edc0bd2. Focused verification after the rebase: pnpm test extensions/msteams/src/errors.test.ts extensions/msteams/src/sdk.test.ts extensions/msteams/src/monitor.lifecycle.test.ts (50 passed) and pnpm exec oxfmt --check --threads=1 CHANGELOG.md extensions/msteams/src/errors.ts extensions/msteams/src/errors.test.ts extensions/msteams/src/monitor.ts extensions/msteams/src/monitor-handler.ts extensions/msteams/src/monitor-handler/message-handler.ts extensions/msteams/src/sdk.ts extensions/msteams/src/sdk.test.ts. Fresh CI is running.

@BradGroux BradGroux merged commit eecda91 into openclaw:main May 6, 2026
103 checks passed
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…outbound replies (openclaw#77674) (openclaw#78081)

* fix(msteams): surface network errors blocking Teams bot JWT validation and outbound replies (openclaw#77674)

When login.botframework.com or smba.trafficmanager.net egress is blocked,
errors previously disappeared completely. JWT validator swallowed network
errors and returned false (401 looked identical to a bad credential), and
outbound send failures with transport-level codes had no hint pointing to
the Connector endpoint.

- sdk.ts: rethrow ECONNREFUSED/ENOTFOUND/EHOSTUNREACH/ETIMEDOUT/ECONNRESET
  from the JWKS key fetch so callers can distinguish firewall blocks from bad
  credentials; add isJwksNetworkError() helper
- monitor.ts: catch rethrown network errors in JWT middleware and log at
  runtime.error level with an actionable message pointing to
  login.botframework.com:443; upgrade allowlist resolution failures from
  runtime.log (optional/silent) to runtime.error
- errors.ts: add "network" kind to classifyMSTeamsSendError for transport-level
  errors (ECONNREFUSED, ENOTFOUND, etc.); add formatMSTeamsSendErrorHint for
  "network" kind pointing to smba.trafficmanager.net and egress rules
- monitor-handler.ts, message-handler.ts: remove spurious ?. from runtime.error
  calls (RuntimeEnv.error is a required non-optional field)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(msteams): surface blocked botframework egress

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Brad Groux <3053586+BradGroux@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: msteams Channel integration: msteams proof: override Maintainer override for the external PR real behavior proof gate. size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: MS Teams bot fails silently when network paths are blocked — errors are swallowed and logs don't help

2 participants