v0.26.1 fix(oauth): client_credentials tokens rejected by MCP bearer auth#577
Merged
garrytan merged 3 commits intogarrytan:masterfrom May 3, 2026
Merged
Conversation
added 3 commits
May 3, 2026 07:14
Three bugs found in production when connecting Claude Code via Tailscale:
1. Token validation fails with 'Token has no expiration time'
- Root cause: postgres driver with prepare:false returns expires_at as
string, but MCP SDK's bearerAuth middleware checks typeof === 'number'
- Fix: Number(row.expires_at) in verifyAccessToken
2. OAuth metadata missing client_credentials grant type
- Root cause: MCP SDK hardcodes ['authorization_code', 'refresh_token']
in mcpAuthRouter's .well-known endpoint
- Fix: middleware intercepts metadata response and appends
'client_credentials' before it reaches the client
- Claude Code's native OAuth auto-discovery now finds the CC flow
3. Express 5 compatibility fixes
- trust proxy: 'loopback' for reverse proxy deployments (Caddy/Tailscale)
without this, express-rate-limit throws ERR_ERL_UNEXPECTED_X_FORWARDED_FOR
- /admin/* wildcard → /admin/{*path} (Express 5 named param syntax)
Unit test (oauth.test.ts): - expiresAt is always a number, not string — SDK bearerAuth compat Integration tests (serve-http-oauth.test.ts, 7 cases): - client_credentials token accepted at /mcp (the actual regression) - token expires_in matches server TTL - OAuth metadata includes client_credentials grant type - token endpoint discoverable from metadata - admin dashboard serves SPA (Express 5 wildcard fix) - X-Forwarded-For doesn't crash rate limiter (trust proxy fix) - read-only token cannot call write operations (scope enforcement) 42 tests, 0 failures, 172 assertions.
Spins up a real gbrain serve --http against real Postgres, registers an OAuth client, mints tokens via client_credentials, and exercises the full MCP JSON-RPC pipeline end-to-end. E2E cases (test/e2e/serve-http-oauth.test.ts): - mint token via client_credentials grant - minted token accepted at /mcp — tools/list returns tools - minted token works for tools/call — search executes - expired/invalid token rejected at /mcp - missing Authorization header returns 401 - OAuth metadata includes all three grant types - OAuth metadata issuer matches public URL - admin dashboard serves SPA (Express 5 wildcard fix) - admin sub-routes serve SPA fallback - X-Forwarded-For doesn't crash rate limiter - read-only token rejected for write operations - write-scoped token can call read operations - health endpoint works without auth - multiple tokens work independently - wrong client_secret rejected at token endpoint Unit test addition (test/oauth.test.ts): - expiresAt is always typeof number (SDK bearerAuth compat) Total: 50 tests, 0 failures, 201 assertions.
7 tasks
garrytan
added a commit
that referenced
this pull request
May 3, 2026
…ug class (#593) * feat(oauth): add coerceTimestamp helper + fix BIGINT-as-string bug class Postgres-js with prepare:false (auto-detected on Supabase pooler / port 6543) returns BIGINT columns as strings. Two surfaces broke on this: (1) MCP SDK's bearerAuth checks typeof === 'number' and rejected strings — fixed in v0.26.1 only at line 303 of oauth-provider.ts; (2) RFC 7591 §3.2.1 requires client_id_issued_at and client_secret_expires_at to be JSON numbers in DCR responses, not strings — latent until v0.26.2. Adds module-private coerceTimestamp() at the SELECT-row → JS-number boundary. Throws on non-finite (corrupt rows fail loud, not as fake-valid expiresAt: NaN flowing into the SDK). Returns undefined for SQL NULL — schema permits NULL on oauth_tokens.expires_at, callers treat NULL as expired (fail-closed) at comparison sites and preserve undefined in DCR getClient response per RFC 7591. Refactors 5 sites: - L112,113 (getClient) — DCR response numeric-shape compliance. - L274 (exchangeRefreshToken) — NULL→expired fail-closed contract. - L296,303 (verifyAccessToken) — single guard, narrowed return. No `!` non-null assertions: all 5 sites read nullable BIGINT columns per src/schema.sql:362,363,372. The L296/L303 cleanup also folds in v0.26.1's inline Number(...) at L303. * feat(auth): add gbrain auth revoke-client subcommand Hard-deletes the matching oauth_clients row via atomic DELETE ... RETURNING. Schema-level FK CASCADE on oauth_tokens.client_id and oauth_codes.client_id (src/schema.sql:370,382) purges all dependent rows in the same transaction. No manual delete of dependents needed. Exit 1 on no-such-client (idempotent: re-running on the same id produces the same error). Operator-friendly output: prints the client name + cascade confirmation, no race-prone pre-delete count. Closes the v0.26.1 process miss where test/e2e/serve-http-oauth.test.ts afterAll already called this subcommand — silently failing because the subcommand didn't exist. With this fix, E2E cleanup actually purges test clients. * test(oauth): v0.26.2 regression coverage + bun execSync env fix Unit additions in test/oauth.test.ts: - 5 cases pinning coerceTimestamp contract (null/undef/string/number/ throws-on-NaN). The throws-on-NaN case is load-bearing: pre-v0.26.2 Number(corrupt) → NaN, NaN < now is false → expired check skipped, fake-valid expiresAt:NaN flowed to SDK. Now fail-closed. - NULL expires_at on oauth_tokens insert → verifyAccessToken throws "Token expired". Schema permits NULL; pre-v0.26.2 hand-modified rows could ride past validation. - Cascade-deleted client → previously-minted token fails verifyAccessToken with "Invalid token" (not "expired"). Pins the cascade contract independently of the CLI subprocess path. E2E additions in test/e2e/serve-http-oauth.test.ts: - DCR /register HTTP-level response-shape test. Spawns server with --enable-dcr, POSTs a client manifest, asserts typeof === 'number' on client_id_issued_at and (when present) client_secret_expires_at per RFC 7591 §3.2.1. Replaces the v0.26.1 plan's internal-store-only test that Codex flagged as the wrong seam. - Real CLI subprocess test for revoke-client: register → mint token → revoke via execSync → assert token rejected at /mcp + cascade invalidation visible + re-run exits 1 with "No client found". - afterAll guards on clientId so pre-registration beforeAll failures surface cleanly instead of throwing on undefined during cleanup. Also tracks DCR-registered clients alongside the manual one. - Server fixture: --enable-dcr added so /register is reachable. - Health endpoint: page_count assertion loosened from > 0 to >= 0 + typeof number — pre-v0.26.2 broke on fresh-schema E2E runs. bun execSync env-inheritance fix (the load-bearing infrastructure fix that unbroke v0.26.2's full-suite test): - bun's child_process.execSync does NOT inherit env mutations done via process.env.X = ...; only OS-level env from before bun started. - helpers.ts loads .env.testing and sets DATABASE_URL via process.env mutation, invisible to subprocesses unless env: { ...process.env } is passed explicitly. - All 4 execSync calls in this file (beforeAll register-client, afterAll revoke-client, in-test register-client, in-test revoke-client x2) now pass env: { ...process.env }. - Without this, full bun test suite OAuth E2E fails with "Set DATABASE_URL or GBRAIN_DATABASE_URL environment variable" even when isolated test/e2e/serve-http-oauth.test.ts runs pass. Pattern is documented inline as a reference for other E2E test fixes (see TODOS.md "test infra (v0.26.2 follow-up)" for the 22-test backlog). * build: commit admin/dist + remove gitignore exclusion CLAUDE.md (admin/ section, v0.26.0 release notes) states: "output at admin/dist/ is committed for self-contained binaries" But .gitignore excluded admin/dist/, so the bun --compile binary that embeds the admin SPA via `import path from '...' with { type: 'file' }` couldn't resolve in fresh clones. PR #577 (v0.26.1) didn't catch this because admin tests pass when admin/dist exists locally. Removes the .gitignore line + commits the current 220KB build: - index.html (0.7KB) - assets/index-{hash}.js (210KB / 65KB gzip) - assets/index-{hash}.css (6.3KB / 1.8KB gzip) Now `bun build --compile --outfile bin/gbrain src/cli.ts` works on a fresh clone without a separate `cd admin && bun install && bun run build` step in CI. * docs: capturing test output rule + regen llms-full.txt Adds a CLAUDE.md section "Capturing test output (NEVER pipe through tail / head)" documenting the iron rule that bit v0.26.2's ship: bun test 2>&1 | tail -10 → exit code = tail's (always 0), failures truncated, ship gates fail open The pipe form silently breaks /ship Step T1 (test failure ownership triage) because $? after a pipe is the LAST command's exit code, and bun prints failure details before the summary line so tail -N drops them. v0.26.2's first ship attempt reported "3911 pass / 23 fail" but no failure details survived, forcing a 23-minute re-run to triage. Right pattern: redirect to a file first, then tail the file separately. Regenerates llms-full.txt to match the new CLAUDE.md content (drift guard at test/build-llms.test.ts enforces this). * docs: P0 TODO for 22 pre-existing test failures unrelated to OAuth Captures the test-infra backlog uncovered by v0.26.2's full bun test run. None of the 22 failing cases touch the OAuth diff: - 12 Git-to-DB Sync Pipeline cases (state-machine drift) - 3 multi-source cascade + sync routing cases - E2E sync-parallel, sync --skip-failed, doctor, dream, runCycle, claw-test fresh-install, BrainRegistry lazy init Likely root causes for several: same bun execSync env-inheritance pattern fixed in test/e2e/serve-http-oauth.test.ts during v0.26.2 (documented in the TODO + the inline test comment for the next maintainer to find). Separating from v0.26.2 keeps the OAuth ship focused on the bug class it was scoped for. Fix-wave deserves its own PR. * chore: bump to v0.26.2 + CHANGELOG VERSION 0.26.0 → 0.26.2. Includes a retroactive v0.26.1 entry above v0.26.0 because PR #577 shipped its three fixes (oauth-provider:303 Number cast, OAuth metadata interceptor, Express 5 trust proxy + admin wildcard) without bumping VERSION/package.json/CHANGELOG — this branch catches the changelog up to commit history. v0.26.2 release-summary covers the OAuth string-vs-number bug class fix (5 sites + coerceTimestamp helper), the gbrain auth revoke-client subcommand landing as a real CLI, and the bun execSync env-inheritance fix that unblocked full-suite E2E OAuth tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: post-ship updates for v0.26.2 - CLAUDE.md src/core/oauth-provider.ts: append v0.26.2 coerceTimestamp boundary helper note (5 call sites, NULL semantics, throw-on-NaN posture, intentionally module-private) - CLAUDE.md src/commands/auth.ts: add v0.26.2 revoke-client subcommand with FK CASCADE cleanup - CLAUDE.md test/oauth.test.ts: bump v0.26.2 case additions (5 coerceTimestamp + NULL-expires_at + cascade-delete contract) - CLAUDE.md test/e2e/serve-http-oauth.test.ts: new entry covering v0.26.0 + v0.26.2 expansion (DCR HTTP-level test, CLI subprocess revoke-client test, bun execSync env-inheritance fix as reference for sibling E2Es) - README.md: add gbrain auth revoke-client to command list - llms-full.txt: regenerate after CLAUDE.md edits Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
garrytan
added a commit
that referenced
this pull request
May 3, 2026
Master added v0.26.2 (#593) and v0.26.1 (#577) entries via the OAuth fix-wave landing path. Resolved three conflicts: - VERSION: kept 0.26.4 (this branch's version) - package.json: kept 0.26.4 - CHANGELOG.md: kept both — v0.26.4 entry on top, then master's new v0.26.2 + v0.26.1 entries below. Final order: 0.26.4 → 0.26.2 → 0.26.1 → 0.26.0 (top to bottom, contiguous version sequence). Also fixed a banned-name reference master shipped in the v0.26.1 CHANGELOG credit line ("Co-authored by Wintermute" → "Co-authored by your OpenClaw") per CLAUDE.md:550 / scripts/check-privacy.sh. Regenerated llms-full.txt to match the merged CHANGELOG. Verified post-merge: typecheck clean, schema-diff unit tests 17/17 pass. Master's diff did not touch src/schema.sql, src/core/pglite-schema.ts, or src/core/migrate.ts, so the new schema-drift gate's behavior is unchanged.
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
v0.26.1 fix: client_credentials OAuth tokens rejected by MCP bearer auth
The bug
Every token minted via
client_credentialsgrant is rejected at/mcpwith:Token issuance works (returns
expires_in: 3600), but the MCP SDK bearer auth middleware checkstypeof authInfo.expiresAt !== 'number'and gets'string'because thepostgresdriver withprepare: falsereturns all integer columns as strings.The fix (3 bugs, 22 lines)
Number(row.expires_at)inverifyAccessToken— cast string to number before returning to SDKgrant_types_supported: ['authorization_code', 'refresh_token']. Middleware intercepts the.well-knownresponse and appends'client_credentials'so RFC-conformant clients (Claude Code, Cursor) auto-discover the flow.trust proxy: 'loopback'(reverse proxy crashes express-rate-limit without it),/admin/{*path}(Express 5 named wildcard syntax)Reproduction
Context
Found in production connecting Claude Code to
gbrain serve --httpbehind a Caddy reverse proxy.Need help on this PR? Tag
@codesmithwith what you need.