JWT-only OS auth: no auth-worker roundtrips on page loads#1408
Merged
Conversation
Authenticate normal OS requests purely from signed JWT claims so the OS worker never calls the auth worker on page loads, including cold isolate starts: - OS middleware authenticates with includeUserInfo: false; session data (user, orgs, projects) comes from access/ID token claims only. - Auth SDK accepts a static JWKS (ITERATE_AUTH_JWKS / APP_CONFIG_ITERATE_AUTH__JWKS) and uses createLocalJWKSet when configured, eliminating the remote JWKS fetch on first verification. - Access-token org claims now carry organization names, so sessions no longer need userinfo for display names. - Deleted the active/current-organization context entirely: route auth validates org membership synchronously from session claims, and oRPC uses an authenticated-user middleware plus signed principal/project claims for project authorization. The only remaining OS -> auth-worker call is explicit project creation (projects-capability), an intentional mutation boundary. Token refresh (every ~5 min) still talks to auth by design. Verified on preview_3: authenticated /projects load and reload with zero browser or worker requests to auth.iterate.com and zero /api/iterate-auth/session calls. ITERATE_AUTH_JWKS is set in the os Doppler prd/preview root configs and personal dev configs (deploy-time JWKS fetch is a follow-up task; dev_localhost intentionally skipped as it signs with local keys). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Conflicts: # apps/os/src/components/app-sidebar.tsx # apps/os/src/lib/auth.ts # apps/os/src/routes/_app/org/$organizationSlug/index.tsx # apps/os/src/routes/index.tsx # apps/os/src/routes/organization.tsx
Project creation previously failed with BAD_REQUEST for users whose signed session lists more than one organization. The create input now accepts an optional organizationSlug, validated against the signed org claims; single-org users keep the implicit default. The create-project form shows an organization picker when the session has multiple orgs. Also note the claims-staleness-after-mutation follow-up (new projects absent from claims-based reads until token refresh) in the refresh task. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 5edf1c0. Configure here.
Root beforeLoad read the session from getGlobalStartContext(), which only exists during SSR. Client-side navigations recomputed the route context without it, so every guarded route redirected to /sign-in once you navigated in the SPA. The SSR pass now seeds an auth snapshot (session, issuer, project-host slug) into the query cache — which TanStack Start already dehydrates to the client — and client navigations reuse it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Aligns the session propagation with TanStack Start's first-party authentication pattern (server function awaited in root beforeLoad, result in router context) while keeping zero-roundtrip navigations: the snapshot query has infinite staleTime, so SSR seeds it once via the dehydrated query cache and client-side navigations reuse it. Unlike the previous hand-rolled cache fallback, a cache miss now fetches the session from the OS worker (still no auth-worker roundtrip) instead of silently treating the user as signed out. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
jonastemplestein
added a commit
that referenced
this pull request
Jun 10, 2026
…1410) Follow-up to #1408. Three reliability fixes on the OS/auth path, one branch. ## 1. ~5-minute logout (root cause from library source) `@better-auth/oauth-provider`'s refresh-token grant **rotates** the refresh token on every use and, on reuse of an already-rotated token, **revokes the entire token family** (`handleRefreshTokenGrant` → `createRefreshToken` marks the old token `revoked`, and a later reuse deletes all of the user+client's refresh tokens). A normal OS page load fires several concurrent requests; once the 5-minute access token was within the 30s refresh skew, each request independently hit the token endpoint with the same cookie token — the first rotated it, the rest looked like theft and nuked the session → logout, repeating roughly every 5 minutes. Fixes (`apps/auth/src/lib/server.ts`, the SDK OS bundles): - **Single-flight refresh** per refresh token — concurrent refreshes collapse to one token-endpoint call (`createSingleFlight`, extracted + unit-tested). - **Never refresh on WebSocket upgrades** — an upgrade response can't carry `Set-Cookie`, so refreshing there would rotate the token into a response the browser can't store and strand the session (this is also the **REPL websocket failure**: once the access token went stale, the capnweb upgrade tried to refresh, failed, and 401'd). - **Tolerate a failed refresh while the access token is still valid** — serve the request and let a later one retry, instead of dropping the session on any hiccup. - **Access-token TTL 5m → 30m** (`auth-plugins.ts`) so refresh is rare. Tradeoff: org/project claim changes propagate within ≤30m (mitigated for the creator by client cache seeding). ## 2. Deploy-time JWKS (`apps/os/alchemy.run.ts`) #1408 verified JWTs locally from a static JWKS but relied on a hand-set Doppler secret. Now the alchemy script **fetches `<issuer>/jwks` at deploy time** into `APP_CONFIG` (typesafe), so key rotation only needs an OS redeploy. A loopback issuer (local dev auth, own keys) skips the static JWKS; a failed fetch falls back to runtime JWKS. **Verified**: preview-5 deploy log shows the JWKS baked into config and the worker healthy with zero auth-worker roundtrips. ## 3. Stream append skeleton flash (`apps/os/.../project-stream-view.tsx`) On append the virtualized window shifted (the list grows and the view force-scrolled to the bottom), which re-created the visible-range SQL query. `stream-browser-db.query()` seeds a new range query as `pending` carrying a *different* range's rows, so `rowsByIndex` missed the visible indices and every visible row blanked to a grey `bg-slate-100` skeleton for a frame — the "skeleton flash + all rows redraw". Fixes: retain the last committed rows across range re-queries (only genuinely-new indices fall back to a skeleton), and only auto-scroll to the bottom when already pinned there (don't yank a scrolled-up reader and trigger a full-window re-query). ## Proof status (being precise) - **Refresh single-flight**: proven by `apps/os/src/auth/iterate-auth-single-flight.test.ts` (deterministic) + root cause read from the oauth-provider source. The end-to-end "wait 5 minutes in a browser" proof was **not** completed: production auth is Google-only (no headless sign-in) and doesn't honor service-token impersonation at the public `oauth2/authorize`, and the fixed local stack now issues 30m tokens. Worth a manual 5-min check after this deploys to prod auth. - **Deploy-time JWKS**: verified on a real preview-5 deploy. - **Stream flash**: fix is code-reasoned and safe (retain-last-rows + sticky-scroll); not pixel-verified — the local headless harness was too unstable (Chrome OOM, dev issuer drift) to capture the sub-second flash reliably. See `apps/os/docs/headless-local-debugging.md`. ## Docs `apps/os/docs/headless-local-debugging.md` — driving the full local OS+Auth stack headlessly (test OTP `424242`, signup allowlist, orgs/projects, OAuth/consent quirks, reading local D1, MutationObserver over throttled timers). `pnpm typecheck && pnpm lint && pnpm test` all green. 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **High Risk** > Changes authentication refresh semantics, WebSocket session behavior, and access-token lifetime—security-critical paths that affect all signed-in users and long-lived connections. > > **Overview** > Addresses three reliability issues on the OS/auth path: periodic session logout, JWT verification at deploy, and stream UI flicker. > > **Auth session (~5‑minute logout):** Adds exported `createSingleFlight` and wraps refresh-token grants so concurrent requests sharing one cookie collapse to a single token call—avoiding rotated-token reuse that revokes the whole family. Cookie middleware skips refresh on WebSocket upgrades (no `Set-Cookie`), tolerates refresh failures while the access token is still valid, and extends access-token TTL from 5m to 30m on the auth provider. > > **OS deploy:** `alchemy.run.ts` fetches issuer JWKS at deploy time into static config (loopback dev skips production JWKS; fetch failure falls back to runtime JWKS). > > **Stream UI:** `project-stream-view` keeps last committed SQLite rows during virtualizer range re-queries and only auto-scrolls when the user is pinned near the bottom, reducing skeleton flashes on append. > > Adds `headless-local-debugging.md`, README link, and unit tests for `createSingleFlight`. > > <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 10682c4. Bugbot is set up for automated code reviews on this repo. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup> <!-- /CURSOR_SUMMARY --> <!-- CLOUDFLARE_PREVIEW --> ## Environment Config Lease <!-- CLOUDFLARE_PREVIEW_STATE --> <!-- { "apps": { "os": { "appDisplayName": "OS", "appSlug": "os", "status": "deployed", "updatedAt": "2026-06-09T23:12:17.956Z", "headSha": "10682c43e5022fef9c39f55405bed1f423384950", "message": null, "publicUrl": "https://os.iterate-preview-2.com", "runUrl": "https://github.com/iterate/iterate/actions/runs/27241670590", "shortSha": "10682c4" } }, "environmentConfigLease": { "dopplerConfig": "preview_2", "leasedUntil": 1781050179583, "leaseId": "0aa9f837-5428-4f86-be56-bfc11f0a201d", "slug": "preview-2", "type": "environment-config-lease" } } --> <!-- /CLOUDFLARE_PREVIEW_STATE --> Lease: `preview-2` Doppler config: `preview_2` Type: `environment-config-lease` Leased until: 2026-06-10T00:09:39.583Z ### OS Status: deployed Commit: `10682c4` Preview: https://os.iterate-preview-2.com [Workflow run](https://github.com/iterate/iterate/actions/runs/27241670590) Updated: 2026-06-09T23:12:17.956Z <!-- /CLOUDFLARE_PREVIEW --> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What
Authenticate normal OS requests purely from signed JWT claims so the OS worker never makes a network roundtrip to the auth worker on page loads — including on cold isolate starts.
auth.authenticate({ includeUserInfo: false }); user/org/project session data comes entirely from access/ID token claims.ITERATE_AUTH_JWKS→APP_CONFIG_ITERATE_AUTH__JWKS, typesafe via the AppConfig schema) and usescreateLocalJWKSet, eliminating the remote JWKS fetch on first verification per isolate. Falls back tocreateRemoteJWKSetwhen unset.apps/auth), so sessions don't need userinfo for display names.Remaining OS → auth-worker calls (by design): explicit project creation (mutation boundary) and OAuth token refresh (~5 min token expiry).
Verification
pnpm typecheck && pnpm lint && pnpm format && pnpm testall green.activeOrganization/current-org references remain; only expected auth-worker usages (JWKS wiring,includeUserInfo: false, project creation).os.iterate-preview-3.com, landed on an authenticated/projectspage, then reloaded. Network log for the authenticated reload: the document + static assets only — zero requests toauth.iterate.com, zero to/api/iterate-auth/session(server-side JWT verification used the baked-in static JWKS).Ops notes
ITERATE_AUTH_JWKSset in theosDopplerprd+previewroot configs (branch configs inherit) and personal dev configs.dev_localhostdeliberately skipped — it signs with local auth keys.invalid_clienton code exchange); fixed by re-syncing viasync-auth-clients.tswith rotation.tasks/os-deploy-time-jwks-fetch.md(fetch JWKS at deploy time to fix the key-rotation story),tasks/os-auth-spurious-logout-refresh.md(suspected concurrent-refresh race causing spurious logouts in dev).🤖 Generated with Claude Code
Note
High Risk
Changes core authentication, authorization, and project/org scoping across middleware, oRPC, and UI; misconfigured JWKS or stale JWT claims could break access until token refresh.
Overview
OS request and page auth now rely on locally verified JWTs instead of calling the auth worker on every load. The auth SDK gains optional static JWKS (
createLocalJWKSet) andauthenticate({ includeUserInfo: false })so cookie sessions are built from access/ID token claims only; auth tokens can carry optional org names so UI does not need userinfo.The active/current-organization layer is removed (~170 lines). Authorization uses the user/admin principal and signed org/project claims in the token: project list/read/create gates on JWT project claims (not per-request auth-worker project lists),
organizationSlugis required when the user belongs to multiple orgs, andfindBySlugis documented as globally unique. oRPC middleware is renamed to authenticated-user; project access helpers drop org-scoped auth-worker lookups.The root route hydrates auth from middleware-resolved session (SSR snapshot, no
/api/iterate-auth/sessionon reload). Create project adds an organization selector when needed. Codemode/MCP paths stop threadingactiveOrganizationthrough RPC props.Deploy wiring adds
APP_CONFIG_ITERATE_AUTH__JWKS; follow-up task files note deploy-time JWKS fetch and dev refresh races.Reviewed by Cursor Bugbot for commit 9e4ebee. Bugbot is set up for automated code reviews on this repo. Configure here.
Environment Config Lease
No active environment config lease.
OS
Status: released
Commit:
9e4ebeePreview: https://os.iterate-preview-6.com
Summary: Preview app released.
Workflow run
Updated: 2026-06-09T21:38:47.208Z