Skip to content

feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth#1774

Draft
leex279 wants to merge 17 commits into
devfrom
feat/keycloak-multi-user-auth
Draft

feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth#1774
leex279 wants to merge 17 commits into
devfrom
feat/keycloak-multi-user-auth

Conversation

@leex279

@leex279 leex279 commented May 26, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Problem: Archon has no auth — every /api/* route is publicly accessible, there is no user table, and a single shared GITHUB_TOKEN env var is used for all clones. This blocks any hosted, multi-developer deployment.
  • Why it matters: Foundational identity infrastructure is needed before any other multi-tenancy or RBAC feature can be layered on. Engineering teams want a single shared Archon instance instead of per-person local installs.
  • What changed: Added OIDC (Keycloak) login + JWT middleware on /api/*, a remote_agent_users table with AES-256-GCM-encrypted per-user GitHub OAuth tokens, created_by_user_id attribution on conversations and workflow runs, and per-user GitHub token usage in the clone path. Web UI now redirects unauthenticated users to login and shows a "Connect GitHub" button.
  • What did NOT change: Backwards-compatible no-op when KEYCLOAK_URL is unset — single-user mode still works unchanged. No RBAC, no per-user AI provider selection, no multi-tenancy, no CLI/Slack/Telegram/GitHub-adapter auth changes.

UX Journey

Before

Web user           Archon API
────────           ──────────
opens /            anonymous request
sends message ───▶ no auth check
                   accepts request
                   clones with shared GITHUB_TOKEN
sees reply  ◀───── responds

After (multi-user mode, KEYCLOAK_URL set)

Web user           Archon API                Keycloak
────────           ──────────                ────────
opens / ─────────▶ /api/auth/me → 401
                   ◀──── redirect
clicks login ────▶ /api/auth/login ────────▶ login page
authenticates ◀──────────────────────────── 
returns to /  ◀── set HTTP-only cookie
                   c.set('user', {sub})
sends message ───▶ JWT validated via JWKS
                   *created_by_user_id stamped*
                   clones with *per-user GitHub token*
                   (falls back to org GITHUB_TOKEN if not connected)
sees reply  ◀───── responds

Connect GitHub ──▶ /api/auth/github ────────▶ GitHub OAuth
                   encrypts + persists token

After (single-user mode, KEYCLOAK_URL unset)

(unchanged from Before — middleware is a no-op)

Architecture Diagram

Before

[Web UI] ──▶ [/api/*] ──▶ [Routes] ──▶ [Core/DB]
                                       ──▶ [Git clone using env GITHUB_TOKEN]

After

[Web UI ~] ──▶ [/api/auth/* +]  ──▶ [Keycloak ext]
                                ──▶ [GitHub OAuth ext]
            ──▶ [/api/* ~] ──▶ [oidc middleware +] ──▶ [Routes ~]
                                                      ──▶ [Core/DB ~ (+ users table, +created_by_user_id FKs)]
                                                      ──▶ [Git clone ~ (uses per-user token, falls back to env)]

Connection inventory:

From To Status Notes
Web UI /api/auth/me new Hook checks auth status on mount; redirects on 401
Web UI /api/auth/login new Triggered on 401; redirects to Keycloak
Web UI /api/auth/github new "Connect GitHub" button
/api/* OIDC middleware new Inserted after CORS, before route handlers
OIDC middleware Keycloak JWKS new External, only when KEYCLOAK_URL set
Auth routes userDb.upsertUser / setGithubToken new Persists Keycloak identity + encrypted GH token
POST /api/codebases userDb.getGithubToken new Look up per-user token before clone
cloneRepository (caller) modified Optional userGithubToken 2nd arg; falls back to env
getOrCreateConversation DB INSERT modified Adds created_by_user_id column
createWorkflowRun DB INSERT modified Adds created_by_user_id column
Orchestrator conversation create modified Propagates created_by_user_id from parent

Label Snapshot

  • Risk: risk: medium (new auth surface; backwards-compat path mitigates blast radius)
  • Size: size: L (~15 files, new DB migration, new middleware + routes, web UI hooks)
  • Scope: core | server | web | adapters | config | docs | tests
  • Module: server:middleware, server:routes/auth, core:db/users, core:utils/token-crypto, web:hooks/useAuth

Change Metadata

  • Change type: feature
  • Primary scope: multi

Linked Issue

  • Closes #
  • Related #
  • Depends on #
  • Supersedes #

Validation Evidence (required)

unset ARCHON_DOCKER  # required when running inside the Archon container; see "Compatibility" below
bun run validate
# → exit 0
# - check:bundled ✓
# - check:bundled-skill ✓
# - type-check ✓ (all 10 packages)
# - lint --max-warnings 0 ✓
# - format:check ✓
# - test ✓ (all packages)
  • Evidence: Full bun run validate exited 0 against this branch.
  • Skipped: Nothing.

Security Impact (required)

  • New permissions/capabilities? Yes — OIDC bearer token validation, GitHub OAuth flow
  • New external network calls? Yes — Keycloak JWKS endpoint, GitHub OAuth token endpoint
  • Secrets/tokens handling changed? Yes — per-user GitHub OAuth tokens stored AES-256-GCM-encrypted at rest using TOKEN_ENCRYPTION_KEY; server refuses to start in multi-user mode without it; tokens never logged
  • File system access scope changed? No
  • Risk/mitigation: Cookie is HTTP-only + Secure + SameSite=Lax. PKCE used for OIDC web flow. Server fails fast at startup if KEYCLOAK_URL is set without PostgreSQL or TOKEN_ENCRYPTION_KEY.

Compatibility / Migration

  • Backward compatible? Yes — middleware is a no-op when KEYCLOAK_URL is unset; all created_by_user_id columns are nullable; pre-auth rows continue to load
  • Config/env changes? Yes — new optional env vars: KEYCLOAK_URL, KEYCLOAK_REALM, KEYCLOAK_CLIENT_ID, KEYCLOAK_CLIENT_SECRET, TOKEN_ENCRYPTION_KEY, GITHUB_OAUTH_CLIENT_ID, GITHUB_OAUTH_CLIENT_SECRET (all documented in .env.example)
  • Database migration needed? Yesmigrations/022_users_and_auth.sql (PostgreSQL only)
  • Upgrade steps for multi-user mode:
    1. psql $DATABASE_URL < migrations/022_users_and_auth.sql
    2. Generate encryption key: openssl rand -hex 32 → set as TOKEN_ENCRYPTION_KEY
    3. Configure Keycloak realm + client; set KEYCLOAK_URL/KEYCLOAK_REALM/KEYCLOAK_CLIENT_ID/KEYCLOAK_CLIENT_SECRET
    4. Register a GitHub OAuth app; set GITHUB_OAUTH_CLIENT_ID/GITHUB_OAUTH_CLIENT_SECRET
    5. Set DATABASE_URL (PostgreSQL — startup refuses SQLite in multi-user mode)
    6. Restart server

Human Verification (required)

  • Verified: bun run validate passes end-to-end (commits 3e7a428c + fe6edfcb)
  • Edge cases checked: test mocks updated for new created_by_user_id INSERT arg; ClaudeProvider tests fixed for UID 0 under Docker/CI
  • Not verified manually: live Keycloak login round-trip, live GitHub OAuth callback, encrypted token decrypt-in-clone-path on a real PostgreSQL deployment. Requires a Keycloak realm + GitHub OAuth app provisioned in the deploying org.

Side Effects / Blast Radius (required)

  • Affected subsystems: server (routes, middleware, startup validation), core (db/users, utils/token-crypto, orchestrator, conversation/workflow create), web (TopNav, useAuth)
  • Potential unintended effects:
    • If the no-op middleware misbehaves under KEYCLOAK_URL unset, existing users get 401 → infinite redirect loop. Mitigation: integration test on the unset path; manual smoke before rollout.
    • created_by_user_id is nullable, but any later code that assumes it's NOT NULL would break pre-auth rows.
  • Guardrails: startup validation fails fast on misconfig; tests cover the no-op middleware path

Rollback Plan (required)

  • Fast rollback: git revert fe6edfcb 3e7a428c (the two commits on this branch)
  • Feature flag: KEYCLOAK_URL env var — unset it and restart to instantly fall back to single-user mode without reverting code
  • DB rollback: migration is additive — DROP TABLE remote_agent_users CASCADE; ALTER TABLE ... DROP COLUMN created_by_user_id if a clean rollback is needed
  • Observable failure symptoms: 401s on legitimate requests; clone failures attributable to per-user token lookup; startup refusing to launch with multi-user env vars set

Risks and Mitigations

  • Risk: Web UI infinite redirect loop if /api/auth/me returns 401 when KEYCLOAK_URL is unset.
    • Mitigation: Verified oidcMiddleware is a true no-op in single-user mode; add an integration test before next release if not already covered.
  • Risk: created_by_user_id is captured but never used downstream (RBAC is post-MVP).
    • Mitigation: Acceptable for v1 — the column exists so future RBAC work doesn't require a backfill migration.
  • Risk: AES-256-GCM key in env var (not KMS).
    • Mitigation: Documented as acceptable for v1; key rotation requires manual re-encryption of all stored tokens.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added HyperFrames video workflow for AI-powered website-to-video composition generation with asset capture, pipeline execution, and MP4 rendering.
    • Introduced multi-user authentication via Keycloak OIDC with user sign-in/sign-out and account management.
    • Added per-user GitHub authentication for secure, personalized git repository access.
    • Implemented user menu in navigation displaying authentication status and account options.

Review Change Stack

Archon and others added 2 commits May 26, 2026 18:23
Adds optional Keycloak-backed OIDC authentication to the web UI and REST API,
while preserving the single-user/SQLite path entirely (no-op when KEYCLOAK_URL
is absent).

P0 – DB + crypto foundation:
- migrations/022_users_and_auth.sql: remote_agent_users table, nullable
  created_by_user_id FKs on conversations and workflow_runs
- packages/core/src/utils/token-crypto.ts: AES-256-GCM encrypt/decrypt for
  GitHub OAuth tokens; TOKEN_ENCRYPTION_KEY env var (32-byte hex)
- packages/core/src/db/users.ts: upsertUser, getUserByKeycloakSub, getUserById,
  setGithubToken (encrypted), getGithubToken (decrypted)

P1 – OIDC middleware + auth routes:
- packages/server/src/middleware/auth.ts: JWKS-backed JWT verification via jose;
  reads cookie or Bearer token; public-path bypass; no-op when KEYCLOAK_URL unset
- packages/server/src/routes/auth.ts: /api/auth/{login,callback,logout,me,github,
  github/callback} — PKCE code flow, cookie set on callback, per-user GitHub OAuth
- Startup fatal gate: PostgreSQL + TOKEN_ENCRYPTION_KEY required in multi-user mode

P2 – Wire created_by_user_id:
- Conversations and workflow runs record the creating user's UUID (nullable for compat)

P3 – Per-user GitHub OAuth in clone path:
- packages/core/src/handlers/clone.ts: prefers per-user token over org token for
  github.com URLs when provided

P4 – Web UI:
- fetchJSON redirects to /api/auth/login on 401 when OIDC is active
- packages/web/src/hooks/useAuth.ts: useAuth() hook via @tanstack/react-query
- TopNav: UserMenu dropdown (name, GitHub connect status, sign out)

P5 – Config:
- .env.example: Keycloak and GitHub OAuth variable documentation
- conversations/workflows test: add null for created_by_user_id arg in INSERT mocks
- api.codebases test: expect userGithubToken second arg on cloneRepository
- claude/registry provider tests: set IS_SANDBOX=1 in beforeEach so ClaudeProvider can construct under UID 0 (Docker/CI)
- regenerate bundled-defaults to include the previously-untracked archon-hyperframes-video workflow that was on the branch
@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5f687bbf-b140-4425-8dd1-2dc922f74b4f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds Keycloak/OIDC-based multi-user authentication with encrypted per-user GitHub token storage, plus a new HyperFrames video generation workflow. Users can now log in via Keycloak, optionally connect their GitHub account, and the system tracks conversation and workflow ownership.

Changes

Multi-User Authentication System

Layer / File(s) Summary
Database schema for users and attribution
migrations/022_users_and_auth.sql, packages/core/src/types/index.ts, packages/core/src/db/conversations.test.ts, packages/core/src/db/workflows.test.ts, packages/workflows/src/schemas/workflow-run.ts, packages/workflows/src/store.ts
New remote_agent_users table with UUID primary key, unique Keycloak subject, and nullable GitHub token fields. Foreign key columns created_by_user_id added to conversations and workflow runs with ON DELETE SET NULL and supporting indexes. Type definitions and test expectations updated across database layers.
Token encryption and user persistence
packages/core/src/utils/token-crypto.ts, packages/core/src/utils/token-crypto.test.ts, packages/core/src/db/users.ts, packages/core/src/db/conversations.ts, packages/core/src/index.ts
AES-256-GCM encryption for GitHub tokens with environment-based key parsing. User DB layer exports upsert, lookup by Keycloak sub or ID, encrypted token storage/retrieval, and logging. Conversation and workflow creation functions updated to accept and persist created_by_user_id. Public re-exports of user DB and token utilities.
OIDC authentication middleware and routes
packages/server/src/middleware/auth.ts, packages/server/src/routes/auth.ts
Keycloak JWT verification middleware with public-path bypass and context injection. Auth routes implement OIDC login/callback/logout, GitHub OAuth authorization/callback, PKCE state management, user upsert, and token exchange. All routes gate on required environment variables with 404 fallback.
Server startup and middleware wiring
packages/server/src/index.ts
Multi-user mode validation requiring DATABASE_URL and TOKEN_ENCRYPTION_KEY when KEYCLOAK_URL is set. OIDC middleware and auth routes registered early in request pipeline before generic API middleware.
API endpoint integration with user resolution
packages/server/src/routes/api.ts, packages/core/src/handlers/clone.ts
POST /api/conversations resolves authenticated user from OIDC claims and passes user ID to conversation creation. POST /api/codebases fetches per-user GitHub token from DB and passes it to clone handler, which uses per-user token in precedence over org-level token for GitHub URLs.
Frontend auth state and UI
packages/web/src/hooks/useAuth.ts, packages/web/src/lib/api.ts, packages/web/src/components/layout/TopNav.tsx
useAuth React hook fetches current user via /api/auth/me with 5-minute cache. Web client intercepts 401 responses (non-auth routes) to redirect to login. TopNav displays UserMenu showing user display name, GitHub connection status, and sign-out option.
Configuration, dependencies, and test setup
.env.example, packages/server/package.json, packages/providers/src/claude/provider.test.ts, packages/providers/src/registry.test.ts
Environment variable documentation for Keycloak/OIDC configuration and GitHub OAuth credentials. Added jose dependency for JWT verification. Provider tests set IS_SANDBOX=1 to enable Claude instantiation under root/CI without root-check bypass.

HyperFrames Video Generation Workflow

Layer / File(s) Summary
HyperFrames workflow definition
.archon/workflows/defaults/archon-hyperframes-video.yaml
New Archon workflow validating website URL, installing HyperFrames skill, capturing site assets, running multi-step AI pipeline (design/script/storyboard/VO+timing/build), previewing, rendering MP4 output, extracting key-frame snapshots at fixed percentages, and summarizing results with input/output locations and remediation guidance on errors.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Hops excitedly

A bunny's tale of auth and frames,
Where Keycloak gates the user claims,
GitHub tokens locked up tight,
Videos rendered, snapshots bright!
Multi-user magic done just right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth' accurately and concisely summarizes the main feature being introduced — multi-user authentication via Keycloak/OIDC with per-user GitHub OAuth token support.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering all major template sections: Summary with 4 bullets, UX Journey before/after flows, Architecture Diagram with connection inventory, Label Snapshot, Change Metadata, Validation Evidence showing full test passing, Security Impact with risk mitigation, Compatibility/Migration steps, Human Verification, Side Effects, Rollback Plan, and Risks/Mitigations. All required sections are present and adequately detailed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/keycloak-multi-user-auth

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.archon/workflows/defaults/archon-hyperframes-video.yaml:
- Around line 22-23: Ensure VIDEO_TYPE is validated against an explicit
allowlist before proceeding: define an ALLOWED_VIDEO_TYPES (e.g., array or
comma-separated string) and check that VIDEO_TYPE (from the parameter default
VIDEO_TYPE="${3:-product-launch}") is a member; if not, print a clear error and
exit non-zero. Apply the same validation logic to the other occurrence
referenced (lines ~122-128) so unsupported values cannot pass silently; keep the
default assignment but perform the membership check immediately after and fail
fast with a helpful message listing allowed values.
- Around line 71-72: Bash nodes are incorrectly reading an env var named
"$setup_output" instead of the workflow-provided "$setup.output"; update each
bash script block that currently uses setup_output (e.g., the capture, preview,
render, snapshot blocks) to first assign SETUP_OUTPUT=$setup.output and then
parse URL, PROJECT_DIR, and OUTPUT_NAME from SETUP_OUTPUT (use the same grep/cut
parsing logic) so values are populated correctly; additionally, in the setup
step ensure VIDEO_TYPE is validated against the documented allowlist (explicitly
check accepted values and fail or default if invalid) to avoid silent
mis-mapping of duration.

In `@packages/core/src/db/users.ts`:
- Around line 68-75: The update currently always logs success even when no row
was updated and logs the GitHub username (PII); in setGithubToken, capture the
result of pool.query, check result.rowCount and throw a clear error (e.g., "no
user updated for id <userId>") when rowCount === 0 to fail fast, and change the
getLog().info call to avoid logging githubUsername or the token (log only
non-PII context such as userId or a generic message). Ensure you reference the
pool.query call and replace the existing getLog().info usage accordingly.

In `@packages/core/src/handlers/clone.ts`:
- Line 340: The authenticated clone URL reconstruction in clone.ts builds
cloneUrl using parsed.hostname which drops custom ports; update the logic where
cloneUrl is assigned (the line creating cloneUrl) to use parsed.host (or
parsed.host || parsed.hostname as a fallback) so self-hosted forge URLs with
non-default ports are preserved when creating the authenticated URL.
- Around line 333-336: The current substring check for isGithubUrl is unsafe;
parse workingUrl using the URL constructor and set isGithubUrl only when the
parsed hostname strictly equals 'github.com' (or matches an exact allowlist),
then compute effectiveToken as before (use userGithubToken only when that strict
host check passes); ensure any URL parsing errors are caught and treated as
non-GitHub so you don’t accidentally apply userGithubToken. Reference: update
the computation of isGithubUrl, the use of workingUrl, and the subsequent
effectiveToken assignment that depends on resolveForgeAuth and userGithubToken.

In `@packages/core/src/orchestrator/orchestrator.ts`:
- Around line 346-347: The call to db.getConversationById can throw and should
not abort dispatch; wrap the lookup in a try/catch around the
db.getConversationById(ctx.conversationDbId) call so any thrown error is caught
and logged (or ignored) and parentConversation/createdByUserId fall back to
undefined; specifically protect the parentConversation variable assignment and
ensure createdByUserId is set from parentConversation only when the lookup
succeeds, leaving attribution undefined on failure rather than propagating the
exception.

In `@packages/core/src/utils/token-crypto.test.ts`:
- Around line 37-57: The tests mutate process.env.TOKEN_ENCRYPTION_KEY but
restore it only inside test bodies, which can leak state if a test throws;
change to a guaranteed cleanup by capturing const origEnv =
process.env.TOKEN_ENCRYPTION_KEY and restoring it in a global afterEach (or by
wrapping each test body in try/finally) so TOKEN_ENCRYPTION_KEY is always reset
regardless of failures; update the test file to use afterEach(() => {
process.env.TOKEN_ENCRYPTION_KEY = origEnv; }) and keep references to
getEncryptionKey and the TOKEN_ENCRYPTION_KEY manipulations unchanged.

In `@packages/providers/src/claude/provider.test.ts`:
- Around line 75-77: Save the original process.env.IS_SANDBOX value at the start
of the test setup and restore it in an afterEach cleanup to avoid leaking the
sandbox flag between tests; specifically update the test file's beforeEach block
that sets process.env.IS_SANDBOX = '1' (in
packages/providers/src/claude/provider.test.ts) to capture the prior value
(e.g., const prevIsSandbox = process.env.IS_SANDBOX) and add an afterEach that
resets process.env.IS_SANDBOX = prevIsSandbox (or deletes it if it was
undefined) so individual root-check tests can control IS_SANDBOX safely.

In `@packages/providers/src/registry.test.ts`:
- Around line 61-63: Save the original value of process.env.IS_SANDBOX before
mutating it in the test (e.g., capture it at top of the test file or inside the
setup block), set process.env.IS_SANDBOX = '1' for the test(s) that need it, and
restore the saved original value in an afterEach or afterAll teardown so the
global environment is not leaked across registry tests; update registry.test.ts
to perform this save/set/restore around the code that currently assigns
process.env.IS_SANDBOX so tests remain order-independent.

In `@packages/server/src/index.ts`:
- Around line 206-215: The current check only verifies process.env.DATABASE_URL
exists when KEYCLOAK_URL is enabled; update the validation to also reject
non-PostgreSQL schemes by verifying DATABASE_URL starts with a PostgreSQL scheme
(e.g. "postgresql://" or "postgres://") before continuing. If the scheme is
invalid, call getLog().fatal with the same 'multi_user_mode_requires_postgresql'
context (and a hint mentioning acceptable schemes), print the console error
about requiring PostgreSQL, and exit with process.exit(1) to fail fast; apply
this logic around the existing block that references process.env.DATABASE_URL
and KEYCLOAK_URL.

In `@packages/server/src/middleware/auth.ts`:
- Around line 13-17: The PUBLIC_PREFIXES entry '/api/auth/' is too broad and
causes isPublicPath to treat all /api/auth/* routes as public (preventing
oidcUser from being set); update PUBLIC_PREFIXES and isPublicPath so only
intended routes are public: remove or change '/api/auth/' to an exact path like
'/api/auth' (or list only explicit public endpoints such as '/api/auth/login',
'/api/auth/logout') and change isPublicPath to treat entries that end with a
slash as true prefixes but treat entries without a trailing slash as exact
matches (i.e., use equality for exact paths and startsWith only for deliberate
prefix entries); apply the same fix wherever PUBLIC_PREFIXES is referenced
(including the other occurrence around lines 50-52) so /api/auth/me and
/api/auth/github/callback no longer bypass JWT parsing.

In `@packages/server/src/routes/api.ts`:
- Around line 1747-1753: If an authenticated OIDC user exists but no DB user or
GitHub token is found, do not fall back to org-level credentials; instead detect
when oidcUser is truthy and (userDb.getUserByKeycloakSub(oidcUser.sub) returns
null OR userDb.getGithubToken(...) returns undefined) and return an explicit
authorization error (e.g., 403/401) to the caller rather than continuing to call
cloneRepository with org credentials; update the logic around oidcUser,
userDb.getUserByKeycloakSub, and userDb.getGithubToken to short-circuit with an
error response for multi-user authenticated requests to avoid silent permission
broadening.
- Around line 1296-1302: If an oidcUser is present but
userDb.getUserByKeycloakSub(oidcUser.sub) returns null, stop processing and
throw a clear error instead of leaving webUserId undefined; update the logic
around the oidcUser resolution (the block using oidcUser,
userDb.getUserByKeycloakSub, and webUserId) to check for a missing dbUser and
throw an explicit error (e.g., "OIDC user not found in DB" or similar) so
downstream conversation creation never proceeds without created_by_user_id in
multi-user mode.

In `@packages/server/src/routes/auth.ts`:
- Around line 135-137: The current logging in the token exchange path leaks the
raw IdP response body; update the handler around tokenRes and getLog() (the
block that logs 'auth.token_exchange_failed') to never include the raw body or
any tokens/PII—replace it with tokenRes.status and a bounded/sanitized error
identifier (e.g., parsed error code or a constant like "<idp-error-redacted>" or
the first 200 chars only), or log only status plus a safe errorCode variable;
then return the same 502 response via c.json({ error: 'Token exchange failed' })
without exposing IdP details.
- Around line 119-132: Wrap the external token fetch (the call that assigns
tokenRes when POSTing to `${keycloakBase()}/protocol/openid-connect/token` in
auth.ts) in an AbortController-based timeout and add a try/catch to handle
network failures; ensure you abort the request after a configurable timeout,
catch and log the error via the existing logger or return a proper 504/502 style
response, and also check tokenRes.ok and handle non-2xx responses (parsing and
surfacing error details) instead of assuming success. Apply the same pattern to
the other external OIDC/fetch calls in this file (the later fetch sequence
referenced around the second block) so all external OAuth/OIDC HTTP calls use
AbortController timeouts, explicit error handling, and clear non-OK response
handling.
- Around line 62-315: Replace the direct app.get(...) registrations for the auth
endpoints with the OpenAPI wrapper: for each route currently registered via
app.get('/api/auth/login', ...), app.get('/api/auth/callback', ...),
app.get('/api/auth/logout', ...), app.get('/api/auth/me', ...),
app.get('/api/auth/github', ...), and app.get('/api/auth/github/callback', ...)
wrap the handler with registerOpenApiRoute(createRoute({...}), handler) using
appropriate route metadata (method, path, summary, responses) so the routes
conform to the server contract; keep the existing handler functions intact but
move them into the second arg of registerOpenApiRoute and import/initialize
createRoute and registerOpenApiRoute as used elsewhere in the codebase.
- Around line 166-171: The auth cookie set in the setCookie call for
'archon_access_token' is missing the secure flag; update the options object
passed to setCookie(c, 'archon_access_token', tokens.access_token, { ... }) to
include secure: true so the cookie is only sent over TLS, preserving httpOnly,
sameSite, path, and maxAge settings (optionally gate secure by NODE_ENV if you
need HTTP in local dev).

In `@packages/web/src/lib/api.ts`:
- Line 63: In fetchJSON (the helper that calls fetch), callers can override
credentials because options are spread after the default; change the call so
options are spread first and credentials is set last (i.e., fetch(url, {
...options, credentials: 'include' })) so that credentials: 'include' cannot be
overridden; update the fetch invocation inside fetchJSON accordingly.
- Around line 565-567: The current response handling in the block that checks
res.status and creates the "data" variable incorrectly returns null for all
non-OK responses; change it so that only 401 and 404 return null and any other
non-ok status throws an Error (include status and a short message or response
text for diagnostics) before attempting to call res.json(); preserve the
existing typing to { authenticated?: false } & Partial<AuthMeResponse> when
parsing successful responses and ensure thrown errors surface upstream instead
of being swallowed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 27130310-687c-4bb9-9ff5-99d30f707a64

📥 Commits

Reviewing files that changed from the base of the PR and between ac18034 and fe6edfc.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (28)
  • .archon/workflows/defaults/archon-hyperframes-video.yaml
  • .env.example
  • migrations/022_users_and_auth.sql
  • packages/core/src/db/conversations.test.ts
  • packages/core/src/db/conversations.ts
  • packages/core/src/db/users.ts
  • packages/core/src/db/workflows.test.ts
  • packages/core/src/db/workflows.ts
  • packages/core/src/handlers/clone.ts
  • packages/core/src/index.ts
  • packages/core/src/orchestrator/orchestrator.ts
  • packages/core/src/types/index.ts
  • packages/core/src/utils/token-crypto.test.ts
  • packages/core/src/utils/token-crypto.ts
  • packages/providers/src/claude/provider.test.ts
  • packages/providers/src/registry.test.ts
  • packages/server/package.json
  • packages/server/src/index.ts
  • packages/server/src/middleware/auth.ts
  • packages/server/src/routes/api.codebases.test.ts
  • packages/server/src/routes/api.ts
  • packages/server/src/routes/auth.ts
  • packages/web/src/components/layout/TopNav.tsx
  • packages/web/src/hooks/useAuth.ts
  • packages/web/src/lib/api.ts
  • packages/workflows/src/defaults/bundled-defaults.generated.ts
  • packages/workflows/src/schemas/workflow-run.ts
  • packages/workflows/src/store.ts

Comment on lines +22 to +23
VIDEO_TYPE="${3:-product-launch}"
OUTPUT_NAME="${4:-hyperframes-output}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate VIDEO_TYPE against the declared allowlist in setup.

Unsupported values currently pass through silently, which weakens deterministic behavior in the generation step.

Proposed fix
       VIDEO_TYPE="${3:-product-launch}"
+      case "$VIDEO_TYPE" in
+        social-ad|product-launch|product-tour|brand-reel|feature-announcement|teaser) ;;
+        *)
+          echo "ERROR: Unsupported video type '$VIDEO_TYPE'."
+          echo "Allowed: social-ad | product-launch | product-tour | brand-reel | feature-announcement | teaser"
+          exit 1
+          ;;
+      esac

Also applies to: 122-128

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.archon/workflows/defaults/archon-hyperframes-video.yaml around lines 22 -
23, Ensure VIDEO_TYPE is validated against an explicit allowlist before
proceeding: define an ALLOWED_VIDEO_TYPES (e.g., array or comma-separated
string) and check that VIDEO_TYPE (from the parameter default
VIDEO_TYPE="${3:-product-launch}") is a member; if not, print a clear error and
exit non-zero. Apply the same validation logic to the other occurrence
referenced (lines ~122-128) so unsupported values cannot pass silently; keep the
default assignment but perform the membership check immediately after and fail
fast with a helpful message listing allowed values.

Comment on lines +71 to +72
URL=$(echo "$setup_output" | grep '^URL=' | cut -d= -f2-)
PROJECT_DIR=$(echo "$setup_output" | grep '^PROJECT_DIR=' | cut -d= -f2-)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Find variable-reference patterns used in workflows:"
rg -n '\$[A-Za-z0-9_-]+(_output|\.output)' .archon/workflows

echo
echo "2) Inspect workflow substitution implementation for supported forms:"
rg -n --type=ts 'substitut|ARGUMENTS|WORKFLOW_ID|\.output|_output' packages/workflows/src

Repository: coleam00/Archon

Length of output: 50372


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"

echo "=== Hyperframes video workflow: context around lines 50-210 ==="
nl -ba "$FILE" | sed -n '40,220p'

echo
echo "=== Search for $setup_output / _output usage patterns in repo ==="
rg -n "\$[A-Za-z0-9_-]+_output" .archon/workflows packages/workflows/src

echo
echo "=== Search workflow substitution implementation for node output refs pattern ==="
rg -n "substituteNodeOutputRefs|\\$\\([a-zA-Z_].*\\)\\.output|_output|\\.output\\b" packages/workflows/src/dag-executor.ts packages/workflows/src/loader.ts packages/workflows/src/*.ts

echo
echo "=== Search for documentation/guidelines mentions of $nodeId.output in loader/builder ==="
rg -n "NOT shell-quoted|\\$nodeId\\.output|nodeId\\.output|prefer \\`json\\.loads\\(\"\\\"\\\"\\$nodeId\\.output\\\"\\\"\\\"\\)\\`|_output" packages/workflows/src packages/workflows/README* .archon/workflows/defaults/archon-workflow-builder.yaml

Repository: coleam00/Archon

Length of output: 184


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"

echo "=== Hyperframes video workflow: context around lines 50-210 ==="
cat -n "$FILE" | sed -n '50,210p'

echo
echo "=== Search for \$*(_output) usage patterns in workflows ==="
rg -n "\$[A-Za-z0-9_-]+_output\b" .archon/workflows

echo
echo "=== Search for code handling node output refs (look for _output vs .output) ==="
rg -n "substituteNodeOutputRefs|_output\b|nodeId\.output|\.output(\b|\[)" packages/workflows/src

echo
echo "=== Show relevant sections from dag-executor.ts and loader.ts ==="
sed -n '240,380p' packages/workflows/src/dag-executor.ts
sed -n '1,220p' packages/workflows/src/loader.ts

Repository: coleam00/Archon

Length of output: 6947


🏁 Script executed:

#!/bin/bash
set -u

# 1) Re-run the hyperframes file validation quickly (no nl dependency)
FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"
echo "=== Hyperframes: lines 60-200 ==="
cat -n "$FILE" | sed -n '60,200p'

# 2) Find $<id>_output usage across workflows (do not fail if no matches)
echo
echo "=== Search workflows for \$<id>_output (literal) ==="
rg -n '\$[A-Za-z0-9_-]+_output' .archon/workflows || true

echo
echo "=== Search codebase for env var injection / handling of _output in dag executor ==="
rg -n "_output\b|node.*_output|process\.env.*_output|env.*_output" packages/workflows/src/dag-executor.ts packages/workflows/src -S || true

echo
echo "=== Locate where bash nodes execute and how they pass upstream output ==="
rg -n "type: 'bash'|execFileAsync\\('bash'|bash\"|runBash|bash node|substituteNodeOutputRefs\\(.*true" packages/workflows/src/dag-executor.ts || true

echo
echo "=== Print the likely relevant section in dag-executor.ts near bash execution ==="
# Best-effort: show a window around the match locations above
sed -n '2100,2300p' packages/workflows/src/dag-executor.ts 2>/dev/null || true
sed -n '1260,1600p' packages/workflows/src/dag-executor.ts 2>/dev/null || true

Repository: coleam00/Archon

Length of output: 44431


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the implementation details for quoting/escaping of node output refs in bash nodes
rg -n "function substituteNodeOutputRefs|shellQuoteOrFile|escapedForBash|NODE_OUTPUT_FILE_THRESHOLD" packages/workflows/src/dag-executor.ts

# Print the relevant section around substituteNodeOutputRefs
sed -n '240,380p' packages/workflows/src/dag-executor.ts

Repository: coleam00/Archon

Length of output: 7139


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"
echo "=== Hyperframes video workflow: lines 1-60 (setup) ==="
cat -n "$FILE" | sed -n '1,70p'

Repository: coleam00/Archon

Length of output: 3368


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".archon/workflows/defaults/archon-workflow-builder.yaml"
echo "=== archon-workflow-builder.yaml: relevant guideline section ==="
cat -n "$FILE" | sed -n '150,220p'

Repository: coleam00/Archon

Length of output: 3897


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "function shellQuote\\b" packages/workflows/src/dag-executor.ts
sed -n '200,280p' packages/workflows/src/dag-executor.ts

Repository: coleam00/Archon

Length of output: 2897


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Inspect a couple of existing bash-node patterns for $<nodeId>.output usage
sed -n '35,90p' .archon/workflows/maintainer/maintainer-standup.yaml
echo "----"
sed -n '40,90p' .archon/workflows/maintainer/maintainer-review-pr.yaml
echo "----"
sed -n '1,120p' .archon/workflows/test-workflows/e2e-deterministic.yaml
echo "----"
sed -n '1,140p' .archon/workflows/experimental/archon-release.yaml | sed -n '300,420p'

Repository: coleam00/Archon

Length of output: 5612


Fix bash nodes to use $setup.output (not $setup_output) in archon-hyperframes-video

  • In .archon/workflows/defaults/archon-hyperframes-video.yaml, bash nodes reference "$setup_output" (e.g., lines 71-72, 140, 160-161, 184-185). The workflow engine substitutes upstream values via $nodeId.output refs inside the bash script; it does not inject an env var named <nodeId>_output, so URL/PROJECT_DIR/OUTPUT_NAME can become empty and break downstream steps.
  • setup sets VIDEO_TYPE (line 22) but does not validate against the documented allowlist, so unsupported values silently degrade the intended duration mapping.
Proposed fix
# capture (lines 71-72)
SETUP_OUTPUT=$setup.output
URL=$(echo "$SETUP_OUTPUT" | grep '^URL=' | cut -d= -f2-)
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)

# preview (line 140)
SETUP_OUTPUT=$setup.output
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)

# render (lines 160-161)
SETUP_OUTPUT=$setup.output
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)
OUTPUT_NAME=$(echo "$SETUP_OUTPUT" | grep '^OUTPUT_NAME=' | cut -d= -f2-)

# snapshot (lines 184-185)
SETUP_OUTPUT=$setup.output
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)
OUTPUT_NAME=$(echo "$SETUP_OUTPUT" | grep '^OUTPUT_NAME=' | cut -d= -f2-)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.archon/workflows/defaults/archon-hyperframes-video.yaml around lines 71 -
72, Bash nodes are incorrectly reading an env var named "$setup_output" instead
of the workflow-provided "$setup.output"; update each bash script block that
currently uses setup_output (e.g., the capture, preview, render, snapshot
blocks) to first assign SETUP_OUTPUT=$setup.output and then parse URL,
PROJECT_DIR, and OUTPUT_NAME from SETUP_OUTPUT (use the same grep/cut parsing
logic) so values are populated correctly; additionally, in the setup step ensure
VIDEO_TYPE is validated against the documented allowlist (explicitly check
accepted values and fail or default if invalid) to avoid silent mis-mapping of
duration.

Comment thread packages/core/src/db/users.ts Outdated
Comment on lines +68 to +75
await pool.query(
`UPDATE remote_agent_users
SET github_oauth_token = $1, github_username = $2, updated_at = NOW()
WHERE id = $3`,
[encrypted, githubUsername, userId]
);
getLog().info({ userId, githubUsername }, 'user.github_token_stored');
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when no user row is updated, and avoid logging githubUsername.

setGithubToken currently reports success even when no row matched userId, and it logs user-identifying data (githubUsername).

💡 Suggested fix
-  await pool.query(
+  const result = await pool.query(
     `UPDATE remote_agent_users
      SET github_oauth_token = $1, github_username = $2, updated_at = NOW()
      WHERE id = $3`,
     [encrypted, githubUsername, userId]
   );
-  getLog().info({ userId, githubUsername }, 'user.github_token_stored');
+  if (result.rowCount !== 1) {
+    throw new Error(`setGithubToken failed: no user found for id=${userId}`);
+  }
+  getLog().info({ userId }, 'user.github_token_stored');
As per coding guidelines "Prefer throwing early with clear error for unsupported or unsafe states; never silently swallow errors" and "never log API keys, tokens ... or PII".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/db/users.ts` around lines 68 - 75, The update currently
always logs success even when no row was updated and logs the GitHub username
(PII); in setGithubToken, capture the result of pool.query, check
result.rowCount and throw a clear error (e.g., "no user updated for id
<userId>") when rowCount === 0 to fail fast, and change the getLog().info call
to avoid logging githubUsername or the token (log only non-PII context such as
userId or a generic message). Ensure you reference the pool.query call and
replace the existing getLog().info usage accordingly.

Comment thread packages/core/src/handlers/clone.ts Outdated
Comment on lines 333 to 336
const isGithubUrl = workingUrl.includes('github.com');
const { token: forgeToken, scheme: authScheme } = resolveForgeAuth(workingUrl);
const effectiveToken = isGithubUrl && userGithubToken ? userGithubToken : forgeToken;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Use strict hostname matching before applying per-user GitHub token.

Line 333 uses substring matching (includes('github.com')), which is unsafe for host-confusable URLs and can leak user tokens to non-GitHub hosts. Parse the URL and require hostname === 'github.com' (or an exact allowlist).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.ts` around lines 333 - 336, The current
substring check for isGithubUrl is unsafe; parse workingUrl using the URL
constructor and set isGithubUrl only when the parsed hostname strictly equals
'github.com' (or matches an exact allowlist), then compute effectiveToken as
before (use userGithubToken only when that strict host check passes); ensure any
URL parsing errors are caught and treated as non-GitHub so you don’t
accidentally apply userGithubToken. Reference: update the computation of
isGithubUrl, the use of workingUrl, and the subsequent effectiveToken assignment
that depends on resolveForgeAuth and userGithubToken.

Comment thread packages/core/src/handlers/clone.ts Outdated
const parsed = safeParseUrl(workingUrl);
if (parsed) {
cloneUrl = `https://${authScheme}${forgeToken}@${parsed.hostname}${parsed.pathname}`;
cloneUrl = `https://${authScheme}${effectiveToken}@${parsed.hostname}${parsed.pathname}`;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve host:port when rebuilding authenticated clone URL.

Line 340 uses parsed.hostname, which strips custom ports. Use parsed.host so self-hosted forge URLs on non-default ports continue cloning correctly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.ts` at line 340, The authenticated clone URL
reconstruction in clone.ts builds cloneUrl using parsed.hostname which drops
custom ports; update the logic where cloneUrl is assigned (the line creating
cloneUrl) to use parsed.host (or parsed.host || parsed.hostname as a fallback)
so self-hosted forge URLs with non-default ports are preserved when creating the
authenticated URL.

Comment thread packages/server/src/routes/auth.ts Outdated
Comment on lines +119 to +132
const tokenRes = await fetch(`${keycloakBase()}/protocol/openid-connect/token`, {
method: 'POST',
headers: { 'Content-Type': 'application/x-www-form-urlencoded' },
body: new URLSearchParams({
grant_type: 'authorization_code',
client_id: process.env.KEYCLOAK_CLIENT_ID ?? '',
...(process.env.KEYCLOAK_CLIENT_SECRET
? { client_secret: process.env.KEYCLOAK_CLIENT_SECRET }
: {}),
redirect_uri: `${appBase(c)}/api/auth/callback`,
code,
code_verifier: stored.codeVerifier,
}),
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

External OAuth/OIDC calls need timeout + network-failure handling.

These fetch calls have no abort timeout. Network stalls or DNS/TLS failures will currently bubble into 500s or hung requests.

Also applies to: 272-300

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/routes/auth.ts` around lines 119 - 132, Wrap the external
token fetch (the call that assigns tokenRes when POSTing to
`${keycloakBase()}/protocol/openid-connect/token` in auth.ts) in an
AbortController-based timeout and add a try/catch to handle network failures;
ensure you abort the request after a configurable timeout, catch and log the
error via the existing logger or return a proper 504/502 style response, and
also check tokenRes.ok and handle non-2xx responses (parsing and surfacing error
details) instead of assuming success. Apply the same pattern to the other
external OIDC/fetch calls in this file (the later fetch sequence referenced
around the second block) so all external OAuth/OIDC HTTP calls use
AbortController timeouts, explicit error handling, and clear non-OK response
handling.

Comment thread packages/server/src/routes/auth.ts Outdated
Comment on lines +135 to +137
const body = await tokenRes.text();
getLog().error({ status: tokenRes.status, body }, 'auth.token_exchange_failed');
return c.json({ error: 'Token exchange failed' }, 502);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid logging raw token-exchange response bodies.

Logging body from the IdP response can leak sensitive error details into logs; log status and a bounded/sanitized error code instead.

As per coding guidelines: “never log API keys, tokens ... or PII”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/routes/auth.ts` around lines 135 - 137, The current
logging in the token exchange path leaks the raw IdP response body; update the
handler around tokenRes and getLog() (the block that logs
'auth.token_exchange_failed') to never include the raw body or any
tokens/PII—replace it with tokenRes.status and a bounded/sanitized error
identifier (e.g., parsed error code or a constant like "<idp-error-redacted>" or
the first 200 chars only), or log only status plus a safe errorCode variable;
then return the same 502 response via c.json({ error: 'Token exchange failed' })
without exposing IdP details.

Comment on lines +166 to +171
setCookie(c, 'archon_access_token', tokens.access_token, {
httpOnly: true,
sameSite: 'Lax',
path: '/',
maxAge: 60 * 60 * 8, // 8 hours
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Auth cookie is missing the secure flag.

archon_access_token is HTTP-only, but without secure: true it can be sent over non-TLS HTTP in misconfigured environments.

Suggested fix
     setCookie(c, 'archon_access_token', tokens.access_token, {
       httpOnly: true,
+      secure: true,
       sameSite: 'Lax',
       path: '/',
       maxAge: 60 * 60 * 8, // 8 hours
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
setCookie(c, 'archon_access_token', tokens.access_token, {
httpOnly: true,
sameSite: 'Lax',
path: '/',
maxAge: 60 * 60 * 8, // 8 hours
});
setCookie(c, 'archon_access_token', tokens.access_token, {
httpOnly: true,
secure: true,
sameSite: 'Lax',
path: '/',
maxAge: 60 * 60 * 8, // 8 hours
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/routes/auth.ts` around lines 166 - 171, The auth cookie
set in the setCookie call for 'archon_access_token' is missing the secure flag;
update the options object passed to setCookie(c, 'archon_access_token',
tokens.access_token, { ... }) to include secure: true so the cookie is only sent
over TLS, preserving httpOnly, sameSite, path, and maxAge settings (optionally
gate secure by NODE_ENV if you need HTTP in local dev).

Comment thread packages/web/src/lib/api.ts Outdated

async function fetchJSON<T>(url: string, options?: RequestInit): Promise<T> {
const res = await fetch(url, options);
const res = await fetch(url, { credentials: 'include', ...options });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make cookie credentials non-overridable in fetchJSON.

{ credentials: 'include', ...options } allows callers to override credentials accidentally. If this helper is meant to always include session cookies, spread first and set credentials last.

Suggested patch
-  const res = await fetch(url, { credentials: 'include', ...options });
+  const res = await fetch(url, { ...options, credentials: 'include' });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const res = await fetch(url, { credentials: 'include', ...options });
const res = await fetch(url, { ...options, credentials: 'include' });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/lib/api.ts` at line 63, In fetchJSON (the helper that calls
fetch), callers can override credentials because options are spread after the
default; change the call so options are spread first and credentials is set last
(i.e., fetch(url, { ...options, credentials: 'include' })) so that credentials:
'include' cannot be overridden; update the fetch invocation inside fetchJSON
accordingly.

Comment on lines +565 to +567
if (res.status === 401 || res.status === 404) return null;
if (!res.ok) return null;
const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t collapse unexpected /api/auth/me failures into “unauthenticated”.

Returning null for all non-OK responses hides 5xx/backend failures and makes diagnostics harder. Keep null for 401/404, but throw for other statuses.

Suggested patch
 export async function getAuthMe(): Promise<AuthMeResponse | null> {
   const res = await fetch('/api/auth/me', { credentials: 'include' });
   if (res.status === 401 || res.status === 404) return null;
-  if (!res.ok) return null;
+  if (!res.ok) {
+    const body = await res.text();
+    const truncated = body.length > 200 ? body.slice(0, 200) + '...' : body;
+    throw new Error(`GET /api/auth/me failed (${String(res.status)}): ${truncated}`);
+  }
   const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>;
   if (data.authenticated === false) return null;
   return data as AuthMeResponse;
 }

As per coding guidelines: “Prefer throwing early with clear error for unsupported or unsafe states; never silently swallow errors…”.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (res.status === 401 || res.status === 404) return null;
if (!res.ok) return null;
const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>;
if (res.status === 401 || res.status === 404) return null;
if (!res.ok) {
const body = await res.text();
const truncated = body.length > 200 ? body.slice(0, 200) + '...' : body;
throw new Error(`GET /api/auth/me failed (${String(res.status)}): ${truncated}`);
}
const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/web/src/lib/api.ts` around lines 565 - 567, The current response
handling in the block that checks res.status and creates the "data" variable
incorrectly returns null for all non-OK responses; change it so that only 401
and 404 return null and any other non-ok status throws an Error (include status
and a short message or response text for diagnostics) before attempting to call
res.json(); preserve the existing typing to { authenticated?: false } &
Partial<AuthMeResponse> when parsing successful responses and ensure thrown
errors surface upstream instead of being swallowed.

@leex279 leex279 marked this pull request as draft May 26, 2026 20:13
Thomas added 5 commits May 26, 2026 22:31
…fixes

Discovered while deploying live against Keycloak:

Backend
- middleware/auth.ts: always attempt cookie validation (was skipping on
  public /api/auth/* paths, so /api/auth/me always 401'd after login and
  the web UI looped back to /login).
- middleware/auth.ts: validate JWT issuer + azp claim instead of audience.
  Keycloak access tokens carry aud: "account", not the client id, so the
  prior audience: clientId check always failed.
- routes/auth.ts: appBase() now honors X-Forwarded-Proto / X-Forwarded-Host
  so the OIDC redirect_uri matches the public URL behind a reverse proxy.
- routes/auth.ts: new DELETE /api/auth/github endpoint that best-effort
  revokes the token on GitHub then clears the encrypted token + username.
- db/users.ts: new clearGithubToken(userId) helper.

Frontend
- lib/api.ts: disconnectGithub() client.
- SettingsPage.tsx: new "Account & Connections" card showing signed-in
  identity, Sign out, GitHub connection status, and Reconnect/Manage on
  GitHub/Disconnect buttons. Hidden in single-user mode (no user from
  useAuth).
…ranch

It was untracked WIP on the branch when bundled-defaults was regenerated in fe6edfc, which pulled it into the bundle. It is unrelated to the Keycloak auth work and was flagged by CodeRabbit on PR #1774.
Critical
- middleware/auth.ts: narrow PUBLIC_PATHS to {/api/auth/login, callback,
  logout, health, openapi.json} + PUBLIC_PREFIXES=['/webhooks/']. Routes
  like /api/auth/me and /api/auth/github/callback now hit JWT validation
  and receive a populated oidcUser.
- handlers/clone.ts: parse URL and require hostname === 'github.com' before
  applying per-user token (substring match was vulnerable to host-confusable
  URLs like github.com.attacker.io).

Major
- handlers/clone.ts: use parsed.host (preserves :port) when rebuilding
  authenticated clone URL — fixes self-hosted forges on non-default ports.
- orchestrator.ts: wrap getConversationById attribution lookup in try/catch
  so a transient DB error can't abort the entire workflow dispatch.
- server/index.ts: require DATABASE_URL to start with postgres:// or
  postgresql:// (was only checking presence — sqlite:// would slip through).
- api.ts (POST /conversations): fail fast if oidcUser is set but no matching
  remote_agent_users row exists, rather than silently creating an unaudited
  conversation.
- api.ts (POST /codebases): fail fast on missing DB user, and require the
  user's GitHub token for github.com URLs in multi-user mode (don't
  silently broaden to the org token, which would mis-attribute the clone).
- auth.ts: fetchWithTimeout wrapper (default 10s) around all external IdP
  and GitHub fetches; network failures return 502 instead of hanging.
- auth.ts: don't log raw token-exchange response bodies — extract only the
  short `error` code (PII guard).
- auth.ts: add secure:true to archon_access_token cookie.
- db/users.ts (setGithubToken): throw when rowCount != 1 (was silent on
  missing user); drop githubUsername from log payload.
- web/lib/api.ts (getAuthMe): throw on unexpected non-OK statuses instead
  of collapsing them to null (5xx failures are now visible).

Minor
- token-crypto.test.ts: restore TOKEN_ENCRYPTION_KEY via afterEach so
  throwing tests can't leak env state into later tests.
- provider.test.ts + registry.test.ts: restore IS_SANDBOX via afterEach
  so the override I added in this session doesn't mask root-check
  regressions in other test files in the same process.
- web/lib/api.ts (fetchJSON): set credentials AFTER spread so callers
  cannot accidentally override credentials: 'include'.

Skipped from CodeRabbit's list:
- auth.ts:378 ("use registerOpenApiRoute wrapper"): marked Heavy lift;
  out of scope for this fix-pass. Filed as follow-up.
- hyperframes-video.yaml comments: file already removed in 1850915.
Addresses CodeRabbit H1 (auth.ts:378) — per CLAUDE.md, all new API routes
must use registerOpenApiRoute(createRoute({...}), handler) so they appear
in the generated OpenAPI spec (/api/openapi.json).

- New routes/schemas/auth.schemas.ts with authMeResponseSchema, okResponseSchema,
  and query schemas for callback endpoints.
- routes/auth.ts: 7 createRoute configs with proper tags/summaries/responses.
  302 redirects declared as empty responses; Location set by c.redirect() at
  runtime.
- Local registerOpenApiRoute helper (matches the pattern in api.ts).
- Side effect: added the auth-required check on GET /api/auth/github (start)
  so the OpenAPI 401 declaration is honest — only login/callback/logout are
  in the middleware allowlist.
- Side effect: added secure:true to the github_oauth_state CSRF cookie.

No behavior changes for handlers themselves.
The previous refactor commit added imports for the new schema file but
forgot to git-add it before commit -am (only stages tracked files).
This commit lands the file so c5b631f actually compiles.
@Wirasm

Wirasm commented May 27, 2026

Copy link
Copy Markdown
Collaborator

@$leex279 related to #1348 — overlapping area or partial fix.

@Wirasm

Wirasm commented May 27, 2026

Copy link
Copy Markdown
Collaborator

@$leex279 related to #1469 — overlapping area or partial fix.

@Wirasm

Wirasm commented May 27, 2026

Copy link
Copy Markdown
Collaborator

@$leex279 related to #1348 — overlapping area or partial fix.

@Wirasm

Wirasm commented May 27, 2026

Copy link
Copy Markdown
Collaborator

@$leex279 related to #1469 — overlapping area or partial fix.

Archon and others added 2 commits May 27, 2026 13:52
In multi-user mode, workflow subprocesses (bash/script nodes and Claude/
Codex/Pi AI providers) inherited the container's GH_TOKEN/GITHUB_TOKEN
unconditionally via process.env. A user who had not connected their own
GitHub account could therefore have 'gh' and 'git push' inside any
workflow silently authenticate as the shared org account, including on
private repos owned by another user.

This change introduces a per-run GitHub token policy
(packages/workflows/src/utils/github-token-policy.ts):

- New env flag ARCHON_ALLOW_ORG_GITHUB_TOKEN_FALLBACK (default false)
- Multi-user mode + user-initiated run + user has personal token:
  inject the user's token as GH_TOKEN/GITHUB_TOKEN, clear COPILOT_GITHUB_TOKEN.
- Multi-user mode + user-initiated run + no user token + flag disabled
  (default): scrub all three keys, so 'gh' and 'git push' fail with an
  auth error instead of silently using the org token.
- Multi-user mode + user-initiated run + no user token + flag enabled:
  legacy behaviour, org token kept (opt-in).
- Server-initiated runs (no created_by_user_id — GitHub webhook, cron, CLI):
  never scrubbed; trusted server context.
- Single-user mode (no KEYCLOAK_URL): never scrubbed; backward-compatible.

Plumbing:
- WorkflowDeps gains optional getUserGithubToken(userId), wired from
  @archon/core/db/users.getGithubToken (already decrypts at rest).
- dag-executor resolves the run owner's token once per run and threads
  the resulting overrides into:
    - executeBashNode / executeScriptNode subprocess env builders
      (Reflect.deleteProperty for hard-delete semantics)
    - resolveNodeProviderAndModel / buildLoopNodeOptions baseOptions.env,
      so AI provider subprocesses receive the same overrides. The
      providers' existing process.env merge treats '' as scrub.
- cloneRepository: the forge-level GH_TOKEN/GITLAB_TOKEN/GITEA_TOKEN
  fallback is gated on the same policy. github.com URLs are still
  blocked at the API layer when the user has no personal token; this
  change closes the parallel hole for non-GitHub forges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lows

Previous commit (d18e7c8) only covered workflow dispatch (dag-executor).
Free-form chat — the most common path — goes through orchestrator-agent
which calls aiClient.sendQuery directly, spawning Claude/Codex with
process.env unaltered. So a non-connected user chatting with Claude
could still have the model's gh/git tool calls succeed using the org
GH_TOKEN.

Fix: in orchestrator-agent.handleMessage, resolve the conversation
owner's personal token via userDb.getGithubToken and call
resolveGithubTokenOverrides() the same way dag-executor does. Merge
the result LAST into requestOptions.env so it wins over any
per-codebase env that happens to declare GH_TOKEN.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Wirasm

Wirasm commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Review Summary

Verdict: minor-fixes-needed

PR #1774 adds OIDC/Keycloak multi-user authentication, per-user GitHub OAuth with AES-256-GCM token encryption, and a GitHub token isolation policy for workflow subprocesses. The implementation is solid — error handling is correct across all IdP/GitHub calls, comments explain the security-sensitive decisions well, and the code passes CLAUDE.md style checks. Two gaps need to be resolved before merge: a critical documentation gap (new public surfaces undocumented) and a test that exists but never runs in CI.


Blocking issues

  • No documentation for multi-user mode (packages/docs-web/src/content/docs/): The PR adds 8 new environment variables, 7 new API routes, and a behavioral change (org-token fallback now disabled by default in multi-user mode). None of this is documented. Add a deployment/multi-user.md page covering: when to enable multi-user mode, required env vars and how to generate TOKEN_ENCRYPTION_KEY (openssl rand -hex 32), PostgreSQL requirement (SQLite not supported), the login/GitHub OAuth flows, and the ARCHON_ALLOW_ORG_GITHUB_TOKEN_FALLBACK behavioral change. Also update reference/security.md to remove the now-incorrect "single-developer tool" claim and note the per-user token encryption. The PR body's Compatibility/Migration section is a good draft outline.

Suggested fixes

  • packages/server/src/middleware/auth.ts:63-72: azp claim check is skipped when KEYCLOAK_CLIENT_ID is unset. A token issued to any other Keycloak client in the same realm would be accepted. Add an assertion:

    if (azp && !clientId) throw new Error('KEYCLOAK_CLIENT_ID not configured; refusing token with azp claim');
  • packages/server/src/routes/auth.ts:64: pkceStateStore Map grows unbounded under sustained login attempts. Add a size cap:

    const MAX_PKCE_STATES = 10_000;
    if (pkceStateStore.size >= MAX_PKCE_STATES) {
      getLog().error({}, 'auth.pkce_state_store_full');
      return c.json({ error: 'Service overloaded, retry later' }, 503);
    }
  • packages/workflows/package.json: github-token-policy.test.ts is not listed in the test script and will never run in CI. Append bun test src/utils/github-token-policy.test.ts as a new invocation line (or merge into an existing src/utils/ batch that doesn't touch KEYCLOAK_URL to avoid env-pollution).

  • packages/server/src/routes/auth.tsregisterAuthRoutes has no HTTP-level tests for security-critical flows. Add tests for: PKCE state creation/validation/expiry, token exchange failure, GitHub user lookup non-fatal path (mock /user → non-OK, assert setGithubToken is still called with empty username), and me in single-user mode (KEYCLOAK_URL absent → { authenticated: false } 200).

  • packages/core/src/db/users.ts — No test covers getGithubToken decrypt path, clearGithubToken NULL-setting, or getUserByKeycloakSub null-return. Add a test file.

  • packages/core/src/workflows/store-adapter.tscreateWorkflowRun sets created_by_user_id but has no test. Add a case to store-adapter.test.ts asserting the column is set in the DB row.


Minor / nice-to-have

  • packages/server/src/middleware/auth.ts:70: payload.sub ?? '' has a dead empty-string fallback. Assert non-empty or remove the ?? ''.
  • packages/core/src/db/users.ts:91: getEncryptionKey() called on every getGithubToken call. Cache at module level (like the cachedLog pattern in the same file).
  • packages/workflows/src/utils/github-token-policy.ts: resolveGithubTokenOverrides(undefined, null) not tested in multi-user mode. Add the boundary case.
  • packages/docs-web/src/content/docs/reference/configuration.md: Docker Configuration section doesn't mention the new multi-user env vars. Add a note.
  • packages/web/src/lib/api.ts: disconnectGithub JSDoc first sentence is a literal restatement of the function name. Delete it — keep only the "Best-effort revokes..." clause.
  • packages/web/src/hooks/useAuth.ts: No test for the error-path (non-401/404 failure throwing). Add a test with fetch('/api/auth/me', ...) mocked to return 500.

Compliments

  • Error handling is a real strength of this PR: sensitive bodies are not logged, non-fatal paths are justified (GitHub user lookup failure doesn't abort token storage), and the per-user token policy fails closed. All of this is correct.
  • The secure: true cookie comment and the IdP body non-logging comment are exemplary — they capture security decisions that wouldn't be obvious from the code alone.
  • The module-level policy comment in github-token-policy.ts is appropriate for its subject matter and explains the decision tree clearly.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality, docs-impact.

Wirasm added a commit that referenced this pull request Jun 1, 2026
…icy, worktree git identity (PR-C G4–G6)

Attribution core for PR-C. All three paths are no-ops unless per-user GitHub is
enabled (GITHUB_APP_ID + TOKEN_ENCRYPTION_KEY), so solo PAT installs are
unchanged.

G4 — requires gate:
- workflowRequirementSchema + `requires: [github]` on workflowBaseSchema
- workflow-requirements.ts: assertWorkflowRequirementsMet + WorkflowRequirementError
- orchestrator hard-fails a requiring workflow at invocation (before isolation/
  worktree/AI cost) when the originating user hasn't connected

G5 — per-user token in workflow subprocess env:
- github-token-policy.ts cherry-picked from #1774; KEYCLOAK_URL mode-detector
  replaced by an injected isPerUserGitHubEnabled flag (kept pure)
- WorkflowDeps gains getUserGithubToken + isPerUserGitHubEnabled
- executor merges per-user overrides last into config.envVars (user token wins,
  or '' scrubs the org/bot token when unconnected + fallback off)
- store-adapter wires getUserGithubToken → getDecryptedAccessToken

G6 — worktree git identity:
- IsolationRequest/ResolveRequest gain gitIdentity; resolver forwards it
- WorktreeProvider stamps `git config user.email/name` on new worktrees
- orchestrator resolves the no-reply email (<id>+<login>@users.noreply...) from
  the originating userId and passes it on the resolve request

Tests: workflow-requirements (3), github-token-policy (8). Type-check + isolation
suite green.
Wirasm added a commit that referenced this pull request Jun 1, 2026
…-C) (#1823)

* feat(core): per-user GitHub token store + device-flow engine (PR-C foundation)

Foundation layer for per-user GitHub attribution (PRD Phase 3). Pure additive
infrastructure — nothing wires into the adapters/workflows/web yet, so there is
zero behavior change. Subsequent increments add the invocation gate, token
policy, git-config plumbing, connect surfaces, and interim web auth.

- token-crypto: AES-256-GCM encrypt/decrypt + TOKEN_ENCRYPTION_KEY validation
  (cherry-picked from the #1774 donor branch; decrypt rewritten for strict TS)
- github-auth/config: isPerUserGitHubEnabled (App + key), device-flow client-id
  loader, assertEncryptionKeyAtBoot fail-fast
- remote_agent_user_github_tokens table (PG migration + bundled-schema regen +
  SQLite createSchema); one row per user, encrypted access/refresh tokens
- db/user-github-token-store: save (UPSERT), get, decrypt-with-refresh-on-read
  (5-min buffer, per-user mutex against double-refresh), delete, no-reply email
- github-auth/device-flow: hand-rolled start/poll/refresh/fetchUser over fetch
  (no client_secret, slow_down + authorization_pending handling)
- github-auth/connect-service: device flow -> profile -> conflict-guarded
  identity bind -> encrypted token persist -> cached display_name/email
- users: updateUserGithubProfile, linkGithubIdentity, GithubIdentityConflictError

Tests: 30 new (crypto/device-flow/token-store, 10 each). bun run validate green.

TS filenames carry -row/-store suffixes to satisfy the local secret-guard hook;
the DB table is remote_agent_user_github_tokens.

* fix(core): address multi-agent review findings on per-user github tokens

Hardening + test coverage for the dormant per-user GitHub token store and
device-flow engine. No behavior change for existing installs.

Correctness:
- token store: separate the refresh call from the DB save so a persist failure
  after a successful refresh is no longer mislabeled `refresh_failed` and the
  freshly-issued (now-valid) token is returned instead of discarded; the failure
  logs `persist_failed` (error) (I1)
- linkGithubIdentity: add UNIQUE-violation race recovery (re-SELECT, conflict if
  a different user won, no-op if same user) so a concurrent double-connect no
  longer surfaces a raw 23505 (I2)
- getUserGithubTokenRecord: type the raw read as number|string and coerce
  github_user_id, honest against node-postgres returning BIGINT as a string (I5)

Types/logging/docs:
- row schema: widen token-expiry timestamps to date|string (PG Date vs SQLite
  ISO text) (I6)
- connectGithubForUser: emit github_connect.started/.failed per logging
  convention (I7)
- token store: debug log on the not-connected path; relinked + race-recovered
  audit logs on linkGithubIdentity (S3/S5)
- device-flow: floor-guard the slow_down backoff; surface GitHub's error_description
  from postForm (S1/S2)
- getEncryptionKey: accept an env param so assertEncryptionKeyAtBoot is consistent
  and injectable
- fix stale updateUserGithubProfile comment (I8); CLAUDE.md 11->12 tables and
  migration header table list, regenerated bundled schema (D1/D2)

Tests (+20): linkGithubIdentity (conflict/same-user/new/race) and
updateUserGithubProfile; token-store mutex + re-read fallback + persist-failure;
new config.test.ts and connect-service.test.ts (ordering: no token saved when the
identity bind throws).

Deferred: discriminated-union return for getDecryptedAccessToken (no callers yet
- premature); literal-union DeviceFlowError.code (clean form needs a string & {}
eslint-disable for marginal benefit). Rejected the review's suggestions to key
identity on numeric id (breaks adapter login matching), to drop a dead data.interval
operand (it is a live field GitHub returns), and to remove a fresh && guard (TS
cannot narrow it).

* feat(workflows,isolation): requires:[github] gate, per-user token policy, worktree git identity (PR-C G4–G6)

Attribution core for PR-C. All three paths are no-ops unless per-user GitHub is
enabled (GITHUB_APP_ID + TOKEN_ENCRYPTION_KEY), so solo PAT installs are
unchanged.

G4 — requires gate:
- workflowRequirementSchema + `requires: [github]` on workflowBaseSchema
- workflow-requirements.ts: assertWorkflowRequirementsMet + WorkflowRequirementError
- orchestrator hard-fails a requiring workflow at invocation (before isolation/
  worktree/AI cost) when the originating user hasn't connected

G5 — per-user token in workflow subprocess env:
- github-token-policy.ts cherry-picked from #1774; KEYCLOAK_URL mode-detector
  replaced by an injected isPerUserGitHubEnabled flag (kept pure)
- WorkflowDeps gains getUserGithubToken + isPerUserGitHubEnabled
- executor merges per-user overrides last into config.envVars (user token wins,
  or '' scrubs the org/bot token when unconnected + fallback off)
- store-adapter wires getUserGithubToken → getDecryptedAccessToken

G6 — worktree git identity:
- IsolationRequest/ResolveRequest gain gitIdentity; resolver forwards it
- WorktreeProvider stamps `git config user.email/name` on new worktrees
- orchestrator resolves the no-reply email (<id>+<login>@users.noreply...) from
  the originating userId and passes it on the resolve request

Tests: workflow-requirements (3), github-token-policy (8). Type-check + isolation
suite green.

* feat(adapters,server): route PR/issue comments through the originating user's token (PR-C G7)

When per-user GitHub is enabled (App mode + TOKEN_ENCRYPTION_KEY), the GitHub
adapter authors outbound comments under the human who triggered the thread,
instead of archon[bot].

- handleWebhook caches conversationId → originating archon userId
  (actorByConversation; App mode only, lost on restart → graceful bot fallback)
- postComment prefers a per-user Octokit (getUserOctokit, 5-min cached); on 401
  (revoked/expired) it evicts and falls back to the installation token rather
  than failing the reply
- adapter constructor gains an optional getUserToken resolver; server injects it
  (→ getDecryptedAccessToken) only when isPerUserGitHubEnabled()
- no MessageMetadata / core streaming changes — the actor is resolved entirely
  inside the adapter from the webhook it already processes

Existing 64 adapter tests pass (PAT path: actor undefined → unchanged). A
dedicated App-mode user-token test is deferred (no mock-provider harness yet);
the path is type-safe with a conservative bot fallback.

* feat(cli,adapters): connect-github surfaces for CLI and Slack (PR-C G8)

- CLI: `archon auth github` — device flow for the CLI user. Identity from
  ARCHON_USER_ID else $USER/$USERNAME, resolved to a stable 'cli' Archon user.
  Guarded on isPerUserGitHubEnabled; clear errors for conflict / device-flow /
  not-enabled. Wired into the cli.ts switch + noGitCommands + usage.
- Slack: `/archon connect github` — resolves the invoking user, drives the
  device flow detached, and delivers the code + result as ephemeral follow-ups
  via response_url (within its 30-min/5-use window). No orchestrator dispatch.

Both reuse the shared connectGithubForUser engine. CLI identity uses
ARCHON_USER_ID rather than a config.yaml field (smaller surface than expanding
the config schema this PR). Type-check + slack adapter suite (38) green.

* feat(server,web,docs): web device-flow endpoints, X-Archon-User auth, Connect UI, docs (PR-C G9–G10)

Web surface + interim web auth + docs for per-user GitHub.

Device flow (browser-driven, non-blocking):
- decomposed device-flow: pollDeviceFlowOnce + persistGithubConnection
  (connect-service tail), shared by the blocking CLI/Slack path
- POST /api/auth/github/device/start | /poll, GET/DELETE /api/auth/github
- web api client fns + a Settings "GitHub Identity" card (connect → show code →
  poll → connected; disconnect)

Interim web auth (X-Archon-User):
- resolveWebUserId(c) reads ARCHON_WEB_AUTH_HEADER (default X-Archon-User) and
  resolves a stable 'web' user; absent → NULL (never elevated)
- the three web userId TODO sites (create conversation, send message, run
  workflow) now attribute conversation/message rows and the workflow run

Boot: assertEncryptionKeyAtBoot fails fast on a malformed key; App-without-key
WARNs + disables per-user (App-for-bot-only stays valid).

Docs: .env.example (GITHUB_APP_CLIENT_ID, TOKEN_ENCRYPTION_KEY,
ARCHON_ALLOW_ORG_GITHUB_TOKEN_FALLBACK, ARCHON_WEB_AUTH_HEADER); CLAUDE.md
(requires:[github], auth endpoints).

Test fixes: orchestrator-isolation mocks the new db dep; api.messages /
api.workflow-runs updated for the addMessage userId arg. bun run validate green.

* fix(pr-c): address multi-agent review findings on per-user github tokens

Critical:
- C1 (core): wrap all three decryptToken calls in resolveAccessToken in
  try/catch -> log user_github_token.decrypt_failed and return null, honoring
  the documented "null on irrecoverable failure" contract. A TOKEN_ENCRYPTION_KEY
  rotation or tampered ciphertext no longer throws an opaque crypto error and
  crash workflow dispatch (the requires:[github] gate treats null as unconnected).
- C2 (workflows): until_bash loop subprocess now spreads ...(config.envVars ?? {})
  last, matching executeBashNode/executeScriptNode. The per-user token scrub and
  managed per-project env vars reach until_bash; it no longer inherits the
  server's ambient GitHub token and bypasses the scrub.
- C3 (docs): .env.example and the config.ts module doc now name all three vars
  correctly — GITHUB_APP_ID + TOKEN_ENCRYPTION_KEY arm the gate; GITHUB_APP_CLIENT_ID
  is additionally required for the connect flow. Previous text named CLIENT_ID as
  the gate trigger, which could cause a silent lockout.

Important:
- I1 (server): boot guard warns web_auth.header_trust_on_public_bind when the
  X-Archon-User trust is live (header set, or per-user mode since the default
  header is trusted) on a non-loopback bind, mirroring the /internal/* guard.
- I2 (server): new strict requireWebUser helper for the 4 auth-required endpoints
  distinguishes missing header (401) from identity-resolution backend failure
  (503, added to apiError's union). Best-effort resolveWebUserId kept for
  attribution paths but now logs headerPresent.
- I3 (server): GET/DELETE /api/auth/github wrapped in try/catch with userId
  context instead of an opaque 500 from the global handler.
- I4 (server): corrected the /internal/* inline comment — the guard refuses to
  start (fatal), it does not merely WARN.

Cheap correctness wins:
- S1 (core): assertKeyLength 32-byte guard in encryptToken/decryptToken surfaces
  an actionable error instead of Node's opaque "Invalid key length".
- S3 (server): replaced the nested ternary in the poll-status mapping with a
  testable mapDeviceFlowErrorToPollStatus helper.
- S5 (cli): corrected the stale TODO — archon auth github has landed; remaining
  work is threading resolveCliUserId() (ARCHON_USER_ID/$USER), not config.yaml.

Docs:
- README table count 7 -> 12; CLAUDE.md @archon/core batch count 7 -> 20 and a
  new `auth github` CLI entry.
- docs-web: `auth github` CLI reference, per-user GitHub env var table in
  configuration.md, `requires: [github]` in authoring-workflows.md, and a
  "Step 8: Enable per-user GitHub identity" section in github-app-setup.md.

Tests:
- T1 (core): direct pollDeviceFlowOnce coverage (pending / slow_down /
  default-interval / authorized / error).
- T2 (server): auth-poll-status.test.ts covers the expired/denied/error mapping;
  registered in server package.json.

Deferred:
- T3: locking the isPerUserGitHubEnabled()===false no-op for requires:[github]
  at its only call site (dispatchOrchestratorWorkflow) needs ~100 lines of new
  mock scaffolding in orchestrator-agent.test.ts (which today only tests command
  parsing and mocks neither the config gate nor the token store) to verify a
  one-line && short-circuit whose two halves are already unit-tested
  (config.test.ts + workflow-requirements.test.ts). Not worth the scaffolding.

bun run validate passes (all 6 checks).
Thomas added 5 commits June 1, 2026 17:50
Lockfile drifted during the merge because both branches independently
bumped package versions; regenerated with `bun install --linker=hoisted`
to satisfy the Dockerfile's `--frozen-lockfile` install step.
Dev's PR-C device-flow added disconnectGithub() at L117; the fork already
had its own at L521. Both call DELETE /api/auth/github with the same
semantics. Keep dev's (per the merge directive: dev wins on auth) and
remove the fork's duplicate. SettingsPage call sites only use the
mutationFn return for invalidation, so either return type works.
The keycloak fork extended cloneRepository() with a userGithubToken
param and reached into @archon/workflows/utils/github-token-policy
(which dev keeps internal). Dev's design centralizes the per-user
token policy inside the workflow executor instead. Replace the fork's
clone.ts with dev's version and drop the second arg at the api.ts
call site so the bundler can resolve the @archon/workflows import.

Per merge directive: dev wins on GitHub auth.
…umers

orchestrator-agent.ts (chat path) imports resolveGithubTokenOverrides
from @archon/workflows/utils/github-token-policy. Dev kept this as an
internal relative import inside the workflows package, but the keycloak
fork needs it on the chat path too (commit 3413e57). Exposing the
subpath is the smallest correct fix: no behavior change for dev's own
internal callers, and the fork's multi-user chat-path token policy works.
Thomas and others added 3 commits June 1, 2026 18:15
…v merge

Dev's auth.schemas.ts replaced the fork's via -X theirs and only carries
the new device-flow schemas. The fork's routes/auth.ts (Keycloak/OIDC
gate) still imports authMeResponseSchema, okResponseSchema,
oidcCallbackQuerySchema, githubCallbackQuerySchema. Re-add them so the
two route files (Keycloak gate + device-flow connect) can coexist.
routes/auth.ts (Keycloak callback) + routes/api.ts + orchestrator-agent.ts
all call upsertUser, getUserByKeycloakSub, setGithubToken, getGithubToken,
clearGithubToken. Dev's users.ts (taken via -X theirs) didn't have these
— it uses a different model (findOrCreateUserByPlatformIdentity +
user_github_tokens table). The DB schema still has the fork's columns
(keycloak_sub, github_oauth_token, github_username) so the data is intact.

Co-located the fork's functions in dev's users.ts with a separate
KeycloakUser type so the two user models live side-by-side without
mixing types.
Add archon-smart-pr-review-agnostic and archon-comprehensive-pr-review-agnostic
under .archon/workflows/experimental/, mirroring archon-fix-github-issue-agnostic.

These drop Claude-only bindings so every node inherits the configured default
assistant, making them runnable on Codex/Pi-only deployments:
- smart variant: removes the `model: haiku` pin on `classify` (Codex rejects
  Claude model names), adds `additionalProperties: false` to its output_format
  (required by Codex/OpenAI strict structured output), and drops the Claude-only
  `mcp:` ntfy notify branch.
- comprehensive variant: already binding-free; renamed copy for parity.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants