feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth#1774
feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth#1774leex279 wants to merge 17 commits into
Conversation
Adds optional Keycloak-backed OIDC authentication to the web UI and REST API,
while preserving the single-user/SQLite path entirely (no-op when KEYCLOAK_URL
is absent).
P0 – DB + crypto foundation:
- migrations/022_users_and_auth.sql: remote_agent_users table, nullable
created_by_user_id FKs on conversations and workflow_runs
- packages/core/src/utils/token-crypto.ts: AES-256-GCM encrypt/decrypt for
GitHub OAuth tokens; TOKEN_ENCRYPTION_KEY env var (32-byte hex)
- packages/core/src/db/users.ts: upsertUser, getUserByKeycloakSub, getUserById,
setGithubToken (encrypted), getGithubToken (decrypted)
P1 – OIDC middleware + auth routes:
- packages/server/src/middleware/auth.ts: JWKS-backed JWT verification via jose;
reads cookie or Bearer token; public-path bypass; no-op when KEYCLOAK_URL unset
- packages/server/src/routes/auth.ts: /api/auth/{login,callback,logout,me,github,
github/callback} — PKCE code flow, cookie set on callback, per-user GitHub OAuth
- Startup fatal gate: PostgreSQL + TOKEN_ENCRYPTION_KEY required in multi-user mode
P2 – Wire created_by_user_id:
- Conversations and workflow runs record the creating user's UUID (nullable for compat)
P3 – Per-user GitHub OAuth in clone path:
- packages/core/src/handlers/clone.ts: prefers per-user token over org token for
github.com URLs when provided
P4 – Web UI:
- fetchJSON redirects to /api/auth/login on 401 when OIDC is active
- packages/web/src/hooks/useAuth.ts: useAuth() hook via @tanstack/react-query
- TopNav: UserMenu dropdown (name, GitHub connect status, sign out)
P5 – Config:
- .env.example: Keycloak and GitHub OAuth variable documentation
- conversations/workflows test: add null for created_by_user_id arg in INSERT mocks - api.codebases test: expect userGithubToken second arg on cloneRepository - claude/registry provider tests: set IS_SANDBOX=1 in beforeEach so ClaudeProvider can construct under UID 0 (Docker/CI) - regenerate bundled-defaults to include the previously-untracked archon-hyperframes-video workflow that was on the branch
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR adds Keycloak/OIDC-based multi-user authentication with encrypted per-user GitHub token storage, plus a new HyperFrames video generation workflow. Users can now log in via Keycloak, optionally connect their GitHub account, and the system tracks conversation and workflow ownership. ChangesMulti-User Authentication System
HyperFrames Video Generation Workflow
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 19
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.archon/workflows/defaults/archon-hyperframes-video.yaml:
- Around line 22-23: Ensure VIDEO_TYPE is validated against an explicit
allowlist before proceeding: define an ALLOWED_VIDEO_TYPES (e.g., array or
comma-separated string) and check that VIDEO_TYPE (from the parameter default
VIDEO_TYPE="${3:-product-launch}") is a member; if not, print a clear error and
exit non-zero. Apply the same validation logic to the other occurrence
referenced (lines ~122-128) so unsupported values cannot pass silently; keep the
default assignment but perform the membership check immediately after and fail
fast with a helpful message listing allowed values.
- Around line 71-72: Bash nodes are incorrectly reading an env var named
"$setup_output" instead of the workflow-provided "$setup.output"; update each
bash script block that currently uses setup_output (e.g., the capture, preview,
render, snapshot blocks) to first assign SETUP_OUTPUT=$setup.output and then
parse URL, PROJECT_DIR, and OUTPUT_NAME from SETUP_OUTPUT (use the same grep/cut
parsing logic) so values are populated correctly; additionally, in the setup
step ensure VIDEO_TYPE is validated against the documented allowlist (explicitly
check accepted values and fail or default if invalid) to avoid silent
mis-mapping of duration.
In `@packages/core/src/db/users.ts`:
- Around line 68-75: The update currently always logs success even when no row
was updated and logs the GitHub username (PII); in setGithubToken, capture the
result of pool.query, check result.rowCount and throw a clear error (e.g., "no
user updated for id <userId>") when rowCount === 0 to fail fast, and change the
getLog().info call to avoid logging githubUsername or the token (log only
non-PII context such as userId or a generic message). Ensure you reference the
pool.query call and replace the existing getLog().info usage accordingly.
In `@packages/core/src/handlers/clone.ts`:
- Line 340: The authenticated clone URL reconstruction in clone.ts builds
cloneUrl using parsed.hostname which drops custom ports; update the logic where
cloneUrl is assigned (the line creating cloneUrl) to use parsed.host (or
parsed.host || parsed.hostname as a fallback) so self-hosted forge URLs with
non-default ports are preserved when creating the authenticated URL.
- Around line 333-336: The current substring check for isGithubUrl is unsafe;
parse workingUrl using the URL constructor and set isGithubUrl only when the
parsed hostname strictly equals 'github.com' (or matches an exact allowlist),
then compute effectiveToken as before (use userGithubToken only when that strict
host check passes); ensure any URL parsing errors are caught and treated as
non-GitHub so you don’t accidentally apply userGithubToken. Reference: update
the computation of isGithubUrl, the use of workingUrl, and the subsequent
effectiveToken assignment that depends on resolveForgeAuth and userGithubToken.
In `@packages/core/src/orchestrator/orchestrator.ts`:
- Around line 346-347: The call to db.getConversationById can throw and should
not abort dispatch; wrap the lookup in a try/catch around the
db.getConversationById(ctx.conversationDbId) call so any thrown error is caught
and logged (or ignored) and parentConversation/createdByUserId fall back to
undefined; specifically protect the parentConversation variable assignment and
ensure createdByUserId is set from parentConversation only when the lookup
succeeds, leaving attribution undefined on failure rather than propagating the
exception.
In `@packages/core/src/utils/token-crypto.test.ts`:
- Around line 37-57: The tests mutate process.env.TOKEN_ENCRYPTION_KEY but
restore it only inside test bodies, which can leak state if a test throws;
change to a guaranteed cleanup by capturing const origEnv =
process.env.TOKEN_ENCRYPTION_KEY and restoring it in a global afterEach (or by
wrapping each test body in try/finally) so TOKEN_ENCRYPTION_KEY is always reset
regardless of failures; update the test file to use afterEach(() => {
process.env.TOKEN_ENCRYPTION_KEY = origEnv; }) and keep references to
getEncryptionKey and the TOKEN_ENCRYPTION_KEY manipulations unchanged.
In `@packages/providers/src/claude/provider.test.ts`:
- Around line 75-77: Save the original process.env.IS_SANDBOX value at the start
of the test setup and restore it in an afterEach cleanup to avoid leaking the
sandbox flag between tests; specifically update the test file's beforeEach block
that sets process.env.IS_SANDBOX = '1' (in
packages/providers/src/claude/provider.test.ts) to capture the prior value
(e.g., const prevIsSandbox = process.env.IS_SANDBOX) and add an afterEach that
resets process.env.IS_SANDBOX = prevIsSandbox (or deletes it if it was
undefined) so individual root-check tests can control IS_SANDBOX safely.
In `@packages/providers/src/registry.test.ts`:
- Around line 61-63: Save the original value of process.env.IS_SANDBOX before
mutating it in the test (e.g., capture it at top of the test file or inside the
setup block), set process.env.IS_SANDBOX = '1' for the test(s) that need it, and
restore the saved original value in an afterEach or afterAll teardown so the
global environment is not leaked across registry tests; update registry.test.ts
to perform this save/set/restore around the code that currently assigns
process.env.IS_SANDBOX so tests remain order-independent.
In `@packages/server/src/index.ts`:
- Around line 206-215: The current check only verifies process.env.DATABASE_URL
exists when KEYCLOAK_URL is enabled; update the validation to also reject
non-PostgreSQL schemes by verifying DATABASE_URL starts with a PostgreSQL scheme
(e.g. "postgresql://" or "postgres://") before continuing. If the scheme is
invalid, call getLog().fatal with the same 'multi_user_mode_requires_postgresql'
context (and a hint mentioning acceptable schemes), print the console error
about requiring PostgreSQL, and exit with process.exit(1) to fail fast; apply
this logic around the existing block that references process.env.DATABASE_URL
and KEYCLOAK_URL.
In `@packages/server/src/middleware/auth.ts`:
- Around line 13-17: The PUBLIC_PREFIXES entry '/api/auth/' is too broad and
causes isPublicPath to treat all /api/auth/* routes as public (preventing
oidcUser from being set); update PUBLIC_PREFIXES and isPublicPath so only
intended routes are public: remove or change '/api/auth/' to an exact path like
'/api/auth' (or list only explicit public endpoints such as '/api/auth/login',
'/api/auth/logout') and change isPublicPath to treat entries that end with a
slash as true prefixes but treat entries without a trailing slash as exact
matches (i.e., use equality for exact paths and startsWith only for deliberate
prefix entries); apply the same fix wherever PUBLIC_PREFIXES is referenced
(including the other occurrence around lines 50-52) so /api/auth/me and
/api/auth/github/callback no longer bypass JWT parsing.
In `@packages/server/src/routes/api.ts`:
- Around line 1747-1753: If an authenticated OIDC user exists but no DB user or
GitHub token is found, do not fall back to org-level credentials; instead detect
when oidcUser is truthy and (userDb.getUserByKeycloakSub(oidcUser.sub) returns
null OR userDb.getGithubToken(...) returns undefined) and return an explicit
authorization error (e.g., 403/401) to the caller rather than continuing to call
cloneRepository with org credentials; update the logic around oidcUser,
userDb.getUserByKeycloakSub, and userDb.getGithubToken to short-circuit with an
error response for multi-user authenticated requests to avoid silent permission
broadening.
- Around line 1296-1302: If an oidcUser is present but
userDb.getUserByKeycloakSub(oidcUser.sub) returns null, stop processing and
throw a clear error instead of leaving webUserId undefined; update the logic
around the oidcUser resolution (the block using oidcUser,
userDb.getUserByKeycloakSub, and webUserId) to check for a missing dbUser and
throw an explicit error (e.g., "OIDC user not found in DB" or similar) so
downstream conversation creation never proceeds without created_by_user_id in
multi-user mode.
In `@packages/server/src/routes/auth.ts`:
- Around line 135-137: The current logging in the token exchange path leaks the
raw IdP response body; update the handler around tokenRes and getLog() (the
block that logs 'auth.token_exchange_failed') to never include the raw body or
any tokens/PII—replace it with tokenRes.status and a bounded/sanitized error
identifier (e.g., parsed error code or a constant like "<idp-error-redacted>" or
the first 200 chars only), or log only status plus a safe errorCode variable;
then return the same 502 response via c.json({ error: 'Token exchange failed' })
without exposing IdP details.
- Around line 119-132: Wrap the external token fetch (the call that assigns
tokenRes when POSTing to `${keycloakBase()}/protocol/openid-connect/token` in
auth.ts) in an AbortController-based timeout and add a try/catch to handle
network failures; ensure you abort the request after a configurable timeout,
catch and log the error via the existing logger or return a proper 504/502 style
response, and also check tokenRes.ok and handle non-2xx responses (parsing and
surfacing error details) instead of assuming success. Apply the same pattern to
the other external OIDC/fetch calls in this file (the later fetch sequence
referenced around the second block) so all external OAuth/OIDC HTTP calls use
AbortController timeouts, explicit error handling, and clear non-OK response
handling.
- Around line 62-315: Replace the direct app.get(...) registrations for the auth
endpoints with the OpenAPI wrapper: for each route currently registered via
app.get('/api/auth/login', ...), app.get('/api/auth/callback', ...),
app.get('/api/auth/logout', ...), app.get('/api/auth/me', ...),
app.get('/api/auth/github', ...), and app.get('/api/auth/github/callback', ...)
wrap the handler with registerOpenApiRoute(createRoute({...}), handler) using
appropriate route metadata (method, path, summary, responses) so the routes
conform to the server contract; keep the existing handler functions intact but
move them into the second arg of registerOpenApiRoute and import/initialize
createRoute and registerOpenApiRoute as used elsewhere in the codebase.
- Around line 166-171: The auth cookie set in the setCookie call for
'archon_access_token' is missing the secure flag; update the options object
passed to setCookie(c, 'archon_access_token', tokens.access_token, { ... }) to
include secure: true so the cookie is only sent over TLS, preserving httpOnly,
sameSite, path, and maxAge settings (optionally gate secure by NODE_ENV if you
need HTTP in local dev).
In `@packages/web/src/lib/api.ts`:
- Line 63: In fetchJSON (the helper that calls fetch), callers can override
credentials because options are spread after the default; change the call so
options are spread first and credentials is set last (i.e., fetch(url, {
...options, credentials: 'include' })) so that credentials: 'include' cannot be
overridden; update the fetch invocation inside fetchJSON accordingly.
- Around line 565-567: The current response handling in the block that checks
res.status and creates the "data" variable incorrectly returns null for all
non-OK responses; change it so that only 401 and 404 return null and any other
non-ok status throws an Error (include status and a short message or response
text for diagnostics) before attempting to call res.json(); preserve the
existing typing to { authenticated?: false } & Partial<AuthMeResponse> when
parsing successful responses and ensure thrown errors surface upstream instead
of being swallowed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 27130310-687c-4bb9-9ff5-99d30f707a64
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (28)
.archon/workflows/defaults/archon-hyperframes-video.yaml.env.examplemigrations/022_users_and_auth.sqlpackages/core/src/db/conversations.test.tspackages/core/src/db/conversations.tspackages/core/src/db/users.tspackages/core/src/db/workflows.test.tspackages/core/src/db/workflows.tspackages/core/src/handlers/clone.tspackages/core/src/index.tspackages/core/src/orchestrator/orchestrator.tspackages/core/src/types/index.tspackages/core/src/utils/token-crypto.test.tspackages/core/src/utils/token-crypto.tspackages/providers/src/claude/provider.test.tspackages/providers/src/registry.test.tspackages/server/package.jsonpackages/server/src/index.tspackages/server/src/middleware/auth.tspackages/server/src/routes/api.codebases.test.tspackages/server/src/routes/api.tspackages/server/src/routes/auth.tspackages/web/src/components/layout/TopNav.tsxpackages/web/src/hooks/useAuth.tspackages/web/src/lib/api.tspackages/workflows/src/defaults/bundled-defaults.generated.tspackages/workflows/src/schemas/workflow-run.tspackages/workflows/src/store.ts
| VIDEO_TYPE="${3:-product-launch}" | ||
| OUTPUT_NAME="${4:-hyperframes-output}" |
There was a problem hiding this comment.
Validate VIDEO_TYPE against the declared allowlist in setup.
Unsupported values currently pass through silently, which weakens deterministic behavior in the generation step.
Proposed fix
VIDEO_TYPE="${3:-product-launch}"
+ case "$VIDEO_TYPE" in
+ social-ad|product-launch|product-tour|brand-reel|feature-announcement|teaser) ;;
+ *)
+ echo "ERROR: Unsupported video type '$VIDEO_TYPE'."
+ echo "Allowed: social-ad | product-launch | product-tour | brand-reel | feature-announcement | teaser"
+ exit 1
+ ;;
+ esacAlso applies to: 122-128
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.archon/workflows/defaults/archon-hyperframes-video.yaml around lines 22 -
23, Ensure VIDEO_TYPE is validated against an explicit allowlist before
proceeding: define an ALLOWED_VIDEO_TYPES (e.g., array or comma-separated
string) and check that VIDEO_TYPE (from the parameter default
VIDEO_TYPE="${3:-product-launch}") is a member; if not, print a clear error and
exit non-zero. Apply the same validation logic to the other occurrence
referenced (lines ~122-128) so unsupported values cannot pass silently; keep the
default assignment but perform the membership check immediately after and fail
fast with a helpful message listing allowed values.
| URL=$(echo "$setup_output" | grep '^URL=' | cut -d= -f2-) | ||
| PROJECT_DIR=$(echo "$setup_output" | grep '^PROJECT_DIR=' | cut -d= -f2-) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "1) Find variable-reference patterns used in workflows:"
rg -n '\$[A-Za-z0-9_-]+(_output|\.output)' .archon/workflows
echo
echo "2) Inspect workflow substitution implementation for supported forms:"
rg -n --type=ts 'substitut|ARGUMENTS|WORKFLOW_ID|\.output|_output' packages/workflows/srcRepository: coleam00/Archon
Length of output: 50372
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"
echo "=== Hyperframes video workflow: context around lines 50-210 ==="
nl -ba "$FILE" | sed -n '40,220p'
echo
echo "=== Search for $setup_output / _output usage patterns in repo ==="
rg -n "\$[A-Za-z0-9_-]+_output" .archon/workflows packages/workflows/src
echo
echo "=== Search workflow substitution implementation for node output refs pattern ==="
rg -n "substituteNodeOutputRefs|\\$\\([a-zA-Z_].*\\)\\.output|_output|\\.output\\b" packages/workflows/src/dag-executor.ts packages/workflows/src/loader.ts packages/workflows/src/*.ts
echo
echo "=== Search for documentation/guidelines mentions of $nodeId.output in loader/builder ==="
rg -n "NOT shell-quoted|\\$nodeId\\.output|nodeId\\.output|prefer \\`json\\.loads\\(\"\\\"\\\"\\$nodeId\\.output\\\"\\\"\\\"\\)\\`|_output" packages/workflows/src packages/workflows/README* .archon/workflows/defaults/archon-workflow-builder.yamlRepository: coleam00/Archon
Length of output: 184
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"
echo "=== Hyperframes video workflow: context around lines 50-210 ==="
cat -n "$FILE" | sed -n '50,210p'
echo
echo "=== Search for \$*(_output) usage patterns in workflows ==="
rg -n "\$[A-Za-z0-9_-]+_output\b" .archon/workflows
echo
echo "=== Search for code handling node output refs (look for _output vs .output) ==="
rg -n "substituteNodeOutputRefs|_output\b|nodeId\.output|\.output(\b|\[)" packages/workflows/src
echo
echo "=== Show relevant sections from dag-executor.ts and loader.ts ==="
sed -n '240,380p' packages/workflows/src/dag-executor.ts
sed -n '1,220p' packages/workflows/src/loader.tsRepository: coleam00/Archon
Length of output: 6947
🏁 Script executed:
#!/bin/bash
set -u
# 1) Re-run the hyperframes file validation quickly (no nl dependency)
FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"
echo "=== Hyperframes: lines 60-200 ==="
cat -n "$FILE" | sed -n '60,200p'
# 2) Find $<id>_output usage across workflows (do not fail if no matches)
echo
echo "=== Search workflows for \$<id>_output (literal) ==="
rg -n '\$[A-Za-z0-9_-]+_output' .archon/workflows || true
echo
echo "=== Search codebase for env var injection / handling of _output in dag executor ==="
rg -n "_output\b|node.*_output|process\.env.*_output|env.*_output" packages/workflows/src/dag-executor.ts packages/workflows/src -S || true
echo
echo "=== Locate where bash nodes execute and how they pass upstream output ==="
rg -n "type: 'bash'|execFileAsync\\('bash'|bash\"|runBash|bash node|substituteNodeOutputRefs\\(.*true" packages/workflows/src/dag-executor.ts || true
echo
echo "=== Print the likely relevant section in dag-executor.ts near bash execution ==="
# Best-effort: show a window around the match locations above
sed -n '2100,2300p' packages/workflows/src/dag-executor.ts 2>/dev/null || true
sed -n '1260,1600p' packages/workflows/src/dag-executor.ts 2>/dev/null || trueRepository: coleam00/Archon
Length of output: 44431
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the implementation details for quoting/escaping of node output refs in bash nodes
rg -n "function substituteNodeOutputRefs|shellQuoteOrFile|escapedForBash|NODE_OUTPUT_FILE_THRESHOLD" packages/workflows/src/dag-executor.ts
# Print the relevant section around substituteNodeOutputRefs
sed -n '240,380p' packages/workflows/src/dag-executor.tsRepository: coleam00/Archon
Length of output: 7139
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".archon/workflows/defaults/archon-hyperframes-video.yaml"
echo "=== Hyperframes video workflow: lines 1-60 (setup) ==="
cat -n "$FILE" | sed -n '1,70p'Repository: coleam00/Archon
Length of output: 3368
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE=".archon/workflows/defaults/archon-workflow-builder.yaml"
echo "=== archon-workflow-builder.yaml: relevant guideline section ==="
cat -n "$FILE" | sed -n '150,220p'Repository: coleam00/Archon
Length of output: 3897
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "function shellQuote\\b" packages/workflows/src/dag-executor.ts
sed -n '200,280p' packages/workflows/src/dag-executor.tsRepository: coleam00/Archon
Length of output: 2897
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect a couple of existing bash-node patterns for $<nodeId>.output usage
sed -n '35,90p' .archon/workflows/maintainer/maintainer-standup.yaml
echo "----"
sed -n '40,90p' .archon/workflows/maintainer/maintainer-review-pr.yaml
echo "----"
sed -n '1,120p' .archon/workflows/test-workflows/e2e-deterministic.yaml
echo "----"
sed -n '1,140p' .archon/workflows/experimental/archon-release.yaml | sed -n '300,420p'Repository: coleam00/Archon
Length of output: 5612
Fix bash nodes to use $setup.output (not $setup_output) in archon-hyperframes-video
- In
.archon/workflows/defaults/archon-hyperframes-video.yaml, bash nodes reference"$setup_output"(e.g., lines 71-72, 140, 160-161, 184-185). The workflow engine substitutes upstream values via$nodeId.outputrefs inside the bash script; it does not inject an env var named<nodeId>_output, soURL/PROJECT_DIR/OUTPUT_NAMEcan become empty and break downstream steps. setupsetsVIDEO_TYPE(line 22) but does not validate against the documented allowlist, so unsupported values silently degrade the intended duration mapping.
Proposed fix
# capture (lines 71-72)
SETUP_OUTPUT=$setup.output
URL=$(echo "$SETUP_OUTPUT" | grep '^URL=' | cut -d= -f2-)
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)
# preview (line 140)
SETUP_OUTPUT=$setup.output
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)
# render (lines 160-161)
SETUP_OUTPUT=$setup.output
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)
OUTPUT_NAME=$(echo "$SETUP_OUTPUT" | grep '^OUTPUT_NAME=' | cut -d= -f2-)
# snapshot (lines 184-185)
SETUP_OUTPUT=$setup.output
PROJECT_DIR=$(echo "$SETUP_OUTPUT" | grep '^PROJECT_DIR=' | cut -d= -f2-)
OUTPUT_NAME=$(echo "$SETUP_OUTPUT" | grep '^OUTPUT_NAME=' | cut -d= -f2-)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.archon/workflows/defaults/archon-hyperframes-video.yaml around lines 71 -
72, Bash nodes are incorrectly reading an env var named "$setup_output" instead
of the workflow-provided "$setup.output"; update each bash script block that
currently uses setup_output (e.g., the capture, preview, render, snapshot
blocks) to first assign SETUP_OUTPUT=$setup.output and then parse URL,
PROJECT_DIR, and OUTPUT_NAME from SETUP_OUTPUT (use the same grep/cut parsing
logic) so values are populated correctly; additionally, in the setup step ensure
VIDEO_TYPE is validated against the documented allowlist (explicitly check
accepted values and fail or default if invalid) to avoid silent mis-mapping of
duration.
| await pool.query( | ||
| `UPDATE remote_agent_users | ||
| SET github_oauth_token = $1, github_username = $2, updated_at = NOW() | ||
| WHERE id = $3`, | ||
| [encrypted, githubUsername, userId] | ||
| ); | ||
| getLog().info({ userId, githubUsername }, 'user.github_token_stored'); | ||
| } |
There was a problem hiding this comment.
Fail fast when no user row is updated, and avoid logging githubUsername.
setGithubToken currently reports success even when no row matched userId, and it logs user-identifying data (githubUsername).
💡 Suggested fix
- await pool.query(
+ const result = await pool.query(
`UPDATE remote_agent_users
SET github_oauth_token = $1, github_username = $2, updated_at = NOW()
WHERE id = $3`,
[encrypted, githubUsername, userId]
);
- getLog().info({ userId, githubUsername }, 'user.github_token_stored');
+ if (result.rowCount !== 1) {
+ throw new Error(`setGithubToken failed: no user found for id=${userId}`);
+ }
+ getLog().info({ userId }, 'user.github_token_stored');🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/db/users.ts` around lines 68 - 75, The update currently
always logs success even when no row was updated and logs the GitHub username
(PII); in setGithubToken, capture the result of pool.query, check
result.rowCount and throw a clear error (e.g., "no user updated for id
<userId>") when rowCount === 0 to fail fast, and change the getLog().info call
to avoid logging githubUsername or the token (log only non-PII context such as
userId or a generic message). Ensure you reference the pool.query call and
replace the existing getLog().info usage accordingly.
| const isGithubUrl = workingUrl.includes('github.com'); | ||
| const { token: forgeToken, scheme: authScheme } = resolveForgeAuth(workingUrl); | ||
| const effectiveToken = isGithubUrl && userGithubToken ? userGithubToken : forgeToken; | ||
|
|
There was a problem hiding this comment.
Use strict hostname matching before applying per-user GitHub token.
Line 333 uses substring matching (includes('github.com')), which is unsafe for host-confusable URLs and can leak user tokens to non-GitHub hosts. Parse the URL and require hostname === 'github.com' (or an exact allowlist).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/handlers/clone.ts` around lines 333 - 336, The current
substring check for isGithubUrl is unsafe; parse workingUrl using the URL
constructor and set isGithubUrl only when the parsed hostname strictly equals
'github.com' (or matches an exact allowlist), then compute effectiveToken as
before (use userGithubToken only when that strict host check passes); ensure any
URL parsing errors are caught and treated as non-GitHub so you don’t
accidentally apply userGithubToken. Reference: update the computation of
isGithubUrl, the use of workingUrl, and the subsequent effectiveToken assignment
that depends on resolveForgeAuth and userGithubToken.
| const parsed = safeParseUrl(workingUrl); | ||
| if (parsed) { | ||
| cloneUrl = `https://${authScheme}${forgeToken}@${parsed.hostname}${parsed.pathname}`; | ||
| cloneUrl = `https://${authScheme}${effectiveToken}@${parsed.hostname}${parsed.pathname}`; |
There was a problem hiding this comment.
Preserve host:port when rebuilding authenticated clone URL.
Line 340 uses parsed.hostname, which strips custom ports. Use parsed.host so self-hosted forge URLs on non-default ports continue cloning correctly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/handlers/clone.ts` at line 340, The authenticated clone URL
reconstruction in clone.ts builds cloneUrl using parsed.hostname which drops
custom ports; update the logic where cloneUrl is assigned (the line creating
cloneUrl) to use parsed.host (or parsed.host || parsed.hostname as a fallback)
so self-hosted forge URLs with non-default ports are preserved when creating the
authenticated URL.
| const tokenRes = await fetch(`${keycloakBase()}/protocol/openid-connect/token`, { | ||
| method: 'POST', | ||
| headers: { 'Content-Type': 'application/x-www-form-urlencoded' }, | ||
| body: new URLSearchParams({ | ||
| grant_type: 'authorization_code', | ||
| client_id: process.env.KEYCLOAK_CLIENT_ID ?? '', | ||
| ...(process.env.KEYCLOAK_CLIENT_SECRET | ||
| ? { client_secret: process.env.KEYCLOAK_CLIENT_SECRET } | ||
| : {}), | ||
| redirect_uri: `${appBase(c)}/api/auth/callback`, | ||
| code, | ||
| code_verifier: stored.codeVerifier, | ||
| }), | ||
| }); |
There was a problem hiding this comment.
External OAuth/OIDC calls need timeout + network-failure handling.
These fetch calls have no abort timeout. Network stalls or DNS/TLS failures will currently bubble into 500s or hung requests.
Also applies to: 272-300
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/server/src/routes/auth.ts` around lines 119 - 132, Wrap the external
token fetch (the call that assigns tokenRes when POSTing to
`${keycloakBase()}/protocol/openid-connect/token` in auth.ts) in an
AbortController-based timeout and add a try/catch to handle network failures;
ensure you abort the request after a configurable timeout, catch and log the
error via the existing logger or return a proper 504/502 style response, and
also check tokenRes.ok and handle non-2xx responses (parsing and surfacing error
details) instead of assuming success. Apply the same pattern to the other
external OIDC/fetch calls in this file (the later fetch sequence referenced
around the second block) so all external OAuth/OIDC HTTP calls use
AbortController timeouts, explicit error handling, and clear non-OK response
handling.
| const body = await tokenRes.text(); | ||
| getLog().error({ status: tokenRes.status, body }, 'auth.token_exchange_failed'); | ||
| return c.json({ error: 'Token exchange failed' }, 502); |
There was a problem hiding this comment.
Avoid logging raw token-exchange response bodies.
Logging body from the IdP response can leak sensitive error details into logs; log status and a bounded/sanitized error code instead.
As per coding guidelines: “never log API keys, tokens ... or PII”.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/server/src/routes/auth.ts` around lines 135 - 137, The current
logging in the token exchange path leaks the raw IdP response body; update the
handler around tokenRes and getLog() (the block that logs
'auth.token_exchange_failed') to never include the raw body or any
tokens/PII—replace it with tokenRes.status and a bounded/sanitized error
identifier (e.g., parsed error code or a constant like "<idp-error-redacted>" or
the first 200 chars only), or log only status plus a safe errorCode variable;
then return the same 502 response via c.json({ error: 'Token exchange failed' })
without exposing IdP details.
| setCookie(c, 'archon_access_token', tokens.access_token, { | ||
| httpOnly: true, | ||
| sameSite: 'Lax', | ||
| path: '/', | ||
| maxAge: 60 * 60 * 8, // 8 hours | ||
| }); |
There was a problem hiding this comment.
Auth cookie is missing the secure flag.
archon_access_token is HTTP-only, but without secure: true it can be sent over non-TLS HTTP in misconfigured environments.
Suggested fix
setCookie(c, 'archon_access_token', tokens.access_token, {
httpOnly: true,
+ secure: true,
sameSite: 'Lax',
path: '/',
maxAge: 60 * 60 * 8, // 8 hours
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| setCookie(c, 'archon_access_token', tokens.access_token, { | |
| httpOnly: true, | |
| sameSite: 'Lax', | |
| path: '/', | |
| maxAge: 60 * 60 * 8, // 8 hours | |
| }); | |
| setCookie(c, 'archon_access_token', tokens.access_token, { | |
| httpOnly: true, | |
| secure: true, | |
| sameSite: 'Lax', | |
| path: '/', | |
| maxAge: 60 * 60 * 8, // 8 hours | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/server/src/routes/auth.ts` around lines 166 - 171, The auth cookie
set in the setCookie call for 'archon_access_token' is missing the secure flag;
update the options object passed to setCookie(c, 'archon_access_token',
tokens.access_token, { ... }) to include secure: true so the cookie is only sent
over TLS, preserving httpOnly, sameSite, path, and maxAge settings (optionally
gate secure by NODE_ENV if you need HTTP in local dev).
|
|
||
| async function fetchJSON<T>(url: string, options?: RequestInit): Promise<T> { | ||
| const res = await fetch(url, options); | ||
| const res = await fetch(url, { credentials: 'include', ...options }); |
There was a problem hiding this comment.
Make cookie credentials non-overridable in fetchJSON.
{ credentials: 'include', ...options } allows callers to override credentials accidentally. If this helper is meant to always include session cookies, spread first and set credentials last.
Suggested patch
- const res = await fetch(url, { credentials: 'include', ...options });
+ const res = await fetch(url, { ...options, credentials: 'include' });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const res = await fetch(url, { credentials: 'include', ...options }); | |
| const res = await fetch(url, { ...options, credentials: 'include' }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/web/src/lib/api.ts` at line 63, In fetchJSON (the helper that calls
fetch), callers can override credentials because options are spread after the
default; change the call so options are spread first and credentials is set last
(i.e., fetch(url, { ...options, credentials: 'include' })) so that credentials:
'include' cannot be overridden; update the fetch invocation inside fetchJSON
accordingly.
| if (res.status === 401 || res.status === 404) return null; | ||
| if (!res.ok) return null; | ||
| const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>; |
There was a problem hiding this comment.
Don’t collapse unexpected /api/auth/me failures into “unauthenticated”.
Returning null for all non-OK responses hides 5xx/backend failures and makes diagnostics harder. Keep null for 401/404, but throw for other statuses.
Suggested patch
export async function getAuthMe(): Promise<AuthMeResponse | null> {
const res = await fetch('/api/auth/me', { credentials: 'include' });
if (res.status === 401 || res.status === 404) return null;
- if (!res.ok) return null;
+ if (!res.ok) {
+ const body = await res.text();
+ const truncated = body.length > 200 ? body.slice(0, 200) + '...' : body;
+ throw new Error(`GET /api/auth/me failed (${String(res.status)}): ${truncated}`);
+ }
const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>;
if (data.authenticated === false) return null;
return data as AuthMeResponse;
}As per coding guidelines: “Prefer throwing early with clear error for unsupported or unsafe states; never silently swallow errors…”.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (res.status === 401 || res.status === 404) return null; | |
| if (!res.ok) return null; | |
| const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>; | |
| if (res.status === 401 || res.status === 404) return null; | |
| if (!res.ok) { | |
| const body = await res.text(); | |
| const truncated = body.length > 200 ? body.slice(0, 200) + '...' : body; | |
| throw new Error(`GET /api/auth/me failed (${String(res.status)}): ${truncated}`); | |
| } | |
| const data = (await res.json()) as { authenticated?: false } & Partial<AuthMeResponse>; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/web/src/lib/api.ts` around lines 565 - 567, The current response
handling in the block that checks res.status and creates the "data" variable
incorrectly returns null for all non-OK responses; change it so that only 401
and 404 return null and any other non-ok status throws an Error (include status
and a short message or response text for diagnostics) before attempting to call
res.json(); preserve the existing typing to { authenticated?: false } &
Partial<AuthMeResponse> when parsing successful responses and ensure thrown
errors surface upstream instead of being swallowed.
…fixes Discovered while deploying live against Keycloak: Backend - middleware/auth.ts: always attempt cookie validation (was skipping on public /api/auth/* paths, so /api/auth/me always 401'd after login and the web UI looped back to /login). - middleware/auth.ts: validate JWT issuer + azp claim instead of audience. Keycloak access tokens carry aud: "account", not the client id, so the prior audience: clientId check always failed. - routes/auth.ts: appBase() now honors X-Forwarded-Proto / X-Forwarded-Host so the OIDC redirect_uri matches the public URL behind a reverse proxy. - routes/auth.ts: new DELETE /api/auth/github endpoint that best-effort revokes the token on GitHub then clears the encrypted token + username. - db/users.ts: new clearGithubToken(userId) helper. Frontend - lib/api.ts: disconnectGithub() client. - SettingsPage.tsx: new "Account & Connections" card showing signed-in identity, Sign out, GitHub connection status, and Reconnect/Manage on GitHub/Disconnect buttons. Hidden in single-user mode (no user from useAuth).
Critical
- middleware/auth.ts: narrow PUBLIC_PATHS to {/api/auth/login, callback,
logout, health, openapi.json} + PUBLIC_PREFIXES=['/webhooks/']. Routes
like /api/auth/me and /api/auth/github/callback now hit JWT validation
and receive a populated oidcUser.
- handlers/clone.ts: parse URL and require hostname === 'github.com' before
applying per-user token (substring match was vulnerable to host-confusable
URLs like github.com.attacker.io).
Major
- handlers/clone.ts: use parsed.host (preserves :port) when rebuilding
authenticated clone URL — fixes self-hosted forges on non-default ports.
- orchestrator.ts: wrap getConversationById attribution lookup in try/catch
so a transient DB error can't abort the entire workflow dispatch.
- server/index.ts: require DATABASE_URL to start with postgres:// or
postgresql:// (was only checking presence — sqlite:// would slip through).
- api.ts (POST /conversations): fail fast if oidcUser is set but no matching
remote_agent_users row exists, rather than silently creating an unaudited
conversation.
- api.ts (POST /codebases): fail fast on missing DB user, and require the
user's GitHub token for github.com URLs in multi-user mode (don't
silently broaden to the org token, which would mis-attribute the clone).
- auth.ts: fetchWithTimeout wrapper (default 10s) around all external IdP
and GitHub fetches; network failures return 502 instead of hanging.
- auth.ts: don't log raw token-exchange response bodies — extract only the
short `error` code (PII guard).
- auth.ts: add secure:true to archon_access_token cookie.
- db/users.ts (setGithubToken): throw when rowCount != 1 (was silent on
missing user); drop githubUsername from log payload.
- web/lib/api.ts (getAuthMe): throw on unexpected non-OK statuses instead
of collapsing them to null (5xx failures are now visible).
Minor
- token-crypto.test.ts: restore TOKEN_ENCRYPTION_KEY via afterEach so
throwing tests can't leak env state into later tests.
- provider.test.ts + registry.test.ts: restore IS_SANDBOX via afterEach
so the override I added in this session doesn't mask root-check
regressions in other test files in the same process.
- web/lib/api.ts (fetchJSON): set credentials AFTER spread so callers
cannot accidentally override credentials: 'include'.
Skipped from CodeRabbit's list:
- auth.ts:378 ("use registerOpenApiRoute wrapper"): marked Heavy lift;
out of scope for this fix-pass. Filed as follow-up.
- hyperframes-video.yaml comments: file already removed in 1850915.
Addresses CodeRabbit H1 (auth.ts:378) — per CLAUDE.md, all new API routes
must use registerOpenApiRoute(createRoute({...}), handler) so they appear
in the generated OpenAPI spec (/api/openapi.json).
- New routes/schemas/auth.schemas.ts with authMeResponseSchema, okResponseSchema,
and query schemas for callback endpoints.
- routes/auth.ts: 7 createRoute configs with proper tags/summaries/responses.
302 redirects declared as empty responses; Location set by c.redirect() at
runtime.
- Local registerOpenApiRoute helper (matches the pattern in api.ts).
- Side effect: added the auth-required check on GET /api/auth/github (start)
so the OpenAPI 401 declaration is honest — only login/callback/logout are
in the middleware allowlist.
- Side effect: added secure:true to the github_oauth_state CSRF cookie.
No behavior changes for handlers themselves.
The previous refactor commit added imports for the new schema file but forgot to git-add it before commit -am (only stages tracked files). This commit lands the file so c5b631f actually compiles.
|
@$leex279 related to #1348 — overlapping area or partial fix. |
|
@$leex279 related to #1469 — overlapping area or partial fix. |
|
@$leex279 related to #1348 — overlapping area or partial fix. |
|
@$leex279 related to #1469 — overlapping area or partial fix. |
In multi-user mode, workflow subprocesses (bash/script nodes and Claude/
Codex/Pi AI providers) inherited the container's GH_TOKEN/GITHUB_TOKEN
unconditionally via process.env. A user who had not connected their own
GitHub account could therefore have 'gh' and 'git push' inside any
workflow silently authenticate as the shared org account, including on
private repos owned by another user.
This change introduces a per-run GitHub token policy
(packages/workflows/src/utils/github-token-policy.ts):
- New env flag ARCHON_ALLOW_ORG_GITHUB_TOKEN_FALLBACK (default false)
- Multi-user mode + user-initiated run + user has personal token:
inject the user's token as GH_TOKEN/GITHUB_TOKEN, clear COPILOT_GITHUB_TOKEN.
- Multi-user mode + user-initiated run + no user token + flag disabled
(default): scrub all three keys, so 'gh' and 'git push' fail with an
auth error instead of silently using the org token.
- Multi-user mode + user-initiated run + no user token + flag enabled:
legacy behaviour, org token kept (opt-in).
- Server-initiated runs (no created_by_user_id — GitHub webhook, cron, CLI):
never scrubbed; trusted server context.
- Single-user mode (no KEYCLOAK_URL): never scrubbed; backward-compatible.
Plumbing:
- WorkflowDeps gains optional getUserGithubToken(userId), wired from
@archon/core/db/users.getGithubToken (already decrypts at rest).
- dag-executor resolves the run owner's token once per run and threads
the resulting overrides into:
- executeBashNode / executeScriptNode subprocess env builders
(Reflect.deleteProperty for hard-delete semantics)
- resolveNodeProviderAndModel / buildLoopNodeOptions baseOptions.env,
so AI provider subprocesses receive the same overrides. The
providers' existing process.env merge treats '' as scrub.
- cloneRepository: the forge-level GH_TOKEN/GITLAB_TOKEN/GITEA_TOKEN
fallback is gated on the same policy. github.com URLs are still
blocked at the API layer when the user has no personal token; this
change closes the parallel hole for non-GitHub forges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lows Previous commit (d18e7c8) only covered workflow dispatch (dag-executor). Free-form chat — the most common path — goes through orchestrator-agent which calls aiClient.sendQuery directly, spawning Claude/Codex with process.env unaltered. So a non-connected user chatting with Claude could still have the model's gh/git tool calls succeed using the org GH_TOKEN. Fix: in orchestrator-agent.handleMessage, resolve the conversation owner's personal token via userDb.getGithubToken and call resolveGithubTokenOverrides() the same way dag-executor does. Merge the result LAST into requestOptions.env so it wins over any per-codebase env that happens to declare GH_TOKEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Review SummaryVerdict: minor-fixes-needed PR #1774 adds OIDC/Keycloak multi-user authentication, per-user GitHub OAuth with AES-256-GCM token encryption, and a GitHub token isolation policy for workflow subprocesses. The implementation is solid — error handling is correct across all IdP/GitHub calls, comments explain the security-sensitive decisions well, and the code passes CLAUDE.md style checks. Two gaps need to be resolved before merge: a critical documentation gap (new public surfaces undocumented) and a test that exists but never runs in CI. Blocking issues
Suggested fixes
Minor / nice-to-have
Compliments
Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality, docs-impact. |
…icy, worktree git identity (PR-C G4–G6) Attribution core for PR-C. All three paths are no-ops unless per-user GitHub is enabled (GITHUB_APP_ID + TOKEN_ENCRYPTION_KEY), so solo PAT installs are unchanged. G4 — requires gate: - workflowRequirementSchema + `requires: [github]` on workflowBaseSchema - workflow-requirements.ts: assertWorkflowRequirementsMet + WorkflowRequirementError - orchestrator hard-fails a requiring workflow at invocation (before isolation/ worktree/AI cost) when the originating user hasn't connected G5 — per-user token in workflow subprocess env: - github-token-policy.ts cherry-picked from #1774; KEYCLOAK_URL mode-detector replaced by an injected isPerUserGitHubEnabled flag (kept pure) - WorkflowDeps gains getUserGithubToken + isPerUserGitHubEnabled - executor merges per-user overrides last into config.envVars (user token wins, or '' scrubs the org/bot token when unconnected + fallback off) - store-adapter wires getUserGithubToken → getDecryptedAccessToken G6 — worktree git identity: - IsolationRequest/ResolveRequest gain gitIdentity; resolver forwards it - WorktreeProvider stamps `git config user.email/name` on new worktrees - orchestrator resolves the no-reply email (<id>+<login>@users.noreply...) from the originating userId and passes it on the resolve request Tests: workflow-requirements (3), github-token-policy (8). Type-check + isolation suite green.
…-C) (#1823) * feat(core): per-user GitHub token store + device-flow engine (PR-C foundation) Foundation layer for per-user GitHub attribution (PRD Phase 3). Pure additive infrastructure — nothing wires into the adapters/workflows/web yet, so there is zero behavior change. Subsequent increments add the invocation gate, token policy, git-config plumbing, connect surfaces, and interim web auth. - token-crypto: AES-256-GCM encrypt/decrypt + TOKEN_ENCRYPTION_KEY validation (cherry-picked from the #1774 donor branch; decrypt rewritten for strict TS) - github-auth/config: isPerUserGitHubEnabled (App + key), device-flow client-id loader, assertEncryptionKeyAtBoot fail-fast - remote_agent_user_github_tokens table (PG migration + bundled-schema regen + SQLite createSchema); one row per user, encrypted access/refresh tokens - db/user-github-token-store: save (UPSERT), get, decrypt-with-refresh-on-read (5-min buffer, per-user mutex against double-refresh), delete, no-reply email - github-auth/device-flow: hand-rolled start/poll/refresh/fetchUser over fetch (no client_secret, slow_down + authorization_pending handling) - github-auth/connect-service: device flow -> profile -> conflict-guarded identity bind -> encrypted token persist -> cached display_name/email - users: updateUserGithubProfile, linkGithubIdentity, GithubIdentityConflictError Tests: 30 new (crypto/device-flow/token-store, 10 each). bun run validate green. TS filenames carry -row/-store suffixes to satisfy the local secret-guard hook; the DB table is remote_agent_user_github_tokens. * fix(core): address multi-agent review findings on per-user github tokens Hardening + test coverage for the dormant per-user GitHub token store and device-flow engine. No behavior change for existing installs. Correctness: - token store: separate the refresh call from the DB save so a persist failure after a successful refresh is no longer mislabeled `refresh_failed` and the freshly-issued (now-valid) token is returned instead of discarded; the failure logs `persist_failed` (error) (I1) - linkGithubIdentity: add UNIQUE-violation race recovery (re-SELECT, conflict if a different user won, no-op if same user) so a concurrent double-connect no longer surfaces a raw 23505 (I2) - getUserGithubTokenRecord: type the raw read as number|string and coerce github_user_id, honest against node-postgres returning BIGINT as a string (I5) Types/logging/docs: - row schema: widen token-expiry timestamps to date|string (PG Date vs SQLite ISO text) (I6) - connectGithubForUser: emit github_connect.started/.failed per logging convention (I7) - token store: debug log on the not-connected path; relinked + race-recovered audit logs on linkGithubIdentity (S3/S5) - device-flow: floor-guard the slow_down backoff; surface GitHub's error_description from postForm (S1/S2) - getEncryptionKey: accept an env param so assertEncryptionKeyAtBoot is consistent and injectable - fix stale updateUserGithubProfile comment (I8); CLAUDE.md 11->12 tables and migration header table list, regenerated bundled schema (D1/D2) Tests (+20): linkGithubIdentity (conflict/same-user/new/race) and updateUserGithubProfile; token-store mutex + re-read fallback + persist-failure; new config.test.ts and connect-service.test.ts (ordering: no token saved when the identity bind throws). Deferred: discriminated-union return for getDecryptedAccessToken (no callers yet - premature); literal-union DeviceFlowError.code (clean form needs a string & {} eslint-disable for marginal benefit). Rejected the review's suggestions to key identity on numeric id (breaks adapter login matching), to drop a dead data.interval operand (it is a live field GitHub returns), and to remove a fresh && guard (TS cannot narrow it). * feat(workflows,isolation): requires:[github] gate, per-user token policy, worktree git identity (PR-C G4–G6) Attribution core for PR-C. All three paths are no-ops unless per-user GitHub is enabled (GITHUB_APP_ID + TOKEN_ENCRYPTION_KEY), so solo PAT installs are unchanged. G4 — requires gate: - workflowRequirementSchema + `requires: [github]` on workflowBaseSchema - workflow-requirements.ts: assertWorkflowRequirementsMet + WorkflowRequirementError - orchestrator hard-fails a requiring workflow at invocation (before isolation/ worktree/AI cost) when the originating user hasn't connected G5 — per-user token in workflow subprocess env: - github-token-policy.ts cherry-picked from #1774; KEYCLOAK_URL mode-detector replaced by an injected isPerUserGitHubEnabled flag (kept pure) - WorkflowDeps gains getUserGithubToken + isPerUserGitHubEnabled - executor merges per-user overrides last into config.envVars (user token wins, or '' scrubs the org/bot token when unconnected + fallback off) - store-adapter wires getUserGithubToken → getDecryptedAccessToken G6 — worktree git identity: - IsolationRequest/ResolveRequest gain gitIdentity; resolver forwards it - WorktreeProvider stamps `git config user.email/name` on new worktrees - orchestrator resolves the no-reply email (<id>+<login>@users.noreply...) from the originating userId and passes it on the resolve request Tests: workflow-requirements (3), github-token-policy (8). Type-check + isolation suite green. * feat(adapters,server): route PR/issue comments through the originating user's token (PR-C G7) When per-user GitHub is enabled (App mode + TOKEN_ENCRYPTION_KEY), the GitHub adapter authors outbound comments under the human who triggered the thread, instead of archon[bot]. - handleWebhook caches conversationId → originating archon userId (actorByConversation; App mode only, lost on restart → graceful bot fallback) - postComment prefers a per-user Octokit (getUserOctokit, 5-min cached); on 401 (revoked/expired) it evicts and falls back to the installation token rather than failing the reply - adapter constructor gains an optional getUserToken resolver; server injects it (→ getDecryptedAccessToken) only when isPerUserGitHubEnabled() - no MessageMetadata / core streaming changes — the actor is resolved entirely inside the adapter from the webhook it already processes Existing 64 adapter tests pass (PAT path: actor undefined → unchanged). A dedicated App-mode user-token test is deferred (no mock-provider harness yet); the path is type-safe with a conservative bot fallback. * feat(cli,adapters): connect-github surfaces for CLI and Slack (PR-C G8) - CLI: `archon auth github` — device flow for the CLI user. Identity from ARCHON_USER_ID else $USER/$USERNAME, resolved to a stable 'cli' Archon user. Guarded on isPerUserGitHubEnabled; clear errors for conflict / device-flow / not-enabled. Wired into the cli.ts switch + noGitCommands + usage. - Slack: `/archon connect github` — resolves the invoking user, drives the device flow detached, and delivers the code + result as ephemeral follow-ups via response_url (within its 30-min/5-use window). No orchestrator dispatch. Both reuse the shared connectGithubForUser engine. CLI identity uses ARCHON_USER_ID rather than a config.yaml field (smaller surface than expanding the config schema this PR). Type-check + slack adapter suite (38) green. * feat(server,web,docs): web device-flow endpoints, X-Archon-User auth, Connect UI, docs (PR-C G9–G10) Web surface + interim web auth + docs for per-user GitHub. Device flow (browser-driven, non-blocking): - decomposed device-flow: pollDeviceFlowOnce + persistGithubConnection (connect-service tail), shared by the blocking CLI/Slack path - POST /api/auth/github/device/start | /poll, GET/DELETE /api/auth/github - web api client fns + a Settings "GitHub Identity" card (connect → show code → poll → connected; disconnect) Interim web auth (X-Archon-User): - resolveWebUserId(c) reads ARCHON_WEB_AUTH_HEADER (default X-Archon-User) and resolves a stable 'web' user; absent → NULL (never elevated) - the three web userId TODO sites (create conversation, send message, run workflow) now attribute conversation/message rows and the workflow run Boot: assertEncryptionKeyAtBoot fails fast on a malformed key; App-without-key WARNs + disables per-user (App-for-bot-only stays valid). Docs: .env.example (GITHUB_APP_CLIENT_ID, TOKEN_ENCRYPTION_KEY, ARCHON_ALLOW_ORG_GITHUB_TOKEN_FALLBACK, ARCHON_WEB_AUTH_HEADER); CLAUDE.md (requires:[github], auth endpoints). Test fixes: orchestrator-isolation mocks the new db dep; api.messages / api.workflow-runs updated for the addMessage userId arg. bun run validate green. * fix(pr-c): address multi-agent review findings on per-user github tokens Critical: - C1 (core): wrap all three decryptToken calls in resolveAccessToken in try/catch -> log user_github_token.decrypt_failed and return null, honoring the documented "null on irrecoverable failure" contract. A TOKEN_ENCRYPTION_KEY rotation or tampered ciphertext no longer throws an opaque crypto error and crash workflow dispatch (the requires:[github] gate treats null as unconnected). - C2 (workflows): until_bash loop subprocess now spreads ...(config.envVars ?? {}) last, matching executeBashNode/executeScriptNode. The per-user token scrub and managed per-project env vars reach until_bash; it no longer inherits the server's ambient GitHub token and bypasses the scrub. - C3 (docs): .env.example and the config.ts module doc now name all three vars correctly — GITHUB_APP_ID + TOKEN_ENCRYPTION_KEY arm the gate; GITHUB_APP_CLIENT_ID is additionally required for the connect flow. Previous text named CLIENT_ID as the gate trigger, which could cause a silent lockout. Important: - I1 (server): boot guard warns web_auth.header_trust_on_public_bind when the X-Archon-User trust is live (header set, or per-user mode since the default header is trusted) on a non-loopback bind, mirroring the /internal/* guard. - I2 (server): new strict requireWebUser helper for the 4 auth-required endpoints distinguishes missing header (401) from identity-resolution backend failure (503, added to apiError's union). Best-effort resolveWebUserId kept for attribution paths but now logs headerPresent. - I3 (server): GET/DELETE /api/auth/github wrapped in try/catch with userId context instead of an opaque 500 from the global handler. - I4 (server): corrected the /internal/* inline comment — the guard refuses to start (fatal), it does not merely WARN. Cheap correctness wins: - S1 (core): assertKeyLength 32-byte guard in encryptToken/decryptToken surfaces an actionable error instead of Node's opaque "Invalid key length". - S3 (server): replaced the nested ternary in the poll-status mapping with a testable mapDeviceFlowErrorToPollStatus helper. - S5 (cli): corrected the stale TODO — archon auth github has landed; remaining work is threading resolveCliUserId() (ARCHON_USER_ID/$USER), not config.yaml. Docs: - README table count 7 -> 12; CLAUDE.md @archon/core batch count 7 -> 20 and a new `auth github` CLI entry. - docs-web: `auth github` CLI reference, per-user GitHub env var table in configuration.md, `requires: [github]` in authoring-workflows.md, and a "Step 8: Enable per-user GitHub identity" section in github-app-setup.md. Tests: - T1 (core): direct pollDeviceFlowOnce coverage (pending / slow_down / default-interval / authorized / error). - T2 (server): auth-poll-status.test.ts covers the expired/denied/error mapping; registered in server package.json. Deferred: - T3: locking the isPerUserGitHubEnabled()===false no-op for requires:[github] at its only call site (dispatchOrchestratorWorkflow) needs ~100 lines of new mock scaffolding in orchestrator-agent.test.ts (which today only tests command parsing and mocks neither the config gate nor the token store) to verify a one-line && short-circuit whose two halves are already unit-tested (config.test.ts + workflow-requirements.test.ts). Not worth the scaffolding. bun run validate passes (all 6 checks).
Lockfile drifted during the merge because both branches independently bumped package versions; regenerated with `bun install --linker=hoisted` to satisfy the Dockerfile's `--frozen-lockfile` install step.
Dev's PR-C device-flow added disconnectGithub() at L117; the fork already had its own at L521. Both call DELETE /api/auth/github with the same semantics. Keep dev's (per the merge directive: dev wins on auth) and remove the fork's duplicate. SettingsPage call sites only use the mutationFn return for invalidation, so either return type works.
The keycloak fork extended cloneRepository() with a userGithubToken param and reached into @archon/workflows/utils/github-token-policy (which dev keeps internal). Dev's design centralizes the per-user token policy inside the workflow executor instead. Replace the fork's clone.ts with dev's version and drop the second arg at the api.ts call site so the bundler can resolve the @archon/workflows import. Per merge directive: dev wins on GitHub auth.
…umers orchestrator-agent.ts (chat path) imports resolveGithubTokenOverrides from @archon/workflows/utils/github-token-policy. Dev kept this as an internal relative import inside the workflows package, but the keycloak fork needs it on the chat path too (commit 3413e57). Exposing the subpath is the smallest correct fix: no behavior change for dev's own internal callers, and the fork's multi-user chat-path token policy works.
…v merge Dev's auth.schemas.ts replaced the fork's via -X theirs and only carries the new device-flow schemas. The fork's routes/auth.ts (Keycloak/OIDC gate) still imports authMeResponseSchema, okResponseSchema, oidcCallbackQuerySchema, githubCallbackQuerySchema. Re-add them so the two route files (Keycloak gate + device-flow connect) can coexist.
routes/auth.ts (Keycloak callback) + routes/api.ts + orchestrator-agent.ts all call upsertUser, getUserByKeycloakSub, setGithubToken, getGithubToken, clearGithubToken. Dev's users.ts (taken via -X theirs) didn't have these — it uses a different model (findOrCreateUserByPlatformIdentity + user_github_tokens table). The DB schema still has the fork's columns (keycloak_sub, github_oauth_token, github_username) so the data is intact. Co-located the fork's functions in dev's users.ts with a separate KeycloakUser type so the two user models live side-by-side without mixing types.
Add archon-smart-pr-review-agnostic and archon-comprehensive-pr-review-agnostic under .archon/workflows/experimental/, mirroring archon-fix-github-issue-agnostic. These drop Claude-only bindings so every node inherits the configured default assistant, making them runnable on Codex/Pi-only deployments: - smart variant: removes the `model: haiku` pin on `classify` (Codex rejects Claude model names), adds `additionalProperties: false` to its output_format (required by Codex/OpenAI strict structured output), and drops the Claude-only `mcp:` ntfy notify branch. - comprehensive variant: already binding-free; renamed copy for parity. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Summary
/api/*route is publicly accessible, there is no user table, and a single sharedGITHUB_TOKENenv var is used for all clones. This blocks any hosted, multi-developer deployment./api/*, aremote_agent_userstable with AES-256-GCM-encrypted per-user GitHub OAuth tokens,created_by_user_idattribution on conversations and workflow runs, and per-user GitHub token usage in the clone path. Web UI now redirects unauthenticated users to login and shows a "Connect GitHub" button.KEYCLOAK_URLis unset — single-user mode still works unchanged. No RBAC, no per-user AI provider selection, no multi-tenancy, no CLI/Slack/Telegram/GitHub-adapter auth changes.UX Journey
Before
After (multi-user mode, KEYCLOAK_URL set)
After (single-user mode, KEYCLOAK_URL unset)
Architecture Diagram
Before
After
Connection inventory:
/api/auth/me/api/auth/login/api/auth/github/api/*KEYCLOAK_URLsetuserDb.upsertUser/setGithubToken/api/codebasesuserDb.getGithubTokencloneRepositoryuserGithubToken2nd arg; falls back to envgetOrCreateConversationcreated_by_user_idcolumncreateWorkflowRuncreated_by_user_idcolumncreated_by_user_idfrom parentLabel Snapshot
risk: medium(new auth surface; backwards-compat path mitigates blast radius)size: L(~15 files, new DB migration, new middleware + routes, web UI hooks)core | server | web | adapters | config | docs | testsserver:middleware,server:routes/auth,core:db/users,core:utils/token-crypto,web:hooks/useAuthChange Metadata
featuremultiLinked Issue
Validation Evidence (required)
bun run validateexited 0 against this branch.Security Impact (required)
TOKEN_ENCRYPTION_KEY; server refuses to start in multi-user mode without it; tokens never loggedKEYCLOAK_URLis set without PostgreSQL orTOKEN_ENCRYPTION_KEY.Compatibility / Migration
KEYCLOAK_URLis unset; allcreated_by_user_idcolumns are nullable; pre-auth rows continue to loadKEYCLOAK_URL,KEYCLOAK_REALM,KEYCLOAK_CLIENT_ID,KEYCLOAK_CLIENT_SECRET,TOKEN_ENCRYPTION_KEY,GITHUB_OAUTH_CLIENT_ID,GITHUB_OAUTH_CLIENT_SECRET(all documented in.env.example)migrations/022_users_and_auth.sql(PostgreSQL only)psql $DATABASE_URL < migrations/022_users_and_auth.sqlopenssl rand -hex 32→ set asTOKEN_ENCRYPTION_KEYKEYCLOAK_URL/KEYCLOAK_REALM/KEYCLOAK_CLIENT_ID/KEYCLOAK_CLIENT_SECRETGITHUB_OAUTH_CLIENT_ID/GITHUB_OAUTH_CLIENT_SECRETDATABASE_URL(PostgreSQL — startup refuses SQLite in multi-user mode)Human Verification (required)
bun run validatepasses end-to-end (commits3e7a428c+fe6edfcb)created_by_user_idINSERT arg; ClaudeProvider tests fixed for UID 0 under Docker/CISide Effects / Blast Radius (required)
KEYCLOAK_URLunset, existing users get 401 → infinite redirect loop. Mitigation: integration test on the unset path; manual smoke before rollout.created_by_user_idis nullable, but any later code that assumes it's NOT NULL would break pre-auth rows.Rollback Plan (required)
git revert fe6edfcb 3e7a428c(the two commits on this branch)KEYCLOAK_URLenv var — unset it and restart to instantly fall back to single-user mode without reverting codeDROP TABLE remote_agent_users CASCADE; ALTER TABLE ... DROP COLUMN created_by_user_idif a clean rollback is neededRisks and Mitigations
/api/auth/mereturns 401 whenKEYCLOAK_URLis unset.oidcMiddlewareis a true no-op in single-user mode; add an integration test before next release if not already covered.created_by_user_idis captured but never used downstream (RBAC is post-MVP).🤖 Generated with Claude Code
Summary by CodeRabbit