Skip to content

feat(core): plumb user_id from chat/forge adapters through orchestrator and workflow runs#1783

Merged
Wirasm merged 2 commits into
devfrom
feat/user-identity-foundation
May 28, 2026
Merged

feat(core): plumb user_id from chat/forge adapters through orchestrator and workflow runs#1783
Wirasm merged 2 commits into
devfrom
feat/user-identity-foundation

Conversation

@Wirasm

@Wirasm Wirasm commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Problem: Archon is architecturally single-user — one shared GITHUB_TOKEN, no users table, every conversation/message/workflow_run unattributed. A small team sharing one VPS can't tell who triggered which run, and every teammate's PR comments appear under the same GitHub avatar.
  • Why it matters: Without per-user attribution in the schema, every later team-oriented feature (ACLs, "my runs", per-user budgets, audit logs) ships as a refactor instead of a one-line filter. This is the blocking layer.
  • What changed: Two new tables (remote_agent_users + remote_agent_user_identities), nullable user_id on the four primary tables, and end-to-end plumbing of a stable Archon user_id from every chat/forge adapter (Slack, Telegram, Discord, GitHub) through HandleMessageContext into conversations, workflow runs, and isolation environments. Slack additionally enriches first-sight users with their real name via users.info.
  • What did NOT change (scope boundary): Web UI authentication and CLI identity are explicitly deferred — those rows continue to write user_id = NULL (follow-up PR wires the X-Archon-User header and archon auth github device flow). No GitHub App swap yet (separate follow-up). No ACLs, no requires: [github] workflow gate, no UI changes.

UX Journey

Before

  Slack U-id ──► adapter reads event.user ──► ✗ DROPPED before orchestrator
  Telegram from.id ──► emitted in payload   ──► ✗ DROPPED by server handler
  Discord author.id ──► emitted             ──► ✗ DROPPED
  GitHub sender.login ──► used for whitelist ──► ✗ DROPPED after auth check

  handleMessage(platform, conversationId, content, ctx)  // ctx has no userId
       │
       ▼
  INSERT INTO conversations (...)            ◀── no user_id column exists
  INSERT INTO workflow_runs (...)            ◀── no user_id column exists
  INSERT INTO isolation_environments (...)   ◀── no created_by_user_id column

  SQL audit: SELECT user_id FROM remote_agent_conversations
  → ERROR: column "user_id" does not exist

After

  Slack U-id ──► adapter [users.info] ──► resolve('slack', U-id, real_name)
  Telegram from.id ──► [first_name+last_name] ──► resolve('telegram', id, displayName)
  Discord author.id ──► [author.username]    ──► resolve('discord', id, username)
  GitHub sender.login ──► resolve('github', login, login)
                            │
                            ▼ findOrCreateUserByPlatformIdentity
                            ├── existing identity? return its user (backfill display_name if missing)
                            ├── orphan identity? rebind to a new user
                            └── first sight? transactional INSERT user + identity
                                            (UNIQUE constraint catches race → re-SELECT winner)

  handleMessage(platform, conversationId, content, { ..., userId })  ◀── NEW field
       │
       ▼
  INSERT INTO conversations (..., user_id) VALUES (..., $6)
  INSERT INTO workflow_runs (..., user_id) VALUES (..., $8)
  INSERT INTO isolation_environments (..., created_by_user_id) VALUES (..., $8)
                                  (preserved on ON CONFLICT — first creator wins)

  SQL audit (now valid):
    SELECT u.display_name, COUNT(*) FROM workflow_runs r
    JOIN users u ON u.id = r.user_id GROUP BY u.display_name;

Architecture Diagram

Before

Slack ─┐
Telegram ┤
Discord ┤───► server/index.ts handlers ───► handleMessage ───► orchestrator ───► executor ───► store.createWorkflowRun
GitHub ─┘    (drops platform user id)        (no userId field)                   (no userId opt)

                                                  │
                                                  ▼
                                            isolation/resolver ───► store.create
                                                                    (no created_by_user_id)

DB: codebases, conversations, messages, workflow_runs, isolation_environments, sessions, codebase_env_vars, workflow_events

After

Slack ─[+users.info]┐
Telegram ──[+ctx.from]┤
Discord ──[+author.username]┤───► [~handlers resolve userId]───► [~handleMessage(ctx.userId)]
GitHub ──[+sender.login resolve]┘                                          │
                                                                           ▼
                                                                  [~orchestrator threads userId]
                                                                           │
                                            ┌──────────────────────────────┼──────────────────────────────┐
                                            ▼                              ▼                              ▼
                                  [~getOrCreateConversation(userId)]  [~executeWorkflow({userId})]  [~validateAndResolveIsolation(userId)]
                                            │                              │                              │
                                            ▼                              ▼                              ▼
                                  conversations.user_id        workflow_runs.user_id          isolation_envs.created_by_user_id
                                                                                                          │
                                                                                                          ▼
                                                                                          [~resolver.store.create({created_by_user_id})]

[+users] ←──── [+user_identities] (UNIQUE(platform, platform_user_id), FK CASCADE)

DB: [+users], [+user_identities], codebases, [~conversations], [~messages], [~workflow_runs], [~isolation_environments], sessions, codebase_env_vars, workflow_events

Connection inventory:

From To Status Notes
Slack adapter app.client.users.info new Per-first-sight enrichment, positive-only in-memory cache, warn-and-continue on missing_scope/rate-limit
server/index.ts (3 chat handlers) userDb.findOrCreateUserByPlatformIdentity new Via shared resolveUserId() helper (DRY across Slack/Telegram/Discord)
forge/github/adapter.ts userDb.findOrCreateUserByPlatformIdentity new Resolves comment.user.login ?? sender.login before handleMessage
HandleMessageContext getOrCreateConversation modified New userId param flows through; first-user-wins on subsequent thread replies
orchestrator-agent (dispatchOrchestratorWorkflow, handleStreamMode, handleBatchMode, handleWorkflowInvocationResult, handleWorkflowRunCommand) executeWorkflow / validateAndResolveIsolation / dispatchBackgroundWorkflow modified All gain userId? threading
IsolationRequest IsolationStore.create modified created_by_user_id propagated; not overwritten on ON CONFLICT (preserves original creator)
IWorkflowStore.createWorkflowRun payload DB INSERT modified New user_id column populated by executor opt
db/users.ts getDatabase().withTransaction new Atomic user + identity create with race recovery
db/index.ts exports userDb namespace new Mirrors existing conversationDb/codebaseDb pattern

Label Snapshot

  • Risk: risk: low (additive schema; nullable everywhere; backwards-compat verified by retained NULL paths)
  • Size: size: L (29 files, +926/-71)
  • Scope: core, adapters, server, workflows, isolation
  • Module: core:db, core:orchestrator, adapters:slack, adapters:telegram, adapters:discord, adapters:github, server:handlers, workflows:executor, isolation:resolver

Change Metadata

  • Change type: feature
  • Primary scope: multi

Linked Issue

Validation Evidence (required)

bun run validate
  • check:bundled + check:bundled-skill: up-to-date
  • type-check across 10 workspace packages
  • lint --max-warnings 0: clean
  • format:check: clean
  • test: per-package isolated batches all green; the new db/users.test.ts covers 10 cases (fast-path, opportunistic backfill, no-redundant-backfill, fresh create, race recovery, rethrow on non-race error, orphan repair, getUserById/updateUserDisplayName); Slack adapter test gains 4 cases for fetchDisplayName (real_name + cache, missing_scope graceful fallback, empty-userId short-circuit, retry after transient failure)
  • Skipped commands: none

Security Impact (required)

  • New permissions/capabilities? Yes — Slack adapter calls users.info, which requires the bot token scope users:read. Operators with an existing Slack app must reinstall to grant the scope. Failure returns missing_scope and we warn-log + continue without a display name; no message is ever dropped for this reason.
  • New external network calls? Yes — one Slack users.info call per first-sighting of each Slack user id, dedupe-cached in memory for the adapter's lifetime. Negative results are not cached (retry on next sighting). No new calls for Telegram/Discord/GitHub.
  • Secrets/tokens handling changed? No — PR-A does not touch tokens. users.email is stored plaintext (low-sensitivity public identifier; the GitHub tokens themselves stay out of scope until the follow-up device-flow PR, where they'll be AES-256-GCM encrypted).
  • File system access scope changed? No.
  • The new user_id FKs use ON DELETE SET NULL on all four tables — future user deletion (out of scope here) will never cascade destructively.

Compatibility / Migration

  • Backward compatible? Yes.
  • Config/env changes? No new required env vars. Operators running the Slack adapter should reinstall their Slack app with the users:read bot scope to unlock display-name enrichment, but the adapter degrades gracefully if the scope is absent.
  • Database migration needed? Yes.
    • PostgreSQL: idempotent — re-run psql $DATABASE_URL < migrations/000_combined.sql. Adds two CREATE TABLE IF NOT EXISTS + four ALTER TABLE ... ADD COLUMN IF NOT EXISTS. Second run is a no-op. Existing rows have user_id = NULL.
    • SQLite: applied automatically at server startup via the migrateColumns() PRAGMA-based pattern at packages/core/src/db/adapters/sqlite.ts — no manual step. Existing rows have user_id = NULL.

Human Verification (required)

What was personally validated beyond CI:

  • Verified scenarios:
    • Fresh SQLite DB: deleted ~/.archon/archon.db, server starts, new tables exist (.schema remote_agent_users, .schema remote_agent_user_identities).
    • Existing SQLite DB upgrade path: migrateColumns() ALTERs verified via code review against the existing pattern at sqlite.ts:164-218.
    • Type-check across all 10 workspace packages: clean.
    • Unit tests: all per-package batches green; race-recovery test exercises the realistic UNIQUE-constraint failure path.
  • Edge cases checked (in tests):
    • Concurrent first-sight (UNIQUE race → re-SELECT winner)
    • Orphan identity (user row deleted, identity row stranded → repair by rebinding)
    • users.info missing_scope failure (graceful degradation, message still flows)
    • users.info transient failure (retried on next sighting — negative results not cached)
    • Empty event.user (resolution short-circuits, no DB write)
  • What was not verified: end-to-end through a real Slack workspace / Telegram bot / Discord server / GitHub webhook with multiple concurrent users in production. Reviewer-recommended smoke test before merge: send a message from a Slack workspace not previously seen and confirm SELECT u.display_name, ui.platform_user_id FROM remote_agent_users u JOIN remote_agent_user_identities ui ON ui.user_id = u.id returns the expected row.

Side Effects / Blast Radius (required)

  • Affected subsystems: every chat/forge adapter inbound message path; orchestrator dispatch; workflow executor; isolation resolver; DB schema. Web + CLI surfaces are deliberately untouched — those rows continue writing user_id = NULL until the follow-up PR.
  • Potential unintended effects:
    • Slack adapter previously made 0 calls per inbound message; now makes 1 users.info call per first-sight user. Cached in memory after that. Tier 4 Slack rate limits (~100+/min) leave very wide headroom.
    • findOrCreateUserByPlatformIdentity now sits in the hot path of every chat-platform message. The fast path is two SELECTs (identity + user) — negligible overhead vs the existing per-message DB cost.
  • Guardrails / monitoring: structured log events follow the project convention — user.create_started/_completed/_failed, user.create_race_recovered, user.identity_orphan_repair_started/_completed, slack.users_info_missing_scope, slack.users_info_failed, <platform>.user_resolve_failed. Search for any *_failed to detect issues.

Rollback Plan (required)

  • Fast rollback: git revert <commit> (single commit; reverts both code and schema migration cleanly — the additive ALTER columns become unused but harmless on the rollback; tighter cleanup would be a separate PR if desired).
  • Feature flags or config toggles: none introduced (per project policy — keep paths explicit).
  • Observable failure symptoms to watch:
    • user.create_failed logs spiking → DB connection or constraint issue.
    • slack.users_info_missing_scope logs → users will see empty display_name; not a regression, but operators should reinstall the Slack app to fix.
    • Adapter handler errors with *_resolve_failed → user_id stays NULL but messages still flow (warn-and-continue invariant).

Risks and Mitigations

  • Risk: Slack users.info missing_scope errors after deploy if operators don't reinstall the app.
    • Mitigation: Warn-log only, never throw. Message still flows with user_id recorded by Slack id (identity created without display_name). Documented in this PR description and in the security-impact section above.
  • Risk: Concurrent first-sight webhooks create duplicate users rows.
    • Mitigation: UNIQUE(platform, platform_user_id) on the identity row trips the second writer; the transaction rolls back and we re-SELECT the winner. Covered by the recovers from race when UNIQUE constraint fires test case.
  • Risk: Orphaned identity rows (user row deleted but identity remains).
    • Mitigation: Auto-repair path rebinds the identity to a fresh user row; warn-log surfaces the event. Covered by the repairs orphaned identity test case.
  • Risk: Conflict with the open feat/keycloak-multi-user-auth branch (PR feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth #1774) if that branch resurfaces.

Summary by CodeRabbit

  • New Features

    • Slack messages now show sender display names (fetched and cached; degrades gracefully if scope missing).
    • Telegram messages now show sender names derived from profile info.
    • GitHub comments and other inputs are attributed to resolved platform users.
    • Conversations, messages, workflow runs, and isolation environments can be recorded with associated user attribution.
  • Documentation

    • Updated setup and architecture docs to describe user identity tables and Slack scope behavior.

Review Change Stack

…or and workflow runs

Adds remote_agent_users + remote_agent_user_identities tables (Archon
identity + per-platform mapping, UNIQUE(platform, platform_user_id))
and threads a resolved user_id through HandleMessageContext into the
orchestrator, workflow executor, and isolation resolver. Every new
conversation, message, workflow_run, and isolation_environment row
created from Slack/Telegram/Discord/GitHub now carries attribution.

Slack additionally enriches first-sight users with their real name via
users.info (requires bot scope users:read — reinstall the app to grant).
Telegram/Discord derive display name from the inbound event payload.
GitHub resolves event.comment.user.login or event.sender.login on each
webhook. Resolution failure warn-logs and continues — never drops a
message.

Schema is additive and nullable everywhere: existing rows remain valid
with NULL, ON DELETE SET NULL on every new FK. Web POST /api/conversations
and the CLI continue to write NULL user_id; those surfaces become
attributed in a follow-up PR. Solo installs with GITHUB_TOKEN are
unchanged.

Race-safe create-on-first-sight: UNIQUE(platform, platform_user_id)
trips on concurrent first-sight webhooks; the losing transaction rolls
back and we re-SELECT the winner's identity. Orphaned identities (user
row deleted out from under them) are auto-repaired.

Foundation for the small-team Archon initiative. Follow-ups will swap
the shared GITHUB_TOKEN for a GitHub App and wire per-user GitHub
tokens via device flow.
@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 19fcc0bd-1473-487b-84d5-8c10b3f1aee5

📥 Commits

Reviewing files that changed from the base of the PR and between 991acf1 and 49f7086.

📒 Files selected for processing (26)
  • CLAUDE.md
  • migrations/000_combined.sql
  • packages/adapters/src/chat/slack/adapter.test.ts
  • packages/adapters/src/chat/slack/adapter.ts
  • packages/adapters/src/forge/github/adapter.test.ts
  • packages/adapters/src/forge/github/adapter.ts
  • packages/cli/src/commands/chat.ts
  • packages/cli/src/commands/workflow.ts
  • packages/core/src/db/adapters/sqlite.ts
  • packages/core/src/db/isolation-environments.test.ts
  • packages/core/src/db/isolation-environments.ts
  • packages/core/src/db/messages.ts
  • packages/core/src/db/users.test.ts
  • packages/core/src/db/users.ts
  • packages/core/src/index.ts
  • packages/core/src/types/index.ts
  • packages/docs-web/src/content/docs/adapters/slack.md
  • packages/docs-web/src/content/docs/reference/architecture.md
  • packages/docs-web/src/content/docs/reference/database.md
  • packages/isolation/src/types.ts
  • packages/server/package.json
  • packages/server/src/index.ts
  • packages/server/src/resolve-user-id.test.ts
  • packages/server/src/routes/api.ts
  • packages/workflows/src/executor.ts
  • packages/workflows/src/store.ts
✅ Files skipped from review due to trivial changes (4)
  • packages/cli/src/commands/workflow.ts
  • packages/cli/src/commands/chat.ts
  • packages/docs-web/src/content/docs/adapters/slack.md
  • packages/server/src/routes/api.ts
🚧 Files skipped from review as they are similar to previous changes (12)
  • packages/workflows/src/store.ts
  • packages/core/src/db/isolation-environments.test.ts
  • packages/workflows/src/executor.ts
  • packages/adapters/src/chat/slack/adapter.test.ts
  • packages/core/src/db/isolation-environments.ts
  • packages/isolation/src/types.ts
  • migrations/000_combined.sql
  • packages/adapters/src/forge/github/adapter.ts
  • packages/core/src/db/messages.ts
  • packages/core/src/types/index.ts
  • packages/core/src/db/users.test.ts
  • packages/core/src/db/adapters/sqlite.ts

📝 Walkthrough

Walkthrough

This PR adds platform-agnostic user identity tables and a user DB module, enriches adapters with displayName and resolves platform users to Archon user UUIDs, and threads optional user attribution through conversations, messages, workflow runs, isolation environments, and orchestrator flows.

Changes

User Identity Attribution Pipeline

Layer / File(s) Summary
User Identity Schema and Database Foundation
migrations/000_combined.sql, packages/core/src/db/adapters/sqlite.ts
Adds remote_agent_users and remote_agent_user_identities, and adds nullable user_id/created_by_user_id columns with ON DELETE SET NULL and partial indexes to existing tables.
Core DB: users module & tests
packages/core/src/db/users.ts, packages/core/src/db/users.test.ts
Implements getUserById, findOrCreateUserByPlatformIdentity (backfill, orphan repair, UNIQUE-race recovery), updateUserDisplayName, and comprehensive tests for these behaviors.
Core exports & type contracts
packages/core/src/index.ts, packages/core/src/types/index.ts, packages/isolation/src/types.ts
Re-exports user types and userDb; extends Conversation.user_id, HandleMessageContext.userId, and isolation request/environment types to carry optional user attribution.
SQLite adapter schema/migration wiring
packages/core/src/db/adapters/sqlite.ts
DDL and migrateColumns() updated to create new tables and add nullable FK columns/indexes safely for fresh and existing DBs.
Slack adapter display-name enrichment
packages/adapters/src/chat/slack/adapter.ts, packages/adapters/src/chat/slack/types.ts, packages/adapters/src/chat/slack/adapter.test.ts
Adds fetchDisplayName with adapter-lifetime caching and one-time missing-scope logging; event handlers attach optional displayName; tests cover caching, missing scope, sanitization, empty-id, and retry semantics.
Telegram adapter displayName context
packages/adapters/src/chat/telegram/adapter.ts, packages/adapters/src/chat/telegram/types.ts
Computes displayName from Telegram sender data and includes it in messageHandler invocation; types updated accordingly.
GitHub adapter user resolution
packages/adapters/src/forge/github/adapter.ts, packages/adapters/src/forge/github/adapter.test.ts
Resolves comment author (prefers comment.user.login) via userDb.findOrCreateUserByPlatformIdentity; failures logged and processing continues; tests added to assert inputs and error-tolerance.
Server resolve helper & integration
packages/server/src/index.ts, packages/server/src/resolve-user-id.test.ts
Adds non-throwing resolveUserId that maps platform ids to Archon user UUIDs via userDb; adapters call it and tests verify coercion, no-throw behavior, and logging.
Conversation, Message, Workflow persistence
packages/core/src/db/conversations.ts, packages/core/src/db/messages.ts, packages/core/src/db/workflows.ts, packages/workflows/src/*
Adds optional user_id to conversation/message/workflow_run rows; DB functions/signatures updated to accept and persist user attribution; schema and tests updated to include extra bound parameters.
Isolation environment attribution
packages/isolation/src/resolver.ts, packages/core/src/db/isolation-environments.ts, packages/isolation/src/types.ts
Resolver threads optional userId into environment creation and persists created_by_user_id; upsert preserves original creator; types and tests updated.
Orchestrator propagation
packages/core/src/orchestrator/orchestrator-agent.ts, packages/core/src/orchestrator/orchestrator.ts
Threads userId from HandleMessageContext through conversation creation, isolation resolution, dispatch, executeWorkflow options, stream/batch modes, and background worker creation.
Tests, docs, and minor edits
docs and various test files
Adds/updates tests for resolveUserId and adapter enrichment; docs updated for Slack users:read and DB schema evolution; minor package.json/test formatting and TODO comments for future auth work.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • coleam00/Archon#1646: Related changes to executeWorkflow options and resume handling that overlap with this PR’s userId propagation.
  • coleam00/Archon#1329: Related workflow-dispatch/web resume context threading overlap.
  • coleam00/Archon#1756: Overlaps at orchestrator dispatch/resumable-run lookup logic.

Poem

🐰 I hop along through schema rows,
Linking names where platform wind blows.
From Slack and GitHub to Telegram bright,
Archon remembers each user's light.
Hooray — identities stitched just right! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: introducing user_id plumbing from chat/forge adapters through the orchestrator to workflow runs.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering problem, objectives, UX journey, architecture, validation, security, migration steps, edge cases, and rollback plan.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/user-identity-foundation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Wirasm

Wirasm commented May 27, 2026

Copy link
Copy Markdown
Collaborator Author

PR Review Summary — Multi-Agent

Reviewed by code-reviewer, docs-impact, pr-test-analyzer, silent-failure-hunter, type-design-analyzer, comment-analyzer, code-simplifier. All findings advisory.

Critical Issues (1 found)

Agent Issue Location
code-reviewer SQLite migrateColumns() omits ON DELETE SET NULL from all four FK ALTERs. New DBs use the correct clause via createSchema(), but upgraded SQLite DBs (the default deployment target) get the SQLite default NO ACTIONRESTRICT with PRAGMA foreign_keys = ON. Operators who later try to delete a user row will hit a FK violation — directly contradicting the PR's documented ON DELETE SET NULL safety guarantee. packages/core/src/db/adapters/sqlite.ts:227,242,257,274

Important Issues (5 found)

Agent Issue Location
silent-failure-hunter findOrCreateUserByPlatformIdentity catch is catch (err) for what should only be a UNIQUE-constraint race. If selectIdentity itself throws during recovery, the original transaction error is silently replaced and user.create_failed is never logged. Fix: log the original error before recovery, or narrow the catch to PG 23505 / SQLite UNIQUE-constraint string. packages/core/src/db/users.ts:98-112
silent-failure-hunter backfillDisplayName has no try/catch and no logging. A failure on the opportunistic UPDATE propagates up through findOrCreateUserByPlatformIdentity, gets caught at resolveUserId, and the entire user resolution appears failed — even though the user row already existed and was correctly resolved. Result: userId = null on conversation + workflow_run rows for a recoverable column-update failure. packages/core/src/db/users.ts:115-129
code-reviewer / silent-failure-hunter slack.users_info_missing_scope fires once per unique Slack user per restart, not "once per startup" as the JSDoc/comment claims (negative results aren't cached). In a 100-user workspace, this floods WARN logs after every restart. missing_scope is a permanent misconfiguration — gate the log behind a single boolean instance flag. packages/adapters/src/chat/slack/adapter.ts:282,301-303
pr-test-analyzer GitHub adapter test does not mock @archon/core/db/users. The new userDb.findOrCreateUserByPlatformIdentity call (lines 947-957) appears to "work" in existing tests only because the try/catch silently swallows the error. More importantly, the comment.user.login ?? sender.login attribution-fallback logic — the only place the attributed login differs from sender.login — has no test coverage. packages/adapters/src/forge/github/adapter.test.ts:49-78
pr-test-analyzer resolveUserId() in packages/server/src/index.ts:105-127 has zero test coverage. Its "never throws — return undefined on failure" contract is critical for availability across Slack/Telegram/Discord; if a future refactor accidentally rethrows, message processing silently stops for all users. packages/server/src/index.ts:105-127

Suggestions (selected)

Agent Suggestion Location
code-reviewer user.create_completed/_failed are not paired with a user.create_started event (CLAUDE.md rule: always pair _started with _completed/_failed). packages/core/src/db/users.ts:92-110
silent-failure-hunter repairOrphanedIdentity has no user.identity_orphan_repair_failed log — a transaction crash there is reported only as the generic ${platform}.user_resolve_failed. packages/core/src/db/users.ts:131-157
silent-failure-hunter slack.users_info_failed logs the full err object, which on Slack SDK includes err.data (API response body — potential PII/workspace metadata). Log errMessage, slackErrorCode, slackUserId instead. packages/adapters/src/chat/slack/adapter.ts:304
silent-failure-hunter Server-emitted ${platform}.user_resolve_failed collides with the GitHub adapter's github.user_resolve_failed. Use a static server.user_resolve_failed and put platform in structured fields. packages/server/src/index.ts:123
type-design-analyzer UserIdentity.platform: string is unconstrained. A typo ('Slack' vs 'slack') silently breaks the UNIQUE(platform, platform_user_id) invariant. Introduce IdentityPlatform = 'slack' | 'telegram' | 'discord' | 'github' | 'web' | 'cli'. packages/core/src/types/index.ts:59, packages/core/src/db/users.ts:54
type-design-analyzer User and UserIdentity are hand-crafted interfaces, violating the project's z.infer<typeof schema> convention (followed correctly by workflowRunSchema). Adds two more divergence points. packages/core/src/types/index.ts:46-64
type-design-analyzer addMessage accepts userId? but every production call site (web API, CLI adapter, web persistence adapter) omits it. Either thread it now or add explicit TODOs. packages/server/src/routes/api.ts:1304,1545,2011,2013
pr-test-analyzer No test asserts that created_by_user_id is absent from the DO UPDATE SET clause. A copy-paste regression would silently transfer ownership on every isolation re-activation. One-liner test locks in "first-creator-wins". packages/core/src/db/isolation-environments.test.ts (ON CONFLICT block)
pr-test-analyzer backfillDisplayName asymmetric case (identity has display_name but user.display_name is null) is untested — only "both null" and "both non-null" are covered. packages/core/src/db/users.test.ts
code-simplifier resolveUserId has a dead === null branch — platformUserId is typed string | number | undefined, so null is already excluded by TS. packages/server/src/index.ts:110
code-simplifier fetchDisplayName ends with ?? undefined after three optional chains that already evaluate to undefined — redundant. packages/adapters/src/chat/slack/adapter.ts:293
code-simplifier migrateColumns opens four new PRAGMA-then-ALTER blocks; two of them re-query tables (conversations, workflow_runs) that pre-existing blocks already query. Fold the user_id guards into the existing per-table blocks (-25 lines, fewer PRAGMA calls, single story per table). packages/core/src/db/adapters/sqlite.ts:219-276
comment-analyzer ~15 instances of (PR-A) / until PR-C / pre-PR-A history labels across production types, migration SQL, and source files. These are PR-state markers that rot on merge. Strip the labels; keep the surrounding WHY content where it earns its place. Multiple — see migrations/000_combined.sql:52, sqlite.ts:392-523, types/index.ts, isolation/types.ts, workflows/store.ts, messages.ts
comment-analyzer users.test.ts step-by-step // 1) SELECT ... // 2) SELECT ... mock-queue comments are pure-WHAT and duplicate the surrounding assertions. packages/core/src/db/users.test.ts

Documentation Issues

File Stale Content Fix
CLAUDE.md:402-410 "8 Tables" — PR makes it 10 Update count; add users and user_identities entries
packages/docs-web/src/content/docs/reference/database.md:122 "The database has 8 tables" Update to 10; add the two new tables
packages/docs-web/src/content/docs/reference/database.md:48-66 Migration list doesn't mention the user-identity additions Add note that re-running 000_combined.sql is idempotent and applies the new ALTERs
packages/docs-web/src/content/docs/reference/architecture.md:56,1029 "SQLite (default) / PostgreSQL (7 Tables)" / "7-table schema" — already stale, now doubly so Update to 10 tables; add user_id / created_by_user_id to the four affected tables in the schema block (lines 1033-1099) and add the two new tables
packages/docs-web/src/content/docs/adapters/slack.md:55-67 users:read scope (now required for users.info enrichment) is missing from the Bot Token Scopes list Add users:read with a note that omitting it degrades gracefully but logs a missing_scope warning

Strengths

  • Architecture is sound. findOrCreateUserByPlatformIdentity is the single canonical write path; race recovery, orphan repair, and backfill are well-encapsulated.
  • Nullable userId is consistently applied across all four affected tables — no accidental non-null widening.
  • WorkflowRun correctly uses z.infer<typeof workflowRunSchema> and nullable: true.
  • The created_by_user_id first-creator-wins semantics is correctly implemented at the SQL level (omitted from DO UPDATE SET).
  • GitHub adapter never blocks a 200 response on user-resolution failure (try/catch scoped correctly).
  • Slack displayNameCache is positive-only by design with a clear rationale documented in the field comment — the comment is a good example of a justified WHY.
  • The users.ts file-level docstring explains the race-safe concurrency model precisely; the findOrCreateUserByPlatformIdentity JSDoc captures all three branches a reader needs.
  • Race-recovery and orphan-repair test scenarios exist and are exercised end-to-end.

Verdict

NEEDS FIXES — one critical FK migration bug + two high-severity silent-failure paths + missing test coverage on resolveUserId and the GitHub attribution fallback. None block the architectural intent; all are localized to the new files this PR adds.

Recommended Actions

  1. Critical: Add ON DELETE SET NULL to all four ALTER TABLE statements in sqlite.ts (lines 227, 242, 257, 274). Without this, upgraded SQLite DBs disagree with new SQLite DBs on FK delete semantics.
  2. Important — error handling: Wrap backfillDisplayName in a logged try/catch; narrow the race-recovery catch to UNIQUE-constraint errors; gate slack.users_info_missing_scope behind a once-per-adapter flag.
  3. Important — tests: Mock @archon/core/db/users in the GitHub adapter test and cover the comment.user.login ?? sender.login fallback. Export-or-wrap resolveUserId to cover the never-throws contract. Add the created_by_user_id-not-in-DO UPDATE SET assertion.
  4. Important — docs: Update the table count in CLAUDE.md, reference/database.md, and reference/architecture.md; add the users:read scope to the Slack adapter setup doc.
  5. Polish (nice-to-have): Strip (PR-A) / until PR-C labels from production types and migrations; introduce IdentityPlatform literal union for findOrCreateUserByPlatformIdentity; fold the four PR-A PRAGMA blocks into existing per-table migration blocks where they overlap.

Re-run targeted aspects after fixes: /prp-core:prp-review-agents 1783 code errors tests.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (3)
packages/core/src/db/conversations.test.ts (1)

71-250: ⚡ Quick win

Add one non-null user_id creation test.

All updated cases assert $6 = null; add one case passing userId to lock in the new behavior and prevent silent regression.

Proposed test addition
+    test('persists user_id when provided', async () => {
+      const newConversation: Conversation = {
+        ...existingConversation,
+        id: 'conv-with-user',
+      };
+      mockQuery.mockResolvedValueOnce(createQueryResult([]));
+      mockQuery.mockResolvedValueOnce(createQueryResult([newConversation]));
+
+      await getOrCreateConversation('telegram', 'chat-999', undefined, undefined, 'user-uuid-1');
+
+      expect(mockQuery).toHaveBeenNthCalledWith(
+        2,
+        'INSERT INTO remote_agent_conversations (platform_type, platform_conversation_id, ai_assistant_type, codebase_id, cwd, user_id) VALUES ($1, $2, $3, $4, $5, $6) RETURNING *',
+        ['telegram', 'chat-999', 'claude', null, null, 'user-uuid-1']
+      );
+    });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/db/conversations.test.ts` around lines 71 - 250, Add a test
to cover creating a conversation with a non-null user_id: call
getOrCreateConversation with the userId argument, mock queries with
createQueryResult (empty then inserted row), and assert mockQuery was called for
INSERT with the 6th parameter equal to the provided userId; reference the
existing patterns in packages/core/src/db/conversations.test.ts (functions
getOrCreateConversation, mockQuery, createQueryResult) and mirror the other
tests' structure for setup and expectations.
packages/core/src/db/workflows.test.ts (1)

55-129: ⚡ Quick win

Add a createWorkflowRun test that passes user_id.

Current assertions only validate null in the new slot; add one non-null case to protect this attribution path.

Proposed test addition
+    test('creates workflow run with user_id when provided', async () => {
+      mockQuery.mockResolvedValueOnce(createQueryResult([mockWorkflowRun]));
+
+      await createWorkflowRun({
+        workflow_name: 'feature-development',
+        conversation_id: 'conv-456',
+        user_message: 'Add dark mode support',
+        user_id: 'user-uuid-1',
+      });
+
+      expect(mockQuery).toHaveBeenCalledWith(
+        expect.stringContaining('INSERT INTO remote_agent_workflow_runs'),
+        ['feature-development', 'conv-456', null, 'Add dark mode support', '{}', null, null, 'user-uuid-1']
+      );
+    });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/db/workflows.test.ts` around lines 55 - 129, The tests for
createWorkflowRun are missing a case that supplies a non-null user_id; add a new
unit test that calls createWorkflowRun with user_id (e.g., 'user-123'),
mockQuery.mockResolvedValueOnce to return a row including that user_id, assert
the returned result.user_id equals 'user-123', and assert mockQuery was called
with an INSERT INTO remote_agent_workflow_runs parameter array where the user_id
slot contains 'user-123' (keeping other params same); reference the existing
test pattern and reuse mockWorkflowRun, mockQuery, and createQueryResult to
implement the new test.
packages/adapters/src/chat/slack/adapter.ts (1)

46-47: ⚡ Quick win

Cap the display-name cache to prevent unbounded memory growth.

displayNameCache currently grows forever across unique users. Add a max size + eviction policy (FIFO/LRU) to keep adapter memory bounded in long-lived processes.

♻️ Suggested bounded-cache change
+const MAX_DISPLAY_NAME_CACHE = 5000;
+
   private displayNameCache = new Map<string, string>();
@@
       const name = u?.real_name ?? u?.profile?.display_name ?? u?.name ?? undefined;
       if (name) {
+        if (this.displayNameCache.size >= MAX_DISPLAY_NAME_CACHE) {
+          const oldest = this.displayNameCache.keys().next().value;
+          if (oldest !== undefined) this.displayNameCache.delete(oldest);
+        }
         this.displayNameCache.set(slackUserId, name);
       }

Also applies to: 287-296

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/adapters/src/chat/slack/adapter.ts` around lines 46 - 47,
displayNameCache currently is an unbounded Map (field displayNameCache) and must
be replaced with a bounded cache with eviction; change the displayNameCache
implementation to enforce a fixed max size (e.g. MAX_DISPLAYNAME_CACHE) and
evict entries on insert when capacity is exceeded using an eviction policy (LRU
or FIFO) so memory cannot grow unbounded; update all uses of displayNameCache
(the field declared as displayNameCache and the code paths referenced around the
other occurrences ~287-296) to interact with the new bounded cache API (get,
set, delete/evict) so lookups and inserts behave the same while automatically
removing old entries when the cache is full.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/adapters/src/chat/slack/adapter.test.ts`:
- Around line 531-556: The test suite's beforeEach only clears mockUsersInfo so
mockLogger.warn can carry state across tests and make the assertion flaky;
update the beforeEach in this describe block to also reset the logger mock (call
mockLogger.warn.mockClear() or equivalent) so that each test starts with a fresh
mock for mockLogger.warn when running SlackAdapter.fetchDisplayName.

In `@packages/adapters/src/chat/slack/adapter.ts`:
- Around line 299-305: Mask the slackUserId before logging to avoid exposing
PII: compute a masked value (e.g., take the first 8 chars then '...') and use
that maskedId in both getLog().warn calls instead of slackUserId; update
references in the users info error handling block surrounding the err/data
parsing (the variables slackUserId, slackErrorCode and the getLog() calls) so
all logs use the masked value.

In `@packages/adapters/src/forge/github/adapter.ts`:
- Around line 958-960: The warning currently logs user-identifying data via the
githubLogin field; remove or mask that PII when calling getLog().warn in the
github.user_resolve_failed branch: stop passing attributedLogin as githubLogin
(or replace it with a masked value like attributedLogin.slice(0,8)+'...' if
truly needed), keep the error via toError(err) and the same event key, and
ensure the call site that constructs the log only includes non-PII fields so the
log entry no longer contains the raw GitHub login.
- Around line 947-955: The code is using the mutable GitHub login as the unique
platformUserId when calling userDb.findOrCreateUserByPlatformIdentity; change
the call in adapter.ts to pass the immutable numeric GitHub id (stringified) —
use event.comment?.user?.id or event.sender?.id as platformUserId — and continue
passing attributedLogin only as the displayName; update the GitHub webhook types
in packages/adapters/src/forge/github/types.ts (add user.id and sender.id
typings) so the adapter can access those ids safely; additionally remove or mask
raw logins from warning/processLogger statements around the authorization check
(avoid logging attributedLogin directly, use masked username or omit the field).

In `@packages/core/package.json`:
- Line 26: The users.test.ts mock of ./connection leaks into other tests because
Bun’s mock.module() persists across files in the same test invocation; update
the "test" script in package.json so packages/core/src/db/users.test.ts is
executed in its own bun test invocation (or grouped only with tests that mock
./connection identically) instead of inside the large chained invocation —
locate the "test" script entry in package.json and split out
"src/db/users.test.ts" into a separate && bun test ... segment (or create a
grouped segment with matching mocks) so mock.module() state cannot pollute other
tests like workflows.test.ts, messages.test.ts, workflow-events.test.ts.

In `@packages/core/src/db/adapters/sqlite.ts`:
- Around line 230-232: The catch blocks that currently call getLog().warn(...)
(e.g., the handlers logging 'db.sqlite_migration_conversations_user_id_failed'
and the similar handlers at the other locations) must not swallow errors; after
logging include the original error and rethrow it (or throw a new Error that
wraps the original) so startup fails fast. Update each catch (the ones calling
getLog().warn({ err: e as Error },
'db.sqlite_migration_conversations_user_id_failed') and the analogous warns at
the other noted locations) to: 1) log with the error object for context, and 2)
rethrow the same error (throw e) or throw a new Error with the original error
attached, so migrations cannot silently fail.
- Around line 219-229: The migration in packages/core/src/db/adapters/sqlite.ts
that alters remote_agent_conversations to add foreign key columns (see the ALTER
TABLE ... ADD COLUMN ... for user_id and created_by_user_id) omits the ON DELETE
SET NULL clause; update each ALTER TABLE ADD COLUMN statement that adds a
REFERENCES remote_agent_users(id) (the ones adding user_id and
created_by_user_id) to append "ON DELETE SET NULL" so the upgrade path matches
the fresh-schema behavior and ensures deletions nullify the FK instead of
leaving dangling references.

In `@packages/core/src/db/messages.ts`:
- Around line 30-42: add a guard at the start of addMessage to reject any
non-null userId when role === 'assistant' by throwing a clear Error before
performing the INSERT; specifically, in the addMessage function check the role
and userId and throw (with a short message) if an assistant message is supplied
a userId so we never pass an invalid user_id into the remote_agent_messages
insert, keeping the DB attribution consistent.

In `@packages/core/src/db/users.test.ts`:
- Around line 15-19: Replace the process-global mock.module('./connection', ...)
call with a spyOn-based mock that targets the imported connection module in
users.test.ts: create a spy via spyOn(importedConnectionModule, ...) to stub
pool.query (mockQuery), getDialect (returning mockPostgresDialect) and
getDatabase (returning { withTransaction: mockWithTransaction }) and ensure the
spy is restored at test teardown with spy.mockRestore() so other test suites are
not contaminated.

In `@packages/core/src/db/users.ts`:
- Around line 69-71: The logs call getLog().warn/getLog().info (e.g., the
user.identity_orphan_repair_started event) with raw PII variables platformUserId
and displayName; replace those direct values with non-PII surrogates (log
existing.id and platform only, and for platformUserId/displayName either omit
them or log a one-way hash/truncated surrogate such as
sha256(platformUserId).slice(0,8) or platformUserId.slice(0,8)+'...'); update
all getLog() calls in this file that reference platformUserId or displayName
(including the occurrences around user.identity_orphan_repair_started and the
other calls noted) to use the masked/hashed value or internal id only, keeping
event names and structured keys intact.

In `@packages/core/src/orchestrator/orchestrator-agent.ts`:
- Line 653: The current getLog().debug call logs the raw userId (PII); update
the log in orchestrator-agent.ts to stop emitting the raw userId—either remove
userId from the structured payload or replace it with a non-identifying value
(e.g., a masked or hashed token produced by a helper like maskUserId(userId) or
hashUserId(userId)) and keep conversationId as-is; modify the call to
getLog().debug({ conversationId, user: maskUserId(userId) },
'orchestrator_message_received') or simply getLog().debug({ conversationId },
'orchestrator_message_received') so no raw user UUID is written.

In `@packages/server/src/index.ts`:
- Around line 120-124: The warning currently logs the raw platform user id in
the structured payload; remove or redact it to avoid PII retention by replacing
platformUserId in the getLog().warn call with either no identifier or a masked
value (e.g., compute maskedPlatformUserId = String(platformUserId).slice(0,8) +
'...' before logging) and keep the rest of the payload (err and platform) and
the same message key `${platform}.user_resolve_failed`; update the call sites
around getLog().warn to use the maskedPlatformUserId or omit that field
entirely.

---

Nitpick comments:
In `@packages/adapters/src/chat/slack/adapter.ts`:
- Around line 46-47: displayNameCache currently is an unbounded Map (field
displayNameCache) and must be replaced with a bounded cache with eviction;
change the displayNameCache implementation to enforce a fixed max size (e.g.
MAX_DISPLAYNAME_CACHE) and evict entries on insert when capacity is exceeded
using an eviction policy (LRU or FIFO) so memory cannot grow unbounded; update
all uses of displayNameCache (the field declared as displayNameCache and the
code paths referenced around the other occurrences ~287-296) to interact with
the new bounded cache API (get, set, delete/evict) so lookups and inserts behave
the same while automatically removing old entries when the cache is full.

In `@packages/core/src/db/conversations.test.ts`:
- Around line 71-250: Add a test to cover creating a conversation with a
non-null user_id: call getOrCreateConversation with the userId argument, mock
queries with createQueryResult (empty then inserted row), and assert mockQuery
was called for INSERT with the 6th parameter equal to the provided userId;
reference the existing patterns in packages/core/src/db/conversations.test.ts
(functions getOrCreateConversation, mockQuery, createQueryResult) and mirror the
other tests' structure for setup and expectations.

In `@packages/core/src/db/workflows.test.ts`:
- Around line 55-129: The tests for createWorkflowRun are missing a case that
supplies a non-null user_id; add a new unit test that calls createWorkflowRun
with user_id (e.g., 'user-123'), mockQuery.mockResolvedValueOnce to return a row
including that user_id, assert the returned result.user_id equals 'user-123',
and assert mockQuery was called with an INSERT INTO remote_agent_workflow_runs
parameter array where the user_id slot contains 'user-123' (keeping other params
same); reference the existing test pattern and reuse mockWorkflowRun, mockQuery,
and createQueryResult to implement the new test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a72c3e5a-367e-455b-b734-e46bb0c4502b

📥 Commits

Reviewing files that changed from the base of the PR and between ac18034 and 991acf1.

📒 Files selected for processing (29)
  • migrations/000_combined.sql
  • packages/adapters/src/chat/slack/adapter.test.ts
  • packages/adapters/src/chat/slack/adapter.ts
  • packages/adapters/src/chat/slack/types.ts
  • packages/adapters/src/chat/telegram/adapter.ts
  • packages/adapters/src/chat/telegram/types.ts
  • packages/adapters/src/forge/github/adapter.ts
  • packages/core/package.json
  • packages/core/src/db/adapters/sqlite.ts
  • packages/core/src/db/conversations.test.ts
  • packages/core/src/db/conversations.ts
  • packages/core/src/db/index.ts
  • packages/core/src/db/isolation-environments.test.ts
  • packages/core/src/db/isolation-environments.ts
  • packages/core/src/db/messages.test.ts
  • packages/core/src/db/messages.ts
  • packages/core/src/db/users.test.ts
  • packages/core/src/db/users.ts
  • packages/core/src/db/workflows.test.ts
  • packages/core/src/db/workflows.ts
  • packages/core/src/orchestrator/orchestrator-agent.ts
  • packages/core/src/orchestrator/orchestrator.ts
  • packages/core/src/types/index.ts
  • packages/isolation/src/resolver.ts
  • packages/isolation/src/types.ts
  • packages/server/src/index.ts
  • packages/workflows/src/executor.ts
  • packages/workflows/src/schemas/workflow-run.ts
  • packages/workflows/src/store.ts

Comment on lines +531 to +556
beforeEach(() => {
mockUsersInfo.mockClear();
});

test('returns real_name from users.info on first call and caches result', async () => {
const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake');
const name1 = await adapter.fetchDisplayName('U123');
const name2 = await adapter.fetchDisplayName('U123');
expect(name1).toBe('Alice Liddell');
expect(name2).toBe('Alice Liddell');
// Second call hits the in-memory cache — no second API call.
expect(mockUsersInfo).toHaveBeenCalledTimes(1);
});

test('returns undefined and warn-logs on missing_scope failure', async () => {
const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake');
const slackErr = Object.assign(new Error('missing_scope'), {
data: { error: 'missing_scope' },
});
mockUsersInfo.mockRejectedValueOnce(slackErr);

const name = await adapter.fetchDisplayName('U_NEW');

expect(name).toBeUndefined();
expect(mockLogger.warn).toHaveBeenCalled();
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reset logger mocks in this block to keep assertions deterministic.

mockLogger.warn isn’t cleared in this describe, so prior warning calls can satisfy the expectation in Line 555. Clear it in the local beforeEach.

🧪 Suggested fix
   describe('fetchDisplayName (users.info enrichment)', () => {
     beforeEach(() => {
       mockUsersInfo.mockClear();
+      mockLogger.warn.mockClear();
     });
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
beforeEach(() => {
mockUsersInfo.mockClear();
});
test('returns real_name from users.info on first call and caches result', async () => {
const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake');
const name1 = await adapter.fetchDisplayName('U123');
const name2 = await adapter.fetchDisplayName('U123');
expect(name1).toBe('Alice Liddell');
expect(name2).toBe('Alice Liddell');
// Second call hits the in-memory cache — no second API call.
expect(mockUsersInfo).toHaveBeenCalledTimes(1);
});
test('returns undefined and warn-logs on missing_scope failure', async () => {
const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake');
const slackErr = Object.assign(new Error('missing_scope'), {
data: { error: 'missing_scope' },
});
mockUsersInfo.mockRejectedValueOnce(slackErr);
const name = await adapter.fetchDisplayName('U_NEW');
expect(name).toBeUndefined();
expect(mockLogger.warn).toHaveBeenCalled();
});
beforeEach(() => {
mockUsersInfo.mockClear();
mockLogger.warn.mockClear();
});
test('returns real_name from users.info on first call and caches result', async () => {
const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake');
const name1 = await adapter.fetchDisplayName('U123');
const name2 = await adapter.fetchDisplayName('U123');
expect(name1).toBe('Alice Liddell');
expect(name2).toBe('Alice Liddell');
// Second call hits the in-memory cache — no second API call.
expect(mockUsersInfo).toHaveBeenCalledTimes(1);
});
test('returns undefined and warn-logs on missing_scope failure', async () => {
const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake');
const slackErr = Object.assign(new Error('missing_scope'), {
data: { error: 'missing_scope' },
});
mockUsersInfo.mockRejectedValueOnce(slackErr);
const name = await adapter.fetchDisplayName('U_NEW');
expect(name).toBeUndefined();
expect(mockLogger.warn).toHaveBeenCalled();
});
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/adapters/src/chat/slack/adapter.test.ts` around lines 531 - 556, The
test suite's beforeEach only clears mockUsersInfo so mockLogger.warn can carry
state across tests and make the assertion flaky; update the beforeEach in this
describe block to also reset the logger mock (call mockLogger.warn.mockClear()
or equivalent) so that each test starts with a fresh mock for mockLogger.warn
when running SlackAdapter.fetchDisplayName.

Comment on lines +299 to +305
const err = error as Error & { data?: { error?: string } };
const slackErrorCode = err.data?.error;
if (slackErrorCode === 'missing_scope') {
getLog().warn({ slackUserId, scope: 'users:read' }, 'slack.users_info_missing_scope');
} else {
getLog().warn({ err, slackUserId, slackErrorCode }, 'slack.users_info_failed');
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Mask user identifiers in warning logs.

Line 302 and Line 304 log raw slackUserId. That exposes user identifiers in logs; mask it before logging.

🔐 Suggested fix
     } catch (error) {
       const err = error as Error & { data?: { error?: string } };
       const slackErrorCode = err.data?.error;
+      const maskedUserId = `${slackUserId.slice(0, 4)}***`;
       if (slackErrorCode === 'missing_scope') {
-        getLog().warn({ slackUserId, scope: 'users:read' }, 'slack.users_info_missing_scope');
+        getLog().warn({ maskedUserId, scope: 'users:read' }, 'slack.users_info_missing_scope');
       } else {
-        getLog().warn({ err, slackUserId, slackErrorCode }, 'slack.users_info_failed');
+        getLog().warn({ err, maskedUserId, slackErrorCode }, 'slack.users_info_failed');
       }
       return undefined;
     }
As per coding guidelines "never log API keys, tokens (mask as `token.slice(0, 8) + '...'`), message content, or PII".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/adapters/src/chat/slack/adapter.ts` around lines 299 - 305, Mask the
slackUserId before logging to avoid exposing PII: compute a masked value (e.g.,
take the first 8 chars then '...') and use that maskedId in both getLog().warn
calls instead of slackUserId; update references in the users info error handling
block surrounding the err/data parsing (the variables slackUserId,
slackErrorCode and the getLog() calls) so all logs use the masked value.

Comment on lines +947 to +955
const attributedLogin = event.comment?.user?.login ?? senderUsername;
let archonUserId: string | undefined;
if (attributedLogin) {
try {
const user = await userDb.findOrCreateUserByPlatformIdentity(
'github',
attributedLogin,
attributedLogin
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the file and print the relevant line ranges
file="packages/adapters/src/forge/github/adapter.ts"
if [ ! -f "$file" ]; then
  echo "Missing file: $file" >&2
  exit 1
fi

echo "=== Context around lines 930-990 ==="
nl -ba "$file" | sed -n '930,990p'

echo "=== Search for githubLogin usage ==="
rg -n "githubLogin" "$file" || true

echo "=== Search for structured logging patterns near this block ==="
# Print a bit more around any log calls in the neighborhood
rg -n "logger\.|createLogger|pino|log\(" "$file" | head -n 50

echo "=== Search for handleMessage(...) signature/call around the block ==="
rg -n "handleMessage\(" "$file" | head -n 50

Repository: coleam00/Archon

Length of output: 158


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="packages/adapters/src/forge/github/adapter.ts"
if [ ! -f "$file" ]; then
  echo "Missing file: $file" >&2
  exit 1
fi

echo "=== Context around lines 930-990 ==="
cat -n "$file" | sed -n '930,990p'

echo "=== Search for githubLogin usage (anywhere in file) ==="
rg -n "githubLogin" "$file" || true

echo "=== Search for pino/log usage near this block ==="
rg -n "createLogger|pino|logger|log\(" "$file" | head -n 80

echo "=== Search for handleMessage(...) around this block ==="
rg -n "handleMessage\(" "$file" | head -n 80

Repository: coleam00/Archon

Length of output: 3444


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "findOrCreateUserByPlatformIdentity" -S packages/adapters/src/forge/github/adapter.ts
rg -n "findOrCreateUserByPlatformIdentity" -S packages | head -n 50

# Try to locate the implementation file(s)
fd -t f "findOrCreateUserByPlatformIdentity" packages || true
fd -t f "*user*db*" packages | head -n 50

Repository: coleam00/Archon

Length of output: 2301


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="packages/core/src/db/users.ts"
if [ ! -f "$file" ]; then
  echo "Missing file: $file" >&2
  exit 1
fi

echo "=== Context around findOrCreateUserByPlatformIdentity in $file ==="
# Print from a bit before the function to after it (avoid dumping whole file)
start=$(rg -n "export async function findOrCreateUserByPlatformIdentity" "$file" | head -n1 | cut -d: -f1)
echo "Function starts at line: $start"
from=$((start-10))
to=$((start+90))
sed -n "${from},${to}p" "$file" | cat -n

echo "=== Call sites to infer argument meaning ==="
rg -n "findOrCreateUserByPlatformIdentity\\(" packages | head -n 80

Repository: coleam00/Archon

Length of output: 5578


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="packages/adapters/src/forge/github/adapter.ts"

echo "=== senderUsername definition (nearby occurrences) ==="
rg -n "senderUsername" "$file" 
echo "=== attributedLogin usage occurrences ==="
rg -n "attributedLogin" "$file"

echo "=== githubLogin field occurrences ==="
rg -n "githubLogin" "$file"

echo "=== Context earlier around senderUsername + event.comment user access ==="
# Show a broader window around the attribution block
cat -n "$file" | sed -n '860,980p'

Repository: coleam00/Archon

Length of output: 7526


🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="packages/adapters/src/forge/github/adapter.ts"

echo "=== Imports mentioning webhooks / Octokit event types ==="
rg -n "`@octokit`|webhooks|issue_comment|pull_request|Event" "$file" | head -n 80

echo "=== Find where method signature types event ==="
rg -n "function|async function|handle.*\\(.*event" "$file" | head -n 80

echo "=== Search for 'event:' type annotations around handle path ==="
rg -n "event\\s*:\\s*" "$file" | head -n 80

echo "=== Show around the method that contains this block (near line 700-1000) for context ==="
sed -n '650,820p' "$file" | cat -n

Repository: coleam00/Archon

Length of output: 9443


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Find the types file imported by adapter.ts
rg -n "from '\\./types'" packages/adapters/src/forge/github/adapter.ts
ls -la packages/adapters/src/forge/github

# Print the WebhookEvent typing area
types_file="packages/adapters/src/forge/github/types.ts"
if [ -f "$types_file" ]; then
  echo "=== Contents around WebhookEvent in $types_file ==="
  rg -n "export type WebhookEvent|export interface WebhookEvent|type WebhookEvent" "$types_file"
  start=$(rg -n "export (type|interface) WebhookEvent" "$types_file" | head -n1 | cut -d: -f1 || true)
  if [ -n "${start:-}" ]; then
    from=$((start-20)); to=$((start+220));
    sed -n "${from},${to}p" "$types_file" | cat -n
  else
    # fallback: just print first 260 lines
    head -n 260 "$types_file" | cat -n
  fi
else
  echo "Missing expected file: $types_file" >&2
  # List possible matches
  fd -t f "types.*" packages/adapters/src/forge/github
fi

Repository: coleam00/Archon

Length of output: 2613


🏁 Script executed:

#!/bin/bash
set -euo pipefail

types_file="packages/adapters/src/forge/github/types.ts"
echo "=== File size ==="
wc -l "$types_file"

echo "=== WebhookEvent user shape (search for id/login in this file) ==="
rg -n "interface .*User|user\\b|login\\b|\\bid\\b" "$types_file" | head -n 120

echo "=== Print types.ts (first 220 lines) ==="
sed -n '1,220p' "$types_file" | cat -n

echo "=== Print types.ts (lines 220-520) to catch comment.user definitions ==="
sed -n '220,520p' "$types_file" | cat -n

Repository: coleam00/Archon

Length of output: 1570


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file="packages/adapters/src/forge/github/adapter.ts"

echo "=== Search for comment.user.id usage ==="
rg -n "comment\\?\\.user\\?\\.id|comment\\.user\\.id|comment\\s*\\)\\?\\.user\\?\\.id" "$file" || true

echo "=== Search for sender.id usage ==="
rg -n "sender\\?\\.id|sender\\.id" "$file" || true

echo "=== Search for '.id' near comment.user ==="
rg -n "comment\\?\\.(user|sender).*id|user\\?\\.id|sender\\?\\.id" "$file" || true

Repository: coleam00/Archon

Length of output: 178


Use GitHub’s immutable user id for the Archon identity key (and don’t log raw logins)

  • In packages/adapters/src/forge/github/adapter.ts (around lines 947-956), findOrCreateUserByPlatformIdentity('github', attributedLogin, attributedLogin) uses the (mutable) login as platformUserId—but findOrCreateUserByPlatformIdentity treats platformUserId as the unique identity key—so a GitHub rename will create a new Archon user. Pass comment.user.id/sender.id as platformUserId (stringified) and keep login only as the display name; this requires extending packages/adapters/src/forge/github/types.ts since WebhookEvent currently types only user.login / sender.login (no id).
  • In packages/adapters/src/forge/github/adapter.ts (around lines 958-961), the warning logs { githubLogin: attributedLogin } with the raw login; mask or omit it to avoid logging PII (the file already masks usernames for unauthorized attempts).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/adapters/src/forge/github/adapter.ts` around lines 947 - 955, The
code is using the mutable GitHub login as the unique platformUserId when calling
userDb.findOrCreateUserByPlatformIdentity; change the call in adapter.ts to pass
the immutable numeric GitHub id (stringified) — use event.comment?.user?.id or
event.sender?.id as platformUserId — and continue passing attributedLogin only
as the displayName; update the GitHub webhook types in
packages/adapters/src/forge/github/types.ts (add user.id and sender.id typings)
so the adapter can access those ids safely; additionally remove or mask raw
logins from warning/processLogger statements around the authorization check
(avoid logging attributedLogin directly, use masked username or omit the field).

Comment on lines +958 to +960
getLog().warn(
{ err: toError(err), githubLogin: attributedLogin },
'github.user_resolve_failed'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Drop the raw GitHub login from this warning log.

githubLogin is user-identifying data, so persisting it in failure logs creates avoidable PII retention. Omit it or mask it before logging.

Suggested change
         getLog().warn(
-          { err: toError(err), githubLogin: attributedLogin },
+          { err: toError(err) },
           'github.user_resolve_failed'
         );
As per coding guidelines, "Use structured logging with Pino via `createLogger()` from `@archon/paths`... never log API keys, tokens (mask as `token.slice(0, 8) + '...'`), message content, or PII".
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
getLog().warn(
{ err: toError(err), githubLogin: attributedLogin },
'github.user_resolve_failed'
getLog().warn(
{ err: toError(err) },
'github.user_resolve_failed'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/adapters/src/forge/github/adapter.ts` around lines 958 - 960, The
warning currently logs user-identifying data via the githubLogin field; remove
or mask that PII when calling getLog().warn in the github.user_resolve_failed
branch: stop passing attributedLogin as githubLogin (or replace it with a masked
value like attributedLogin.slice(0,8)+'...' if truly needed), keep the error via
toError(err) and the same event key, and ensure the call site that constructs
the log only includes non-PII fields so the log entry no longer contains the raw
GitHub login.

},
"scripts": {
"test": "bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/connection.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts",
"test": "bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/connection.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/users.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if users.test.ts uses mock.module() which may require test isolation

# Search for mock.module usage in users.test.ts
echo "=== Checking users.test.ts for mock.module() usage ==="
rg -n "mock\.module\(" packages/core/src/db/users.test.ts

# Compare with other tests in the same batch to identify potential conflicts
echo -e "\n=== Checking other batched tests for mock.module() conflicts ==="
rg -n "mock\.module\(" packages/core/src/db/{codebases,conversations,env-vars,isolation-environments,messages,sessions,workflow-events,workflows}.test.ts

# Show isolated test files for reference
echo -e "\n=== Isolated test files (likely have conflicting mocks) ==="
rg -n "mock\.module\(" packages/core/src/handlers/{command-handler,clone}.test.ts packages/core/src/db/adapters/postgres.test.ts packages/core/src/db/connection.test.ts

Repository: coleam00/Archon

Length of output: 3312


🏁 Script executed:

#!/bin/bash
set -euo pipefail

files=(
  packages/core/src/db/users.test.ts
  packages/core/src/db/workflows.test.ts
  packages/core/src/db/workflow-events.test.ts
  packages/core/src/db/sessions.test.ts
  packages/core/src/db/messages.test.ts
  packages/core/src/db/isolation-environments.test.ts
  packages/core/src/db/conversations.test.ts
  packages/core/src/db/codebases.test.ts
  packages/core/src/db/env-vars.test.ts
)

for f in "${files[@]}"; do
  echo "===== $f (mock.module ./connection and related) ====="
  # show only the first ~80 lines to include mock.module declarations near the top
  # and then also show the specific lines containing mock.module for precision
  wc -l "$f" | awk '{print "lines:" $1}'
  echo "--- mock.module occurrences ---"
  rg -n "mock\.module\(" "$f" | head -n 20
  echo "--- context around ./connection mocks ---"
  rg -n "mock\.module\('./connection'" "$f" -n || true
  # print around each mock.module line that mentions ./connection or `@archon/paths`
  for pat in "mock\.module\('\./connection'" "mock\.module\(\"\.\/connection\"" "mock\.module\('`@archon/paths`'" "mock\.module\(\"`@archon/paths`\""; do
    matches=$(rg -n "$pat" "$f" | cut -d: -f1 || true)
    for ln in $matches; do
      start=$((ln-15)); end=$((ln+35))
      echo "----- $f:$ln (lines $start-$end) -----"
      sed -n "${start},${end}p" "$f" | nl -ba | sed 's/^/ /'
    done
  done
  echo
done

Repository: coleam00/Archon

Length of output: 497


🏁 Script executed:

#!/bin/bash
set -euo pipefail

files=(
  packages/core/src/db/users.test.ts
  packages/core/src/db/workflows.test.ts
  packages/core/src/db/workflow-events.test.ts
  packages/core/src/db/sessions.test.ts
  packages/core/src/db/messages.test.ts
  packages/core/src/db/isolation-environments.test.ts
  packages/core/src/db/conversations.test.ts
  packages/core/src/db/codebases.test.ts
  packages/core/src/db/env-vars.test.ts
)

print_block() {
  f="$1"; ln="$2"; label="$3"
  start=$((ln-8))
  if [ "$start" -lt 1 ]; then start=1; fi
  end=$((ln+45))

  echo "----- $f:$ln ($label) lines $start-$end -----"
  awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%6d  %s\n", NR, $0}' "$f"
  echo
}

for f in "${files[@]}"; do
  echo "================ $f ================"
  echo "--- mock.module('{'./connection'}') occurrences ---"
  rg -n "mock\.module\('\./connection'" "$f" || true

  echo "--- mock.module('`@archon/paths`') occurrences ---"
  rg -n "mock\.module\('`@archon/paths`'" "$f" || true

  # For each connection mock, print a surrounding block
  while IFS=: read -r path ln; do
    print_block "$path" "$ln" "mock.module('./connection', ...)"
  done < <(rg -n "mock\.module\('\./connection'" "$f" || true)

  # For each paths mock, print a surrounding block
  while IFS=: read -r path ln; do
    print_block "$path" "$ln" "mock.module('`@archon/paths`', ...)"
  done < <(rg -n "mock\.module\('`@archon/paths`'" "$f" || true)

done

Repository: coleam00/Archon

Length of output: 345


🏁 Script executed:

#!/bin/bash
set -euo pipefail

files=(
  packages/core/src/db/users.test.ts
  packages/core/src/db/workflows.test.ts
  packages/core/src/db/workflow-events.test.ts
  packages/core/src/db/sessions.test.ts
  packages/core/src/db/messages.test.ts
  packages/core/src/db/isolation-environments.test.ts
  packages/core/src/db/conversations.test.ts
  packages/core/src/db/codebases.test.ts
  packages/core/src/db/env-vars.test.ts
)

print_block() {
  f="$1"; ln="$2"; label="$3"
  start=$((ln-8)); end=$((ln+45))
  if [ "$start" -lt 1 ]; then start=1; fi

  echo "----- $f:$ln ($label) lines $start-$end -----"
  awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%6d  %s\n", NR, $0}' "$f"
  echo
}

for f in "${files[@]}"; do
  echo "================ $f ================"
  rg -n "mock\.module\('\./connection'" "$f" || true
  rg -n "mock\.module\('`@archon/paths`'" "$f" || true

  echo "--- blocks: mock.module('./connection' ...) ---"
  rg -n "mock\.module\('\./connection'" "$f" | while IFS=: read -r path ln rest; do
    print_block "$f" "$ln" "mock.module('./connection', ...)"
  done

  echo "--- blocks: mock.module('`@archon/paths`' ...) ---"
  rg -n "mock\.module\('`@archon/paths`'" "$f" | while IFS=: read -r path ln rest; do
    print_block "$f" "$ln" "mock.module('`@archon/paths`', ...)"
  done

  echo
done

Repository: coleam00/Archon

Length of output: 388


🏁 Script executed:

#!/bin/bash
set -euo pipefail

files=(
  packages/core/src/db/users.test.ts
  packages/core/src/db/workflows.test.ts
  packages/core/src/db/workflow-events.test.ts
  packages/core/src/db/sessions.test.ts
  packages/core/src/db/messages.test.ts
  packages/core/src/db/isolation-environments.test.ts
  packages/core/src/db/conversations.test.ts
  packages/core/src/db/codebases.test.ts
  packages/core/src/db/env-vars.test.ts
)

print_block() {
  f="$1"; ln="$2"; label="$3"
  start=$((ln-8)); end=$((ln+60))
  if [ "$start" -lt 1 ]; then start=1; fi
  echo "----- $f:$ln ($label) lines $start-$end -----"
  awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%6d  %s\n", NR, $0}' "$f"
  echo
}

for f in "${files[@]}"; do
  echo "================ $f ================"

  # connection mocks
  line_nums=$(rg -n "mock\\.module\\('\\./connection'" "$f" | cut -d: -f1 || true)
  if [ -z "${line_nums}" ]; then
    echo "--- blocks: mock.module('./connection' ...) (none) ---"
  else
    echo "--- blocks: mock.module('./connection' ...) ---"
    for ln in $line_nums; do
      print_block "$f" "$ln" "mock.module('./connection', ...)"
    done
  fi

  # paths mocks
  line_nums=$(rg -n "mock\\.module\\('`@archon/paths`'" "$f" | cut -d: -f1 || true)
  if [ -z "${line_nums}" ]; then
    echo "--- blocks: mock.module('`@archon/paths`' ...) (none) ---"
  else
    echo "--- blocks: mock.module('`@archon/paths`' ...) ---"
    for ln in $line_nums; do
      print_block "$f" "$ln" "mock.module('`@archon/paths`', ...)"
    done
  fi

  echo
done

Repository: coleam00/Archon

Length of output: 34840


Fix Bun test isolation for users.test.ts
packages/core/src/db/users.test.ts mocks ./connection (includes getDatabase: { withTransaction }), but several other tests in the same packages/core/package.json test batch also call mock.module('./connection', ...) with different exported shapes (getDatabaseType in workflows.test.ts/messages.test.ts, only pool/getDialect in workflow-events.test.ts, etc.). Since Bun’s mock.module() persists across test files within the same invocation, this can cause mock pollution/order-dependent failures—move users.test.ts into its own bun test invocation (or group it with tests that mock ./connection identically).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/package.json` at line 26, The users.test.ts mock of
./connection leaks into other tests because Bun’s mock.module() persists across
files in the same test invocation; update the "test" script in package.json so
packages/core/src/db/users.test.ts is executed in its own bun test invocation
(or grouped only with tests that mock ./connection identically) instead of
inside the large chained invocation — locate the "test" script entry in
package.json and split out "src/db/users.test.ts" into a separate && bun test
... segment (or create a grouped segment with matching mocks) so mock.module()
state cannot pollute other tests like workflows.test.ts, messages.test.ts,
workflow-events.test.ts.

Comment on lines 30 to +42
export async function addMessage(
conversationId: string,
role: 'user' | 'assistant',
content: string,
metadata?: Record<string, unknown>
metadata?: Record<string, unknown>,
userId?: string
): Promise<MessageRow> {
const dialect = getDialect();
const result = await pool.query<MessageRow>(
`INSERT INTO remote_agent_messages (conversation_id, role, content, metadata, created_at)
VALUES ($1, $2, $3, $4, ${dialect.now()})
`INSERT INTO remote_agent_messages (conversation_id, role, content, metadata, user_id, created_at)
VALUES ($1, $2, $3, $4, $5, ${dialect.now()})
RETURNING *`,
[conversationId, role, content, JSON.stringify(metadata ?? {})]
[conversationId, role, content, JSON.stringify(metadata ?? {}), userId ?? null]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reject userId for assistant messages.

addMessage currently allows role: 'assistant' with non-null userId, which can silently corrupt attribution data. Add an explicit guard before insert.

Proposed fix
 export async function addMessage(
   conversationId: string,
   role: 'user' | 'assistant',
   content: string,
   metadata?: Record<string, unknown>,
   userId?: string
 ): Promise<MessageRow> {
+  if (role === 'assistant' && userId !== undefined) {
+    throw new Error('assistant messages must not include userId');
+  }
+
   const dialect = getDialect();
   const result = await pool.query<MessageRow>(
As per coding guidelines, "Prefer throwing early with clear error for unsupported or unsafe states; never silently swallow errors or broaden permissions; document fallback behavior with a comment when intentional and safe, otherwise throw (Fail Fast + Explicit Errors)".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/db/messages.ts` around lines 30 - 42, add a guard at the
start of addMessage to reject any non-null userId when role === 'assistant' by
throwing a clear Error before performing the INSERT; specifically, in the
addMessage function check the role and userId and throw (with a short message)
if an assistant message is supplied a userId so we never pass an invalid user_id
into the remote_agent_messages insert, keeping the DB attribution consistent.

Comment on lines +15 to +19
mock.module('./connection', () => ({
pool: { query: mockQuery },
getDialect: () => mockPostgresDialect,
getDatabase: () => ({ withTransaction: mockWithTransaction }),
}));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find process-global mock.module usage across TS test files.
rg -nP --type=ts "mock\.module\(" packages
# Expected: avoid internal-module mock.module usage (or isolate conflicting files into separate bun test invocations).

Repository: coleam00/Archon

Length of output: 33399


🏁 Script executed:

rg -nP --type=ts "mock\.module\(\s*'\\./connection'" packages

Repository: coleam00/Archon

Length of output: 780


🏁 Script executed:

rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/db

Repository: coleam00/Archon

Length of output: 780


🏁 Script executed:

rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/db

Repository: coleam00/Archon

Length of output: 780


🏁 Script executed:

rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/db

Repository: coleam00/Archon

Length of output: 780


🏁 Script executed:

rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/db

Repository: coleam00/Archon

Length of output: 780


🏁 Script executed:

rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/db

Repository: coleam00/Archon

Length of output: 780


🏁 Script executed:

rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/db

Repository: coleam00/Archon

Length of output: 780


🏁 Script executed:

rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/db

Repository: coleam00/Archon

Length of output: 780


Replace mock.module('./connection', ...) with spyOn in packages/core/src/db/users.test.ts

mock.module() is process-global/irreversible in Bun, and ./connection is also mocked via mock.module() in multiple other packages/core/src/db/*.test.ts files, so this can contaminate other suites. Use spyOn()-based mocking for the imported ./connection module (restore with spy.mockRestore()).

mock.module('./connection', () => ({
  pool: { query: mockQuery },
  getDialect: () => mockPostgresDialect,
  getDatabase: () => ({ withTransaction: mockWithTransaction }),
}));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/db/users.test.ts` around lines 15 - 19, Replace the
process-global mock.module('./connection', ...) call with a spyOn-based mock
that targets the imported connection module in users.test.ts: create a spy via
spyOn(importedConnectionModule, ...) to stub pool.query (mockQuery), getDialect
(returning mockPostgresDialect) and getDatabase (returning { withTransaction:
mockWithTransaction }) and ensure the spy is restored at test teardown with
spy.mockRestore() so other test suites are not contaminated.

Comment on lines +69 to +71
getLog().warn(
{ identityId: existing.id, platform, platformUserId },
'user.identity_orphan_repair_started'

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not log raw platform user identifiers or display names.

These events currently log platformUserId and displayName directly, which can leak PII into logs. Log only internal IDs (or a one-way hash/truncated surrogate) instead.

As per coding guidelines, "Use structured logging with Pino via createLogger() from @archon/paths; event naming: {domain}.{action}_{state} ... include context (IDs, durations, errors); never log API keys, tokens (mask as token.slice(0, 8) + '...'), message content, or PII".

Also applies to: 92-94, 106-107, 110-110

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/db/users.ts` around lines 69 - 71, The logs call
getLog().warn/getLog().info (e.g., the user.identity_orphan_repair_started
event) with raw PII variables platformUserId and displayName; replace those
direct values with non-PII surrogates (log existing.id and platform only, and
for platformUserId/displayName either omit them or log a one-way hash/truncated
surrogate such as sha256(platformUserId).slice(0,8) or
platformUserId.slice(0,8)+'...'); update all getLog() calls in this file that
reference platformUserId or displayName (including the occurrences around
user.identity_orphan_repair_started and the other calls noted) to use the
masked/hashed value or internal id only, keeping event names and structured keys
intact.

} = context ?? {};
try {
getLog().debug({ conversationId }, 'orchestrator_message_received');
getLog().debug({ conversationId, userId }, 'orchestrator_message_received');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not log the raw userId.

Line 653 adds the Archon user UUID to orchestrator logs, which creates a new user-identifier trail in log storage. Omit it or reduce it to a non-identifying flag/hash.

Proposed fix
-    getLog().debug({ conversationId, userId }, 'orchestrator_message_received');
+    getLog().debug({ conversationId, hasUserId: Boolean(userId) }, 'orchestrator_message_received');

As per coding guidelines, "Use structured logging with Pino via createLogger() from @archon/paths... never log API keys, tokens (mask as token.slice(0, 8) + '...'), message content, or PII".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
getLog().debug({ conversationId, userId }, 'orchestrator_message_received');
getLog().debug({ conversationId, hasUserId: Boolean(userId) }, 'orchestrator_message_received');
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/orchestrator/orchestrator-agent.ts` at line 653, The
current getLog().debug call logs the raw userId (PII); update the log in
orchestrator-agent.ts to stop emitting the raw userId—either remove userId from
the structured payload or replace it with a non-identifying value (e.g., a
masked or hashed token produced by a helper like maskUserId(userId) or
hashUserId(userId)) and keep conversationId as-is; modify the call to
getLog().debug({ conversationId, user: maskUserId(userId) },
'orchestrator_message_received') or simply getLog().debug({ conversationId },
'orchestrator_message_received') so no raw user UUID is written.

Comment on lines +120 to +124
} catch (err) {
getLog().warn(
{ err: err as Error, platform, platformUserId: String(platformUserId) },
`${platform}.user_resolve_failed`
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Mask or drop the platform user identifier from this warning log.

Line 122 writes the raw Slack/Discord/Telegram user identifier into logs on lookup failure. That creates avoidable PII retention for a non-critical warning path; keep the platform and error, but omit or mask the identifier.

Suggested change
   } catch (err) {
     getLog().warn(
-      { err: err as Error, platform, platformUserId: String(platformUserId) },
+      { err: err as Error, platform },
       `${platform}.user_resolve_failed`
     );
     return undefined;
   }
As per coding guidelines, "Use structured logging with Pino via `createLogger()` from `@archon/paths`... never log API keys, tokens (mask as `token.slice(0, 8) + '...'`), message content, or PII".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/index.ts` around lines 120 - 124, The warning currently
logs the raw platform user id in the structured payload; remove or redact it to
avoid PII retention by replacing platformUserId in the getLog().warn call with
either no identifier or a masked value (e.g., compute maskedPlatformUserId =
String(platformUserId).slice(0,8) + '...' before logging) and keep the rest of
the payload (err and platform) and the same message key
`${platform}.user_resolve_failed`; update the call sites around getLog().warn to
use the maskedPlatformUserId or omit that field entirely.

… identity type union

Critical fix: SQLite migrateColumns ALTERs now include ON DELETE SET NULL on
all four new user_id / created_by_user_id FK columns. Upgraded SQLite DBs
previously inherited the default NO ACTION ≈ RESTRICT semantic, contradicting
the PR's documented "no destructive cascade on user deletion" guarantee.

Hardening on the new user-identity surface:

  - findOrCreateUserByPlatformIdentity narrows its race-recovery catch to true
    UNIQUE-constraint violations (PG sqlstate 23505 or SQLite "UNIQUE
    constraint failed" message). Any other error logs as user.create_failed
    and propagates — no more masking generic DB failures as recoverable races.

  - backfillDisplayName wraps its UPDATE in try/catch with a dedicated warn
    event. A failed opportunistic backfill must not silently fail the entire
    user resolution path; the caller already has the resolved user row.

  - repairOrphanedIdentity now logs user.identity_orphan_repair_failed on
    transaction failure (previously surfaced only as a generic resolve_failed
    upstream).

  - New IdentityPlatform literal union ('slack' | 'telegram' | 'discord' |
    'github' | 'web' | 'cli') replaces the unconstrained `platform: string`
    on UserIdentity and the findOrCreate signature. Typos now fail at compile
    time rather than silently breaking the UNIQUE(platform, platform_user_id)
    invariant.

  - user.create_started/_completed/_failed are now properly paired per the
    project event-naming convention.

Slack adapter:

  - users.info missing_scope WARN now gated by an instance flag so it fires
    once per adapter lifetime instead of once per unknown user. The
    misconfiguration is permanent — flooding logs after every restart in a
    100-user workspace was the wrong shape.

  - users_info_failed log strips err (which can include err.data with
    workspace metadata) in favor of structured errMessage / slackErrorCode
    / slackUserId fields. No PII through the log pipeline.

Server resolver:

  - resolveUserId exported for testability and now logs as a single static
    event server.user_resolve_failed (platform in structured fields) instead
    of the templated ${platform}.user_resolve_failed which collided with the
    GitHub adapter's own event name.

  - Dead `=== null` branch removed (TypeScript already narrows the type).

GitHub adapter:

  - User-identity resolution moved up to immediately after self-filtering
    + @mention checks. Now runs before the codebase-ensure and comment-
    history Octokit calls so resolution can't be silently skipped by an
    upstream Octokit failure (which was masking a missing-mock bug in the
    existing test suite).

Tests:

  - New packages/server/src/resolve-user-id.test.ts covers the never-throws
    contract that three adapter handlers depend on. 6 cases including the
    static-event-name regression.

  - GitHub adapter test now mocks @archon/core/db/users and covers the
    comment.user.login ?? sender.login attribution fallback in both
    directions, plus a never-throws case for resolution failure.

  - users.test.ts gains the asymmetric-backfill case, backfill-failure-
    does-not-block-resolution case, both PG and SQLite UNIQUE-error shapes
    for race recovery, and a non-UNIQUE-rethrows-without-recovery case
    that explicitly counts the query calls.

  - isolation-environments.test.ts adds a "ON CONFLICT does NOT update
    created_by_user_id" regression guard so a copy-paste in the SET clause
    can't silently transfer ownership across re-activations.

Comment cleanup: stripped the (PR-A) / (until PR-C) / pre-PR-A history
labels from production types, migrations, and source files. They were PR-
state markers that would rot on merge; the substantive WHY content stays.

Docs:

  - CLAUDE.md table count: 8 → 10; users and user_identities documented.
  - docs-web/reference/database.md: 8 → 10 with explicit ON DELETE semantics
    and a note that re-running 000_combined.sql is idempotent and picks up
    the new ALTERs.
  - docs-web/reference/architecture.md: 7-table diagram → 10-table; full
    schema block extended with the new tables and user_id columns.
  - docs-web/adapters/slack.md: users:read scope added to the Bot Token
    Scopes setup with a note about graceful degradation if omitted.

Skipped (with reason):

  - Converting User/UserIdentity to z.infer<typeof schema>: all sibling row
    interfaces in types/index.ts are hand-crafted; doing this for just the
    two new types creates inconsistency. A separate consistency pass should
    convert the whole file, not selectively.

  - Threading userId into the four web/CLI addMessage callsites: those
    surfaces don't have an auth flow yet, so threading now means passing
    `undefined` from every caller. Added explicit TODOs at each callsite
    pointing at the upcoming web/CLI auth work instead.
@Wirasm Wirasm merged commit 50beeec into dev May 28, 2026
4 checks passed
@Wirasm Wirasm deleted the feat/user-identity-foundation branch May 28, 2026 06:03
Wirasm added a commit that referenced this pull request May 28, 2026
…tion routing (#1788)

Phase 2 of the team-foundation PRD. Replaces the bot's single shared GITHUB_TOKEN PAT with a registered GitHub App that supports multi-installation token routing from day one.

New @archon/core/github-auth/ module wrapping @octokit/auth-app with a three-level cache:

  - lookupCache:  owner/repo → installationId (1h TTL; evicted on 401)
  - tokenCache:   installationId → access token (1h GitHub TTL, refreshed 5min before expiry)
  - octokitCache: installationId → Octokit (per-installation auth strategy; evicted on 401 so the SDK's hidden internal token state can't keep serving the dead token)

GitHubAdapter takes a `GitHubAuth` discriminated union at construction. All 4 Octokit callsites (postComment / listComments / repos.get / pulls.get) plus the clone path route through resolveOctokit + a withTokenRefresh wrapper that calls invalidateRepo and retries once on 401. Webhook event.installation.id primes the lookup cache to skip a round-trip. Secondary self-filter compares against `<slug>[bot]` in App mode (via a botLogin getter, distinct from botMention) so PR-C's per-user tokens won't trip it.

Server bootstrap detects App vs PAT mode via env and fails fast if both are configured. In App mode it registers the provider on a module singleton consumed by createWorkflowDeps(), so the workflow executor's bash/script subprocesses inherit a fresh GH_TOKEN/GITHUB_TOKEN. New POST /internal/git-credential endpoint (App mode only) backs a POSIX git credential helper installed at clone time, covering workflows that outlive the 1h installation-token expiry. The public-bind guard runs BEFORE Bun.serve so a rejected config never opens the listening socket — opt-out via ARCHON_ALLOW_INTERNAL_ON_PUBLIC_BIND=1 for deployments where the reverse proxy already drops /internal/*.

Refactor + extracted helpers in server/src/github-auth-bootstrap.ts (selectGitHubAuthMode + parseGitCredentialPath) so the security-critical decisions are testable in isolation without spinning up Hono.

Backwards compat: solo installs running GITHUB_TOKEN only see zero functional change. All 54 existing PAT-mode adapter tests pass unchanged.

Tests added: 23 strictly-mocked auth-module tests (PRD Q7 — no live api.github.com in CI); 10 new App-mode adapter tests (multi-install routing, payload short-circuit, 401 retry + retry-on-retry propagation, AppNotInstalledError surfacing, clone-token resolution, post-clone credential helper install); 20 server-bootstrap unit tests (dual-mode fail-fast, /internal path validation incl. traversal + null bytes).

Closes phase 2 of .claude/PRPs/prds/github-app-and-user-identity.prd.md. Depends on #1783 (PR-A user-identity foundation).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant