feat(core): plumb user_id from chat/forge adapters through orchestrator and workflow runs#1783
Conversation
…or and workflow runs Adds remote_agent_users + remote_agent_user_identities tables (Archon identity + per-platform mapping, UNIQUE(platform, platform_user_id)) and threads a resolved user_id through HandleMessageContext into the orchestrator, workflow executor, and isolation resolver. Every new conversation, message, workflow_run, and isolation_environment row created from Slack/Telegram/Discord/GitHub now carries attribution. Slack additionally enriches first-sight users with their real name via users.info (requires bot scope users:read — reinstall the app to grant). Telegram/Discord derive display name from the inbound event payload. GitHub resolves event.comment.user.login or event.sender.login on each webhook. Resolution failure warn-logs and continues — never drops a message. Schema is additive and nullable everywhere: existing rows remain valid with NULL, ON DELETE SET NULL on every new FK. Web POST /api/conversations and the CLI continue to write NULL user_id; those surfaces become attributed in a follow-up PR. Solo installs with GITHUB_TOKEN are unchanged. Race-safe create-on-first-sight: UNIQUE(platform, platform_user_id) trips on concurrent first-sight webhooks; the losing transaction rolls back and we re-SELECT the winner's identity. Orphaned identities (user row deleted out from under them) are auto-repaired. Foundation for the small-team Archon initiative. Follow-ups will swap the shared GITHUB_TOKEN for a GitHub App and wire per-user GitHub tokens via device flow.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (26)
✅ Files skipped from review due to trivial changes (4)
🚧 Files skipped from review as they are similar to previous changes (12)
📝 WalkthroughWalkthroughThis PR adds platform-agnostic user identity tables and a user DB module, enriches adapters with displayName and resolves platform users to Archon user UUIDs, and threads optional user attribution through conversations, messages, workflow runs, isolation environments, and orchestrator flows. ChangesUser Identity Attribution Pipeline
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Review Summary — Multi-AgentReviewed by Critical Issues (1 found)
Important Issues (5 found)
Suggestions (selected)
Documentation Issues
Strengths
VerdictNEEDS FIXES — one critical FK migration bug + two high-severity silent-failure paths + missing test coverage on Recommended Actions
Re-run targeted aspects after fixes: |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (3)
packages/core/src/db/conversations.test.ts (1)
71-250: ⚡ Quick winAdd one non-null
user_idcreation test.All updated cases assert
$6 = null; add one case passinguserIdto lock in the new behavior and prevent silent regression.Proposed test addition
+ test('persists user_id when provided', async () => { + const newConversation: Conversation = { + ...existingConversation, + id: 'conv-with-user', + }; + mockQuery.mockResolvedValueOnce(createQueryResult([])); + mockQuery.mockResolvedValueOnce(createQueryResult([newConversation])); + + await getOrCreateConversation('telegram', 'chat-999', undefined, undefined, 'user-uuid-1'); + + expect(mockQuery).toHaveBeenNthCalledWith( + 2, + 'INSERT INTO remote_agent_conversations (platform_type, platform_conversation_id, ai_assistant_type, codebase_id, cwd, user_id) VALUES ($1, $2, $3, $4, $5, $6) RETURNING *', + ['telegram', 'chat-999', 'claude', null, null, 'user-uuid-1'] + ); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/db/conversations.test.ts` around lines 71 - 250, Add a test to cover creating a conversation with a non-null user_id: call getOrCreateConversation with the userId argument, mock queries with createQueryResult (empty then inserted row), and assert mockQuery was called for INSERT with the 6th parameter equal to the provided userId; reference the existing patterns in packages/core/src/db/conversations.test.ts (functions getOrCreateConversation, mockQuery, createQueryResult) and mirror the other tests' structure for setup and expectations.packages/core/src/db/workflows.test.ts (1)
55-129: ⚡ Quick winAdd a createWorkflowRun test that passes
user_id.Current assertions only validate
nullin the new slot; add one non-null case to protect this attribution path.Proposed test addition
+ test('creates workflow run with user_id when provided', async () => { + mockQuery.mockResolvedValueOnce(createQueryResult([mockWorkflowRun])); + + await createWorkflowRun({ + workflow_name: 'feature-development', + conversation_id: 'conv-456', + user_message: 'Add dark mode support', + user_id: 'user-uuid-1', + }); + + expect(mockQuery).toHaveBeenCalledWith( + expect.stringContaining('INSERT INTO remote_agent_workflow_runs'), + ['feature-development', 'conv-456', null, 'Add dark mode support', '{}', null, null, 'user-uuid-1'] + ); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/db/workflows.test.ts` around lines 55 - 129, The tests for createWorkflowRun are missing a case that supplies a non-null user_id; add a new unit test that calls createWorkflowRun with user_id (e.g., 'user-123'), mockQuery.mockResolvedValueOnce to return a row including that user_id, assert the returned result.user_id equals 'user-123', and assert mockQuery was called with an INSERT INTO remote_agent_workflow_runs parameter array where the user_id slot contains 'user-123' (keeping other params same); reference the existing test pattern and reuse mockWorkflowRun, mockQuery, and createQueryResult to implement the new test.packages/adapters/src/chat/slack/adapter.ts (1)
46-47: ⚡ Quick winCap the display-name cache to prevent unbounded memory growth.
displayNameCachecurrently grows forever across unique users. Add a max size + eviction policy (FIFO/LRU) to keep adapter memory bounded in long-lived processes.♻️ Suggested bounded-cache change
+const MAX_DISPLAY_NAME_CACHE = 5000; + private displayNameCache = new Map<string, string>(); @@ const name = u?.real_name ?? u?.profile?.display_name ?? u?.name ?? undefined; if (name) { + if (this.displayNameCache.size >= MAX_DISPLAY_NAME_CACHE) { + const oldest = this.displayNameCache.keys().next().value; + if (oldest !== undefined) this.displayNameCache.delete(oldest); + } this.displayNameCache.set(slackUserId, name); }Also applies to: 287-296
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/adapters/src/chat/slack/adapter.ts` around lines 46 - 47, displayNameCache currently is an unbounded Map (field displayNameCache) and must be replaced with a bounded cache with eviction; change the displayNameCache implementation to enforce a fixed max size (e.g. MAX_DISPLAYNAME_CACHE) and evict entries on insert when capacity is exceeded using an eviction policy (LRU or FIFO) so memory cannot grow unbounded; update all uses of displayNameCache (the field declared as displayNameCache and the code paths referenced around the other occurrences ~287-296) to interact with the new bounded cache API (get, set, delete/evict) so lookups and inserts behave the same while automatically removing old entries when the cache is full.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/adapters/src/chat/slack/adapter.test.ts`:
- Around line 531-556: The test suite's beforeEach only clears mockUsersInfo so
mockLogger.warn can carry state across tests and make the assertion flaky;
update the beforeEach in this describe block to also reset the logger mock (call
mockLogger.warn.mockClear() or equivalent) so that each test starts with a fresh
mock for mockLogger.warn when running SlackAdapter.fetchDisplayName.
In `@packages/adapters/src/chat/slack/adapter.ts`:
- Around line 299-305: Mask the slackUserId before logging to avoid exposing
PII: compute a masked value (e.g., take the first 8 chars then '...') and use
that maskedId in both getLog().warn calls instead of slackUserId; update
references in the users info error handling block surrounding the err/data
parsing (the variables slackUserId, slackErrorCode and the getLog() calls) so
all logs use the masked value.
In `@packages/adapters/src/forge/github/adapter.ts`:
- Around line 958-960: The warning currently logs user-identifying data via the
githubLogin field; remove or mask that PII when calling getLog().warn in the
github.user_resolve_failed branch: stop passing attributedLogin as githubLogin
(or replace it with a masked value like attributedLogin.slice(0,8)+'...' if
truly needed), keep the error via toError(err) and the same event key, and
ensure the call site that constructs the log only includes non-PII fields so the
log entry no longer contains the raw GitHub login.
- Around line 947-955: The code is using the mutable GitHub login as the unique
platformUserId when calling userDb.findOrCreateUserByPlatformIdentity; change
the call in adapter.ts to pass the immutable numeric GitHub id (stringified) —
use event.comment?.user?.id or event.sender?.id as platformUserId — and continue
passing attributedLogin only as the displayName; update the GitHub webhook types
in packages/adapters/src/forge/github/types.ts (add user.id and sender.id
typings) so the adapter can access those ids safely; additionally remove or mask
raw logins from warning/processLogger statements around the authorization check
(avoid logging attributedLogin directly, use masked username or omit the field).
In `@packages/core/package.json`:
- Line 26: The users.test.ts mock of ./connection leaks into other tests because
Bun’s mock.module() persists across files in the same test invocation; update
the "test" script in package.json so packages/core/src/db/users.test.ts is
executed in its own bun test invocation (or grouped only with tests that mock
./connection identically) instead of inside the large chained invocation —
locate the "test" script entry in package.json and split out
"src/db/users.test.ts" into a separate && bun test ... segment (or create a
grouped segment with matching mocks) so mock.module() state cannot pollute other
tests like workflows.test.ts, messages.test.ts, workflow-events.test.ts.
In `@packages/core/src/db/adapters/sqlite.ts`:
- Around line 230-232: The catch blocks that currently call getLog().warn(...)
(e.g., the handlers logging 'db.sqlite_migration_conversations_user_id_failed'
and the similar handlers at the other locations) must not swallow errors; after
logging include the original error and rethrow it (or throw a new Error that
wraps the original) so startup fails fast. Update each catch (the ones calling
getLog().warn({ err: e as Error },
'db.sqlite_migration_conversations_user_id_failed') and the analogous warns at
the other noted locations) to: 1) log with the error object for context, and 2)
rethrow the same error (throw e) or throw a new Error with the original error
attached, so migrations cannot silently fail.
- Around line 219-229: The migration in packages/core/src/db/adapters/sqlite.ts
that alters remote_agent_conversations to add foreign key columns (see the ALTER
TABLE ... ADD COLUMN ... for user_id and created_by_user_id) omits the ON DELETE
SET NULL clause; update each ALTER TABLE ADD COLUMN statement that adds a
REFERENCES remote_agent_users(id) (the ones adding user_id and
created_by_user_id) to append "ON DELETE SET NULL" so the upgrade path matches
the fresh-schema behavior and ensures deletions nullify the FK instead of
leaving dangling references.
In `@packages/core/src/db/messages.ts`:
- Around line 30-42: add a guard at the start of addMessage to reject any
non-null userId when role === 'assistant' by throwing a clear Error before
performing the INSERT; specifically, in the addMessage function check the role
and userId and throw (with a short message) if an assistant message is supplied
a userId so we never pass an invalid user_id into the remote_agent_messages
insert, keeping the DB attribution consistent.
In `@packages/core/src/db/users.test.ts`:
- Around line 15-19: Replace the process-global mock.module('./connection', ...)
call with a spyOn-based mock that targets the imported connection module in
users.test.ts: create a spy via spyOn(importedConnectionModule, ...) to stub
pool.query (mockQuery), getDialect (returning mockPostgresDialect) and
getDatabase (returning { withTransaction: mockWithTransaction }) and ensure the
spy is restored at test teardown with spy.mockRestore() so other test suites are
not contaminated.
In `@packages/core/src/db/users.ts`:
- Around line 69-71: The logs call getLog().warn/getLog().info (e.g., the
user.identity_orphan_repair_started event) with raw PII variables platformUserId
and displayName; replace those direct values with non-PII surrogates (log
existing.id and platform only, and for platformUserId/displayName either omit
them or log a one-way hash/truncated surrogate such as
sha256(platformUserId).slice(0,8) or platformUserId.slice(0,8)+'...'); update
all getLog() calls in this file that reference platformUserId or displayName
(including the occurrences around user.identity_orphan_repair_started and the
other calls noted) to use the masked/hashed value or internal id only, keeping
event names and structured keys intact.
In `@packages/core/src/orchestrator/orchestrator-agent.ts`:
- Line 653: The current getLog().debug call logs the raw userId (PII); update
the log in orchestrator-agent.ts to stop emitting the raw userId—either remove
userId from the structured payload or replace it with a non-identifying value
(e.g., a masked or hashed token produced by a helper like maskUserId(userId) or
hashUserId(userId)) and keep conversationId as-is; modify the call to
getLog().debug({ conversationId, user: maskUserId(userId) },
'orchestrator_message_received') or simply getLog().debug({ conversationId },
'orchestrator_message_received') so no raw user UUID is written.
In `@packages/server/src/index.ts`:
- Around line 120-124: The warning currently logs the raw platform user id in
the structured payload; remove or redact it to avoid PII retention by replacing
platformUserId in the getLog().warn call with either no identifier or a masked
value (e.g., compute maskedPlatformUserId = String(platformUserId).slice(0,8) +
'...' before logging) and keep the rest of the payload (err and platform) and
the same message key `${platform}.user_resolve_failed`; update the call sites
around getLog().warn to use the maskedPlatformUserId or omit that field
entirely.
---
Nitpick comments:
In `@packages/adapters/src/chat/slack/adapter.ts`:
- Around line 46-47: displayNameCache currently is an unbounded Map (field
displayNameCache) and must be replaced with a bounded cache with eviction;
change the displayNameCache implementation to enforce a fixed max size (e.g.
MAX_DISPLAYNAME_CACHE) and evict entries on insert when capacity is exceeded
using an eviction policy (LRU or FIFO) so memory cannot grow unbounded; update
all uses of displayNameCache (the field declared as displayNameCache and the
code paths referenced around the other occurrences ~287-296) to interact with
the new bounded cache API (get, set, delete/evict) so lookups and inserts behave
the same while automatically removing old entries when the cache is full.
In `@packages/core/src/db/conversations.test.ts`:
- Around line 71-250: Add a test to cover creating a conversation with a
non-null user_id: call getOrCreateConversation with the userId argument, mock
queries with createQueryResult (empty then inserted row), and assert mockQuery
was called for INSERT with the 6th parameter equal to the provided userId;
reference the existing patterns in packages/core/src/db/conversations.test.ts
(functions getOrCreateConversation, mockQuery, createQueryResult) and mirror the
other tests' structure for setup and expectations.
In `@packages/core/src/db/workflows.test.ts`:
- Around line 55-129: The tests for createWorkflowRun are missing a case that
supplies a non-null user_id; add a new unit test that calls createWorkflowRun
with user_id (e.g., 'user-123'), mockQuery.mockResolvedValueOnce to return a row
including that user_id, assert the returned result.user_id equals 'user-123',
and assert mockQuery was called with an INSERT INTO remote_agent_workflow_runs
parameter array where the user_id slot contains 'user-123' (keeping other params
same); reference the existing test pattern and reuse mockWorkflowRun, mockQuery,
and createQueryResult to implement the new test.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a72c3e5a-367e-455b-b734-e46bb0c4502b
📒 Files selected for processing (29)
migrations/000_combined.sqlpackages/adapters/src/chat/slack/adapter.test.tspackages/adapters/src/chat/slack/adapter.tspackages/adapters/src/chat/slack/types.tspackages/adapters/src/chat/telegram/adapter.tspackages/adapters/src/chat/telegram/types.tspackages/adapters/src/forge/github/adapter.tspackages/core/package.jsonpackages/core/src/db/adapters/sqlite.tspackages/core/src/db/conversations.test.tspackages/core/src/db/conversations.tspackages/core/src/db/index.tspackages/core/src/db/isolation-environments.test.tspackages/core/src/db/isolation-environments.tspackages/core/src/db/messages.test.tspackages/core/src/db/messages.tspackages/core/src/db/users.test.tspackages/core/src/db/users.tspackages/core/src/db/workflows.test.tspackages/core/src/db/workflows.tspackages/core/src/orchestrator/orchestrator-agent.tspackages/core/src/orchestrator/orchestrator.tspackages/core/src/types/index.tspackages/isolation/src/resolver.tspackages/isolation/src/types.tspackages/server/src/index.tspackages/workflows/src/executor.tspackages/workflows/src/schemas/workflow-run.tspackages/workflows/src/store.ts
| beforeEach(() => { | ||
| mockUsersInfo.mockClear(); | ||
| }); | ||
|
|
||
| test('returns real_name from users.info on first call and caches result', async () => { | ||
| const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake'); | ||
| const name1 = await adapter.fetchDisplayName('U123'); | ||
| const name2 = await adapter.fetchDisplayName('U123'); | ||
| expect(name1).toBe('Alice Liddell'); | ||
| expect(name2).toBe('Alice Liddell'); | ||
| // Second call hits the in-memory cache — no second API call. | ||
| expect(mockUsersInfo).toHaveBeenCalledTimes(1); | ||
| }); | ||
|
|
||
| test('returns undefined and warn-logs on missing_scope failure', async () => { | ||
| const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake'); | ||
| const slackErr = Object.assign(new Error('missing_scope'), { | ||
| data: { error: 'missing_scope' }, | ||
| }); | ||
| mockUsersInfo.mockRejectedValueOnce(slackErr); | ||
|
|
||
| const name = await adapter.fetchDisplayName('U_NEW'); | ||
|
|
||
| expect(name).toBeUndefined(); | ||
| expect(mockLogger.warn).toHaveBeenCalled(); | ||
| }); |
There was a problem hiding this comment.
Reset logger mocks in this block to keep assertions deterministic.
mockLogger.warn isn’t cleared in this describe, so prior warning calls can satisfy the expectation in Line 555. Clear it in the local beforeEach.
🧪 Suggested fix
describe('fetchDisplayName (users.info enrichment)', () => {
beforeEach(() => {
mockUsersInfo.mockClear();
+ mockLogger.warn.mockClear();
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| beforeEach(() => { | |
| mockUsersInfo.mockClear(); | |
| }); | |
| test('returns real_name from users.info on first call and caches result', async () => { | |
| const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake'); | |
| const name1 = await adapter.fetchDisplayName('U123'); | |
| const name2 = await adapter.fetchDisplayName('U123'); | |
| expect(name1).toBe('Alice Liddell'); | |
| expect(name2).toBe('Alice Liddell'); | |
| // Second call hits the in-memory cache — no second API call. | |
| expect(mockUsersInfo).toHaveBeenCalledTimes(1); | |
| }); | |
| test('returns undefined and warn-logs on missing_scope failure', async () => { | |
| const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake'); | |
| const slackErr = Object.assign(new Error('missing_scope'), { | |
| data: { error: 'missing_scope' }, | |
| }); | |
| mockUsersInfo.mockRejectedValueOnce(slackErr); | |
| const name = await adapter.fetchDisplayName('U_NEW'); | |
| expect(name).toBeUndefined(); | |
| expect(mockLogger.warn).toHaveBeenCalled(); | |
| }); | |
| beforeEach(() => { | |
| mockUsersInfo.mockClear(); | |
| mockLogger.warn.mockClear(); | |
| }); | |
| test('returns real_name from users.info on first call and caches result', async () => { | |
| const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake'); | |
| const name1 = await adapter.fetchDisplayName('U123'); | |
| const name2 = await adapter.fetchDisplayName('U123'); | |
| expect(name1).toBe('Alice Liddell'); | |
| expect(name2).toBe('Alice Liddell'); | |
| // Second call hits the in-memory cache — no second API call. | |
| expect(mockUsersInfo).toHaveBeenCalledTimes(1); | |
| }); | |
| test('returns undefined and warn-logs on missing_scope failure', async () => { | |
| const adapter = new SlackAdapter('xoxb-fake', 'xapp-fake'); | |
| const slackErr = Object.assign(new Error('missing_scope'), { | |
| data: { error: 'missing_scope' }, | |
| }); | |
| mockUsersInfo.mockRejectedValueOnce(slackErr); | |
| const name = await adapter.fetchDisplayName('U_NEW'); | |
| expect(name).toBeUndefined(); | |
| expect(mockLogger.warn).toHaveBeenCalled(); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/adapters/src/chat/slack/adapter.test.ts` around lines 531 - 556, The
test suite's beforeEach only clears mockUsersInfo so mockLogger.warn can carry
state across tests and make the assertion flaky; update the beforeEach in this
describe block to also reset the logger mock (call mockLogger.warn.mockClear()
or equivalent) so that each test starts with a fresh mock for mockLogger.warn
when running SlackAdapter.fetchDisplayName.
| const err = error as Error & { data?: { error?: string } }; | ||
| const slackErrorCode = err.data?.error; | ||
| if (slackErrorCode === 'missing_scope') { | ||
| getLog().warn({ slackUserId, scope: 'users:read' }, 'slack.users_info_missing_scope'); | ||
| } else { | ||
| getLog().warn({ err, slackUserId, slackErrorCode }, 'slack.users_info_failed'); | ||
| } |
There was a problem hiding this comment.
Mask user identifiers in warning logs.
Line 302 and Line 304 log raw slackUserId. That exposes user identifiers in logs; mask it before logging.
🔐 Suggested fix
} catch (error) {
const err = error as Error & { data?: { error?: string } };
const slackErrorCode = err.data?.error;
+ const maskedUserId = `${slackUserId.slice(0, 4)}***`;
if (slackErrorCode === 'missing_scope') {
- getLog().warn({ slackUserId, scope: 'users:read' }, 'slack.users_info_missing_scope');
+ getLog().warn({ maskedUserId, scope: 'users:read' }, 'slack.users_info_missing_scope');
} else {
- getLog().warn({ err, slackUserId, slackErrorCode }, 'slack.users_info_failed');
+ getLog().warn({ err, maskedUserId, slackErrorCode }, 'slack.users_info_failed');
}
return undefined;
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/adapters/src/chat/slack/adapter.ts` around lines 299 - 305, Mask the
slackUserId before logging to avoid exposing PII: compute a masked value (e.g.,
take the first 8 chars then '...') and use that maskedId in both getLog().warn
calls instead of slackUserId; update references in the users info error handling
block surrounding the err/data parsing (the variables slackUserId,
slackErrorCode and the getLog() calls) so all logs use the masked value.
| const attributedLogin = event.comment?.user?.login ?? senderUsername; | ||
| let archonUserId: string | undefined; | ||
| if (attributedLogin) { | ||
| try { | ||
| const user = await userDb.findOrCreateUserByPlatformIdentity( | ||
| 'github', | ||
| attributedLogin, | ||
| attributedLogin | ||
| ); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the file and print the relevant line ranges
file="packages/adapters/src/forge/github/adapter.ts"
if [ ! -f "$file" ]; then
echo "Missing file: $file" >&2
exit 1
fi
echo "=== Context around lines 930-990 ==="
nl -ba "$file" | sed -n '930,990p'
echo "=== Search for githubLogin usage ==="
rg -n "githubLogin" "$file" || true
echo "=== Search for structured logging patterns near this block ==="
# Print a bit more around any log calls in the neighborhood
rg -n "logger\.|createLogger|pino|log\(" "$file" | head -n 50
echo "=== Search for handleMessage(...) signature/call around the block ==="
rg -n "handleMessage\(" "$file" | head -n 50Repository: coleam00/Archon
Length of output: 158
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="packages/adapters/src/forge/github/adapter.ts"
if [ ! -f "$file" ]; then
echo "Missing file: $file" >&2
exit 1
fi
echo "=== Context around lines 930-990 ==="
cat -n "$file" | sed -n '930,990p'
echo "=== Search for githubLogin usage (anywhere in file) ==="
rg -n "githubLogin" "$file" || true
echo "=== Search for pino/log usage near this block ==="
rg -n "createLogger|pino|logger|log\(" "$file" | head -n 80
echo "=== Search for handleMessage(...) around this block ==="
rg -n "handleMessage\(" "$file" | head -n 80Repository: coleam00/Archon
Length of output: 3444
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "findOrCreateUserByPlatformIdentity" -S packages/adapters/src/forge/github/adapter.ts
rg -n "findOrCreateUserByPlatformIdentity" -S packages | head -n 50
# Try to locate the implementation file(s)
fd -t f "findOrCreateUserByPlatformIdentity" packages || true
fd -t f "*user*db*" packages | head -n 50Repository: coleam00/Archon
Length of output: 2301
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="packages/core/src/db/users.ts"
if [ ! -f "$file" ]; then
echo "Missing file: $file" >&2
exit 1
fi
echo "=== Context around findOrCreateUserByPlatformIdentity in $file ==="
# Print from a bit before the function to after it (avoid dumping whole file)
start=$(rg -n "export async function findOrCreateUserByPlatformIdentity" "$file" | head -n1 | cut -d: -f1)
echo "Function starts at line: $start"
from=$((start-10))
to=$((start+90))
sed -n "${from},${to}p" "$file" | cat -n
echo "=== Call sites to infer argument meaning ==="
rg -n "findOrCreateUserByPlatformIdentity\\(" packages | head -n 80Repository: coleam00/Archon
Length of output: 5578
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="packages/adapters/src/forge/github/adapter.ts"
echo "=== senderUsername definition (nearby occurrences) ==="
rg -n "senderUsername" "$file"
echo "=== attributedLogin usage occurrences ==="
rg -n "attributedLogin" "$file"
echo "=== githubLogin field occurrences ==="
rg -n "githubLogin" "$file"
echo "=== Context earlier around senderUsername + event.comment user access ==="
# Show a broader window around the attribution block
cat -n "$file" | sed -n '860,980p'Repository: coleam00/Archon
Length of output: 7526
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="packages/adapters/src/forge/github/adapter.ts"
echo "=== Imports mentioning webhooks / Octokit event types ==="
rg -n "`@octokit`|webhooks|issue_comment|pull_request|Event" "$file" | head -n 80
echo "=== Find where method signature types event ==="
rg -n "function|async function|handle.*\\(.*event" "$file" | head -n 80
echo "=== Search for 'event:' type annotations around handle path ==="
rg -n "event\\s*:\\s*" "$file" | head -n 80
echo "=== Show around the method that contains this block (near line 700-1000) for context ==="
sed -n '650,820p' "$file" | cat -nRepository: coleam00/Archon
Length of output: 9443
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Find the types file imported by adapter.ts
rg -n "from '\\./types'" packages/adapters/src/forge/github/adapter.ts
ls -la packages/adapters/src/forge/github
# Print the WebhookEvent typing area
types_file="packages/adapters/src/forge/github/types.ts"
if [ -f "$types_file" ]; then
echo "=== Contents around WebhookEvent in $types_file ==="
rg -n "export type WebhookEvent|export interface WebhookEvent|type WebhookEvent" "$types_file"
start=$(rg -n "export (type|interface) WebhookEvent" "$types_file" | head -n1 | cut -d: -f1 || true)
if [ -n "${start:-}" ]; then
from=$((start-20)); to=$((start+220));
sed -n "${from},${to}p" "$types_file" | cat -n
else
# fallback: just print first 260 lines
head -n 260 "$types_file" | cat -n
fi
else
echo "Missing expected file: $types_file" >&2
# List possible matches
fd -t f "types.*" packages/adapters/src/forge/github
fiRepository: coleam00/Archon
Length of output: 2613
🏁 Script executed:
#!/bin/bash
set -euo pipefail
types_file="packages/adapters/src/forge/github/types.ts"
echo "=== File size ==="
wc -l "$types_file"
echo "=== WebhookEvent user shape (search for id/login in this file) ==="
rg -n "interface .*User|user\\b|login\\b|\\bid\\b" "$types_file" | head -n 120
echo "=== Print types.ts (first 220 lines) ==="
sed -n '1,220p' "$types_file" | cat -n
echo "=== Print types.ts (lines 220-520) to catch comment.user definitions ==="
sed -n '220,520p' "$types_file" | cat -nRepository: coleam00/Archon
Length of output: 1570
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="packages/adapters/src/forge/github/adapter.ts"
echo "=== Search for comment.user.id usage ==="
rg -n "comment\\?\\.user\\?\\.id|comment\\.user\\.id|comment\\s*\\)\\?\\.user\\?\\.id" "$file" || true
echo "=== Search for sender.id usage ==="
rg -n "sender\\?\\.id|sender\\.id" "$file" || true
echo "=== Search for '.id' near comment.user ==="
rg -n "comment\\?\\.(user|sender).*id|user\\?\\.id|sender\\?\\.id" "$file" || trueRepository: coleam00/Archon
Length of output: 178
Use GitHub’s immutable user id for the Archon identity key (and don’t log raw logins)
- In
packages/adapters/src/forge/github/adapter.ts(around lines 947-956),findOrCreateUserByPlatformIdentity('github', attributedLogin, attributedLogin)uses the (mutable)loginasplatformUserId—butfindOrCreateUserByPlatformIdentitytreatsplatformUserIdas the unique identity key—so a GitHub rename will create a new Archon user. Passcomment.user.id/sender.idasplatformUserId(stringified) and keeploginonly as the display name; this requires extendingpackages/adapters/src/forge/github/types.tssinceWebhookEventcurrently types onlyuser.login/sender.login(noid). - In
packages/adapters/src/forge/github/adapter.ts(around lines 958-961), the warning logs{ githubLogin: attributedLogin }with the raw login; mask or omit it to avoid logging PII (the file already masks usernames for unauthorized attempts).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/adapters/src/forge/github/adapter.ts` around lines 947 - 955, The
code is using the mutable GitHub login as the unique platformUserId when calling
userDb.findOrCreateUserByPlatformIdentity; change the call in adapter.ts to pass
the immutable numeric GitHub id (stringified) — use event.comment?.user?.id or
event.sender?.id as platformUserId — and continue passing attributedLogin only
as the displayName; update the GitHub webhook types in
packages/adapters/src/forge/github/types.ts (add user.id and sender.id typings)
so the adapter can access those ids safely; additionally remove or mask raw
logins from warning/processLogger statements around the authorization check
(avoid logging attributedLogin directly, use masked username or omit the field).
| getLog().warn( | ||
| { err: toError(err), githubLogin: attributedLogin }, | ||
| 'github.user_resolve_failed' |
There was a problem hiding this comment.
Drop the raw GitHub login from this warning log.
githubLogin is user-identifying data, so persisting it in failure logs creates avoidable PII retention. Omit it or mask it before logging.
Suggested change
getLog().warn(
- { err: toError(err), githubLogin: attributedLogin },
+ { err: toError(err) },
'github.user_resolve_failed'
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| getLog().warn( | |
| { err: toError(err), githubLogin: attributedLogin }, | |
| 'github.user_resolve_failed' | |
| getLog().warn( | |
| { err: toError(err) }, | |
| 'github.user_resolve_failed' |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/adapters/src/forge/github/adapter.ts` around lines 958 - 960, The
warning currently logs user-identifying data via the githubLogin field; remove
or mask that PII when calling getLog().warn in the github.user_resolve_failed
branch: stop passing attributedLogin as githubLogin (or replace it with a masked
value like attributedLogin.slice(0,8)+'...' if truly needed), keep the error via
toError(err) and the same event key, and ensure the call site that constructs
the log only includes non-PII fields so the log entry no longer contains the raw
GitHub login.
| }, | ||
| "scripts": { | ||
| "test": "bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/connection.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts", | ||
| "test": "bun test src/handlers/command-handler.test.ts && bun test src/handlers/clone.test.ts && bun test src/db/adapters/postgres.test.ts && bun test src/db/connection.test.ts && bun test src/db/adapters/sqlite.test.ts src/db/codebases.test.ts src/db/conversations.test.ts src/db/env-vars.test.ts src/db/isolation-environments.test.ts src/db/messages.test.ts src/db/sessions.test.ts src/db/users.test.ts src/db/workflow-events.test.ts src/db/workflows.test.ts src/utils/defaults-copy.test.ts src/utils/worktree-sync.test.ts src/utils/conversation-lock.test.ts src/utils/credential-sanitizer.test.ts src/utils/port-allocation.test.ts src/utils/error.test.ts src/utils/error-formatter.test.ts src/utils/github-graphql.test.ts src/config/ src/state/ && bun test src/utils/path-validation.test.ts && bun test src/services/cleanup-service.test.ts && bun test src/services/title-generator.test.ts && bun test src/workflows/ && bun test src/operations/workflow-operations.test.ts && bun test src/operations/isolation-operations.test.ts && bun test src/orchestrator/orchestrator.test.ts && bun test src/orchestrator/orchestrator-agent.test.ts && bun test src/orchestrator/orchestrator-isolation.test.ts", |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check if users.test.ts uses mock.module() which may require test isolation
# Search for mock.module usage in users.test.ts
echo "=== Checking users.test.ts for mock.module() usage ==="
rg -n "mock\.module\(" packages/core/src/db/users.test.ts
# Compare with other tests in the same batch to identify potential conflicts
echo -e "\n=== Checking other batched tests for mock.module() conflicts ==="
rg -n "mock\.module\(" packages/core/src/db/{codebases,conversations,env-vars,isolation-environments,messages,sessions,workflow-events,workflows}.test.ts
# Show isolated test files for reference
echo -e "\n=== Isolated test files (likely have conflicting mocks) ==="
rg -n "mock\.module\(" packages/core/src/handlers/{command-handler,clone}.test.ts packages/core/src/db/adapters/postgres.test.ts packages/core/src/db/connection.test.tsRepository: coleam00/Archon
Length of output: 3312
🏁 Script executed:
#!/bin/bash
set -euo pipefail
files=(
packages/core/src/db/users.test.ts
packages/core/src/db/workflows.test.ts
packages/core/src/db/workflow-events.test.ts
packages/core/src/db/sessions.test.ts
packages/core/src/db/messages.test.ts
packages/core/src/db/isolation-environments.test.ts
packages/core/src/db/conversations.test.ts
packages/core/src/db/codebases.test.ts
packages/core/src/db/env-vars.test.ts
)
for f in "${files[@]}"; do
echo "===== $f (mock.module ./connection and related) ====="
# show only the first ~80 lines to include mock.module declarations near the top
# and then also show the specific lines containing mock.module for precision
wc -l "$f" | awk '{print "lines:" $1}'
echo "--- mock.module occurrences ---"
rg -n "mock\.module\(" "$f" | head -n 20
echo "--- context around ./connection mocks ---"
rg -n "mock\.module\('./connection'" "$f" -n || true
# print around each mock.module line that mentions ./connection or `@archon/paths`
for pat in "mock\.module\('\./connection'" "mock\.module\(\"\.\/connection\"" "mock\.module\('`@archon/paths`'" "mock\.module\(\"`@archon/paths`\""; do
matches=$(rg -n "$pat" "$f" | cut -d: -f1 || true)
for ln in $matches; do
start=$((ln-15)); end=$((ln+35))
echo "----- $f:$ln (lines $start-$end) -----"
sed -n "${start},${end}p" "$f" | nl -ba | sed 's/^/ /'
done
done
echo
doneRepository: coleam00/Archon
Length of output: 497
🏁 Script executed:
#!/bin/bash
set -euo pipefail
files=(
packages/core/src/db/users.test.ts
packages/core/src/db/workflows.test.ts
packages/core/src/db/workflow-events.test.ts
packages/core/src/db/sessions.test.ts
packages/core/src/db/messages.test.ts
packages/core/src/db/isolation-environments.test.ts
packages/core/src/db/conversations.test.ts
packages/core/src/db/codebases.test.ts
packages/core/src/db/env-vars.test.ts
)
print_block() {
f="$1"; ln="$2"; label="$3"
start=$((ln-8))
if [ "$start" -lt 1 ]; then start=1; fi
end=$((ln+45))
echo "----- $f:$ln ($label) lines $start-$end -----"
awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%6d %s\n", NR, $0}' "$f"
echo
}
for f in "${files[@]}"; do
echo "================ $f ================"
echo "--- mock.module('{'./connection'}') occurrences ---"
rg -n "mock\.module\('\./connection'" "$f" || true
echo "--- mock.module('`@archon/paths`') occurrences ---"
rg -n "mock\.module\('`@archon/paths`'" "$f" || true
# For each connection mock, print a surrounding block
while IFS=: read -r path ln; do
print_block "$path" "$ln" "mock.module('./connection', ...)"
done < <(rg -n "mock\.module\('\./connection'" "$f" || true)
# For each paths mock, print a surrounding block
while IFS=: read -r path ln; do
print_block "$path" "$ln" "mock.module('`@archon/paths`', ...)"
done < <(rg -n "mock\.module\('`@archon/paths`'" "$f" || true)
doneRepository: coleam00/Archon
Length of output: 345
🏁 Script executed:
#!/bin/bash
set -euo pipefail
files=(
packages/core/src/db/users.test.ts
packages/core/src/db/workflows.test.ts
packages/core/src/db/workflow-events.test.ts
packages/core/src/db/sessions.test.ts
packages/core/src/db/messages.test.ts
packages/core/src/db/isolation-environments.test.ts
packages/core/src/db/conversations.test.ts
packages/core/src/db/codebases.test.ts
packages/core/src/db/env-vars.test.ts
)
print_block() {
f="$1"; ln="$2"; label="$3"
start=$((ln-8)); end=$((ln+45))
if [ "$start" -lt 1 ]; then start=1; fi
echo "----- $f:$ln ($label) lines $start-$end -----"
awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%6d %s\n", NR, $0}' "$f"
echo
}
for f in "${files[@]}"; do
echo "================ $f ================"
rg -n "mock\.module\('\./connection'" "$f" || true
rg -n "mock\.module\('`@archon/paths`'" "$f" || true
echo "--- blocks: mock.module('./connection' ...) ---"
rg -n "mock\.module\('\./connection'" "$f" | while IFS=: read -r path ln rest; do
print_block "$f" "$ln" "mock.module('./connection', ...)"
done
echo "--- blocks: mock.module('`@archon/paths`' ...) ---"
rg -n "mock\.module\('`@archon/paths`'" "$f" | while IFS=: read -r path ln rest; do
print_block "$f" "$ln" "mock.module('`@archon/paths`', ...)"
done
echo
doneRepository: coleam00/Archon
Length of output: 388
🏁 Script executed:
#!/bin/bash
set -euo pipefail
files=(
packages/core/src/db/users.test.ts
packages/core/src/db/workflows.test.ts
packages/core/src/db/workflow-events.test.ts
packages/core/src/db/sessions.test.ts
packages/core/src/db/messages.test.ts
packages/core/src/db/isolation-environments.test.ts
packages/core/src/db/conversations.test.ts
packages/core/src/db/codebases.test.ts
packages/core/src/db/env-vars.test.ts
)
print_block() {
f="$1"; ln="$2"; label="$3"
start=$((ln-8)); end=$((ln+60))
if [ "$start" -lt 1 ]; then start=1; fi
echo "----- $f:$ln ($label) lines $start-$end -----"
awk -v s="$start" -v e="$end" 'NR>=s && NR<=e {printf "%6d %s\n", NR, $0}' "$f"
echo
}
for f in "${files[@]}"; do
echo "================ $f ================"
# connection mocks
line_nums=$(rg -n "mock\\.module\\('\\./connection'" "$f" | cut -d: -f1 || true)
if [ -z "${line_nums}" ]; then
echo "--- blocks: mock.module('./connection' ...) (none) ---"
else
echo "--- blocks: mock.module('./connection' ...) ---"
for ln in $line_nums; do
print_block "$f" "$ln" "mock.module('./connection', ...)"
done
fi
# paths mocks
line_nums=$(rg -n "mock\\.module\\('`@archon/paths`'" "$f" | cut -d: -f1 || true)
if [ -z "${line_nums}" ]; then
echo "--- blocks: mock.module('`@archon/paths`' ...) (none) ---"
else
echo "--- blocks: mock.module('`@archon/paths`' ...) ---"
for ln in $line_nums; do
print_block "$f" "$ln" "mock.module('`@archon/paths`', ...)"
done
fi
echo
doneRepository: coleam00/Archon
Length of output: 34840
Fix Bun test isolation for users.test.ts
packages/core/src/db/users.test.ts mocks ./connection (includes getDatabase: { withTransaction }), but several other tests in the same packages/core/package.json test batch also call mock.module('./connection', ...) with different exported shapes (getDatabaseType in workflows.test.ts/messages.test.ts, only pool/getDialect in workflow-events.test.ts, etc.). Since Bun’s mock.module() persists across test files within the same invocation, this can cause mock pollution/order-dependent failures—move users.test.ts into its own bun test invocation (or group it with tests that mock ./connection identically).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/package.json` at line 26, The users.test.ts mock of
./connection leaks into other tests because Bun’s mock.module() persists across
files in the same test invocation; update the "test" script in package.json so
packages/core/src/db/users.test.ts is executed in its own bun test invocation
(or grouped only with tests that mock ./connection identically) instead of
inside the large chained invocation — locate the "test" script entry in
package.json and split out "src/db/users.test.ts" into a separate && bun test
... segment (or create a grouped segment with matching mocks) so mock.module()
state cannot pollute other tests like workflows.test.ts, messages.test.ts,
workflow-events.test.ts.
| export async function addMessage( | ||
| conversationId: string, | ||
| role: 'user' | 'assistant', | ||
| content: string, | ||
| metadata?: Record<string, unknown> | ||
| metadata?: Record<string, unknown>, | ||
| userId?: string | ||
| ): Promise<MessageRow> { | ||
| const dialect = getDialect(); | ||
| const result = await pool.query<MessageRow>( | ||
| `INSERT INTO remote_agent_messages (conversation_id, role, content, metadata, created_at) | ||
| VALUES ($1, $2, $3, $4, ${dialect.now()}) | ||
| `INSERT INTO remote_agent_messages (conversation_id, role, content, metadata, user_id, created_at) | ||
| VALUES ($1, $2, $3, $4, $5, ${dialect.now()}) | ||
| RETURNING *`, | ||
| [conversationId, role, content, JSON.stringify(metadata ?? {})] | ||
| [conversationId, role, content, JSON.stringify(metadata ?? {}), userId ?? null] |
There was a problem hiding this comment.
Reject userId for assistant messages.
addMessage currently allows role: 'assistant' with non-null userId, which can silently corrupt attribution data. Add an explicit guard before insert.
Proposed fix
export async function addMessage(
conversationId: string,
role: 'user' | 'assistant',
content: string,
metadata?: Record<string, unknown>,
userId?: string
): Promise<MessageRow> {
+ if (role === 'assistant' && userId !== undefined) {
+ throw new Error('assistant messages must not include userId');
+ }
+
const dialect = getDialect();
const result = await pool.query<MessageRow>(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/db/messages.ts` around lines 30 - 42, add a guard at the
start of addMessage to reject any non-null userId when role === 'assistant' by
throwing a clear Error before performing the INSERT; specifically, in the
addMessage function check the role and userId and throw (with a short message)
if an assistant message is supplied a userId so we never pass an invalid user_id
into the remote_agent_messages insert, keeping the DB attribution consistent.
| mock.module('./connection', () => ({ | ||
| pool: { query: mockQuery }, | ||
| getDialect: () => mockPostgresDialect, | ||
| getDatabase: () => ({ withTransaction: mockWithTransaction }), | ||
| })); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find process-global mock.module usage across TS test files.
rg -nP --type=ts "mock\.module\(" packages
# Expected: avoid internal-module mock.module usage (or isolate conflicting files into separate bun test invocations).Repository: coleam00/Archon
Length of output: 33399
🏁 Script executed:
rg -nP --type=ts "mock\.module\(\s*'\\./connection'" packagesRepository: coleam00/Archon
Length of output: 780
🏁 Script executed:
rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/dbRepository: coleam00/Archon
Length of output: 780
🏁 Script executed:
rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/dbRepository: coleam00/Archon
Length of output: 780
🏁 Script executed:
rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/dbRepository: coleam00/Archon
Length of output: 780
🏁 Script executed:
rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/dbRepository: coleam00/Archon
Length of output: 780
🏁 Script executed:
rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/dbRepository: coleam00/Archon
Length of output: 780
🏁 Script executed:
rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/dbRepository: coleam00/Archon
Length of output: 780
🏁 Script executed:
rg -nP --type=ts "mock\.module\('\./connection'" packages/core/src/dbRepository: coleam00/Archon
Length of output: 780
Replace mock.module('./connection', ...) with spyOn in packages/core/src/db/users.test.ts
mock.module() is process-global/irreversible in Bun, and ./connection is also mocked via mock.module() in multiple other packages/core/src/db/*.test.ts files, so this can contaminate other suites. Use spyOn()-based mocking for the imported ./connection module (restore with spy.mockRestore()).
mock.module('./connection', () => ({
pool: { query: mockQuery },
getDialect: () => mockPostgresDialect,
getDatabase: () => ({ withTransaction: mockWithTransaction }),
}));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/db/users.test.ts` around lines 15 - 19, Replace the
process-global mock.module('./connection', ...) call with a spyOn-based mock
that targets the imported connection module in users.test.ts: create a spy via
spyOn(importedConnectionModule, ...) to stub pool.query (mockQuery), getDialect
(returning mockPostgresDialect) and getDatabase (returning { withTransaction:
mockWithTransaction }) and ensure the spy is restored at test teardown with
spy.mockRestore() so other test suites are not contaminated.
| getLog().warn( | ||
| { identityId: existing.id, platform, platformUserId }, | ||
| 'user.identity_orphan_repair_started' |
There was a problem hiding this comment.
Do not log raw platform user identifiers or display names.
These events currently log platformUserId and displayName directly, which can leak PII into logs. Log only internal IDs (or a one-way hash/truncated surrogate) instead.
As per coding guidelines, "Use structured logging with Pino via createLogger() from @archon/paths; event naming: {domain}.{action}_{state} ... include context (IDs, durations, errors); never log API keys, tokens (mask as token.slice(0, 8) + '...'), message content, or PII".
Also applies to: 92-94, 106-107, 110-110
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/db/users.ts` around lines 69 - 71, The logs call
getLog().warn/getLog().info (e.g., the user.identity_orphan_repair_started
event) with raw PII variables platformUserId and displayName; replace those
direct values with non-PII surrogates (log existing.id and platform only, and
for platformUserId/displayName either omit them or log a one-way hash/truncated
surrogate such as sha256(platformUserId).slice(0,8) or
platformUserId.slice(0,8)+'...'); update all getLog() calls in this file that
reference platformUserId or displayName (including the occurrences around
user.identity_orphan_repair_started and the other calls noted) to use the
masked/hashed value or internal id only, keeping event names and structured keys
intact.
| } = context ?? {}; | ||
| try { | ||
| getLog().debug({ conversationId }, 'orchestrator_message_received'); | ||
| getLog().debug({ conversationId, userId }, 'orchestrator_message_received'); |
There was a problem hiding this comment.
Do not log the raw userId.
Line 653 adds the Archon user UUID to orchestrator logs, which creates a new user-identifier trail in log storage. Omit it or reduce it to a non-identifying flag/hash.
Proposed fix
- getLog().debug({ conversationId, userId }, 'orchestrator_message_received');
+ getLog().debug({ conversationId, hasUserId: Boolean(userId) }, 'orchestrator_message_received');As per coding guidelines, "Use structured logging with Pino via createLogger() from @archon/paths... never log API keys, tokens (mask as token.slice(0, 8) + '...'), message content, or PII".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| getLog().debug({ conversationId, userId }, 'orchestrator_message_received'); | |
| getLog().debug({ conversationId, hasUserId: Boolean(userId) }, 'orchestrator_message_received'); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/orchestrator/orchestrator-agent.ts` at line 653, The
current getLog().debug call logs the raw userId (PII); update the log in
orchestrator-agent.ts to stop emitting the raw userId—either remove userId from
the structured payload or replace it with a non-identifying value (e.g., a
masked or hashed token produced by a helper like maskUserId(userId) or
hashUserId(userId)) and keep conversationId as-is; modify the call to
getLog().debug({ conversationId, user: maskUserId(userId) },
'orchestrator_message_received') or simply getLog().debug({ conversationId },
'orchestrator_message_received') so no raw user UUID is written.
| } catch (err) { | ||
| getLog().warn( | ||
| { err: err as Error, platform, platformUserId: String(platformUserId) }, | ||
| `${platform}.user_resolve_failed` | ||
| ); |
There was a problem hiding this comment.
Mask or drop the platform user identifier from this warning log.
Line 122 writes the raw Slack/Discord/Telegram user identifier into logs on lookup failure. That creates avoidable PII retention for a non-critical warning path; keep the platform and error, but omit or mask the identifier.
Suggested change
} catch (err) {
getLog().warn(
- { err: err as Error, platform, platformUserId: String(platformUserId) },
+ { err: err as Error, platform },
`${platform}.user_resolve_failed`
);
return undefined;
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/server/src/index.ts` around lines 120 - 124, The warning currently
logs the raw platform user id in the structured payload; remove or redact it to
avoid PII retention by replacing platformUserId in the getLog().warn call with
either no identifier or a masked value (e.g., compute maskedPlatformUserId =
String(platformUserId).slice(0,8) + '...' before logging) and keep the rest of
the payload (err and platform) and the same message key
`${platform}.user_resolve_failed`; update the call sites around getLog().warn to
use the maskedPlatformUserId or omit that field entirely.
… identity type union
Critical fix: SQLite migrateColumns ALTERs now include ON DELETE SET NULL on
all four new user_id / created_by_user_id FK columns. Upgraded SQLite DBs
previously inherited the default NO ACTION ≈ RESTRICT semantic, contradicting
the PR's documented "no destructive cascade on user deletion" guarantee.
Hardening on the new user-identity surface:
- findOrCreateUserByPlatformIdentity narrows its race-recovery catch to true
UNIQUE-constraint violations (PG sqlstate 23505 or SQLite "UNIQUE
constraint failed" message). Any other error logs as user.create_failed
and propagates — no more masking generic DB failures as recoverable races.
- backfillDisplayName wraps its UPDATE in try/catch with a dedicated warn
event. A failed opportunistic backfill must not silently fail the entire
user resolution path; the caller already has the resolved user row.
- repairOrphanedIdentity now logs user.identity_orphan_repair_failed on
transaction failure (previously surfaced only as a generic resolve_failed
upstream).
- New IdentityPlatform literal union ('slack' | 'telegram' | 'discord' |
'github' | 'web' | 'cli') replaces the unconstrained `platform: string`
on UserIdentity and the findOrCreate signature. Typos now fail at compile
time rather than silently breaking the UNIQUE(platform, platform_user_id)
invariant.
- user.create_started/_completed/_failed are now properly paired per the
project event-naming convention.
Slack adapter:
- users.info missing_scope WARN now gated by an instance flag so it fires
once per adapter lifetime instead of once per unknown user. The
misconfiguration is permanent — flooding logs after every restart in a
100-user workspace was the wrong shape.
- users_info_failed log strips err (which can include err.data with
workspace metadata) in favor of structured errMessage / slackErrorCode
/ slackUserId fields. No PII through the log pipeline.
Server resolver:
- resolveUserId exported for testability and now logs as a single static
event server.user_resolve_failed (platform in structured fields) instead
of the templated ${platform}.user_resolve_failed which collided with the
GitHub adapter's own event name.
- Dead `=== null` branch removed (TypeScript already narrows the type).
GitHub adapter:
- User-identity resolution moved up to immediately after self-filtering
+ @mention checks. Now runs before the codebase-ensure and comment-
history Octokit calls so resolution can't be silently skipped by an
upstream Octokit failure (which was masking a missing-mock bug in the
existing test suite).
Tests:
- New packages/server/src/resolve-user-id.test.ts covers the never-throws
contract that three adapter handlers depend on. 6 cases including the
static-event-name regression.
- GitHub adapter test now mocks @archon/core/db/users and covers the
comment.user.login ?? sender.login attribution fallback in both
directions, plus a never-throws case for resolution failure.
- users.test.ts gains the asymmetric-backfill case, backfill-failure-
does-not-block-resolution case, both PG and SQLite UNIQUE-error shapes
for race recovery, and a non-UNIQUE-rethrows-without-recovery case
that explicitly counts the query calls.
- isolation-environments.test.ts adds a "ON CONFLICT does NOT update
created_by_user_id" regression guard so a copy-paste in the SET clause
can't silently transfer ownership across re-activations.
Comment cleanup: stripped the (PR-A) / (until PR-C) / pre-PR-A history
labels from production types, migrations, and source files. They were PR-
state markers that would rot on merge; the substantive WHY content stays.
Docs:
- CLAUDE.md table count: 8 → 10; users and user_identities documented.
- docs-web/reference/database.md: 8 → 10 with explicit ON DELETE semantics
and a note that re-running 000_combined.sql is idempotent and picks up
the new ALTERs.
- docs-web/reference/architecture.md: 7-table diagram → 10-table; full
schema block extended with the new tables and user_id columns.
- docs-web/adapters/slack.md: users:read scope added to the Bot Token
Scopes setup with a note about graceful degradation if omitted.
Skipped (with reason):
- Converting User/UserIdentity to z.infer<typeof schema>: all sibling row
interfaces in types/index.ts are hand-crafted; doing this for just the
two new types creates inconsistency. A separate consistency pass should
convert the whole file, not selectively.
- Threading userId into the four web/CLI addMessage callsites: those
surfaces don't have an auth flow yet, so threading now means passing
`undefined` from every caller. Added explicit TODOs at each callsite
pointing at the upcoming web/CLI auth work instead.
…tion routing (#1788) Phase 2 of the team-foundation PRD. Replaces the bot's single shared GITHUB_TOKEN PAT with a registered GitHub App that supports multi-installation token routing from day one. New @archon/core/github-auth/ module wrapping @octokit/auth-app with a three-level cache: - lookupCache: owner/repo → installationId (1h TTL; evicted on 401) - tokenCache: installationId → access token (1h GitHub TTL, refreshed 5min before expiry) - octokitCache: installationId → Octokit (per-installation auth strategy; evicted on 401 so the SDK's hidden internal token state can't keep serving the dead token) GitHubAdapter takes a `GitHubAuth` discriminated union at construction. All 4 Octokit callsites (postComment / listComments / repos.get / pulls.get) plus the clone path route through resolveOctokit + a withTokenRefresh wrapper that calls invalidateRepo and retries once on 401. Webhook event.installation.id primes the lookup cache to skip a round-trip. Secondary self-filter compares against `<slug>[bot]` in App mode (via a botLogin getter, distinct from botMention) so PR-C's per-user tokens won't trip it. Server bootstrap detects App vs PAT mode via env and fails fast if both are configured. In App mode it registers the provider on a module singleton consumed by createWorkflowDeps(), so the workflow executor's bash/script subprocesses inherit a fresh GH_TOKEN/GITHUB_TOKEN. New POST /internal/git-credential endpoint (App mode only) backs a POSIX git credential helper installed at clone time, covering workflows that outlive the 1h installation-token expiry. The public-bind guard runs BEFORE Bun.serve so a rejected config never opens the listening socket — opt-out via ARCHON_ALLOW_INTERNAL_ON_PUBLIC_BIND=1 for deployments where the reverse proxy already drops /internal/*. Refactor + extracted helpers in server/src/github-auth-bootstrap.ts (selectGitHubAuthMode + parseGitCredentialPath) so the security-critical decisions are testable in isolation without spinning up Hono. Backwards compat: solo installs running GITHUB_TOKEN only see zero functional change. All 54 existing PAT-mode adapter tests pass unchanged. Tests added: 23 strictly-mocked auth-module tests (PRD Q7 — no live api.github.com in CI); 10 new App-mode adapter tests (multi-install routing, payload short-circuit, 401 retry + retry-on-retry propagation, AppNotInstalledError surfacing, clone-token resolution, post-clone credential helper install); 20 server-bootstrap unit tests (dual-mode fail-fast, /internal path validation incl. traversal + null bytes). Closes phase 2 of .claude/PRPs/prds/github-app-and-user-identity.prd.md. Depends on #1783 (PR-A user-identity foundation).
Summary
GITHUB_TOKEN, no users table, every conversation/message/workflow_run unattributed. A small team sharing one VPS can't tell who triggered which run, and every teammate's PR comments appear under the same GitHub avatar.remote_agent_users+remote_agent_user_identities), nullableuser_idon the four primary tables, and end-to-end plumbing of a stable Archonuser_idfrom every chat/forge adapter (Slack, Telegram, Discord, GitHub) throughHandleMessageContextinto conversations, workflow runs, and isolation environments. Slack additionally enriches first-sight users with their real name viausers.info.user_id = NULL(follow-up PR wires theX-Archon-Userheader andarchon auth githubdevice flow). No GitHub App swap yet (separate follow-up). No ACLs, norequires: [github]workflow gate, no UI changes.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
app.client.users.infomissing_scope/rate-limitserver/index.ts(3 chat handlers)userDb.findOrCreateUserByPlatformIdentityresolveUserId()helper (DRY across Slack/Telegram/Discord)forge/github/adapter.tsuserDb.findOrCreateUserByPlatformIdentitycomment.user.login ?? sender.loginbeforehandleMessageHandleMessageContextgetOrCreateConversationuserIdparam flows through; first-user-wins on subsequent thread repliesorchestrator-agent(dispatchOrchestratorWorkflow,handleStreamMode,handleBatchMode,handleWorkflowInvocationResult,handleWorkflowRunCommand)executeWorkflow/validateAndResolveIsolation/dispatchBackgroundWorkflowuserId?threadingIsolationRequestIsolationStore.createcreated_by_user_idpropagated; not overwritten on ON CONFLICT (preserves original creator)IWorkflowStore.createWorkflowRunpayloaduser_idcolumn populated by executor optdb/users.tsgetDatabase().withTransactiondb/index.tsuserDbnamespaceconversationDb/codebaseDbpatternLabel Snapshot
risk: low(additive schema; nullable everywhere; backwards-compat verified by retained NULL paths)size: L(29 files, +926/-71)core,adapters,server,workflows,isolationcore:db,core:orchestrator,adapters:slack,adapters:telegram,adapters:discord,adapters:github,server:handlers,workflows:executor,isolation:resolverChange Metadata
featuremultiLinked Issue
feat/keycloak-multi-user-authbranch (PR feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth #1774); the Keycloak/OIDC layer in that branch is explicitly out of scopeValidation Evidence (required)
check:bundled+check:bundled-skill: up-to-datetype-checkacross 10 workspace packageslint--max-warnings 0: cleanformat:check: cleantest: per-package isolated batches all green; the newdb/users.test.tscovers 10 cases (fast-path, opportunistic backfill, no-redundant-backfill, fresh create, race recovery, rethrow on non-race error, orphan repair,getUserById/updateUserDisplayName); Slack adapter test gains 4 cases forfetchDisplayName(real_name + cache, missing_scope graceful fallback, empty-userId short-circuit, retry after transient failure)Security Impact (required)
users.info, which requires the bot token scopeusers:read. Operators with an existing Slack app must reinstall to grant the scope. Failure returnsmissing_scopeand we warn-log + continue without a display name; no message is ever dropped for this reason.users.infocall per first-sighting of each Slack user id, dedupe-cached in memory for the adapter's lifetime. Negative results are not cached (retry on next sighting). No new calls for Telegram/Discord/GitHub.users.emailis stored plaintext (low-sensitivity public identifier; the GitHub tokens themselves stay out of scope until the follow-up device-flow PR, where they'll be AES-256-GCM encrypted).user_idFKs useON DELETE SET NULLon all four tables — future user deletion (out of scope here) will never cascade destructively.Compatibility / Migration
users:readbot scope to unlock display-name enrichment, but the adapter degrades gracefully if the scope is absent.psql $DATABASE_URL < migrations/000_combined.sql. Adds twoCREATE TABLE IF NOT EXISTS+ fourALTER TABLE ... ADD COLUMN IF NOT EXISTS. Second run is a no-op. Existing rows haveuser_id = NULL.migrateColumns()PRAGMA-based pattern atpackages/core/src/db/adapters/sqlite.ts— no manual step. Existing rows haveuser_id = NULL.Human Verification (required)
What was personally validated beyond CI:
~/.archon/archon.db, server starts, new tables exist (.schema remote_agent_users,.schema remote_agent_user_identities).migrateColumns()ALTERs verified via code review against the existing pattern atsqlite.ts:164-218.users.infomissing_scopefailure (graceful degradation, message still flows)users.infotransient failure (retried on next sighting — negative results not cached)event.user(resolution short-circuits, no DB write)SELECT u.display_name, ui.platform_user_id FROM remote_agent_users u JOIN remote_agent_user_identities ui ON ui.user_id = u.idreturns the expected row.Side Effects / Blast Radius (required)
user_id = NULLuntil the follow-up PR.users.infocall per first-sight user. Cached in memory after that. Tier 4 Slack rate limits (~100+/min) leave very wide headroom.findOrCreateUserByPlatformIdentitynow sits in the hot path of every chat-platform message. The fast path is two SELECTs (identity + user) — negligible overhead vs the existing per-message DB cost.user.create_started/_completed/_failed,user.create_race_recovered,user.identity_orphan_repair_started/_completed,slack.users_info_missing_scope,slack.users_info_failed,<platform>.user_resolve_failed. Search for any*_failedto detect issues.Rollback Plan (required)
git revert <commit>(single commit; reverts both code and schema migration cleanly — the additive ALTER columns become unused but harmless on the rollback; tighter cleanup would be a separate PR if desired).user.create_failedlogs spiking → DB connection or constraint issue.slack.users_info_missing_scopelogs → users will see emptydisplay_name; not a regression, but operators should reinstall the Slack app to fix.*_resolve_failed→ user_id stays NULL but messages still flow (warn-and-continue invariant).Risks and Mitigations
users.infomissing_scopeerrors after deploy if operators don't reinstall the app.user_idrecorded by Slack id (identity created without display_name). Documented in this PR description and in the security-impact section above.usersrows.UNIQUE(platform, platform_user_id)on the identity row trips the second writer; the transaction rolls back and we re-SELECT the winner. Covered by therecovers from race when UNIQUE constraint firestest case.repairs orphaned identitytest case.feat/keycloak-multi-user-authbranch (PR feat: Keycloak/OIDC multi-user auth with per-user GitHub OAuth #1774) if that branch resurfaces.Summary by CodeRabbit
New Features
Documentation