feat(console): console promotion — pending-input banner, project-scoped agent chat, manage_run native tool (Claude + Pi)#1819
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a project-scoped console chat (conversation primitives, HTTP skills, per-conversation SSE), new chat UI (composer, stream, tabs, ChatPage route), cache revalidation (stale-while-revalidate via revalidate/useSyncExternalStore), pending-input promotion/banner and run-card flag, plus native-tool support with a manage_run tool and provider MCP wiring. ChangesConsole Agent Chat Feature
Native Tooling and Manage-Run
🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs:
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
packages/web/src/experiments/console/skills/conversations.ts (1)
37-41: ⚡ Quick winReplace
Parameters<...>indirection with an explicit raw type.The
Parameters<typeof toConversationSummary>[0][]annotation hides the actual shape behind type meta-programming. ExportingRawConversationand referencing it directly reads clearer and keeps the data contract explicit.♻️ Proposed refactor
In
primitives/conversation.ts:-interface RawConversation { +export interface RawConversation {In this file:
-import { toConversationSummary, type ConversationSummary } from '../primitives/conversation'; +import { + toConversationSummary, + type ConversationSummary, + type RawConversation, +} from '../primitives/conversation'; @@ - const raw = await requestJson<Parameters<typeof toConversationSummary>[0][]>( + const raw = await requestJson<RawConversation[]>( `/api/conversations?codebaseId=${encodeURIComponent(projectId)}` );As per coding guidelines: "prefer explicit branches and typed interfaces over hidden dynamic behavior."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/web/src/experiments/console/skills/conversations.ts` around lines 37 - 41, The listConversations function uses Parameters<typeof toConversationSummary>[0][] which hides the incoming shape; instead export and use an explicit RawConversation type from primitives/conversation.ts and update listConversations to call requestJson<RawConversation[]>('/api/conversations?codebaseId=...') and then map(raw => toConversationSummary(raw)) — ensure primitives/conversation.ts exports RawConversation and update the import in this file so the function signature and requestJson generic use RawConversation[] while keeping the return type Promise<ConversationSummary[]>.packages/web/src/experiments/console/lib/sse.ts (1)
170-226: ⚡ Quick win
onLockChangedependency inuseConversationSSE—current usage is stable, so churn is unlikely
ChatPagepassessetLockeddirectly (useStatesetter), which is stable; so includingonLockChangein the effect deps won’t cause the “reconnect on every render” churn for this caller. TheuseRef-stabilization approach is only worth it if other callers ever pass a non-memoized inline callback.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/web/src/experiments/console/lib/sse.ts` around lines 170 - 226, The change that stabilizes onLockChange via a useRef is unnecessary because useState setters (e.g., setLocked) are already stable; revert any refactor and keep the effect dependency array as [conversationPlatformId, onLockChange] inside useConversationSSE, ensuring es.onmessage still calls onLockChange?.(ev.locked) directly and no useRef indirection is introduced.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/web/src/experiments/console/components/ChatComposer.tsx`:
- Around line 59-71: The textarea in ChatComposer.tsx never receives the
disabled prop, so when the component is locked or sending the field remains
editable; update the <textarea> to forward the disabled boolean (the same prop
used for the placeholder) so the DOM element truly becomes readonly and the CSS
class disabled:opacity-50 takes effect; locate the JSX using textareaRef, value,
setValue, grow, onKeyDown and MAX_HEIGHT and add disabled={disabled} to that
element (ensuring existing behavior for placeholder using disabledReason remains
unchanged).
In `@packages/web/src/experiments/console/skills/conversations.ts`:
- Around line 59-78: The multipart fetch call in conversations.ts (the fetch
invocation that builds a FormData and posts it) omits credentials, causing
parity mismatch with requestJson which sets credentials: 'same-origin'; update
the fetch call options for the POST (the fetch(url, { method: 'POST', body: form
})) to include credentials: 'same-origin' so same-origin cookies/CSRF are
preserved; apply the same change in the analogous startRun.ts multipart upload
code to keep behavior consistent and future-proof absolute/CORS URL changes.
---
Nitpick comments:
In `@packages/web/src/experiments/console/lib/sse.ts`:
- Around line 170-226: The change that stabilizes onLockChange via a useRef is
unnecessary because useState setters (e.g., setLocked) are already stable;
revert any refactor and keep the effect dependency array as
[conversationPlatformId, onLockChange] inside useConversationSSE, ensuring
es.onmessage still calls onLockChange?.(ev.locked) directly and no useRef
indirection is introduced.
In `@packages/web/src/experiments/console/skills/conversations.ts`:
- Around line 37-41: The listConversations function uses Parameters<typeof
toConversationSummary>[0][] which hides the incoming shape; instead export and
use an explicit RawConversation type from primitives/conversation.ts and update
listConversations to call
requestJson<RawConversation[]>('/api/conversations?codebaseId=...') and then
map(raw => toConversationSummary(raw)) — ensure primitives/conversation.ts
exports RawConversation and update the import in this file so the function
signature and requestJson generic use RawConversation[] while keeping the return
type Promise<ConversationSummary[]>.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0435aa53-8c38-4e66-9dfd-0e3a3ab58559
📒 Files selected for processing (14)
packages/web/src/experiments/console/ConsoleApp.tsxpackages/web/src/experiments/console/agent-chat-spike.mdpackages/web/src/experiments/console/components/ActiveRunCard.tsxpackages/web/src/experiments/console/components/ChatComposer.tsxpackages/web/src/experiments/console/components/ChatStream.tsxpackages/web/src/experiments/console/components/PendingInputBanner.tsxpackages/web/src/experiments/console/components/ProjectViewTabs.tsxpackages/web/src/experiments/console/lib/sse.tspackages/web/src/experiments/console/primitives/conversation.tspackages/web/src/experiments/console/routes/ChatPage.tsxpackages/web/src/experiments/console/routes/RunsPage.tsxpackages/web/src/experiments/console/skills/conversations.tspackages/web/src/experiments/console/skills/index.tspackages/web/src/experiments/console/store/keys.ts
| <textarea | ||
| ref={textareaRef} | ||
| value={value} | ||
| onChange={e => { | ||
| setValue(e.target.value); | ||
| grow(e.target); | ||
| }} | ||
| onKeyDown={onKeyDown} | ||
| rows={1} | ||
| placeholder={disabled ? (disabledReason ?? 'Waiting…') : 'Message the agent…'} | ||
| className="min-h-[40px] flex-1 resize-none rounded border border-border bg-surface-inset px-3 py-2 text-[13px] text-text-primary placeholder:text-text-tertiary focus:border-border-bright focus:outline-none disabled:opacity-50" | ||
| style={{ maxHeight: `${MAX_HEIGHT.toString()}px` }} | ||
| /> |
There was a problem hiding this comment.
Apply the disabled prop to the textarea.
The disabled prop is never forwarded to the <textarea>, so the disabled:opacity-50 class never activates and the field stays editable while locked || sending. Submission is blocked by submit(), but the user can still type into a "locked" composer with no visual feedback.
🛠️ Proposed fix
<textarea
ref={textareaRef}
value={value}
+ disabled={disabled}
onChange={e => {
setValue(e.target.value);
grow(e.target);
}}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <textarea | |
| ref={textareaRef} | |
| value={value} | |
| onChange={e => { | |
| setValue(e.target.value); | |
| grow(e.target); | |
| }} | |
| onKeyDown={onKeyDown} | |
| rows={1} | |
| placeholder={disabled ? (disabledReason ?? 'Waiting…') : 'Message the agent…'} | |
| className="min-h-[40px] flex-1 resize-none rounded border border-border bg-surface-inset px-3 py-2 text-[13px] text-text-primary placeholder:text-text-tertiary focus:border-border-bright focus:outline-none disabled:opacity-50" | |
| style={{ maxHeight: `${MAX_HEIGHT.toString()}px` }} | |
| /> | |
| <textarea | |
| ref={textareaRef} | |
| value={value} | |
| disabled={disabled} | |
| onChange={e => { | |
| setValue(e.target.value); | |
| grow(e.target); | |
| }} | |
| onKeyDown={onKeyDown} | |
| rows={1} | |
| placeholder={disabled ? (disabledReason ?? 'Waiting…') : 'Message the agent…'} | |
| className="min-h-[40px] flex-1 resize-none rounded border border-border bg-surface-inset px-3 py-2 text-[13px] text-text-primary placeholder:text-text-tertiary focus:border-border-bright focus:outline-none disabled:opacity-50" | |
| style={{ maxHeight: `${MAX_HEIGHT.toString()}px` }} | |
| /> |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/web/src/experiments/console/components/ChatComposer.tsx` around
lines 59 - 71, The textarea in ChatComposer.tsx never receives the disabled
prop, so when the component is locked or sending the field remains editable;
update the <textarea> to forward the disabled boolean (the same prop used for
the placeholder) so the DOM element truly becomes readonly and the CSS class
disabled:opacity-50 takes effect; locate the JSX using textareaRef, value,
setValue, grow, onKeyDown and MAX_HEIGHT and add disabled={disabled} to that
element (ensuring existing behavior for placeholder using disabledReason remains
unchanged).
| // Multipart path: don't set Content-Type — the browser adds the boundary. | ||
| const form = new FormData(); | ||
| form.append('message', message); | ||
| for (const file of files) { | ||
| form.append('files', file, file.name); | ||
| } | ||
| const res = await fetch(url, { method: 'POST', body: form }); | ||
| if (!res.ok) { | ||
| const text = await res.text().catch(() => ''); | ||
| let parsed: { error?: string } = {}; | ||
| try { | ||
| parsed = JSON.parse(text) as { error?: string }; | ||
| } catch { | ||
| /* not JSON */ | ||
| } | ||
| const raw = parsed.error ?? (text.length > 0 ? text : `HTTP ${res.status.toString()}`); | ||
| const msg = raw.length > 200 ? `${raw.slice(0, 200)}...` : raw; | ||
| const path = new URL(url, window.location.origin).pathname; | ||
| throw new HttpError(res.status, path, msg); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Inspect requestJson to see what defaults the multipart fetch must replicate.
fd 'http.ts' --type f | xargs rg -n -A40 'export (async )?function requestJson'Repository: coleam00/Archon
Length of output: 814
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the exact multipart fetch code + nearby context.
sed -n '1,200p' packages/web/src/experiments/console/skills/conversations.ts | nl -ba | sed -n '40,120p'
# Find any other upload/multipart patterns in the same file.
rg -n "new FormData\(|multipart|files'" packages/web/src/experiments/console/skills/conversations.ts -n
# Inspect requestJson usage in nearby code for how headers/auth are typically handled.
rg -n "requestJson\(" packages/web/src/experiments/console/skills/conversations.ts -n
# Locate mergeHeaders usage/imports to see if any auth/CSRF defaults are expected.
rg -n "mergeHeaders|`@archon`|CSRF|credentials" packages/web/src/experiments/console/skills/conversations.ts packages/web/src/experiments/console/skills -S || trueRepository: coleam00/Archon
Length of output: 102
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect the exact multipart fetch code + nearby context with line numbers.
cat -n packages/web/src/experiments/console/skills/conversations.ts | sed -n '40,140p'
# Find any other multipart/FormData usage in the same file.
rg -n "new FormData\(|files'" packages/web/src/experiments/console/skills/conversations.ts
# Locate requestJson usage in nearby code for how it handles defaults.
rg -n "requestJson\(" packages/web/src/experiments/console/skills/conversations.ts
# Look for credentials/auth/CSRF patterns in this area.
rg -n "credentials|CSRF|X-CSRF|Authorization|mergeHeaders" packages/web/src/experiments/console/skills/conversations.tsRepository: coleam00/Archon
Length of output: 1667
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Dump the whole file (it appears small) with line numbers.
wc -l packages/web/src/experiments/console/skills/conversations.ts
cat -n packages/web/src/experiments/console/skills/conversations.ts
# Find where requestJson is defined (from earlier evidence, likely in packages/.../http.ts).
# Search for "mergeHeaders" and "export async function requestJson" definitions.
rg -n "export async function requestJson" packages -S
rg -n "function mergeHeaders|mergeHeaders" packages -S | head -n 50Repository: coleam00/Archon
Length of output: 3877
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Inspect requestJson + helpers fully
cat -n packages/web/src/experiments/console/lib/http.ts
# Find "startRun" multipart path reference and inspect it.
fd -t f 'startRun' packages/web/src/experiments/console -x echo
fd -t f 'startRun\.ts' packages/web/src/experiments/console | head -n 20
rg -n "Multipart|new FormData\(|files'" packages/web/src/experiments/console -S
rg -n "startRun" packages/web/src/experiments/console/skills -S || true
# Specifically locate any other multipart fetch in the console experiments.
rg -n "fetch\\(.*body: form\\b|new FormData\\(\\)" packages/web/src/experiments/console -SRepository: coleam00/Archon
Length of output: 4311
🏁 Script executed:
#!/bin/bash
set -euo pipefail
cat -n packages/web/src/experiments/console/skills/startRun.ts | sed -n '1,120p'Repository: coleam00/Archon
Length of output: 3252
Multipart fetch parity with requestJson: only credentials differs (and it’s likely redundant for same-origin relative URLs).
requestJson injects only credentials: 'same-origin' (and Content-Type: application/json for non-FormData bodies). The multipart code in conversations.ts (and similarly in startRun.ts) omits credentials, but both use relative /api/... URLs, so the browser’s default should already be same-origin; auth/CSRF headers aren’t being dropped here.
For consistency and future safety (if URLs become absolute/CORS), set credentials: 'same-origin' on the multipart fetch.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/web/src/experiments/console/skills/conversations.ts` around lines 59
- 78, The multipart fetch call in conversations.ts (the fetch invocation that
builds a FormData and posts it) omits credentials, causing parity mismatch with
requestJson which sets credentials: 'same-origin'; update the fetch call options
for the POST (the fetch(url, { method: 'POST', body: form })) to include
credentials: 'same-origin' so same-origin cookies/CSRF are preserved; apply the
same change in the analogous startRun.ts multipart upload code to keep behavior
consistent and future-proof absolute/CORS URL changes.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/web/src/experiments/console/routes/ChatPage.tsx`:
- Around line 75-103: The settle timer is currently tied only to the trailing
assistant message and a weak sig (list.length:last.id), which lets stale timers
cancel active turns; fix by gating the timer to the active turn (derive and
store an activeTurnId when a trailing user message appears) and only arm/reset
the timer for that same activeTurnId; also strengthen settleSigRef to include a
content/version signal from the assistant message (e.g., last.content or
last.updatedAt/streamVersion along with last.id) so streamed updates advance the
sig and re-arm the timer correctly; update references: settleTimerRef,
settleSigRef, messages, setBusy, SETTLE_MS to use the activeTurnId guard and the
richer sig.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e11dfe71-9020-4405-922c-b2dced25ee33
📒 Files selected for processing (1)
packages/web/src/experiments/console/routes/ChatPage.tsx
| // Derive turn state from the trailing message: a user message means a reply | ||
| // is pending; once an assistant reply lands and stays stable for SETTLE_MS the | ||
| // turn is done. This also recovers a reload mid-turn (trailing user message). | ||
| const settleTimerRef = useRef<ReturnType<typeof setTimeout> | null>(null); | ||
| const settleSigRef = useRef(''); | ||
| useEffect(() => { | ||
| const list = messages ?? []; | ||
| const last = list[list.length - 1]; | ||
| if (last === undefined) return; | ||
| if (last.role === 'user') { | ||
| settleSigRef.current = ''; | ||
| if (settleTimerRef.current !== null) { | ||
| clearTimeout(settleTimerRef.current); | ||
| settleTimerRef.current = null; | ||
| } | ||
| setBusy(true); | ||
| return; | ||
| } | ||
| // Trailing message is an assistant/system reply. Arm the settle timer once; | ||
| // re-arm only on real content change so identical poll refetches (same sig) | ||
| // don't reset it forever. | ||
| const sig = `${list.length}:${last.id}`; | ||
| if (sig === settleSigRef.current) return; | ||
| settleSigRef.current = sig; | ||
| if (settleTimerRef.current !== null) clearTimeout(settleTimerRef.current); | ||
| settleTimerRef.current = setTimeout(() => { | ||
| setBusy(false); | ||
| }, SETTLE_MS); | ||
| }, [messages]); |
There was a problem hiding this comment.
Tie the settle timer to the active turn, not just the trailing assistant message.
This timeout is armed for any trailing assistant/system message, even when the chat is already idle. If the page is sitting on an old assistant reply and the user sends again before that timeout fires, the stale timer can flip busy back to false mid-request. Also, sig only uses list.length and last.id, so streamed updates to the same assistant message never reset the timer. Please gate the settle timer on the current pending turn and base sig on a real content/version signal from last.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/web/src/experiments/console/routes/ChatPage.tsx` around lines 75 -
103, The settle timer is currently tied only to the trailing assistant message
and a weak sig (list.length:last.id), which lets stale timers cancel active
turns; fix by gating the timer to the active turn (derive and store an
activeTurnId when a trailing user message appears) and only arm/reset the timer
for that same activeTurnId; also strengthen settleSigRef to include a
content/version signal from the assistant message (e.g., last.content or
last.updatedAt/streamVersion along with last.id) so streamed updates advance the
sig and re-arm the timer correctly; update references: settleTimerRef,
settleSigRef, messages, setBusy, SETTLE_MS to use the activeTurnId guard and the
richer sig.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/web/src/experiments/console/store/cache.ts`:
- Around line 182-195: The hook's useSyncExternalStore subscription only returns
cache.get(key) so changes to errors or inflight don't change the snapshot and
React won't re-render; update the snapshot to include a version or a tuple so
any mutation causes a new reference. Specifically, modify the getSnapshot used
in useSyncExternalStore (and the value your subscribe invalidates) to return
either a { data, error, loadingVersion } or a small epoch/version number that
you increment whenever cache, errors, or inflight change (referencing
useSyncExternalStore, subscribe, cache.get, errors.get, inflight, and
revalidate); increment that version on mutations and read it in getSnapshot so
error/loading changes produce new object identities and trigger re-renders.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6042461e-d7c7-40fe-9278-583635a4f1c5
📒 Files selected for processing (1)
packages/web/src/experiments/console/store/cache.ts
| const data = useSyncExternalStore( | ||
| subscribe, | ||
| () => cache.get(key) as T | undefined, | ||
| () => cache.get(key) as T | undefined | ||
| ); | ||
|
|
||
| return { | ||
| data: cache.get(key) as T | undefined, | ||
| data, | ||
| error: errors.get(key), | ||
| loading: !cache.has(key) && inflight.has(key), | ||
| refetch: (): void => { | ||
| cache.delete(key); | ||
| errors.delete(key); | ||
| notify(key); | ||
| ensureLoad(key); | ||
| revalidate(key); | ||
| }, | ||
| }; |
There was a problem hiding this comment.
Error and loading state changes do not trigger re-renders.
useSyncExternalStore only re-renders when getSnapshot returns a different value (by Object.is). Since the snapshot reads only cache.get(key):
- Initial load failure: cache stays
undefined, snapshot unchanged → no re-render → stuck atloading: trueforever - Revalidate failure: stale cache reference unchanged → no re-render → error never shown
error and loading are computed during render but never re-read if React doesn't re-render.
Consider tracking a version/epoch that increments on any mutation (cache, error, or inflight change), or store a { data, error } tuple per key so either change produces a new reference.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/web/src/experiments/console/store/cache.ts` around lines 182 - 195,
The hook's useSyncExternalStore subscription only returns cache.get(key) so
changes to errors or inflight don't change the snapshot and React won't
re-render; update the snapshot to include a version or a tuple so any mutation
causes a new reference. Specifically, modify the getSnapshot used in
useSyncExternalStore (and the value your subscribe invalidates) to return either
a { data, error, loadingVersion } or a small epoch/version number that you
increment whenever cache, errors, or inflight change (referencing
useSyncExternalStore, subscribe, cache.get, errors.get, inflight, and
revalidate); increment that version on mutations and read it in getSnapshot so
error/loading changes produce new object identities and trigger re-renders.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/web/src/experiments/console/routes/ChatPage.tsx (1)
172-190:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winHoist
currentActivity’suseMemoabove theprojectIdearly return to keep hook order stable.
packages/web/src/experiments/console/routes/ChatPage.tsxcallsuseMemoafterif (projectId === undefined) return ...;. IfprojectIdcan beundefinedon one render and defined on another, React can throw a hooks-order error. MovemessageList/useMemoabove the early return (or otherwise ensure the component never returns before the hook).🐛 Proposed fix — hoist above the early return
+ const messageList = messages ?? []; + + // Current activity for the working indicator: the latest tool the agent + // invoked in the in-flight turn (walk back to the last user message). + const currentActivity = useMemo<string | null>(() => { + for (let i = messageList.length - 1; i >= 0; i--) { + const m = messageList[i]; + if (m === undefined) continue; + if (m.role === 'user') break; + if (m.role === 'assistant' && m.toolCalls.length > 0) { + return m.toolCalls[m.toolCalls.length - 1]?.name ?? null; + } + } + return null; + }, [messageList]); + if (projectId === undefined) { return <EmptyState title="No project selected." />; } - - const messageList = messages ?? []; - - // Current activity for the working indicator: the latest tool the agent - // invoked in the in-flight turn (walk back to the last user message). - const currentActivity = useMemo<string | null>(() => { - for (let i = messageList.length - 1; i >= 0; i--) { - const m = messageList[i]; - if (m === undefined) continue; - if (m.role === 'user') break; - if (m.role === 'assistant' && m.toolCalls.length > 0) { - return m.toolCalls[m.toolCalls.length - 1]?.name ?? null; - } - } - return null; - }, [messageList]);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/web/src/experiments/console/routes/ChatPage.tsx` around lines 172 - 190, The useMemo that computes currentActivity (using messageList and useMemo) is declared after the early return for projectId and can break hook order; move the messageList and the useMemo/currentActivity declaration above the if (projectId === undefined) return ... check in ChatPage.tsx so the hook always runs in the same order, or ensure the component never returns before invoking useMemo—update references to projectId as needed so the early-return remains correct after hoisting.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/web/src/experiments/console/routes/ChatPage.tsx`:
- Around line 172-190: The useMemo that computes currentActivity (using
messageList and useMemo) is declared after the early return for projectId and
can break hook order; move the messageList and the useMemo/currentActivity
declaration above the if (projectId === undefined) return ... check in
ChatPage.tsx so the hook always runs in the same order, or ensure the component
never returns before invoking useMemo—update references to projectId as needed
so the early-return remains correct after hoisting.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3f31941b-6fc2-4024-a66c-db83347f5f07
📒 Files selected for processing (7)
packages/web/src/experiments/console/components/ChatStream.tsxpackages/web/src/experiments/console/components/RunLifecycle.tsxpackages/web/src/experiments/console/components/WorkflowDock.tsxpackages/web/src/experiments/console/components/WorkingIndicator.tsxpackages/web/src/experiments/console/console-open-questions.mdpackages/web/src/experiments/console/routes/ChatPage.tsxpackages/web/src/experiments/console/routes/RunDetailPage.tsx
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
packages/core/src/orchestrator/manage-run-tool.ts (1)
60-60: 💤 Low valueLogging emits terminal states without a paired
_started.
manage_run.list_completed,manage_run.get_completed, andmanage_run.failedare logged, but there is no corresponding_startedevent, so a failure can't be correlated to an initiation in the logs.As per coding guidelines: event naming
{domain}.{action}_{state}and "always pair_startedwith_completedor_failed".Also applies to: 69-69, 94-94
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/orchestrator/manage-run-tool.ts` at line 60, The code logs terminal events like "manage_run.list_completed", "manage_run.get_completed" and "manage_run.failed" without emitting matching "manage_run.*_started" events; add paired "_started" logs immediately before the corresponding operations so every completed/failed log has a matching started log. Locate the places that currently call log.info/log.error for "manage_run.list_completed", "manage_run.get_completed", and "manage_run.failed" in manage-run-tool.ts and insert corresponding log.info calls for "manage_run.list_started", "manage_run.get_started", and "manage_run.started" (or "manage_run.<action>_started") with the same metadata object (e.g., { action, codebaseId: ctx.codebaseId }) so states can be correlated.packages/web/src/experiments/console/components/WorkflowDock.tsx (2)
124-132: ⚡ Quick winButton is interactive when navigation target is unavailable.
The button remains clickable even when
run.projectIdisnull, resulting in a no-op click. The hover states still apply, creating a misleading affordance. Consider disabling the button or omitting it whenprojectIdis unavailable.Disable button when projectId is null
<button type="button" + disabled={run.projectId === null} onClick={() => { if (run.projectId !== null) navigate(`/console/p/${run.projectId}/r/${run.id}`); }} - className="ml-auto shrink-0 font-mono text-[11px] text-text-tertiary transition-colors hover:text-text-primary" + className="ml-auto shrink-0 font-mono text-[11px] text-text-tertiary transition-colors hover:text-text-primary disabled:cursor-not-allowed disabled:opacity-50" > Open logs → </button>Apply the same fix to the
DockCardbutton on line 147.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/web/src/experiments/console/components/WorkflowDock.tsx` around lines 124 - 132, The "Open logs →" button in WorkflowDock is interactive even when run.projectId is null; update the button (component WorkflowDock) to be disabled when run.projectId === null (add the disabled attribute and aria-disabled) and conditionally remove/replace hover/cursor classes so hover states and pointer don't apply when disabled; keep the onClick logic using navigate(`/console/p/${run.projectId}/r/${run.id}`) but it will no longer be reachable when disabled. Apply the identical change to the DockCard "Open logs" button so both buttons use the same disabled logic and class toggling based on run.projectId.
147-153: ⚡ Quick winSame navigation affordance issue in
DockCard.This button has the same pattern as
ApprovalDockCardabove — interactive whenprojectIdis null. Apply the same disabled-state fix here for consistency.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/web/src/experiments/console/components/WorkflowDock.tsx` around lines 147 - 153, The button in WorkflowDock (same pattern as ApprovalDockCard) is still clickable when run.projectId is null; update the button element so it is disabled when run.projectId === null (add the disabled attribute and aria-disabled), guard or remove the onClick handler when disabled (only call navigate(`/console/p/${run.projectId}/r/${run.id}`) if run.projectId != null), and apply the same visual disabled classes used in ApprovalDockCard (e.g., cursor-not-allowed/opacity styles) so the component (WorkflowDock / DockCard) has consistent disabled-state affordance.packages/core/src/orchestrator/orchestrator-agent.ts (1)
989-994: ⚡ Quick winUse truthy check for
codebase_idto match file patterns.Line 992 uses
conversation.codebase_id !== null, but the rest of this file consistently uses truthy checks for the same field (lines 542, 758, 952). A truthy check would handle bothnullandundefinedconsistently and align with the established pattern.Align with existing pattern
- if (conversation.codebase_id !== null && getProviderCapabilities(providerKey).nativeTools) { + if (conversation.codebase_id && getProviderCapabilities(providerKey).nativeTools) { requestOptions.nativeTools = [buildManageRunTool({ codebaseId: conversation.codebase_id })]; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/orchestrator/orchestrator-agent.ts` around lines 989 - 994, The condition uses a strict null check for conversation.codebase_id but the file consistently uses truthy checks; change the guard to use a truthy check (e.g., if (conversation.codebase_id) ) so undefined values are handled the same way as null and keep behavior aligned with other checks (affecting the block that sets requestOptions.nativeTools using buildManageRunTool and getProviderCapabilities(providerKey)).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/providers/package.json`:
- Line 36: Verify end-to-end that the Zod v4 schemas defined in native-tools.ts
produce the expected tool input schemas when passed to tool(): run the runtime
path that constructs each tool (calls tool()) and assert the resulting
inputSchema/shape matches expectations; if you observe mismatches or known Zod‑4
conversion edge cases, either adjust the native-tools.ts definitions to a
compatible form or switch to the SDK’s recommended helper (or pin Zod to the
SDK‑compatible version) so tool() receives a supported schema shape. Ensure you
exercise every schema used by native-tools.ts and document which fix (helper vs.
downgrade) you applied.
---
Nitpick comments:
In `@packages/core/src/orchestrator/manage-run-tool.ts`:
- Line 60: The code logs terminal events like "manage_run.list_completed",
"manage_run.get_completed" and "manage_run.failed" without emitting matching
"manage_run.*_started" events; add paired "_started" logs immediately before the
corresponding operations so every completed/failed log has a matching started
log. Locate the places that currently call log.info/log.error for
"manage_run.list_completed", "manage_run.get_completed", and "manage_run.failed"
in manage-run-tool.ts and insert corresponding log.info calls for
"manage_run.list_started", "manage_run.get_started", and "manage_run.started"
(or "manage_run.<action>_started") with the same metadata object (e.g., {
action, codebaseId: ctx.codebaseId }) so states can be correlated.
In `@packages/core/src/orchestrator/orchestrator-agent.ts`:
- Around line 989-994: The condition uses a strict null check for
conversation.codebase_id but the file consistently uses truthy checks; change
the guard to use a truthy check (e.g., if (conversation.codebase_id) ) so
undefined values are handled the same way as null and keep behavior aligned with
other checks (affecting the block that sets requestOptions.nativeTools using
buildManageRunTool and getProviderCapabilities(providerKey)).
In `@packages/web/src/experiments/console/components/WorkflowDock.tsx`:
- Around line 124-132: The "Open logs →" button in WorkflowDock is interactive
even when run.projectId is null; update the button (component WorkflowDock) to
be disabled when run.projectId === null (add the disabled attribute and
aria-disabled) and conditionally remove/replace hover/cursor classes so hover
states and pointer don't apply when disabled; keep the onClick logic using
navigate(`/console/p/${run.projectId}/r/${run.id}`) but it will no longer be
reachable when disabled. Apply the identical change to the DockCard "Open logs"
button so both buttons use the same disabled logic and class toggling based on
run.projectId.
- Around line 147-153: The button in WorkflowDock (same pattern as
ApprovalDockCard) is still clickable when run.projectId is null; update the
button element so it is disabled when run.projectId === null (add the disabled
attribute and aria-disabled), guard or remove the onClick handler when disabled
(only call navigate(`/console/p/${run.projectId}/r/${run.id}`) if run.projectId
!= null), and apply the same visual disabled classes used in ApprovalDockCard
(e.g., cursor-not-allowed/opacity styles) so the component (WorkflowDock /
DockCard) has consistent disabled-state affordance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f7bc8be6-5b82-4a74-9d30-09ee185ccb28
⛔ Files ignored due to path filters (1)
bun.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
packages/core/src/orchestrator/manage-run-tool.tspackages/core/src/orchestrator/orchestrator-agent.tspackages/providers/package.jsonpackages/providers/src/claude/capabilities.tspackages/providers/src/claude/native-tools.tspackages/providers/src/claude/provider.tspackages/providers/src/codex/capabilities.tspackages/providers/src/community/copilot/capabilities.tspackages/providers/src/community/opencode/capabilities.tspackages/providers/src/community/pi/capabilities.tspackages/providers/src/types.tspackages/web/src/experiments/console/components/WorkflowDock.tsx
PR Review Summary — Multi-Agent (#1819)7 specialized agents reviewed the diff (44 files, +2,667 / −56):
Critical Issues (1)
Important Issues (13)
Suggestions (S)Types (type-design-analyzer — verdict ADEQUATE, 6/10):
Tests (pr-test-analyzer — verdict ADEQUATE):
Error handling / UX (silent-failure-hunter):
Simplifications (code-simplifier — 6 found, all functionality-preserving):
Docs (docs-impact — low urgency):
Strengths
Documentation Issues (consolidated)
Verdict: NEEDS FIXESResolve the merge conflict, fix C1 (the short-ID bug breaks the tool's core loop), and address the error-handling/observability items I1, I2, I5, I6 plus the doc contradictions I9, I10, I11 and the cheap consensus fix I4. The type/test suggestions (S1–S6) are worthwhile follow-ups but needn't block. No security regressions found — scoping, confirm-gate, and capability-gated injection are correctly implemented. Recommended order
Advisory review — no files were modified. |
Runs paused on a human gate (approval node or an agent question) now surface in an attention banner pinned above the runs feed, on both the All-projects and project-scoped views. It stays visible regardless of the active filter or scroll position until the user acts on it or dismisses it. The banner reuses the existing ApprovalContext + ApprovalPanel surfaces, so the action flow is identical to the inline paused card. While a run is shown in the banner its feed card collapses to a pointer to avoid a duplicate live panel; dismissing the banner restores the inline panel so the run can still be actioned from the feed. Pending runs are derived from the unfiltered run set so they surface even when the feed is filtered or searched. Dismiss is session-only — the run is still paused, so it re-surfaces on reload — and dismissed ids are pruned once a run leaves the paused state so it re-surfaces if it pauses again.
Design/exploration doc for promoting the console toward a project-scoped agent chat (tab swap with the Run view). Captures the reuse map vs. the old chat UI, the project-scoped chat assembly (mostly existing console pieces + 3 new conversation skills), and the run-management tooling direction. Key finding: no single tool mechanism spans all providers. MCP covers Claude/Codex/OpenCode; in-process custom tools cover Claude + Pi only. Since Pi becomes the main runner ~June 2026 and has no MCP, the tool handler is built once behind a provider-neutral spec with two registration backends (MCP + Pi customTools). Rollout leads with the in-process providers (Claude + Pi), then served MCP for Codex/OpenCode. Mapping only — no chat code yet. Project memory deferred.
Add a Runs | Chat tab swap under each project. The chat is plain conversational AI assembled from existing console pieces: messages load via useEntity(K.messages), a new useConversationSSE keeps them live by invalidating on text/tool events (and surfaces conversation_lock to disable the composer), and rendering reuses MessageItem/ToolCallItem/StreamCard. A new skills/conversations.ts (create/list/send) is the sanctioned home for the conversation concept the console previously hid in startRun.ts. The console's invalidate-refetch SSE model removes the need for the old chat's client-side streaming state machine. No backend changes; scoped entirely to packages/web/src/experiments/console. Run management by the agent (manage_run tool) is the planned follow-on. Also records the resolved tooling decisions (tiered confirm, JSON-Schema canonical schema, deferred Wave-2 transport) in agent-chat-spike.md §10.
Drive the "awaiting reply" state from message content + the send action instead of the per-conversation SSE lock event. The recovery poll is no longer gated on a `conversation_lock` event that the cross-origin dev SSE can drop or never deliver — so the composer can't stick on "Waiting…" and replies are polled for regardless of SSE health. Adds a settle window (re-enable once the trailing message is a stable assistant reply) and a hard cap so a turn that never produces a reply can't disable the composer forever. SSE remains a best-effort accelerator for snappy live updates. Note: live updates still depend on the shared useEntity refetch path, which has a separate reliability issue under investigation (cache refetch can stall after the first invalidation while mounted). Persisted state + on-mount rendering are correct; navigating/reloading always shows the latest.
…ernalStore) Two compounding bugs made the console's live updates fail — a refetched value landed in the cache but never appeared on screen until a remount (navigation / reload), so the chat reply never streamed in and the composer stuck on "Waiting…". 1. invalidate() deleted the cache key before refetching, so every SSE/poll tick flashed the subscriber to undefined/empty before the data returned. At poll cadence the list flickered between full and empty and the derived busy/settle/scroll logic thrashed on the empty frames. Replace the delete-then-load with revalidate(): refetch in place and swap on resolve, keeping the previous value visible (stale-while-revalidate). Unsubscribed keys still drop so a later mount fetches fresh. 2. useEntity subscribed with a manual useState(n => n + 1), which can commit a stale render against an externally-mutated store — the render function ran with the fresh value but the DOM kept an older tree. Switch to useSyncExternalStore for tear-free, consistent reads. Verified end-to-end in the browser: agent replies now render live and the composer re-enables without navigating. Fixes the foundation the chat working indicator and workflow dock depend on.
Three UX passes now that live updates work: 1. Chat declutter — ChatStream hides raw tool-call cards and framework/empty rows by default; a single "Agent is working · <tool>" pill stands in while a turn is active (WorkingIndicator), toggling the inline trace on click. 2. Workflow dock — a pinned tray below the chat stream listing the project's in-progress runs (WorkflowDock): hidden at 0, the richer card at 1, a collapsible "Running workflows (N) ▸" with status dots at 2+. Each card links to the run-detail logs. Live via the dashboard SSE. 3. Run-detail lifecycle lines — the log view now bookends the stream with "▶ Workflow <name> started" and a terminal summary "✓ Completed in <duration> · <cost>" (RunLifecycle), so the logs read as a complete story rather than a tool dump. Node transitions already render. Verified live: chat shows clean prose with the working pill; run log shows the started + finished lines.
Paused-on-approval runs in the dock now render the agent's question + Approve/Reject inline (reusing ApprovalContext + ApprovalPanel, same approve/reject + comment-injection as elsewhere), so you can resolve a gate without leaving the chat. Approval cards are always shown (a paused run needs you); plain running runs keep the 1-vs-N collapse. Resolving invalidates the runs cache so the card clears live.
…nly)
Give the project-scoped chat agent a native tool to see this project's
workflow runs, the first step of agent-as-operator.
Cross-provider plumbing:
- New `NativeTool` on the provider contract ({name, description, JSON-Schema
inputSchema, in-process handler}) + a `nativeTools` capability flag. The
handler closes over live context, so @archon/providers never imports core —
the tool crosses the boundary as data on SendQueryOptions.
- Claude adapter (native-tools.ts): a JSON-Schema→Zod-shape converter (narrow,
fail-fast) feeding createSdkMcpServer()/tool(); registered in sendQuery as an
in-process `archon` MCP server (mcp__archon__*, alwaysLoad for Haiku),
mirroring the file-based mcp branch. Adds zod to @archon/providers.
- Capabilities: nativeTools = true for claude + pi, false for codex/opencode/
copilot (no in-process tool path).
The tool (core/orchestrator/manage-run-tool.ts):
- manage_run with read actions list/get — list shows recent runs (id, workflow,
status, step); get shows one run's detail. Errors caught, returned as text.
- Injected on the direct-chat path only when a codebase is scoped and the
provider supports native tools.
Verified live: a project chat asks "list the runs" → agent calls
mcp__archon__manage_run → handler lists 20 runs → agent answers with the count
and the most-recent run's status. Start + the Pi adapter + destructive writes
are follow-ups.
Adds the `start` action to manage_run: the agent resolves a workflow by name
(4-tier resolveWorkflowName) and dispatches it via dispatchBackgroundWorkflow
(its own isolated worktree, fire-and-forget). The dispatch context (platform,
conversation, cwd, codebase, available workflows, userId) is assembled in the
orchestrator and passed as a `startWorkflow` callback so the tool stays
decoupled from the dispatch machinery. Unknown/ambiguous names return a
friendly text result; start is rejected when the dispatch context is absent.
Verified live: a project chat asks the agent to start archon-assist → agent
calls manage_run({action:'start', workflow, message}) → workflow dispatches,
runs in a worktree, and completes; the agent reports it started.
Pi adapter for the provider-neutral NativeTool, completing Wave 1 across both in-process providers (Claude + Pi, the current and ~June-2026 primary). - native-tools.ts: JSON-Schema→TypeBox converter (same narrow string/enum subset, fail-fast) feeding `defineTool`; each tool's text handler result becomes the ToolDefinition's content. - provider.ts: merge the native ToolDefinitions into `customTools`. Because setting customTools forces noTools:'builtin' (which drops Pi's defaults), the base tool set is re-supplied via the new buildDefaultPiTools() helper, so the agent keeps bash/read/edit/write alongside manage_run. - Lazy-loaded via the existing dynamic-import block (avoids Pi's config.js eager-load). Type-checks across the repo; the tool + core handler are identical to the verified Claude path. Live Pi verification needs Pi configured as the assistant (not set up in this dev env).
…e, lifecycle writes, Pi Expand the manage_run native tool from read-only (list/get/start) to the full run-management surface, with progressive disclosure and project scoping. - help: action returns the action catalog; help + subtool returns per-action detail, so the model's tool surface stays small (progressive disclosure). - Lifecycle writes resume/cancel/abandon/approve/reject go through the shared core workflow-operations functions — identical semantics to the CLI and command-handler (state mutation; no server coupling). - Tiered confirmation: destructive actions (cancel/abandon/approve/reject) require confirm:true; without it the tool returns a preview and asks the agent to confirm with the user (no mid-turn UI primitive required). - getScopedRun: every by-id action verifies the run belongs to the chat's project, so an agent scoped to project A cannot read or mutate project B's runs (generalizes the earlier get-only scope concern). - Schema converters (Claude Zod + Pi TypeBox) gain boolean support for the confirm param. Verified live on Claude and Pi (minimax MiniMax-M2.7): both invoke the tool, help works, and the boolean confirm round-trips through TypeBox as a real boolean. Unit tests: handler dispatch/scope/confirm-gate (18) plus converter fail-fast and boolean-acceptance (Claude 4, Pi 4).
…ility These failed at the branch HEAD; the prior commits were only validated with type-check and lint, not the test suite. - claude/provider.test.ts, codex/provider.test.ts, registry.test.ts asserted the full ProviderCapabilities set via toEqual but were never updated when nativeTools was added to the interface — add nativeTools to the expectations (claude true, codex false, mock false). - pi/provider.test.ts: the @earendil-works/pi-coding-agent mock was missing defineTool, a value import pulled in by ./native-tools (added when manage_run was wired into Pi). Bun enforces the binding for value imports, so every Pi sendQuery threw "Export named 'defineTool' not found" — 56 tests failing. Add a defineTool stub to the mock.
5f890d8 to
0217bdf
Compare
PR Review Summary — multi-agent passSix specialized agents reviewed the diff (base Branch state: Critical Issues (2) — block merge
C1 fix: add prefix matching to Important Issues (5) — address before merge
Suggestions (6)
Documentation
Strengths
Verdict: NEEDS FIXESC1 (broken by-id action surface) and C2 (Zod convention) before merge; I1–I5 are quick and high-value (mostly wording + a few tests + two error-surfacing fixes). Architecture (provider-neutral Note: the 🤖 Multi-agent review (code · docs · tests · comments · errors · types) |
…anage_run + chat Critical: - manage_run by-id actions now resolve the short ids the tool actually shows. getScopedRun looks runs up by id prefix scoped to the codebase (new findWorkflowRunsByIdPrefix), so the 8-char ids in list/get output work for get/resume/cancel/abandon/approve/reject; ambiguous prefixes are reported and project scoping is enforced in the query. Writes pass the verified full id to the operations. - Document the @archon/providers direct-zod import exception in CLAUDE.md and at the import site (Claude SDK tool() needs a Zod shape; this SDK-deps-only leaf package must not pull in Hono via @hono/zod-openapi). Important: - resume help/response no longer claim it "marks resumable" — it only validates eligibility and does not restart the run. - cancel help no longer claims it "stops" a running process — it marks the run cancelled; an executing step may finish first. - ChatPage surfaces conversation/message (re)load errors instead of silently showing stale or empty data. - startWorkflow logs dispatch failures with the workflow name and returns a clear error to the agent. - Completed confirm-gate test coverage: preview-without-confirm for abandon/approve/reject, approve interactive-loop branch, reject rework branch, and an ambiguous-prefix case. Suggestions: - Differentiated confirm-preview wording: cancel/abandon (irreversible) vs approve/reject (human gate). - Dropped the unsound `as Action` cast; added runId to the failure log context. - Pi native-tool label derived from the tool name, not hardcoded "Manage runs". - Corrected NativeTool JSDoc, cache.ts LOC note, and useDashboardSSE mount note. Docs: - console-open-questions.md §2/§5 + priority list (manage_run Wave 1 and inline approvals shipped in this PR); console README /chat route; ai-assistants Pi capability row for in-process native tools. Deferred (proportionality): a typed NativeToolInputSchema (runtime validation already covers the single-tool surface) and an orchestrator-level injection integration test (two-clause guard whose halves are already covered).
Review findings addressed — pushed
|
…l providers Now that #1819's native manage_run tool is in dev, the CLI run-management pointer (buildRunManagementSection) is redundant for Claude/Pi and would steer them onto a bash path needing `archon` on PATH. Gate it: the orchestrator appends it ONLY for project-scoped chats on providers WITHOUT nativeTools (Codex/OpenCode/Copilot); Claude/Pi get the native tool and are nudged to it. - Move the append out of the pure buildOrchestratorSystemAppend into orchestrator-agent.ts next to the nativeTools check; look caps up lazily (scoped chats only). - prompt-builder tests: assert buildOrchestratorSystemAppend does NOT include the section (orchestrator gates it); keep the buildRunManagementSection content test. - orchestrator-agent test: add buildRunManagementSection to the ./prompt-builder mock (the new export it now calls — otherwise the full --filter run replaces the module and the call is undefined). - Document the console-dock limitation: detached CLI runs appear in the dock (listed by project) but don't stream live and post no chat card, because they run out-of-process — use the native tool for the live experience.
…l providers Now that #1819's native manage_run tool is in dev, the CLI run-management pointer (buildRunManagementSection) is redundant for Claude/Pi and would steer them onto a bash path needing `archon` on PATH. Gate it: the orchestrator appends it ONLY for project-scoped chats on providers WITHOUT nativeTools (Codex/OpenCode/Copilot); Claude/Pi get the native tool and are nudged to it. - Move the append out of the pure buildOrchestratorSystemAppend into orchestrator-agent.ts next to the nativeTools check; look caps up lazily (scoped chats only). - prompt-builder tests: assert buildOrchestratorSystemAppend does NOT include the section (orchestrator gates it); keep the buildRunManagementSection content test. - orchestrator-agent test: add buildRunManagementSection to the ./prompt-builder mock (the new export it now calls — otherwise the full --filter run replaces the module and the call is undefined). - Document the console-dock limitation: detached CLI runs appear in the dock (listed by project) but don't stream live and post no chat card, because they run out-of-process — use the native tool for the live experience.
Summary
/invoke-workflowtext protocol). Live updates also flickered/never settled.Runs | Chattab swap) with a docked workflow tray (1-vs-N collapse) that renders inline approvals, a "working" pill, and run-detail lifecycle lines; (3) a foundational live-cache fix (stale-while-revalidate +useSyncExternalStore); (4) themanage_runnative tool — one tool with progressive disclosure (help), project scoping, and the full lifecycle surface (list/get/start/resume/cancel/abandon/approve/reject), provider-neutral and wired for Claude (Zod/MCP) + Pi (TypeBox/customTools); (5) design/spike docs.packages/web/src/experiments/console/).manage_runwrites do state mutation only (CLI-equivalent) — no new auto-resume-after-gate machinery. No project memory, no multi-conversation sidebar, no file-attach composer. Codex/OpenCode getnativeTools: false(served-MCP is a later wave).manage_runnative toolProvider-neutral
NativeTool { name, description, inputSchema (JSON Schema), handler }rides onSendQueryOptions.nativeTools; the handler closes over live core context so@archon/providersnever imports@archon/core. Per-provider adapters convert the canonical JSON Schema: Claude → Zod viacreateSdkMcpServer/tool()(mcp__archon__*,alwaysLoad); Pi → TypeBox viacustomTools(re-supplying Pi's default tools). One tool, anactiondiscriminator, and ahelpaction for progressive disclosure. Destructive actions are gated onconfirm: true(tiered-confirmation policy). Every by-id action is project-scoped viagetScopedRun.UX Journey
Before
After
Architecture Diagram
After (delta)
Label Snapshot
risk: mediumsize: XLweb,core,providers,testsweb:console,core:orchestrator,providers:claude,providers:piChange Metadata
featuremulti(web + core + providers)Linked Issue
Validation Evidence (required)
manage-run-tool.test.ts(18 — help/scope/confirm-gate/dispatch/error),claude/native-tools.test.ts(4),pi/native-tools.test.ts(4 — fail-fast + boolean acceptance).manage_runforlist/get/start(dispatchedarchon-assist, ran to completion). Pi (minimaxMiniMax-M2.7) invokesmanage_run, runshelp, and the booleanconfirmround-trips through TypeBox as a real boolean (typeof === 'boolean', value === true).defineToolmock) — see thefix(providers)commit.Security Impact (required)
getScopedRunblocks cross-project access); destructive actions require explicitconfirm: true; writes use the same core ops as the existing CLI/REST surfaces (no new privilege).Compatibility / Migration
Human Verification (required)
manage_runlist/start; Pi (minimax) tool round-trip incl. booleanconfirm; live-update fix (reply renders, composer re-enables); inline dock approvals reuse the shared approval panel.nativeTools: false); agent-initiated approve does not auto-resume the run (state-only by design — the dock's human-approve path handles auto-resume).Side Effects / Blast Radius (required)
@archon/webconsole experiment,@archon/coreorchestrator (tool injection),@archon/providers(NativeTool contract + Claude/Pi adapters).nativeToolscapability +NativeToolcontract are additive; existing providers unaffected.Rollback Plan (required)
manage_runinjection is a single guarded block inorchestrator-agent.ts; the console UI is fully contained inexperiments/console/.manage_run error: …; console live updates stalling.Risks and Mitigations
confirmgate on destructive actions; approve/reject are state-only (no auto-resume); project scoping.customToolspath is newer than Claude's → Mitigation: live-verified against minimax incl. the boolean round-trip; unit-tested converter; graceful fallback (providers withoutnativeToolssimply don't receive the tool).