chore: sync upstream/dev (coleam00/archon) — 71 commits#5
Merged
Conversation
…leam00#1017) CLI-launched workflows were visible in the Web UI chat but displayed as plain text only — no WorkflowProgressCard or WorkflowResultCard. The CLI adapter already handled both metadata fields; the sendMessage calls were simply missing from workflowRunCommand. Changes: - Send workflowDispatch message before executeWorkflow (mirrors orchestrator.ts) - Send workflowResult message after successful completion with summary - Wrap result message in try/catch with warn log (same pattern as orchestrator) Fixes coleam00#1017 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…for PR coleam00#1052 - Wrap dispatch sendMessage in try/catch (matches result card pattern) to prevent UI notification failures from blocking workflow execution - Update dispatch comment to accurately describe structural similarity to orchestrator while noting synchronous CLI semantics and that workerConversationId === conversationId in the CLI path - Add note to result card comment about paused-path exclusion - Add 4 integration tests for workflowRunCommand: dispatch ordering and metadata shape, result card with summary, no result card without summary, and non-throwing DB failure on result persist Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Template literals automatically coerce numbers to strings; wrapping with String() is redundant. Removed from formatAge, formatDuration, and all console.log calls in workflow.ts. Also compacted a two-line object spread in workflowStatusCommand to a single line. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deduplicate JSON branch in workflowStatusCommand by computing the output array once with a single console.log call, removing the duplicated verbose/non-verbose conditional branches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Apply review finding: rename flat log event names to use the
cli.{action}_{state} convention matching the rest of the file.
- workflow_dispatch_surface_failed → cli.workflow_dispatch_surface_failed
- workflow_output_surface_failed → cli.workflow_result_surface_failed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…aces (coleam00#1035) The cmd.exe fallback in spawnWindowsTerminal() used shell: true, which caused Bun/Node to flatten args into a single string without proper quoting. Paths with spaces were split at whitespace, breaking the /D argument to start. Changes: - Remove shell: true from cmd.exe fallback spawn options - Remove shell?: boolean from trySpawn options type (no callers need it) Fixes coleam00#1035 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…m00#1054) During SSE streaming, tool calls always appeared below all text because onText appended to the existing message even when it already had tool calls. The server-side persistence already segments at this boundary. Mirror that rule in the client's onText handler: when the last streaming message has tool calls, seal it and start a new message for incoming text. Fixes coleam00#1054 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…1055) After a workflow completes, the AI had no awareness of results when answering follow-up questions. This adds a "Recent Workflow Results" section to the orchestrator prompt by querying persisted workflow_result messages from the conversation. Changes: - Add getRecentWorkflowResultMessages() to db/messages.ts - Add WorkflowResultContext type and formatWorkflowContextSection() to prompt-builder.ts - Extend buildFullPrompt() with optional workflowContext parameter - Fetch and inject workflow context in handleMessage() before prompt building Fixes coleam00#1055 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add console.error logging to silent .catch on SSE reconnect re-fetch (ChatInterface.tsx:~544) so production failures are visible in logs - Extract onText setMessages reducer to chat-message-reducer.ts as a pure function (applyOnText) with 14 unit tests covering all 6 segmentation rules including the new tool-call boundary (issue coleam00#1054) - Refactor ChatInterface.onText to delegate to applyOnText Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Parallelize checksums + tarball fetch in serve.ts (removes waterfall latency) - Remove redundant existsSync before readFileSync in update-check.ts (catch already handles ENOENT) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…flowResultCard Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- CRITICAL: fix metadata filter in getRecentWorkflowResultMessages to check for workflowResult key presence instead of category (which is never persisted to DB); feature was completely non-functional on every call - HIGH: guard JSON.parse(msg.metadata) with typeof check to handle PostgreSQL JSONB columns returned as objects (not strings) by node-postgres - MEDIUM: add structured warn log inside inner metadata parse catch block - LOW: use SELECT id, content, metadata instead of SELECT * in new DB query - LOW: update comments in messages.ts and prompt-builder.ts for accuracy - Tests: add formatWorkflowContextSection unit tests (pure function coverage) - Tests: add getRecentWorkflowResultMessages tests (dialect switch + contract) - Tests: add getDatabaseType mock to messages.test.ts connection mock - Tests: add ../db/messages mock and formatWorkflowContextSection to prompt-builder mock in orchestrator-agent.test.ts - Tests: add handleMessage workflow context injection behavioral tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prevents "Cannot read properties of undefined (reading 'status')" crash when navigating between chat and workflow execution views during race conditions where run data may be transiently undefined. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t timeout (coleam00#1067, coleam00#1030, coleam00#1098, coleam00#1070) * fix: strip CWD .env leak, enable platform adapters in serve, add first-event timeout (coleam00#1067) Three bugs fixed: (1) Bun auto-loads CWD .env files before user code, leaking non-overlapping keys into the Archon process — new stripCwdEnv() boot import removes them before any module reads env. (2) archon serve hardcoded skipPlatformAdapters:true, preventing Slack/Telegram/Discord from starting. (3) Claude SDK query had no first-event timeout, causing silent 30-min hangs when the subprocess wedges — new withFirstMessageTimeout wrapper races the first event against a configurable deadline (default 60s). Changes: - Add @archon/paths/strip-cwd-env and strip-cwd-env-boot modules - Import boot module as first import in CLI entry point - Remove skipPlatformAdapters: true from serve.ts - Add withFirstMessageTimeout + diagnostics to ClaudeClient - Add CLAUDECODE=1 nested-session warning to CLI - Add 9 unit tests (6 strip-cwd-env + 3 timeout) Fixes coleam00#1067 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review findings for PR coleam00#1092 Fixed: - Clear setTimeout timer in withFirstMessageTimeout finally block (HIGH-1) - Add strip-cwd-env-boot to server/src/index.ts for direct dev:server path (MEDIUM-1) - Warn to stderr on non-ENOENT errors in stripCwdEnv (MEDIUM-2) - Update stale configuration.md docs for new env-loading mechanism (HIGH-2) - Add ARCHON_CLAUDE_FIRST_EVENT_TIMEOUT_MS and ARCHON_SUPPRESS_NESTED_CLAUDE_WARNING env vars to docs - Add nested Claude Code hang troubleshooting entry - Fix boot module JSDoc: "CLI and server" → "CLI" only - Fix stripCwdEnv JSDoc: remove stale "override: true" reference - Update .claude/rules/cli.md startup behavior section - Update CLAUDE.md @archon/paths description with new exports Tests added: - Assert controller.signal.aborted on timeout - Handle generator that completes immediately without yielding - Strip distinct keys from different .env files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: replace string sentinel with typed error class in withFirstMessageTimeout Replace the '__timeout__' string sentinel used to identify timeout rejections with a dedicated FirstEventTimeoutError class. instanceof checks are more explicit and robust than string comparison on error messages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address review findings — dotenv version, docs, server warning, marker strip, tests 1. Align dotenv to ^17 (was ^16, rest of monorepo uses ^17.2.3) 2. Remove incorrect SUBPROCESS_ENV_ALLOWLIST claim from docs — the SDK bypasses the env option and uses process.env directly (coleam00#1097) 3. Add CLAUDECODE=1 warning to server entry point (was only in CLI) 4. Add diagnostic payload content test for withFirstMessageTimeout 5. Integrate coleam00#1097's finding: strip CLAUDECODE + CLAUDE_CODE_* session markers (except auth vars) + NODE_OPTIONS + VSCODE_INSPECTOR_OPTIONS from process.env at entry point. Pattern-matched on CLAUDE_CODE_* prefix rather than hardcoding 6 names, so future Claude Code markers are handled automatically. Auth vars (CLAUDE_CODE_OAUTH_TOKEN, CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) are preserved. Root cause per coleam00#1097: the Claude Agent SDK leaks process.env into the spawned child regardless of the explicit env option, so the only way to prevent the nested-session deadlock is to delete the markers from process.env at the entry point. Validation: bun run validate passes, 125 paths tests (6 new marker tests), 60 claude tests (1 new diagnostic test), DATABASE_URL leak verified stripped (target repo .env DATABASE_URL does not affect Archon DB selection). * refactor: remove SUBPROCESS_ENV_ALLOWLIST — trust user config, strip only CWD The allowlist was wrong for a single-developer tool: - It blocked keys the user intentionally set in ~/.archon/.env (ANTHROPIC_API_KEY, AWS_*, CLAUDE_CONFIG_DIR, MiniMax vars, etc.) - It was bypassed by the SDK anyway (process.env leaks to subprocess regardless of the env option — see coleam00#1097) - It attracted a constant stream of PRs adding keys (coleam00#1060, coleam00#1093, coleam00#1099) New model: CWD .env keys are the only untrusted source. stripCwdEnv() at entry point handles that. Everything in ~/.archon/.env + shell env passes through to the subprocess. No filtering, no second-guessing. Changes: - Delete env-allowlist.ts and env-allowlist.test.ts - Simplify buildSubprocessEnv() to return { ...process.env } with auth-mode logging (no token stripping — user controls their config) - Replace 4 allowlist-based tests with 1 pass-through test - Remove env-allowlist.test.ts from core test batch - Update security.md and cli.md docs to reflect the new model The CLAUDECODE + CLAUDE_CODE_* marker strip and NODE_OPTIONS strip remain in stripCwdEnv() at entry point — those are process-level safety (not per-subprocess filtering) and are needed regardless. * fix: restore override:true for archon env, add integration tests The integration tests caught a real issue: without override:true, the ~/.archon/.env load doesn't win over shell-inherited env vars. If the user's shell profile exports PORT=9999 and ~/.archon/.env has PORT=3000, the user expects Archon to use 3000. stripCwdEnv() handles CWD .env files (untrusted). override:true handles shell-inherited vars (trusted but less specific than ~/.archon/.env). Different concerns, both needed. Also adds 6 integration tests covering the full entry-point flow: 1. Global auth user with ANTHROPIC_API_KEY in CWD .env — stripped 2. OAuth token in archon env + random key in CWD — CWD stripped, archon kept 3. General leak test — nothing from CWD reaches subprocess 4. Same key in both CWD and archon — archon value wins 5. CLAUDECODE markers stripped even when not from CWD .env 6. CLAUDE_CODE_OAUTH_TOKEN survives marker strip * test: add DATABASE_URL leak scenarios to env integration tests * fix: move CLAUDECODE warning into stripCwdEnv, remove dead useGlobalAuth logic Review findings addressed: 1. CLAUDECODE warning was dead code — the boot import deleted CLAUDECODE from process.env before the warning check in cli.ts/server/index.ts could fire. Moved the warning into stripCwdEnv() itself, emitted BEFORE the deletion. Removed duplicate warning code from both entry points. 2. useGlobalAuth token stripping removed (intentional, not regression) — the old code stripped CLAUDE_CODE_OAUTH_TOKEN and CLAUDE_API_KEY when useGlobalAuth=true. Per design discussion: the user controls ~/.archon/.env and all keys they set are intentional. If they want global auth, they just don't set tokens. Simplified buildSubprocessEnv to log auth mode for diagnostics only, no filtering. 3. Docs "no override needed" corrected — cli.md and configuration.md now reflect the actual code (override: true). --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com>
Release 0.3.6
Rename the core AI provider interface and all related types, classes, factory functions, and directory from clients/ to providers/. Rename map: - IAssistantClient → IAgentProvider - ClaudeClient → ClaudeProvider - CodexClient → CodexProvider - getAssistantClient → getAgentProvider - AssistantRequestOptions → AgentRequestOptions - IWorkflowAssistantClient → IWorkflowAgentProvider - AssistantClientFactory → AgentProviderFactory - WorkflowAssistantOptions → WorkflowAgentOptions - packages/core/src/clients/ → packages/core/src/providers/ NOT renamed (user-facing/DB-stored): assistant config key, DEFAULT_AI_ASSISTANT env var, ai_assistant_type DB column. No behavioral changes — purely naming.
…nd docs - AssistantDefaults → ProviderDefaults, ClaudeAssistantDefaults → ClaudeProviderDefaults - Logger domains: client.claude → provider.claude, client.codex → provider.codex - Fix stale JSDoc, error messages, and references in architecture docs, CHANGELOG, testing rules
- ProviderDefaults → CodexProviderDefaults (symmetric with ClaudeProviderDefaults) - Fix stale "AI client" comments in orchestrator-agent.ts and orchestrator.test.ts - Remove dead createMockAgentProvider in test/mocks/streaming.ts (zero importers, wrong method names) - Fix irregular whitespace in .claude/rules/workflows.md
…t-to-iagentprovider refactor: rename IAssistantClient to IAgentProvider
docs: consolidate Claude guidance into CLAUDE.md
…oleam00#1137) * refactor: extract providers from @archon/core into @archon/providers Move Claude and Codex provider implementations, factory, and SDK dependencies into a new @archon/providers package. This establishes a clean boundary: providers own SDK translation, core owns business logic. Key changes: - New @archon/providers package with zero-dep contract layer (types.ts) - @archon/workflows imports from @archon/providers/types — no mirror types - dag-executor delegates option building to providers via nodeConfig - IAgentProvider gains getCapabilities() for provider-agnostic warnings - @archon/core no longer depends on SDK packages directly - UnknownProviderError standardizes error shape across all surfaces Zero user-facing changes — same providers, same config, same behavior. * refactor: remove config type duplication and backward-compat re-exports Address review findings: - Move ClaudeProviderDefaults and CodexProviderDefaults to the @archon/providers/types contract layer as the single source of truth. @archon/core/config/config-types.ts now imports from there. - Remove provider re-exports from @archon/core (index.ts and types/). Consumers should import from @archon/providers directly. - Update @archon/server to depend on @archon/providers for MessageChunk. * refactor: move structured output validation into providers Each provider now normalizes its own structured output semantics: - Claude already yields structuredOutput from the SDK's native field - Codex now parses inline agent_message text as JSON when outputFormat is set, populating structuredOutput on the result chunk This eliminates the last provider === 'codex' branch from dag-executor, making it fully provider-agnostic. The dag-executor checks structuredOutput uniformly regardless of provider. Also removes the ClaudeCodexProviderDefaults deprecated alias — all consumers now use ClaudeProviderDefaults directly. * fix: address PR review — restore warnings, fix loop options, cleanup Critical fixes: - Restore MCP missing env vars user-facing warning (was silently dropped) - Restore Haiku + MCP tool search warning - Fix buildLoopNodeOptions to pass workflow-level nodeConfig (effort, thinking, betas, sandbox were silently lost for loop nodes) - Add TODO(coleam00#1135) comments documenting env-leak gate gap Cleanup: - Remove backward-compat type aliases from deps.ts (keep WorkflowTokenUsage) - Remove 26 unnecessary eslint-disable comments from test files - Trim internal helpers from providers barrel (withFirstMessageTimeout, getProcessUid, loadMcpConfig, buildSDKHooksFromYAML) - Add @archon/providers dep to CLI package.json - Fix 8 stale documentation paths pointing to deleted core/src/providers/ - Add E2E smoke test workflows for both Claude and Codex providers * fix: forward provider system warnings to users in dag-executor The dag-executor only forwarded system chunks starting with "MCP server connection failed:" — all other provider warnings (missing env vars, Haiku+MCP, structured output issues) were logged but never reached the user. Now forwards all system chunks starting with⚠️ (the prefix providers use for user-actionable warnings). * fix: add providers package to Dockerfile and fix CI module resolution - Add packages/providers/ to all three Dockerfile stages (deps, production package.json copy, production source copy) - Replace wildcard export map (./*) with explicit subpath entries to fix module resolution in CI (bun workspace linking) * chore: update bun.lock for providers package exports
…1160) * fix: make env-integration test cross-platform (Windows CI) Check for Windows env var equivalents (Path instead of PATH, USERPROFILE instead of HOME) in scenario 3 assertions. Closes coleam00#1128 * fix: Windows PATH/HOME casing in provider subprocess env test Same cross-platform fix for ClaudeProvider test — spread objects lose Windows case-insensitive behavior (Path vs PATH, USERPROFILE vs HOME).
…ies (coleam00#1162) * refactor: decompose provider sendQuery() into explicit helper boundaries (coleam00#1139) sendQuery() in both Claude and Codex providers was a monolith mixing SDK option building, nodeConfig translation, stream normalization, and error classification. This makes it hard to safely extend for Phase 2 provider extensibility. Decompose both providers into focused internal helpers: Claude: - buildBaseClaudeOptions: SDK option construction - buildToolCaptureHooks: PostToolUse/PostToolUseFailure hook setup - applyNodeConfig: workflow nodeConfig → SDK translation + structured warnings - streamClaudeMessages: raw SDK event → MessageChunk normalization - classifyAndEnrichError: error classification with retry decisions Codex: - buildTurnOptions: per-turn option construction (output schema, abort) - streamCodexEvents: raw SDK event → MessageChunk normalization - classifyAndEnrichCodexError: error classification with retry decisions Also introduces ProviderWarning { code, message } replacing raw string warnings for machine-readable provider translation warnings. Adds 43 focused unit tests covering the extracted helpers directly. Fixes coleam00#1139 * fix: export ToolResultEntry type used in public buildBaseClaudeOptions API * fix: unexport internal helpers to prevent API surface leakage, fix retry state bug Review findings: 1. Internal helpers were exported and reachable through package.json subpath exports (./claude/provider, ./codex/provider), widening the public API. All new helpers are now file-local — the only public exports remain ClaudeProvider, CodexProvider, loadMcpConfig, buildSDKHooksFromYAML, withFirstMessageTimeout, getProcessUid. 2. Codex streamState (lastTodoListSignature) was shared across retry attempts, causing todo-list dedup to suppress output on retry. Now creates fresh state per attempt. Removed direct helper test imports — existing sendQuery e2e tests (51 Claude + 42 Codex) cover all behavior paths. * fix: address review findings — abort handling, retry bugs, error swallowing Fixes from CodeRabbit + multi-agent review: 1. classifyAndEnrichError preserves first-event timeout diagnostic instead of collapsing it into generic "Query aborted" (the timeout aborts the controller, but the original error carries the coleam00#1067 breadcrumb) 2. nodeConfigWarnings emitted once before retry loop, not per attempt 3. buildSubprocessEnv() called once before retry loop (was re-logging auth mode and rebuilding { ...process.env } per attempt) 4. Abort signal listener registered once with forwarding to current controller (was accumulating per-retry listeners) 5. PostToolUse hook wrapped in try/catch (JSON.stringify can throw on circular refs — was asymmetric with PostToolUseFailure which had it) 6. Codex streamCodexEvents throws on abort instead of silent break (callers were getting truncated stream with no result/error) 7. Both providers store enrichedError (not raw error) for retry exhaustion — preserves stderr context in final throw 8. Log is_error result events at error level in Claude stream normalizer * test: add black-box behavioral tests for sendQuery decomposition fixes Restore test coverage for the specific fixes from the decomposition review, exercised through sendQuery (black-box) since helpers are file-local: Claude (6 tests): - Timeout error preserved (not collapsed into "Query aborted") - nodeConfig warnings emitted once even when retries occur - Abort signal cancels across retries via single forwarding listener - Enriched error (with stderr) thrown at retry exhaustion - PostToolUse hook handles circular reference without crashing - is_error result events logged at error level Codex (3 tests): - Abort signal throws instead of silently truncating stream - Enriched error thrown at retry exhaustion - Todo-list dedup state resets between retry attempts
- e2e-all-nodes: exercises bash, prompt, script (bun), structured output, model override (haiku), effort control, and $nodeId.output refs - e2e-mixed-providers: tests Claude + Codex in the same workflow with cross-provider output references - echo-args.js: simple script node test helper
…leam00#1135) Remove the entire env-leak scanning/consent infrastructure: scanner, allow_env_keys DB column usage, allow_target_repo_keys config, PATCH consent route, --allow-env-keys CLI flag, and UI consent toggle. The env-leak gate was the wrong primitive. Target repo .env protection is already structural: - stripCwdEnv() at boot removes Bun-auto-loaded CWD .env keys - Archon loads its own env sources afterward (~/.archon/.env) - process.env is clean before any subprocess spawns - Managed env injection (config.yaml env: + DB vars) is unchanged No scanning, no consent, no blocking. Any repo can be registered and used. Subprocesses receive the already-clean process.env.
…m00#1161) * Implement managed execution env propagation * Address managed env review feedback
…ss (coleam00#1185) * refactor: extract provider metadata seam for Phase 2 registry readiness - Add static capability constants (capabilities.ts) for Claude and Codex - Export getProviderCapabilities() from @archon/providers for capability queries without provider instantiation - Add inferProviderFromModel() to model-validation.ts, replacing three copy-pasted inline inference blocks in executor.ts and dag-executor.ts - Replace throwaway provider instantiation in dag-executor with static capability lookup (getProviderCapabilities) - Add orchestrator warning when env vars are configured but provider doesn't support envInjection * refactor: address LOW findings from code review - Remove CLAUDE_CAPABILITIES/CODEX_CAPABILITIES from public index (YAGNI — callers should use getProviderCapabilities(), not raw constants) - Remove dead _deps parameter from resolveNodeProviderAndModel and its two call-sites (no longer needed after static capability lookup refactor) - Update factory.ts module JSDoc to mention both exported functions - Add edge-case tests for getProviderCapabilities: empty string and case-sensitive throws (parity with existing getAgentProvider tests) - Add test for inferProviderFromModel with empty string (returns default, documenting the falsy-string shortcut)
…1055 feat(core): inject workflow run context into orchestrator prompt
…0#1042) Telegraf v4's internal `redactToken()` assigns to readonly `error.message` properties, which crashes under Bun's strict ESM mode. Telegraf is EOL. Changes: - Replace `telegraf` dependency with `grammy` ^1.36.0 - Migrate adapter from Telegraf API to grammY API (Bot, bot.api, bot.start) - Use grammY's `onStart` callback pattern for async polling launch - Preserve 409 retry logic and all existing behavior - Update test mocks from telegraf types to grammy types Fixes coleam00#1042 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix misleading 'unde***' log when ctx.from is undefined; use 'unknown' to match the Slack/Discord adapter pattern - Log post-startup bot runtime errors before reject() (no-op after onStart fires but errors are now visible in logs) - Add debug log when message is dropped due to no handler registered - Add stop() unit test to guard against grammY API rename regressions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…1042 fix: replace Telegraf with grammY to fix Bun TypeError crash
…am00#1090) * fix(workflows): stop warning about model/provider on loop nodes (coleam00#1082) The loader incorrectly classified loop nodes as "non-AI nodes" and warned that model/provider fields were ignored, even though the DAG executor has supported these fields on loop nodes since commit 594d5da. Changes: - Add LOOP_NODE_AI_FIELDS constant excluding model/provider from the warn list - Update loader to use LOOP_NODE_AI_FIELDS for loop node field checking - Fix BASH_NODE_AI_FIELDS comment that incorrectly referenced loop nodes - Add tests for loop node model/provider acceptance and unsupported field warnings Fixes coleam00#1082 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(workflows): update stale comment and add LOOP_NODE_AI_FIELDS unit tests - Update section comment from "bash/loop nodes" to "non-AI nodes" since loop nodes do support model/provider (the fix in this PR) - Export LOOP_NODE_AI_FIELDS from schemas/index.ts alongside BASH/SCRIPT variants - Add dedicated describe block in schemas.test.ts verifying that model and provider are excluded and all other BASH_NODE_AI_FIELDS are still present Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * simplify: merge nodeType and aiFields into a single if/else chain in parseDagNode Eliminates the separate isNonAiNode predicate and nested ternary for aiFields selection by combining both into one explicit if/else block — each branch sets nodeType and aiFields together, removing the need to re-check node type twice. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eam00#1089) * fix: surface auth errors instead of silently dropping them (coleam00#1076) When Claude OAuth refresh token is expired, the SDK yields a result chunk with is_error=true and no session_id. Both handleStreamMode and handleBatchMode guarded the result branch with `&& msg.sessionId`, silently dropping the error. Users saw no response at all. Changes: - Remove sessionId guard from result branches in orchestrator-agent.ts - Add isError early-exit that sends error message to user - Add 4 OAuth patterns to AUTH_PATTERNS in claude.ts and codex.ts - Add OAuth refresh-token handler to error-formatter.ts - Add tests for new error-formatter branches Fixes coleam00#1076 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add structured logging to isError path and remove overly broad auth pattern - Add getLog().warn({ conversationId, errorSubtype }, 'ai_result_error') in both handleStreamMode and handleBatchMode isError branches so auth failures are visible server-side instead of silently swallowed - Remove 'access token' from AUTH_PATTERNS in claude.ts and codex.ts; the real OAuth refresh error is already covered by 'refresh token' and 'could not be refreshed', eliminating false-positive auth classification risk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: route isError results through classifyAndFormatError with provider-specific messages The isError path in stream/batch mode used a hardcoded generic message, bypassing the classifyAndFormatError infrastructure. Now constructs a synthetic Error from errorSubtype and routes through the formatter. Error formatter updated with provider-specific auth detection: - Claude: OAuth token refresh, sign-in expired → guidance to run /login - Codex: 401 retry exhaustion → guidance to run codex login - General: tightened patterns (removed broad 'auth error' substring match) Also persists session ID before early-returning on isError. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds real workflow execution to CI, verifying the full engine works end-to-end with both providers. Organized into 4 tiers: deterministic (0 API calls), Claude, Codex, and mixed-provider tests. New workflows: - e2e-deterministic: bash, script (bun/uv), conditions, trigger rules - e2e-skills-mcp: skills injection, MCP server, effort, systemPrompt - Enhanced existing e2e-claude-smoke, e2e-codex-smoke, e2e-mixed-providers - Fixed e2e-all-nodes (was broken due to script node syntax) Supporting files: - e2e-echo-command.md (test command file) - echo-args.py (Python script for uv runtime test) - e2e-test-skill/SKILL.md (minimal skill for injection test) - e2e-filesystem.json (MCP config for filesystem server test) GitHub Actions: .github/workflows/e2e-smoke.yml - Runs on push to main/dev only (no PR trigger to avoid API cost abuse) - Uses haiku (Claude) and gpt-5.1-codex-mini (Codex) for cost efficiency Closes coleam00#1254 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename echo-args.py → echo-py.py to avoid duplicate script name conflict with echo-args.js (script discovery uses base name, not extension) - Add CODEX_API_KEY env var to codex and mixed CI jobs (Codex CLI requires this, not OPENAI_API_KEY, for headless auth) - Sequentialize all Claude AI nodes via depends_on chains to prevent concurrent CLI subprocess idle timeouts in CI - Increase idle_timeout from 60s to 120s on all AI nodes for CI headroom - Override MCP test node to model: sonnet (Haiku doesn't support MCP tool search) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude CLI is slow with structured output and tool use in CI (~4 min for structured output, ~2 min for tool use). With 3 sequential workflow runs (claude-smoke, all-nodes, skills-mcp), 10 minutes is insufficient. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Claude CLI is extremely slow with structured output (~4 min) and tool use (~2 min) in CI, making the previous multi-workflow approach take 10+ min. Radical simplification: - Remove e2e-all-nodes (redundant with deterministic + claude-smoke) - Remove e2e-skills-mcp (advanced features too slow for per-commit smoke) - Remove structured output and tool use from Claude smoke test (too slow) - Strip Claude smoke to: 1 prompt + 1 command + 1 bash verify node - Keep mixed providers (simplified: 1 Claude + 1 Codex + bash verify) - All timeouts reduced to 30s, all job timeouts to 5 min - Remove MCP test fixtures and e2e-test-skill (no longer needed) Expected: Claude job ~15s of AI time, Codex ~5s, mixed ~10s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The command-test node was missing allowed_tools: [], causing the Claude CLI to load full tool access. Without tools restricted, the subprocess hangs after responding. The simple prompt node with allowed_tools: [] completes in 4s — this should match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Command nodes consistently produce zero output and hit the 30s idle timeout in CI, even with allowed_tools: []. This appears to be a bug in how command: nodes interact with the Claude CLI subprocess — the process never emits output. This adds 30s of wasted time to every run. The simple prompt node already verifies Claude connectivity. Command file discovery/loading is a deterministic operation that doesn't need an AI call to validate in a smoke test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Injects exit 1 into e2e-deterministic bash-echo node to prove the engine fix (failWorkflowRun on anyFailed) propagates to a non-zero CLI exit code and a red X in GitHub Actions. Will be reverted in the next commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reverts the injected exit 1 in bash-echo (CI red X confirmed in run 24522356737). Removes feat/e2e-smoke-tests from branch triggers — ready to merge to dev. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Will remove feat/e2e-smoke-tests trigger in the final cleanup commit before merging to dev. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Removes feat/e2e-smoke-tests from E2E workflow triggers. CI failure detection verified: red X on run 24522356737 (deliberate bash exit 1), green on run 24522484762 (reverted), and credit-exhaustion failure also correctly produced exit 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat(ci): add E2E smoke test workflows for Claude and Codex
…am00#1262) * feat(telemetry): add anonymous PostHog workflow-invocation tracking Emits one `workflow_invoked` event per run with workflow name/description, platform, and Archon version. Uses a stable random UUID persisted to `$ARCHON_HOME/telemetry-id` for distinct-install counting, with `$process_person_profile: false` to stay in PostHog's anonymous tier. Opt out with `ARCHON_TELEMETRY_DISABLED=1` or `DO_NOT_TRACK=1`. Self-host via `POSTHOG_API_KEY` / `POSTHOG_HOST`. Closes coleam00#1261 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): stop leaking test events to production PostHog The `telemetry-id preservation` test exercised the real capture path with the embedded production key, so every `bun run validate` published a tombstone `workflow_name: "w"` event. Redirect POSTHOG_HOST to loopback so the flush fails silently; bump test timeout to accommodate the retry-then-give-up window. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(telemetry): silence posthog-node stderr leak on network failure The PostHog SDK's internal logFlushError() writes 'Error while flushing PostHog' directly to stderr via console.error on any network or HTTP error, bypassing logger config. For a fire-and-forget telemetry path this leaked stack traces to users' terminals whenever PostHog was unreachable (offline, firewalled, DNS broken, rate-limited). Pass a silentFetch wrapper to the PostHog client that masks failures as fake 200 responses. The SDK never sees an error, so it never logs. Original failure is still recorded at debug level for diagnostics. Side benefit: shutdown is now fast on network failure (no retry loop), so offline CLI commands no longer hang ~10s on exit. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(telemetry): make id-preservation test deterministic Replace the fire-and-forget capture + setTimeout + POSTHOG_HOST-loopback dance with a direct synchronous call to getOrCreateTelemetryId(). Export the function with an @internal marker so tests can exercise the id path without spinning up the PostHog client. No network, no timer, no flake. Addresses CodeRabbit feedback on coleam00#1262. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…coleam00#1263) * fix(bundled-defaults): auto-generate import list, emit inline strings Root-cause fix for bundle drift (15 commands + 7 workflows previously missing from binary distributions) and a prerequisite for packaging @archon/workflows as a Node-loadable SDK. The hand-maintained `bundled-defaults.ts` import list is replaced by `scripts/generate-bundled-defaults.ts`, which walks `.archon/{commands,workflows}/defaults/` and emits a generated source file with inline string literals. `bundled-defaults.ts` becomes a thin facade that re-exports the generated records and keeps the `isBinaryBuild()` helper. Inline strings (via JSON.stringify) replace Bun's `import X from '...' with { type: 'text' }` attributes. The binary build still embeds the data at compile time, but the module now loads under Node too — removing SDK blocker #2. - Generator: `scripts/generate-bundled-defaults.ts` (+ `--check` mode for CI) - `package.json`: `generate:bundled`, `check:bundled`; wired into `validate` - `build-binaries.sh`: regenerates defaults before compile - Test: `bundle completeness` now derives expected set from on-disk files - All 56 defaults (36 commands + 20 workflows) now in the bundle * fix(bundled-defaults): address PR review feedback Review: coleam00#1263 (comment) Generator: - Guard against .yaml/.yml name collisions (previously silent overwrite) - Add early access() check with actionable error when run from wrong cwd - Type top-level catch as unknown; print only message for Error instances - Drop redundant /* eslint-disable */ emission (global ignore covers it) - Fix misleading CI-mechanism claim in header comment - Collapse dead `if (!ext) continue` guard into a single typed pass Scripts get real type-checking + linting: - New scripts/tsconfig.json extending root config - type-check now includes scripts/ via `tsc --noEmit -p scripts/tsconfig.json` - Drop `scripts/**` from eslint ignores; add to projectService file scope Tests: - Inline listNames helper (Rule of Three) - Drop redundant toBeDefined/typeof assertions; the Record<string, string> type plus length > 50 already cover them - Add content-fidelity round-trip assertion (defense against generator content bugs, not just key-set drift) Facade comment: drop dead reference to .claude/rules/dx-quirks.md. CI: wire `bun run check:bundled` into .github/workflows/test.yml so the header's CI-verification claim is truthful. Docs: CLAUDE.md step count four→five; add contributor bullet about `bun run generate:bundled` in the Defaults section and CONTRIBUTING.md. * chore(e2e): bump Codex model to gpt-5.2 gpt-5.1-codex-mini is deprecated and unavailable on ChatGPT-account Codex auth. Plain gpt-5.2 works. Verified end-to-end: - e2e-codex-smoke: structured output returns {category:'math'} - e2e-mixed-providers: claude+codex both return expected tokens
…oints
Settings → Projects Add Project only submitted { path }, so GitHub URLs
entered there failed even though the API and the Sidebar Add Project
already accepted them. Closes coleam00#1108.
Changes:
- Add packages/web/src/lib/codebase-input.ts: shared getCodebaseInput()
helper returning a discriminated { path } | { url } union (re-exported
from api.ts for convenience).
- Use the helper from all three Add Project entry points: Sidebar,
Settings, and ChatPage. Removes three divergent inline heuristics.
- SettingsPage: rename addPath → addValue (state now holds either URL
or local path) and update placeholder text.
- Tests: cover https://, git@ shorthand, ssh://, git://, whitespace,
unix/relative/home/Windows/UNC paths.
- Docs: document the unified Add Project entry point in adapters/web.md.
Heuristic flips from "assume URL unless explicitly local" to "assume
local unless explicitly remote" — only inputs starting with https?://,
ssh://, git@, or git:// are sent as { url }; everything else is sent
as { path }. The server already resolves tilde/relative paths.
Co-authored-by: Nguyen Huu Loc <lockbkbang@gmail.com>
…tings-add-project-url-support fix: accept GitHub URLs in settings add project
On Windows, `git checkout` converts source files to CRLF via the `* text=auto` policy. The generator inlined raw file content as JSON strings, so the Windows regeneration produced `\r\n` escapes while the committed artifact (written on Linux) used `\n`. `bun run check:bundled` then flagged the file as stale and failed the Windows CI job. Fix by normalizing CRLF → LF both when reading source defaults and when comparing against the existing generated file. No-op on Linux. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Companion to 75427c7. The bundle-completeness test compared BUNDLED_* strings (now LF-normalized by the generator) against raw readFileSync output, which is CRLF on Windows checkouts. Apply the same normalization to the on-disk side so the defense-in-depth check stays meaningful on every platform. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…gex (coleam00#1256) * fix(workflows): add word boundary to context variable substitution regex (coleam00#1112) Variable substitution for $CONTEXT, $EXTERNAL_CONTEXT, and $ISSUE_CONTEXT was matching as a prefix of longer identifiers like $CONTEXT_FILE, silently corrupting bash node scripts. Added negative lookahead (?![A-Za-z0-9_]) to CONTEXT_VAR_PATTERN_STR so only exact variable names are substituted. Changes: - Add negative lookahead to CONTEXT_VAR_PATTERN_STR regex in executor-shared.ts - Add regression test for prefix-match boundary case Fixes coleam00#1112 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test(workflows): add missing boundary cases for context variable substitution Add three new test cases that complete coverage of the word-boundary fix from coleam00#1112: $ISSUE_CONTEXT with suffix variants, $ISSUE_CONTEXT with multiple suffixes, and contextSubstituted=false for suffix-only prompts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Syncs 71 upstream commits covering the provider-registry refactor (IAssistantClient → IAgentProvider, @archon/core/clients extracted to new @archon/providers package), PostHog telemetry, isolation-hardening, target-repo .env-leak prevention, Claude binary-path resolver, and Telegraf → grammY migration. Fork customisations preserved: - schedules: in .archon/config.yaml (ScheduleEntry type, loader, scheduler) - workflow-analytics DB module + /api/analytics/costs route (PR #4) - archon-social-content-engine workflow - $PROJECT_KNOWLEDGE, $CONTEXT sanitisation, knowledge-writer service - cron-parser service Fork-specific behavioural changes: - Telemetry default-OFF (requires ARCHON_TELEMETRY_ENABLED=1 to opt in); upstream's opt-out env vars (ARCHON_TELEMETRY_DISABLED, DO_NOT_TRACK) still short-circuit. Tests updated to match. - Adopted upstream's "No Autonomous Lifecycle Mutation Across Process Boundaries" principle: dropped failOrphanedRuns() call at server startup (coleam00#1216). Kept startWorkflowScheduler(). Superseded by upstream: - orchestrator-agent.ts model-injection fix (now passes full assistantConfig through SendQueryOptions; Claude provider reads assistantConfig.model). Associated test additions dropped. Other: - Regenerated bundled-defaults.generated.ts from disk (36 commands, 22 workflows including our fork-only entries) - Regenerated packages/web/src/lib/api.generated.d.ts against merged server routes (includes /api/providers + /api/analytics/costs) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-merge cleanup: - Drop unused isAbsolute import from packages/workflows/src/dag-executor.ts (fork WIP leftover, unreferenced; caught by strict noUnusedLocals) - prettier --write packages/web/src/lib/api.generated.d.ts after regen Known pre-existing test failures (NOT caused by this merge; both reproduce on pure upstream/dev): - packages/core/src/db/connection.test.ts: 3 failures from local dev shell leaking DATABASE_URL through bun subprocess respawn. See user memory "Archon DATABASE_URL shell leak". Pass with env -i isolation. - packages/workflows/src/dag-executor.test.ts: 1 failure on the coleam00#1135 repo-subprocess-env regression test. Reproduces on clean upstream/dev; upstream CI did not catch (Bun version / env difference). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cross-repo skill distributor: pulls the central claude-skills repo,
diffs each shared skill against the targets declared in manifest.json,
then copies changed skills and opens a PR per target repo.
Path resolution order (most specific first):
1. $CLAUDE_SKILLS_REPO env var (explicit override)
2. ${ARCHON_HOME:-$HOME/.archon}/workspaces/claude-skills (default)
Fails fast with a clear error if the skills repo is not present at
the resolved path, matching the fork's "fail fast + explicit errors"
principle.
Also: ignore .archon/knowledge/ — runtime output of the
$PROJECT_KNOWLEDGE mechanism, not source.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Syncs the fork with 71 commits from
coleam00/archonupstream/dev. Fork customisations preserved; two fork-specific behavioural changes noted below.What's in the merge
Structural refactor (foundation for ~40 later commits)
IAssistantClient→IAgentProviderrename@archon/core/clients/{claude,codex}.tsextracted into new@archon/providerspackageHigh-value upstream fixes inherited
a8ac3f05target repo.envleak prevention (relevant to our prior DATABASE_URL pain)882fc58fserver startup no longer marks in-flight workflow runs as failed (bug: server startup marks actively-running workflows as failed via failOrphanedRuns() coleam00/Archon#1216)73d9240e,af9ed841,fd3f0431,5a4541b3isolation / worktree hardeningda1f8b7dTelegraf → grammY (Bun crash fix)bed36ca4context variable substitution regex word-boundary fix81859d68Claude SDK binary-path resolverf61d576aauto-init submodules in worktreesd535c832anonymous PostHog telemetry (opt-in here — see below)Fork customisations preserved (no conflicts on our work)
schedules:config +ScheduleEntrytype + workflow schedulerpackages/core/src/db/workflow-analytics.ts+/api/analytics/costsroute + dashboard cards (PR feat: analytics hardening (hook extract, SQLite day-boundary fix, tests) #4)$PROJECT_KNOWLEDGEsubstitution +$CONTEXTinjection-defense sanitiserknowledge-writer+cron-parserservicesarchon-social-content-engineworkflowarchon-skill-syncworkflow (portable paths via$CLAUDE_SKILLS_REPO)Fork-specific behavioural changes in this PR
isTelemetryDisabled()now requiresARCHON_TELEMETRY_ENABLED=1to opt in. Upstream's opt-out env vars (ARCHON_TELEMETRY_DISABLED,DO_NOT_TRACK) still short-circuit. Tests updated.failOrphanedRuns()call at server startup (bug: server startup marks actively-running workflows as failed via failOrphanedRuns() coleam00/Archon#1216). Kept ourstartWorkflowScheduler().Superseded by upstream (dropped from fork)
assistantConfigthroughSendQueryOptions; Claude provider readsassistantConfig.model. Behaviourally equivalent, cleaner shape. Associated tests dropped.Regenerated artifacts
packages/workflows/src/defaults/bundled-defaults.generated.ts→ 36 commands, 22 workflows (includes fork-only entries)packages/web/src/lib/api.generated.d.ts→ against merged server routes (/api/providers+/api/analytics/costs)Validation
bun run type-checkbun run lint --max-warnings 0bun run format:checkbun run testPre-existing test failures (NOT caused by this merge — both reproduce on pure
upstream/dev):packages/core/src/db/connection.test.ts× 3 — local dev shell leaksDATABASE_URLthrough bun subprocess respawn (documented in user memory "Archon DATABASE_URL shell leak"). Pass withenv -iisolation.packages/workflows/src/dag-executor.test.ts× 1 —#1135repo-subprocess-env regression test. Reproduces on cleanupstream/dev. Likely Bun version / env difference on macOS dev.Windows-only commits accepted passively
6a6740af,16b47d3d— don't break macOS, exclusion cost > acceptance cost. Fork remains macOS-only for dev validation per user policy.Test plan
bun run type-checkgreenbun run lint --max-warnings 0greenbun run format:checkgreenbun run devstarts cleanly;/api/healthreturns 0.4.0archon-social-content-engine) to confirm scheduler still fires/api/analytics/coststo confirm analytics route still serves dataARCHON_TELEMETRY_ENABLED=1archon-skill-syncwith the new env-var path resolution🤖 Generated with Claude Code