Skip to content

Cross-service workflows: codebase tags + services: workflow field + fanout execution #1348

@Wirasm

Description

@Wirasm

Problem

Archon today is single-codebase-per-workflow. Users with a microservice / service-architecture layout — a non-git parent folder containing N service repos — can't run workflows that touch multiple services without N separate invocations and manual coordination.

Common concrete needs:

  • "Bump `@company/shared` to v2.0 across all 8 services in the platform"
  • "Add the new auth header to every service that calls the API gateway"
  • "Run tests across all services affected by this shared-lib change"

Currently: user invokes the workflow N times, each time with a different `--cwd`, manually coordinates branch names, manually links PRs. Awkward and error-prone.

Foundation: #1236

#1236 (composite codebase identity) is the prerequisite. Without it, two services that happen to share a remote name or are clones of the same template collide on `codebase_id` and share state. After #1236 each service is a distinct codebase row with its own identity, isolation environments, conversations, and env vars.

Proposed direction: three small primitives

1. Codebase tags

New `tags: string[]` column on codebases. Any user-supplied string. No allowlist.

```sql
ALTER TABLE remote_agent_codebases ADD COLUMN tags TEXT[] DEFAULT '{}';
CREATE INDEX idx_codebases_tags ON remote_agent_codebases USING GIN (tags);
```

Set via CLI (`archon codebase tag platform:my-app`), Web UI codebase settings, or at registration time.

Why tags and not groups: a service might belong to multiple slicings (`platform:my-app`, `language:ts`, `team:backend`) and groups force a single parent. Tags compose. See also #1190 (workflow tags, same primitive on the other side).

2. `services:` field on workflow YAML

```yaml
name: bump-shared-lib
services:

  • tag: platform:my-app # match all codebases with this tag

OR

  • names: [service-api, service-worker, service-web] # explicit list

OR

  • match: "owner/service-*" # glob on codebase name
    parallel: true # default false — explicit opt-in
    nodes:
  • id: bump
    prompt: "Bump @company/shared to v{{version}} in $SERVICE"
    ```

Semantics:

  • If `services:` is absent → current single-codebase behavior, no change.
  • If `services:` is present → the executor resolves the matcher(s) against the codebase corpus, runs the DAG once per matched service.
  • `parallel: true` fans out via `Promise.allSettled`; `parallel: false` (default) runs sequentially.
  • New substitution variables: `$SERVICE` (codebase name), `$SERVICE_DIR` (absolute path), `$SERVICE_ID` (codebase id).

3. Per-service worktree isolation

Each service fanout gets its own worktree, layered under the user's configured worktree root:

```
~/.archon/workspaces//worktrees//
├── service-api/ ← worktree of service-api on
├── service-worker/ ← worktree of service-worker on
└── service-web/ ← worktree of service-web on
```

Branch consistency: all services use the same branch name for a given workflow run (matches the "these changes are one coherent unit" intent). Same `worktree.enabled: false` / `worktree.path` primitives already shipped in #1310 extend naturally per-service.

Failure semantics: per-service success/fail is tracked independently. If service A passes and service B fails, both worktrees are left on disk for user inspection — matches today's single-service failure model, applied per-service.

What this unlocks (naturally, without further engine work)

  • Platform-wide refactors driven by AI across N repos
  • Cross-service test coordination (run `bun test` in affected services only)
  • Dependency bump workflows (same version across all services)
  • Parallel per-service PR creation (each service gets its own branch → its own PR → workflow description cross-links them)

What this does NOT do

  • PR cross-linking, same-branch enforcement across services, rollback-A-if-B-fails semantics — all valuable but are polish that depends on this foundation. Separate follow-up issue once phase 1 is in use.
  • Auto-discovery of services from a parent folder — ergonomics layer, tracked separately (see related issue below).
  • Web UI for group / tag management — comes after the CLI path is validated.
  • Adapter routingAdapter/webhook codebase routing: disambiguate when multiple clones match #1347 is the tier that lets a GitHub webhook land on the right clone when multiple matches exist. Required when cross-service workflows get triggered from webhooks.

Implementation notes

  • The DAG executor already handles fanout via topological layers running independent nodes in parallel. "Run per-service" is conceptually a foreach dimension on a node/DAG — the executor needs a new "service context" that scopes cwd, codebase_id, and env per iteration. Rough plan: wrap the per-service iteration at the `executeWorkflow` level, not inside a single node, so every node in the DAG is scoped to one service at a time.
  • The `additionalDirectories` option Claude already supports is how the AI gets read access to sibling services while writing to one — enables "I'm modifying service-api but I can see what service-web expects" scenarios without multi-write complexity.
  • Context-window risk: running workflows with full per-service context across N services will blow provider limits. Per-service `sendQuery` calls (separate sessions, each scoped to one service) is the natural answer — matches how the DAG executor already runs each node.

Acceptance criteria

  • `tags: string[]` column on codebases + migration
  • CLI: `archon codebase tag ` / `archon codebase untag `
  • Workflow YAML `services:` field accepting tag / names / glob
  • Executor fanout: N worktrees, one per matched service, DAG runs per-service
  • New substitution vars: `$SERVICE`, `$SERVICE_DIR`, `$SERVICE_ID`
  • Tests: single-service behavior unchanged; multi-service fanout with 3 mock services, both parallel and sequential; partial-failure semantics (one service fails, others finish)
  • Docs: a dedicated `guides/cross-service-workflows.md` covering the three match modes and the worktree layout

Related

cc

@halindrome — you're the one driving the microservice-support direction via #1236 and a natural owner for phase 1 if you're interested. The foundation is your composite identity; this is the layer that makes it usable for platform-scale workflows.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium priority - Backlog, when time permitsarchitectureArchitectural changes and designarea: workflowsWorkflow engineeffort/highCross-cutting changes, multiple domains, requires design decisionsfeatureNew functionality (planned)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions