Skip to content

bug(isolation/git): worktree sync fails 403 on org repos — per-codebase GH_TOKEN never reaches git fetch subprocess #1469

@133Felix

Description

@133Felix

Symptom

Multi-org Archon deployment (single process.env.GH_TOKEN covering personal repos, per-codebase GH_TOKEN rows in remote_agent_codebase_env_vars for org repos). Direct-chat operations against an org repo work fine — the orchestrator loads per-codebase env vars and injects them into the AI provider subprocess.

But any workflow dispatch fails before the workflow run is even registered in the DB. User-facing message: the generic "An unexpected error occurred. Try /reset to start a fresh session." (Issue #1237 / error-formatter masking — separate). Server log:

@archon/server: orchestrator_message_failed
err: Error: Failed to fetch base branch from origin: Sync fetch from origin/main failed:
  Command failed: git -C /.archon/workspaces/<org>/<repo>/source fetch origin main
  remote: Write access to repository not granted.
  fatal: unable to access 'https://github.com/<org>/<repo>.git/': The requested URL returned error: 403
stack:
  at syncWorkspaceBeforeCreate (/app/packages/isolation/src/providers/worktree.ts:810)
  at async createWorktree (/app/packages/isolation/src/providers/worktree.ts:709)
  at async create                (/app/packages/isolation/src/providers/worktree.ts:154)
  at async createNewEnvironment (/app/packages/isolation/src/resolver.ts:468)

GitHub's "Write access to repository not granted" message in this context means no read access either — the global process.env.GH_TOKEN (personal PAT) is being used, and it has zero scope on the org repo.

Root Cause

packages/git/src/repo.ts:104 (the syncWorkspaceBeforeCreate path that runs before every workflow run as part of worktree creation):

await execFileAsync('git', ['-C', workspacePath, 'fetch', 'origin', branchToSync], {
  timeout: 60000,
});

No env parameter. The spawned git fetch subprocess inherits whatever process.env the Archon server itself was started with — i.e. the Docker container's /opt/archon/.env GH_TOKEN. The per-codebase token from remote_agent_codebase_env_vars (loaded in orchestrator-agent.ts:843) is never threaded into this code path.

This is structurally the case for every git operation in @archon/git and @archon/isolation: see also repo.ts:240 (git fetch origin), worktree.ts various, branch.ts various. None receive an env parameter.

Reproduction

  1. Register two codebases in Archon — one personal (e.g. your-username/something) and one in an org you don't have a global PAT for (e.g. some-org/some-repo).
  2. Set the global GH_TOKEN in container .env = personal PAT (covers personal but not org).
  3. Set per-codebase env var GH_TOKEN for the org codebase via PUT /api/codebases/<org-id>/env = an org-scoped PAT.
  4. From a chat platform, bind a conversation to the org codebase (currently this requires either /invoke-workflow ... --project <org-name> or a manual DB UPDATE — see feat: /setproject command to bind codebase to conversation #1044).
  5. Direct chat works (orchestrator routes per-codebase env into Claude/Codex subprocess).
  6. Trigger any workflow that needs isolation (i.e. worktree.enabled: true, the default). The dispatch fails with 403 before the workflow run is recorded.

Why this matters

Files Affected (audit pass needed)

Initial scan, all candidates that spawn git and don't accept/forward an env:

  • packages/git/src/repo.ts:104git fetch origin <branch> in syncWorkspaceBeforeCreate (the one this bug surfaces)
  • packages/git/src/repo.ts:240git fetch origin in another sync path
  • packages/git/src/worktree.ts — worktree create may invoke git fetch indirectly
  • packages/git/src/repo.ts (clone) — git clone for fresh codebase registration
  • packages/git/src/branch.ts — branch operations that may push/fetch
  • packages/isolation/src/providers/worktree.ts:810 — wraps the failing call

A grep for execFileAsync('git' in @archon/git and @archon/isolation is the right starting point.

Proposed Fix

Make every git operation function that talks to a remote take an optional env: Record<string, string> parameter, forward it to execFileAsync(..., { ..., env }), and have callers (worktree-provider, isolation-resolver, etc.) load per-codebase env from getCodebaseEnvVars(codebaseId) and pass it through.

This is essentially Phase A scaffolding of #1467 — but scoped to the bug (correctness fix), not the architectural overhaul (askpass-shim / GitHub App). Phase A in #1467 wants GIT_ASKPASS, which is a more thorough security improvement; this issue is about not 403-ing in the first place.

Workaround Today

Embed the org PAT into the remote URL of the source clone inside the container:

docker exec -i archon-app-1 sh -c '
  read T &&
  git -C /.archon/workspaces/<org>/<repo>/source remote set-url origin \
    "https://x-access-token:${T}@github.com/<org>/<repo>.git"
' < <(psql -At "$DATABASE_URL" -c "
  SELECT value FROM remote_agent_codebase_env_vars
   WHERE codebase_id='<codebase-uuid>' AND key='GH_TOKEN' LIMIT 1;
")

Trade-offs:

  • Token now stored in cleartext in .git/config — security risk if container shell access is broader than DB access
  • Per-codebase, manual, doesn't survive git remote reconfiguration
  • Has to be repeated on every PAT rotation
  • Doesn't help when Archon clones a new codebase (the same code path runs without env)

Related

Out of Scope

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium priority - Backlog, when time permitsarea: isolationWorktree isolation providerbugSomething is broken

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions