Skip to content

security: zero env leaks from managed repos into provider subprocesses #1135

@Wirasm

Description

@Wirasm

Problem

Archon manages many target repos/codebases. Those repos may contain .env files with API keys, secrets, and credentials.

The policy for target repo env should be simple:

Target repo/project env must never be loaded into the Archon process, and must never be forwarded to Archon provider subprocesses.

This means:

  • ~/.archon/.env is trusted Archon-owned config and may be loaded
  • the Archon repo .env may be loaded in dev/source mode
  • target repo/project .env files are never loaded into Archon
  • target repo/project .env values are never inherited by provider subprocesses
  • there is no consent / opt-in / bypass path for target repo env

Important distinction:

  • ambient target repo env is forbidden
  • explicit Archon-managed env is still allowed

Allowed explicit env sources:

  • ~/.archon/.env
  • Archon dev/source .env
  • .archon/config.yaml env: values
  • per-codebase env vars stored in Archon DB and injected intentionally

The current design still carries a consent model (allow_env_keys, allow_target_repo_keys, --allow-env-keys) and that is the wrong primitive for the intended security boundary.

Required Invariant

At all times:

  • process.env must not contain env vars sourced from any managed target repo
  • provider subprocess env must be built from trusted Archon-owned env only
  • repo registration must not create a path where target repo env becomes allowed later
  • cross-repo contamination must be impossible in long-lived Archon processes

Business Logic Summary

This issue is not primarily about implementation details. It is about the product rule Archon should enforce.

Plain-language rule

When a user points Archon at a target repo:

  • Archon may inspect the repo
  • Archon may run tools in the repo
  • Archon may use Archon-managed credentials and config
  • Archon must not silently inherit that repo's own .env secrets

In business terms:

  • a target repo's secrets belong to that repo/application, not to Archon
  • Archon should never bill the wrong API account because it picked up a repo's .env
  • Archon should never “sometimes allow it if the user clicks consent”
  • the behavior should be deterministic and easy to explain:
    • if the repo has sensitive keys in auto-loaded .env files, Archon refuses to use that repo until the user removes or relocates those keys

Current vs desired business behavior

Surface Current behavior Desired behavior
Repo registration (/clone, Add Project, CLI register) Scan repo, block by default, but allow override/consent Scan repo, block if sensitive keys are found, no override
Provider spawn / workflow execution Scan again, but skip the check if consent was granted earlier Always block if sensitive keys are present in auto-loaded repo .env files
Archon process startup in repo cwd Boot cleanup strips leaked repo env after Bun auto-load Keep this protection; Archon process must stay clean
Provider subprocess env Nested-session/debugger markers are sanitized, but repo-env policy still relies on separate gating Keep sanitization, but never allow target repo .env to become acceptable
Product policy shown to users “You can acknowledge risk and allow env keys” “Target repo .env secrets are not allowed; remove or relocate them”

Stakeholder-level decision

This issue should resolve one product question clearly:

Should Archon ever allow a target repo's ambient .env secrets to be used by Archon or its provider subprocesses?

The answer for this issue is:

No. Never.

That means the system should stop modeling this as a consent workflow and instead model it as a hard safety boundary.

What Must Be Fixed

1. Remove the consent model

These primitives should be removed rather than hardened:

  • codebase-level allow_env_keys
  • config-level allow_target_repo_keys
  • CLI --allow-env-keys
  • UI/API flows that grant env-key consent
  • remediation copy that tells users to allow or bypass repo env leakage

Registration-time scanning can remain, but only as:

  • detection
  • refusal / warning / audit

Not as a consent gate.

2. Treat target repo env as untrusted input, never trusted config

Archon should only trust:

  • ~/.archon/.env
  • Archon dev/source env
  • explicit Archon config values (.archon/config.yaml env: and DB env vars)

It should never treat target repo .env as valid runtime input for either:

  • the Archon process
  • Claude/Codex/future provider subprocesses

3. Harden subprocess env construction

Provider subprocess env must not be { ...process.env } unless process.env is already proven clean and the allowed key set is explicit.

The safer direction is:

  • start from trusted Archon-owned env
  • sanitize nested-session/debugger markers
  • explicitly merge only Archon-controlled additions
  • never merge target repo env

4. Cover late registration and long-lived process cases

Boot-time stripping is not sufficient by itself.

Audit and harden all cases where the process outlives repo registration or repo switching:

  • late-registered codebases
  • /clone
  • Web UI project registration
  • CLI invocation from target-repo cwd
  • long-lived server process handling multiple repos
  • provider subprocesses spawned after multiple repo transitions

Audit Findings

The current codebase still encodes a consent/bypass model in multiple places:

  • registration path:
    • packages/core/src/handlers/clone.ts
    • supports per-call bypass via allowEnvKeys
    • supports global bypass via merged allowTargetRepoKeys
  • scanner/remediation copy:
    • packages/core/src/utils/env-leak-scanner.ts
    • instructs users to re-run with --allow-env-keys, toggle UI consent, or set global bypass
  • config model:
    • packages/core/src/config/config-types.ts
    • global and repo-level allow_target_repo_keys
  • config loading:
    • packages/core/src/config/config-loader.ts
    • merges global and repo-level bypass into effective allowTargetRepoKeys
  • DB model:
    • remote_agent_codebases.allow_env_keys
    • represented in code at packages/core/src/db/codebases.ts
  • API/UI surface:
    • packages/server/src/routes/api.ts
    • PATCH /api/codebases/{id} is explicitly a consent-flag route
    • packages/server/src/routes/schemas/codebase.schemas.ts exposes allow_env_keys
  • CLI surface:
    • packages/cli/src/cli.ts
    • exposes --allow-env-keys
  • startup behavior:
    • packages/server/src/index.ts
    • startup scan is currently skipped when global bypass is enabled
  • provider subprocess env construction:
    • packages/providers/src/claude/provider.ts
    • buildSubprocessEnv() still returns { ...process.env }

This issue should remove the consent model end-to-end, not just harden one call site.

Acceptance Criteria

  • No target repo/project .env keys are ever present in Archon process.env
  • No target repo/project .env keys are ever present in provider subprocess env
  • Cross-repo contamination is impossible: subprocess for repo A has zero repo-env keys from repo B
  • Late-registered repos are as safe as boot-time repos
  • Consent/bypass primitives for target repo env are removed
  • allow_env_keys is removed from DB/API/types
  • allow_target_repo_keys is removed from config/types/loading
  • --allow-env-keys is removed from the CLI
  • Registration and scan flows no longer instruct users to allow or bypass repo env leakage
  • Env protection is enforced by platform/bootstrap/runtime boundaries, not by individual providers
  • Audit covers all provider call sites and subprocess env construction paths
  • Explicit Archon-managed env injection still works:
    • .archon/config.yaml env:
    • DB-managed per-codebase env vars

Non-Goals

  • Allowing repo env with user consent
  • Per-codebase env exemptions
  • Global YAML bypass for repo env leakage

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions