Skip to content

feat(core): swap shared GitHub PAT for GitHub App with multi-installation routing#1788

Merged
Wirasm merged 4 commits into
devfrom
feat/github-app-for-bot
May 28, 2026
Merged

feat(core): swap shared GitHub PAT for GitHub App with multi-installation routing#1788
Wirasm merged 4 commits into
devfrom
feat/github-app-for-bot

Conversation

@Wirasm

@Wirasm Wirasm commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Problem: Archon's bot identity is a single shared GITHUB_TOKEN PAT. One revocation kills everyone, comments all post under one operator's avatar, no auto-rotation, no multi-org support, and webhooks need per-repo setup.
  • Why it matters: Phase 2 of the team-foundation PRD. Without this, every team-oriented feature downstream stays blocked on shared identity. PR-A (feat(core): plumb user_id from chat/forge adapters through orchestrator and workflow runs #1783) landed the user_id plumbing — PR-B replaces the bot identity.
  • What changed: New @archon/core/github-auth/ module wraps @octokit/auth-app with a two-level cache (owner/repo → installationId, installationId → token, refresh-on-access with a 5-min buffer). GitHubAdapter takes a GitHubAuth discriminated union ({kind:'pat'} | {kind:'app'}); the 4 Octokit callsites + the clone token embed route through per-(owner, repo) resolveOctokit + a withTokenRefresh wrapper that evicts on 401 and retries once. Server bootstrap fails fast if both modes are configured. New POST /internal/git-credential endpoint (App mode only) backs a POSIX git credential helper installed at clone time so workflows that outlive the 1h token can refresh in place.
  • What did NOT change (scope boundary): Per-user GitHub tokens (device flow), requires:[github] YAML gate, interim web UI auth, CLI archon auth github — all PR-C. GitHub Enterprise Server, installation lifecycle webhook events, live integration tests against api.github.com, token-encryption-key rotation tooling — out of scope per PRD.

UX Journey

Before

Operator setup:
  1. Generate personal PAT on github.com
  2. Set GITHUB_TOKEN=<PAT> in .env
  3. Per repo: configure webhook → https://archon/webhooks/github
  4. Per repo: add WEBHOOK_SECRET

Bot post-comment runtime:
  Webhook ──▶ verifySig (HMAC) ──▶ Octokit({auth: GITHUB_TOKEN}) ──▶ posts under PAT owner's avatar
  cloneRepository({token: GITHUB_TOKEN}) ──▶ remote URL has embedded token

After

Operator setup (one-time):
  1. Register a GitHub App ("Archon Bot"); copy App ID + private key PEM
  2. Configure ONE webhook URL on the App → https://archon/webhooks/github
  3. Install the App on each org/personal account
  4. Set GITHUB_APP_ID, GITHUB_APP_PRIVATE_KEY[_PATH], GITHUB_APP_SLUG, WEBHOOK_SECRET

Solo install (UNCHANGED): GITHUB_TOKEN + WEBHOOK_SECRET → PAT mode preserved

Bot post-comment runtime (App mode):
  Webhook ──▶ verifySig ──▶ primeInstallationLookup(owner, repo, event.installation.id) [skips one HTTP]
          ──▶ resolveOctokit(owner, repo) → per-install Octokit (cached)
          ──▶ posts as <slug>[bot] with the App badge

  401? ──▶ invalidateToken(installationId) ──▶ retry once with fresh token
  Token expiry? ──▶ refreshed transparently 5min ahead of 1h expiry

  cloneRepository({token: getInstallationToken(owner, repo)})
                  ──▶ post-clone: installCredentialHelper(worktreePath)
                                  for >1h workflow refresh via POST /internal/git-credential

Architecture Diagram

Before

                  ┌─────────────┐
                  │  process.env│
                  │  GITHUB_TOKEN│
                  └──────┬──────┘
                         │
              ┌──────────┴──────────┐
              ▼                     ▼
   ┌────────────────────┐   ┌────────────────┐
   │  GitHubAdapter     │   │ cloneRepository│
   │  Octokit (1 inst)  │   │  URL embed     │
   └─────────┬──────────┘   └────────────────┘
             │
             ▼
   ┌────────────────────┐
   │  api.github.com    │
   └────────────────────┘

After

        ┌────────────────────────────────────────────┐
        │  packages/core/src/github-auth/  [+]       │
        │  ┌──────────────────────────────────────┐  │
        │  │ createGitHubAppAuthProvider()        │  │
        │  │  • lookupCache: owner/repo→install   │  │
        │  │  • tokenCache:  install→token (1h)   │  │
        │  │  • octokitCache per installation     │  │
        │  └──────────┬───────────────────────────┘  │
        └─────────────┼──────────────────────────────┘
                      │                            ▲
        ===           ▼                            │ ===
   ┌────────────────────────┐         ┌────────────┴──────────────┐
   │ GitHubAdapter  [~]     │         │ server bootstrap  [~]     │
   │  • auth: GitHubAuth    │◀────────┤  • dual-mode env detect   │
   │  • resolveOctokit      │         │  • fail-fast on conflict  │
   │  • withTokenRefresh    │         │  • register on singleton  │
   │  • botLogin getter     │         └──────────────┬────────────┘
   │  • <slug>[bot] filter  │                        │
   └────┬─────────┬─────────┘                        │ ===
        │         │                                   ▼
        │         │ ===                  ┌──────────────────────────┐
        │         └────────────────────▶ │ store-adapter  [~]       │
        │                                │  registerGitHubAppAuth() │
        │                                │  createWorkflowDeps()    │
        │ ===                            │  resolveBotGitHubToken?  │
        ▼                                └──────────┬───────────────┘
   ┌────────────────────┐                            │ ===
   │ /internal/         │                            ▼
   │ git-credential [+] │                ┌─────────────────────────┐
   │ (App mode only)    │                │ workflow executor  [~]  │
   └──────────┬─────────┘                │  GH_TOKEN injection at  │
              │                          │  workflow start         │
              ▼                          └─────────────────────────┘
   ┌────────────────────┐
   │ scripts/git-       │
   │ credential-archon  │
   │ .sh  [+]           │
   └────────────────────┘

Connection inventory:

From To Status Notes
server bootstrap loadAppPrivateKey() new PEM resolution from env (inline / file)
server bootstrap createGitHubAppAuthProvider() new App-mode only
server bootstrap registerGitHubAppAuthProvider() new Wires provider into workflows
server bootstrap GitHubAdapter constructor modified GitHubAuth discriminated union (was token: string)
GitHubAdapter IGitHubAppAuthProvider new App-mode runtime dependency
GitHubAdapter (4 Octokit sites) withTokenRefreshresolveOctokit modified Was direct this.octokit.rest.X
GitHubAdapter.handleWebhook primeInstallationLookup new Short-circuits one HTTP per webhook
GitHubAdapter.ensureRepoReady getInstallationToken new App-mode clone; PAT mode unchanged
GitHubAdapter.ensureRepoReady installCredentialHelper new App-mode post-clone hook
store-adapter.createWorkflowDeps() IGitHubAppAuthProvider.getInstallationToken new Via optional WorkflowDeps.resolveBotGitHubToken
workflows/executor.executeWorkflow deps.resolveBotGitHubToken new Injects GH_TOKEN/GITHUB_TOKEN at workflow start
Hono POST /internal/git-credential GitHubAdapter.getInstallationToken new Backs credential helper
git-credential-archon.sh POST /internal/git-credential new Refreshes installation token for >1h worktrees

Label Snapshot

  • Risk: risk: medium
  • Size: size: L
  • Scope: core (github-auth module is the heart); also touches adapters, server, workflows, docs
  • Module: core:github-auth, adapters:github, server:bootstrap, workflows:executor

Change Metadata

  • Change type: feature (also touches refactor for the adapter callsites)
  • Primary scope: multicore (new module + workflow injection wire-up), adapters (GitHubAdapter dual-mode), server (bootstrap + internal endpoint), workflows (subprocess env injection)

Linked Issue

Validation Evidence (required)

bun run type-check    # ✓ all 10 packages clean
bun run lint          # ✓ --max-warnings 0
bun run format:check  # ✓ Prettier clean
bun run test          # ✓ every package green
bun run validate      # ✓ exit 0
  • Evidence provided (test/log/trace/screenshot): all six bun run validate stages exit 0 locally; 15 new strictly-mocked auth-module tests (packages/core/src/github-auth/auth.test.ts); 8 new App-mode adapter tests + all 54 existing PAT-mode tests passing (packages/adapters/src/forge/github/adapter.test.ts). No live api.github.com calls in CI per PRD Q7.
  • If any command is intentionally skipped, explain why: integration tests against api.github.com are deferred per PRD Q7 (test GitHub App + CI secrets out of scope for v1).

Security Impact (required)

  • New permissions/capabilities? Yes — App-mode operators must register a GitHub App and grant Contents:Read, Issues:RW, PullRequests:RW. The App permission surface is narrower than a PAT (no user-wide scope). Documented in packages/docs-web/.../github-app-setup.md.
  • New external network calls? Yes@octokit/auth-app makes a JWT-signed call to GET /repos/{owner}/{repo}/installation (cached 1h) and POST /app/installations/{id}/access_tokens (cached, refreshed 5min before 1h expiry). Both are standard App-auth flows.
  • Secrets/tokens handling changed? Yes — installation access tokens are now memory-only (1h lifetime), not env. The private key PEM remains in env / on disk (operator's choice). No tokens persisted to disk.
  • File system access scope changed? Yes (limited) — installCredentialHelper() copies scripts/git-credential-archon.sh to ~/.archon/bin/ and writes a per-worktree git config credential.https://github.com.helper entry. Idempotent; only on App-mode clone.
  • /internal/git-credential endpoint hands out live installation tokens. MUST be exposed on 127.0.0.1 only OR the reverse proxy MUST drop /internal/*. Server emits a startup WARN (github_app.internal_endpoint_exposed_check_reverse_proxy) if bound to a non-loopback interface with App mode active. Documented in CLAUDE.md API endpoints section and the operator setup guide.

Compatibility / Migration

  • Backward compatible? Yes — PAT-mode behaviour is byte-identical for solo installs. All 54 existing PAT-mode adapter tests pass unchanged.
  • Config/env changes? Yes — new optional env vars: GITHUB_APP_ID, GITHUB_APP_PRIVATE_KEY (or _PATH), GITHUB_APP_SLUG, GITHUB_APP_INSTALLATION_ID. Documented in .env.example + github-app-setup.md. GITHUB_TOKEN continues to work.
  • Database migration needed? No — no schema changes (PR-A landed schema; PR-B is pure auth).
  • Fail-fast guardrail: server refuses to start if both GITHUB_APP_ID and GITHUB_TOKEN are set. Operators must pick one mode explicitly.
  • If yes, exact upgrade steps: see packages/docs-web/src/content/docs/adapters/github-app-setup.md Step 6.

Human Verification (required)

What was personally validated beyond CI:

  • Verified scenarios:
    • bun run validate green end-to-end (all 6 stages).
    • 15 auth-module unit tests cover: PEM loading (inline, file, \n normalisation, missing-env, bad-PEM); token cache (fresh hit, refresh-near-expiry, default-installation-id skip); install lookup (cache, case-insensitive, 404→AppNotInstalledError); priming; invalidation; per-install Octokit memoisation; multi-install Octokit distinctness.
    • 8 App-mode adapter tests cover: per-(owner, repo) Octokit routing, multi-install resolution, event.installation.id priming, 401→invalidate+retry, <slug>[bot] self-filter, clone-token resolution (vs env), AppNotInstalledError surfacing, credential-helper install attempted post-clone.
  • Edge cases checked:
    • Both modes configured → server refuses to start.
    • Missing PEM file → AppPrivateKeyError at bootstrap, not at first webhook.
    • PEM with literal \n (the .env-quoted convention) → normalised on load.
    • Same (owner, repo) from two different installations → routed to distinct tokens.
    • event.installation absent (older / replayed webhook) → falls back to lookup path.
    • PAT mode + this PR → all 54 pre-existing adapter tests pass.
  • What was not verified: live api.github.com integration (deferred per PRD Q7). The installCredentialHelper flow with an actual >1h workflow wasn't end-to-end exercised — covered by unit-level test for the post-clone git config invocation, but the real-world refresh path needs a long-running workflow on a deployed instance.

Side Effects / Blast Radius (required)

  • Affected subsystems/workflows: GitHub adapter (the only forge adapter touched); server bootstrap; workflow executor (subprocess env, only when App mode active); webhook ingestion path.
  • Potential unintended effects:
    • /internal/git-credential exposed beyond loopback would hand out live tokens — guarded by startup WARN and operator documentation.
    • installCredentialHelper writes per-worktree git config; idempotent and scoped to credential.https://github.com.helper so it doesn't collide with operator-set helpers.
    • GH_TOKEN injected at workflow start has 1h TTL — workflows >1h fall through to the credential helper for git push; gh commands hitting the API after expiry would 401 (operators handle via the helper or by keeping workflows short).
  • Guardrails/monitoring for early detection:
    • Startup logs: github.adapter_mode_app vs github.adapter_mode_pat make the active mode obvious.
    • Per-event logs: github_auth.install_lookup_completed, github_auth.token_resolve_completed, github_auth.token_cache_evicted_on_401 give visibility into token lifecycle.
    • github_app.internal_endpoint_exposed_check_reverse_proxy startup WARN flags misconfigured binds.

Rollback Plan (required)

  • Fast rollback command/path: unset the GITHUB_APP_* env vars, set GITHUB_TOKEN back, restart Archon. Adapter snaps back to PAT mode. No data migration to undo (no schema changes).
  • Feature flags or config toggles (if any): the App-mode env vars themselves act as the toggle — App mode is opt-in via env. Removing them returns to legacy behaviour.
  • Observable failure symptoms:
    • Webhook handler emitting github_auth.token_resolve_failed or github_auth.install_lookup_failed repeatedly.
    • Bot comments not appearing under <slug>[bot].
    • Clones failing with AppNotInstalledError.
    • 401 storm in adapter logs.

Risks and Mitigations

  • Risk: /internal/git-credential exposed beyond loopback hands out live installation tokens.
    • Mitigation: startup WARN when server binds non-loopback with App mode active; documented Caddy/nginx pattern in github-app-setup.md; CLAUDE.md API table flags it as 127.0.0.1-only.
  • Risk: PEM private-key newline mangling in .env files.
    • Mitigation: loadAppPrivateKey supports both literal newlines and the \n escape; two unit tests cover both paths.
  • Risk: Installation lookup cache goes stale (App uninstalled → reinstalled with a new id).
    • Mitigation: 1h TTL on lookup cache; 401 → invalidateToken + single retry surfaces AppNotInstalledError cleanly if the install really is gone.
  • Risk: Workflows that AI-write git push after the 1h URL-embedded token expires.
    • Mitigation: installCredentialHelper post-clone in App mode; POST /internal/git-credential returns a fresh token on demand.
  • Risk: Operator confused about active mode, sets both env sets.
    • Mitigation: fail-fast on both configured, with a clear "pick one" error message at startup.
  • Risk: @octokit/auth-app v8 ESM-only — Bun import resolution.
    • Mitigation: @octokit/rest v22 (the matching peer, already used in @archon/adapters) is also ESM-only; the same path works. Verified locally via the new auth tests.

Summary by CodeRabbit

  • New Features

    • GitHub App auth mode with per-installation tokens, automatic refresh on 401, and internal endpoint + git credential helper for long-running pushes.
  • Configuration

    • .env template documents mutually-exclusive PAT vs App modes, required App vars (app id, private key path/inline, slug, optional installation id), and a loopback-binding opt-in escape hatch; server now fails fast on conflicting settings.
  • Documentation

    • New App-mode setup, migration, and security guidance including webhook and internal-endpoint recommendations.
  • Tests

    • Expanded test coverage for App-mode auth, token caching, and credential helper behavior.

Review Change Stack

…tion routing

Adds a new @archon/core/github-auth/ module wrapping @octokit/auth-app with
a two-level cache (owner/repo → installationId, installationId → token) and
refresh-on-access semantics — no background timers, no leaked handles.

GitHubAdapter now takes a `GitHubAuth` discriminated union at construction
time. All 4 Octokit callsites (postComment, listComments, repos.get,
pulls.get) plus the clone-path token embed route through resolveOctokit +
withTokenRefresh, which evicts the cached token and retries once on 401.
The secondary self-filter compares against a new botLogin getter that
returns `<slug>[bot]` in App mode (still botMention in PAT mode) so PR-C's
per-user tokens won't trip it. Webhook `event.installation.id` primes the
lookup cache to skip one HTTP round trip per event.

Server bootstrap detects App vs PAT mode via env and fails fast if both
are configured. In App mode it registers the provider on a module-level
singleton that createWorkflowDeps() reads, so the workflow executor's
bash/script subprocesses inherit a fresh GH_TOKEN/GITHUB_TOKEN. The
new POST /internal/git-credential endpoint (App mode only) backs a
POSIX git credential helper installed at clone time, covering workflows
that outlive the 1h installation-token expiry. A startup WARN fires if
the server binds to a non-loopback interface with App mode active —
operators must drop /internal/* at the reverse proxy or bind 127.0.0.1.

Backwards compat: solo installs running GITHUB_TOKEN only see zero
functional change. Existing 54 PAT-mode adapter tests pass unchanged.

Tests: 15 new strictly-mocked auth tests (PRD Q7 — no live api.github.com
in CI); 8 new App-mode adapter tests covering multi-install routing,
payload short-circuit, 401 retry, AppNotInstalledError surfacing,
clone-token resolution, and post-clone credential helper install.

Refs PRD: Phase 2 of .claude/PRPs/prds/github-app-and-user-identity.prd.md
@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ef91f99-2cd7-409b-98d0-85ae4517415c

📥 Commits

Reviewing files that changed from the base of the PR and between 429997f and e386596.

📒 Files selected for processing (1)
  • packages/core/src/github-auth/auth.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/core/src/github-auth/auth.ts

📝 Walkthrough

Walkthrough

Adds GitHub App mode alongside PAT: public auth types and factory, private-key loader, token/installation caching, credential helper + installer, adapter refactor for mode-aware calls and 401 retry, server internal credential endpoint with startup guard, workflow token injection, tests, docs, and .env updates.

Changes

GitHub App Authentication & Integration

Layer / File(s) Summary
Env & docs updates
​.env.example, packages/docs-web/src/content/docs/adapters/github-app-setup.md, CLAUDE.md
Updates .env.example, adds App-mode setup guide and CLAUDE.md security notes describing loopback-only internal endpoint and opt-out guard.
Auth types, errors & public entry
packages/core/src/github-auth/types.ts, packages/core/src/github-auth/errors.ts, packages/core/src/github-auth/index.ts, packages/core/src/index.ts
Adds GitHubAppConfig, CachedInstallationToken, IGitHubAppAuthProvider, AppNotInstalledError, AppPrivateKeyError, and GitHubAuth discriminated union; re-exports auth utilities from core.
Private key loader
packages/core/src/github-auth/private-key.ts
loadAppPrivateKey selects inline or file PEM, normalizes newlines, validates PEM shape, and throws AppPrivateKeyError on problems.
Auth provider implementation
packages/core/src/github-auth/auth.ts
createGitHubAppAuthProvider with owner/repo → installation lookup cache (1h), installation-token caching with 5m refresh buffer, per-installation Octokit memoization, primeInstallationLookup, and invalidation methods.
Auth tests & mocks
packages/core/src/github-auth/auth.test.ts
Mocked Bun test suite covering key loading, lookup/token caching/refresh, priming/invalidation, TTL boundaries, and Octokit memoization/eviction.
Credential helper & installer
scripts/git-credential-archon.sh, packages/core/src/github-auth/credential-helper-install.ts
Shell credential helper posts to local /internal/git-credential; installer copies/registers helper idempotently and returns structured results.
Core package deps & exports
packages/core/package.json, packages/core/src/index.ts
Adds @octokit/auth-app and @octokit/rest, test entry; re-exports GitHub auth utilities and GitHubAuth from core.
Store adapter registration
packages/core/src/workflows/store-adapter.ts
Adds registerGitHubAppAuthProvider and wires resolveBotGitHubToken(owner,repo) into createWorkflowDeps() when provider registered.
Workflow deps & executor token injection
packages/workflows/src/deps.ts, packages/workflows/src/executor.ts
Adds optional resolveBotGitHubToken(owner,repo) to WorkflowDeps and executor logic to parse repo URLs and inject resolved GH_TOKEN/GITHUB_TOKEN into workflow env with precedence.
Adapter constructor & per-request plumbing
packages/adapters/src/forge/github/adapter.ts
Constructor changed to accept GitHubAuth; octokit nullable; adds getAuthMode(), getInstallationToken(), resolveOctokit(), computed botLogin, and withTokenRefresh() with App-mode 401 eviction+retry.
Adapter API calls, webhook & clone
packages/adapters/src/forge/github/adapter.ts
Routes REST calls through withTokenRefresh(...), primes installation lookup from webhook installation.id, uses mode-aware botLogin for self-filtering, resolves tokens at clone time (App vs PAT), and attempts credential-helper install after clone.
Adapter tests (PAT ctor + App mode suite)
packages/adapters/src/forge/github/adapter.test.ts, packages/adapters/src/forge/github/context.test.ts
Updates tests to use { kind: 'pat', token } constructor, adds execFileAsync stub, and introduces a large describe('App mode') suite validating routing, cache priming/invalidation, 401→evict→retry, AppNotInstalledError, and credential-helper flows.
Webhook event types & context
packages/adapters/src/forge/github/types.ts, packages/adapters/src/forge/github/context.test.ts
Adds optional installation?: { id: number } to WebhookEvent and updates context test adapter construction.
Server bootstrap & internal endpoint
packages/server/src/index.ts, packages/server/src/github-auth-bootstrap.ts
Adds selectGitHubAuthMode, startup App-vs-PAT branching, loads app private key, registers provider, instantiates adapter with `{ kind: 'app'

Sequence Diagram: App-mode credential helper flow

sequenceDiagram
  participant Git as Git
  participant Helper as git-credential-archon
  participant Archon as Archon:/internal/git-credential
  participant Provider as GitHubAppAuthProvider
  participant GitHub as GitHub
  Git->>Helper: credential request (action=get, host=github.com, path=owner/repo.git)
  Helper->>Archon: POST /internal/git-credential { host, path }
  Archon->>Provider: getInstallationToken(owner, repo)
  Provider->>GitHub: POST /app/installations/:id/access_tokens
  GitHub-->>Provider: { token, expires_at }
  Provider-->>Archon: token
  Archon-->>Helper: { token }
  Helper-->>Git: username/password output (x-access-token/<token>)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

  • coleam00/Archon#1011: Related server startup refactor touching startServer where this PR adds App/PAT bootstrapping and the internal endpoint.

Poem

🐰 I’m a rabbit who plants PEMs in rows,

Tokens sprout where the webhook wind blows,
A helper at loopback brings secrets near,
App or PAT, the adapter will hear,
Hop, deploy, and keep your git pushes clear!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main feature: swapping a shared GitHub PAT for a GitHub App with multi-installation routing, which aligns with the changeset's core objective.
Description check ✅ Passed The PR description is comprehensive and covers all major template sections: Summary (problem/why it matters/what changed/boundaries), UX Journey (before/after flows), Architecture Diagram (with connection inventory), Labels, Change Metadata, Linked Issues, Validation Evidence, Security Impact, Compatibility/Migration, Human Verification, Side Effects, Rollback Plan, and Risks/Mitigations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/github-app-for-bot

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
packages/core/src/github-auth/auth.test.ts (1)

219-255: ⚡ Quick win

Add one regression test for invalidation affecting getOctokitForInstallation.

Current tests validate token eviction and Octokit memoization independently, but not the combined 401-recovery behavior. A small test asserting a new instance after invalidateToken(installationId) would protect this critical path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/github-auth/auth.test.ts` around lines 219 - 255, Add a
regression test that ensures invalidateToken(...) triggers creation of a new
Octokit instance via getOctokitForInstallation(...): create a provider with
makeProvider (or default), obtain an Octokit instance with
getOctokitForInstallation('o','r'), call invalidateToken(installationId) for
that installation, then call getOctokitForInstallation('o','r') again and assert
the second instance is not the same as the first; use/mock mockRequest token
responses as needed to simulate token rotation and reference invalidateToken,
getOctokitForInstallation, makeProvider and mockRequest to locate the relevant
code under test.
packages/server/src/index.ts (1)

629-663: 💤 Low value

Static analysis false positive — this is RegExp.exec(), not child_process.exec().

The OpenGrep hint flagging "dynamic command passed to child_process.exec" at line 649 is incorrect. The code uses RegExp.prototype.exec() to parse the owner/repo path, not child_process.exec().

Optional: Consider Zod for request body validation.

Lines 639-642 use a manual type cast for the request body. Per coding guidelines, Zod schemas are preferred for request validation to get runtime type safety.

♻️ Optional: Zod schema for body validation
import { z } from 'zod';

const gitCredentialRequestSchema = z.object({
  host: z.string().optional(),
  path: z.string().optional(),
});

// Inside the handler:
const parseResult = gitCredentialRequestSchema.safeParse(await c.req.json().catch(() => null));
if (!parseResult.success || parseResult.data.host !== 'github.com') {
  return c.json({ error: 'unsupported host' }, 400);
}
const pathStr = parseResult.data.path ?? '';

As per coding guidelines: "Use Zod schemas with camelCase naming and descriptive suffixes; always derive types using z.infer<typeof schema> instead of hand-crafted parallel interfaces."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/index.ts` around lines 629 - 663, The scan flagged a
false positive: the code in the handler registered by
app.post('/internal/git-credential') uses RegExp.prototype.exec() to parse the
owner/repo path (see the /^([^/]+)\/([^/]+?)(?:\.git)?$/ .exec(pathStr) usage) —
not child_process.exec(); leave that as-is. Separately, replace the manual cast
of the request body ((await c.req.json().catch(() => null)) as { host?: string;
path?: string } | null) with a Zod-based runtime validation: define a
gitCredentialRequestSchema (z.object({ host: z.string().optional(), path:
z.string().optional() })), use safeParse on the parsed JSON, and then read
parseResult.data.host/path and return the same error responses if validation
fails or host !== 'github.com'; keep the rest of the handler (owner/repo
extraction, github.getInstallationToken(owner, repo), error handling) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.env.example:
- Line 94: The .env.example has an inline trailing comment on the empty
GITHUB_TOKEN assignment which can confuse dotenv parsers; update the file so
GITHUB_TOKEN= is on its own line with no inline comment and move the explanatory
comment into a separate line above (or below) the GITHUB_TOKEN entry so the
value is unambiguously empty (referencing the GITHUB_TOKEN variable in the
file).

In `@packages/core/src/github-auth/auth.ts`:
- Around line 107-137: The invalidateToken function only deletes tokenCache but
must also evict the per-installation Octokit so retries create a fresh auth
context; update invalidateToken to remove the Octokit instance from octokitCache
(e.g., octokitCache.delete(installationId)) and add a small log entry (using
getLog()) indicating the octokit cache was evicted for the installationId,
ensuring getOctokitForInstallation will construct a new Octokit on subsequent
calls.

In `@packages/docs-web/src/content/docs/adapters/github-app-setup.md`:
- Around line 170-173: Update the "### 401 loop in logs" section to replace the
webhook-secret explanation with authentication/installation causes: explain that
repeated GitHub API 401s are typically due to installation-token or auth issues
(app not installed on the target repo/org, missing or insufficient app
permissions, revoked/expired credentials, or incorrect token handling) and
provide steps to verify the app installation and token validity; move the
existing note about the webhook secret/WEBHOOK_SECRET mismatch into the
signature verification/delivery failures section (or a new subsection) and
clearly label it as relevant to webhook signature verification rather than API
401 errors.

In `@scripts/git-credential-archon.sh`:
- Around line 39-40: The curl call that assigns resp can hang indefinitely and
should include timeouts; update the curl invocation (the line that sets resp) to
add sensible timeout flags such as --connect-timeout 2 and --max-time 5 (or
other project-appropriate values) so the request fails fast if the Archon
endpoint is unreachable, keeping the existing || exit 0 behavior intact; ensure
the modified command still uses -fsS and the same headers/data and captures the
response into resp.

---

Nitpick comments:
In `@packages/core/src/github-auth/auth.test.ts`:
- Around line 219-255: Add a regression test that ensures invalidateToken(...)
triggers creation of a new Octokit instance via getOctokitForInstallation(...):
create a provider with makeProvider (or default), obtain an Octokit instance
with getOctokitForInstallation('o','r'), call invalidateToken(installationId)
for that installation, then call getOctokitForInstallation('o','r') again and
assert the second instance is not the same as the first; use/mock mockRequest
token responses as needed to simulate token rotation and reference
invalidateToken, getOctokitForInstallation, makeProvider and mockRequest to
locate the relevant code under test.

In `@packages/server/src/index.ts`:
- Around line 629-663: The scan flagged a false positive: the code in the
handler registered by app.post('/internal/git-credential') uses
RegExp.prototype.exec() to parse the owner/repo path (see the
/^([^/]+)\/([^/]+?)(?:\.git)?$/ .exec(pathStr) usage) — not
child_process.exec(); leave that as-is. Separately, replace the manual cast of
the request body ((await c.req.json().catch(() => null)) as { host?: string;
path?: string } | null) with a Zod-based runtime validation: define a
gitCredentialRequestSchema (z.object({ host: z.string().optional(), path:
z.string().optional() })), use safeParse on the parsed JSON, and then read
parseResult.data.host/path and return the same error responses if validation
fails or host !== 'github.com'; keep the rest of the handler (owner/repo
extraction, github.getInstallationToken(owner, repo), error handling) unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 225c5038-2897-4cf2-b499-72d12f4eadc3

📥 Commits

Reviewing files that changed from the base of the PR and between 50beeec and 817ce8c.

⛔ Files ignored due to path filters (1)
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (22)
  • .env.example
  • CLAUDE.md
  • packages/adapters/src/forge/github/adapter.test.ts
  • packages/adapters/src/forge/github/adapter.ts
  • packages/adapters/src/forge/github/context.test.ts
  • packages/adapters/src/forge/github/types.ts
  • packages/core/package.json
  • packages/core/src/github-auth/auth.test.ts
  • packages/core/src/github-auth/auth.ts
  • packages/core/src/github-auth/credential-helper-install.ts
  • packages/core/src/github-auth/errors.ts
  • packages/core/src/github-auth/index.ts
  • packages/core/src/github-auth/private-key.ts
  • packages/core/src/github-auth/types.ts
  • packages/core/src/index.ts
  • packages/core/src/workflows/store-adapter.ts
  • packages/docs-web/src/content/docs/adapters/github-app-setup.md
  • packages/docs-web/src/content/docs/adapters/github.md
  • packages/server/src/index.ts
  • packages/workflows/src/deps.ts
  • packages/workflows/src/executor.ts
  • scripts/git-credential-archon.sh

Comment thread .env.example Outdated
Comment thread packages/core/src/github-auth/auth.ts
Comment thread packages/docs-web/src/content/docs/adapters/github-app-setup.md Outdated
Comment thread scripts/git-credential-archon.sh Outdated
@Wirasm

Wirasm commented May 28, 2026

Copy link
Copy Markdown
Collaborator Author

PR Review Summary — 7 specialized agents

Targeted review focused on the 5 concerns raised on PR-B plus general quality. Findings cross-validated across agents; consensus items flagged.


Direct answers to the 5 questions you flagged

Q1. /internal/git-credential loopback guard — defensive enough?
No. WARN is the wrong choice — it should fail-fast. Both code-reviewer (92/100 confidence) and silent-failure-hunter independently flagged this as critical. A misconfigured operator binding to 0.0.0.0 with App mode active leaks live installation tokens to anyone who can reach the port. WARN-level logs are routinely filtered or scrolled past in production. Recommended pattern:

if (githubAppAuthProvider && hostname !== '127.0.0.1' && hostname !== 'localhost') {
  if (!process.env.ARCHON_ALLOW_INTERNAL_ON_PUBLIC_BIND) {
    getLog().fatal({ hostname }, 'github_app.internal_endpoint_public_bind_rejected');
    process.exit(1);
  }
  getLog().warn({ hostname }, 'github_app.internal_endpoint_exposed_check_reverse_proxy');
}

Defense-in-depth alternative: bind /internal/* to a separate loopback-only listener instance, or add a request-level x-forwarded-for check inside the handler (any forwarded request → 403).

Q2. withTokenRefresh 401 retry — infinite-loop risk?
No infinite loop is possible. The second call fn(fresh) at adapter.ts:172 is not wrapped in try/catch — a second consecutive 401 propagates directly. A genuine App-uninstall produces 403, not 401, so the retry path doesn't engage at all. No circuit breaker needed for the 401 case itself, BUT two related defects surfaced:

  • (critical, code-reviewer) lookupCache is never evicted on 401. invalidateToken only touches tokenCache. After an App reinstall (which assigns a new installationId), the lookup cache serves the stale ID for up to 1h, causing every API call against that repo to fail in a loop until TTL expiry. Self-heal won't kick in until the 1h lookup TTL runs out.
  • (critical, silent-failure-hunter) The retry fn(fresh) has no surrounding try/catch with an ERROR log. Retry failures bubble up with zero context — debugging requires reconstructing the call stack from unrelated upstream logs.

Q3. PEM \n-escape handling — common .env pitfalls?
Mostly correct. The inline.includes('\\n') ? replace : pass-through heuristic handles all realistic .env quoting styles correctly (double-quoted → dotenv pre-processes; single-quoted/unquoted → literal \n survives; replace only matches the two-char sequence, not real newlines). Edge cases worth fixing:

  • (important, pr-test-analyzer + code-reviewer) Windows \r\n in a .pem file loaded via PRIVATE_KEY_PATH fails at first JWT signing, not at startup. Fix: extend regex to replace(/\\r\\n|\\n/g, '\n') and add a .replace(/\r\n/g, '\n') for file-loaded PEMs.
  • (suggestion, code-simplifier) The includes('\\n') ? ... : inline ternary is dead branching — replace with no match is a no-op. Drop the conditional.
  • (minor) Whitespace-only GITHUB_APP_PRIVATE_KEY= falls through to _PATH lookup rather than surfacing "the inline key you set is empty"; produces a confusing error message.

Q4. Cache TTLs — 55-minute mark behavior, correctness?
At 55 min the refresh fires correctly. Date.now() + REFRESH_BUFFER_MS (5min) < cached.expiresAt evaluates to false at now + 55min (since expiresAt = now + 60min), so the token is re-issued. The 5-min buffer is reasonable; no off-by-one.

Lookup cache 1h TTL is the right call — but only if Q2's lookupCache invalidation gap is fixed first. Infinite TTL + 401-invalidated would actually be worse today, because a reinstall would permanently serve a stale ID until process restart. Fix the eviction first, then consider extending TTL.

Q5. Credential helper shell script — injection risk?
No direct exploit, but two defense-in-depth gaps:

  • (important, code-reviewer 80/100) $path is unvalidated in the shell. $host is checked ([ "$host" = "github.com" ]) before use, so host injection is blocked. $path flows verbatim into the curl JSON body. The server's path regex (^([^/]+)\/([^/]+?)(?:\.git)?$) blocks crafted paths end-to-end, so the present attack surface is empty — but the structural guarantee belongs in the script too. Add a case "$path" in */*) ;; *) exit 0 ;; esac check.
  • (important, silent-failure-hunter) curl -fsS ... || exit 0 produces no stderr on connection failure. Combined with the [ -n "$token" ] || exit 0 fallthrough, an unattended workflow gets "Authentication failed" with no diagnostic. Fix: stderr-echo a one-line reason before each exit 0.
  • (positive) No set -x token leak. $token is correctly quoted in the final printf. CRLF input from Windows git produces silent fallthrough (not exploitable, but worth noting). The sed -n 's/.*"token":"\([^"]*\)".*/\1/p' JSON extractor is fragile — a token containing / or & would silently produce empty on some sed implementations.

Critical issues (must fix before merge)

ID Issue Location Source
C1 /internal/git-credential non-loopback bind only emits WARN; should fail-fast packages/server/src/index.ts:772-774 code-reviewer + silent-failure-hunter
C2 lookupCache never evicted on 401 → 1h dead period after App reinstall packages/core/src/github-auth/auth.ts:134-137, packages/adapters/src/forge/github/adapter.ts:162-170 code-reviewer (82/100)
C3 withTokenRefresh retry has no try/catch around fn(fresh) — second-attempt failures silently lose context packages/adapters/src/forge/github/adapter.ts:171-172 silent-failure-hunter
C4 /internal/git-credential failure logs at WARN, not ERROR; live token resolution failure is filtered as noise packages/server/src/index.ts:657-659 silent-failure-hunter

Important issues

ID Issue Location Source
I1 installCredentialHelper returns silently in binary builds; caller logs credential_helper_installed (false success). No discriminated result type. credential-helper-install.ts:40-47, adapter.ts:651-660 silent-failure-hunter
I2 getInstallationTokenById POST has no error logging; on failure installationId is lost before reaching outer handlers auth.ts:91-94 silent-failure-hunter
I3 Shell script $path lacks client-side validation (server-side regex is the only defense) scripts/git-credential-archon.sh:40 code-reviewer (80/100)
I4 Shell script swallows curl stderr on connection failure — unattended workflows surface "Authentication failed" with no root cause scripts/git-credential-archon.sh:39-40, 44-45 silent-failure-hunter
I5 PEM \r\n (Windows-edited file) not normalized for PRIVATE_KEY_PATH branch private-key.ts:18-44 pr-test-analyzer
I6 auth.ts file header claims lookupCache: refreshed on 401 — false. Compounds C2. auth.ts:5 comment-analyzer
I7 Inline comment at adapter.ts:888 references "PR-C" — will rot post-merge adapter.ts:888 comment-analyzer
I8 credential-helper-install.ts:12-13 "Source-builds only for this PR" — will rot post-merge; convert to tracked issue credential-helper-install.ts:12-13 comment-analyzer

Test coverage gaps

ID Gap Crit. Source
T1 /internal/git-credential endpoint has zero tests — path-parse regex regression could leak tokens to wrong installations 9/10 pr-test-analyzer
T2 Dual-mode fail-fast guard untested (hasGitHubApp && hasGitHubPat) 8/10 pr-test-analyzer
T3 withTokenRefresh retry-on-retry-401 propagation unproven 8/10 pr-test-analyzer
T4 scripts/git-credential-archon.sh has no shellcheck or functional tests 7/10 pr-test-analyzer
T5 PEM \r\n normalization not tested 6/10 pr-test-analyzer
T6 1h lookupCache TTL boundary not tested with fake timers (off-by-one risk) 6/10 pr-test-analyzer

Suggestions

  • Type design (rated 8/6/8/7): CachedInstallationToken.expiresAt → rename to expiresAtMs for unit clarity. Validate appId and slug non-empty at createGitHubAppAuthProvider entry (consistent with loadAppPrivateKey's eager validation). (type-design-analyzer)
  • Simplifications: drop octokitCache — Octokit construction is cheap and createAppAuth already handles internal token refresh; tokenCache carries the real cache value for non-Octokit callers (clone embed, credential helper). Remove dead if (!appId || !webhookSecret) guards inside hasGitHubApp/hasGitHubPat branches — use ! non-null assertion. Collapse the no-op try/catch in ensureRepoReady clone path. (code-simplifier)
  • API ergonomics: consider invalidateTokenForRepo(owner, repo) collapsing resolveInstallationId + invalidateToken into one interface method — removes async from the 401 hot path and removes one method from IGitHubAppAuthProvider. (code-simplifier; aligns with C2 fix)
  • Docs: GITHUB_APP_SLUG is documented as required in .env.example:101 and github-app-setup.md:83-93 but defaults to 'archon' (index.ts:364). github-app-setup.md:130-132 doesn't mention that the credential helper silently no-ops in compiled binary builds. github.md:20 still implies PAT is the only path. (docs-impact)

Strengths acknowledged

  • Discriminated union (GitHubAuth) is exhaustively narrowed at all 5 callsites (resolveOctokit, withTokenRefresh, botLogin, ensureRepoReady, handleWebhook priming). No accidental fallback to "the other branch." (type-design-analyzer)
  • Dual-mode fail-fast bootstrap (index.ts:341-347) is the correct enforcement of CLAUDE.md's Fail Fast principle — clear "pick one" error, no silent preference. (code-reviewer)
  • ensureRepoReady clone path distinguishes AppNotInstalledError from generic auth failure and maps to user-actionable messages per auth mode. (silent-failure-hunter)
  • postComment retry exhaustion log carries attempt, maxRetries, wasRetryable, messageLength — excellent observability. (silent-failure-hunter)
  • All three webhook handlers (GitHub, Gitea, GitLab) attach .catch() with ERROR log — no silent rejection. (silent-failure-hunter)
  • 401 retry mechanism is structurally bounded — the second call has no surrounding catch, so a second 401 propagates rather than looping. (code-reviewer)
  • private-key.ts file header is honest about validation scope ("catches obviously-bad shapes; crypto validation deferred to @octokit/auth-app"). (comment-analyzer)

Verdict — NEEDS FIXES

C1 and C2 are the load-bearing concerns:

  • C1 is a security defect that violates CLAUDE.md's Fail Fast principle ("Never silently broaden permissions or capabilities"). A misconfigured deployment leaks installation tokens to the network.
  • C2 is a correctness bug that causes silent 1h outages after a routine operational event (App reinstall).

C3 and C4 are observability defects that will burn debugging time on real production incidents. The test gaps T1 + T2 leave the riskiest new surface (the token-vending endpoint, the dual-mode guard) unverified — both are short tests to add and would catch regressions in the security-critical paths.

The structural design is sound: the discriminated union, the fail-fast on dual config, the two-level cache architecture, and the 401-evict-retry shape are all the right calls. Fixes are localized and don't require rework.

🤖 Generated by /prp-core:prp-review-agents (7 specialized agents)

…, lookupCache eviction, retry observability

PR-B follow-up addressing the multi-agent review on #1788. Fixes 4 critical
defects + 7 important defects + 5 test gaps. Non-blocking suggestions (drop
octokitCache, split /internal/* to separate listener) skipped as YAGNI.

Critical:
  • C1 — /internal/git-credential bind safety: replace startup WARN with
    fail-fast exit when App mode is active and hostname != loopback. Opt-out
    via ARCHON_ALLOW_INTERNAL_ON_PUBLIC_BIND=1 for deployments where the
    reverse proxy already drops /internal/*.
  • C2 — lookupCache eviction on 401: add invalidateRepo(owner, repo) that
    drops BOTH caches in one call. Adapter's withTokenRefresh uses it instead
    of the by-id invalidator. An App reinstall (which assigns a NEW
    installation id) used to be served from the stale lookupCache for up
    to 1h; now the next call re-resolves the install.
  • C3 — withTokenRefresh retry observability: wrap fn(fresh) in try/catch
    with an explicit ERROR log carrying owner/repo + firstStatus + retryStatus.
    Bounded retry (no third attempt) — second consecutive failure propagates.
  • C4 — /internal/git-credential failure log upgraded WARN→ERROR. Live
    credential vending failure should not be filtered as noise.

Important:
  • I1 — installCredentialHelper returns discriminated result
    {installed|skipped|failed}. Adapter logs the actual outcome instead of
    false-success in compiled binary builds that ship without scripts/.
  • I2 — getInstallationTokenById wraps POST in try/catch with ERROR log
    carrying installationId + HTTP status so the dead installation is
    greppable.
  • I3 — git-credential-archon.sh adds explicit case "$path" in */*) check
    so malformed paths fall through cleanly.
  • I4 — script echoes one-line stderr diagnostics before each exit 0
    fall-through. Unattended workflow now sees the root cause (non-github
    host, malformed path, curl failure, empty response) instead of "auth
    failed" with no context.
  • I5 — PEM CRLF normalization extended to both the inline branch
    (\\r\\n escape) and the file-load branch. Windows-edited .pem files work.
  • I6 — auth.ts header no longer lies about lookupCache being refreshed
    on 401 (paired with C2).
  • I7 — adapter.ts secondary self-filter comment drops "PR-C" reference;
    states the actual invariant.
  • I8 — credential-helper-install.ts header drops "this PR" reference;
    states the actual condition (source script must exist on disk).

Refactor + suggestions taken:
  • CachedInstallationToken.expiresAt renamed → expiresAtMs.
  • createGitHubAppAuthProvider validates appId + slug non-empty at the
    factory boundary, matching loadAppPrivateKey's eager validation.
  • Dual-mode env detection + /internal path parsing extracted into pure
    helpers in server/src/github-auth-bootstrap.ts (so the security-critical
    decisions are testable in isolation).

Tests:
  • T1 — parseGitCredentialPath unit tests (12 cases): owner/repo, .git
    suffix, single-segment rejection, deep route rejection, .. traversal,
    null bytes, leading slash, case preservation, GitHub-valid identifiers.
  • T2 — selectGitHubAuthMode unit tests (8 cases): app, pat, none,
    incomplete-config rejection, conflict-with-actionable-message.
  • T3 — withTokenRefresh second-401 propagation: single retry, no infinite
    loop, single invalidateRepo call.
  • T5 — PEM CRLF normalization (2 cases: inline \\r\\n, file CRLF).
  • T6 — lookupCache 1h TTL boundary with fake Date.now.
  • Existing 401-retry test updated to assert against invalidateRepo
    (C2 regression guard).

Suggestions not adopted:
  • Drop octokitCache — currently working + tested; reviewer cites no
    measured cost. KISS keep.
  • Split /internal/* to a separate loopback-only listener — overengineering
    when fail-fast bind check + reverse-proxy doc cover the threat model.
  • shellcheck on the credential helper — would require CI infra for a
    30-line POSIX script with one branch. Defer.
  • Remove !appId/!webhookSecret guards via ! non-null assertions — lint
    forbids ! assertions; my defensive guards are appropriate.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/server/src/index.ts (1)

747-777: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Move the public-bind guard before Bun.serve to make this truly fail-fast.

Right now the process binds and logs server_listening before the guard can throw, so /internal/git-credential may be briefly reachable on startup in rejected configs. Validate hostname and App mode before creating the server.

Based on learnings: "Prefer throwing early with a clear error for unsupported/unsafe states — never silently swallow errors" and "Never silently broaden permissions or capabilities. ... otherwise throw (Fail Fast + Explicit Errors)".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/server/src/index.ts` around lines 747 - 777, The public-bind safety
check is executed after creating the server (Bun.serve) and logging
server_listening, which can briefly expose /internal/git-credential; move the
guard that checks githubAppAuthProvider and hostname (and the
ARCHON_ALLOW_INTERNAL_ON_PUBLIC_BIND logic that calls
getLog().warn/getLog().fatal and throws) to run before calling Bun.serve and
before getLog().info so the process fails fast and never binds to a non-loopback
interface in rejected configurations.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/core/src/github-auth/auth.test.ts`:
- Around line 324-329: The test currently overrides Date.now directly and uses
an inline eslint-disable comment; instead remove the disable and replace the
direct assignment with a lint-safe preserve/restore pattern (e.g., use
jest.spyOn(Date, 'now').mockImplementation(() => mockedNow) or a small typed
helper like preserveAndMockDateNow that captures realNow, sets the mock, and
restores it in afterEach). Update the code referencing realNow, baseTime, and
mockedNow to use that spy/helper and ensure the spy is restored (mockRestore or
helper cleanup) after the test so no inline eslint-disable is needed.

---

Outside diff comments:
In `@packages/server/src/index.ts`:
- Around line 747-777: The public-bind safety check is executed after creating
the server (Bun.serve) and logging server_listening, which can briefly expose
/internal/git-credential; move the guard that checks githubAppAuthProvider and
hostname (and the ARCHON_ALLOW_INTERNAL_ON_PUBLIC_BIND logic that calls
getLog().warn/getLog().fatal and throws) to run before calling Bun.serve and
before getLog().info so the process fails fast and never binds to a non-loopback
interface in rejected configurations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 75ce516e-a8a8-49b8-8bd8-4c09ffe38803

📥 Commits

Reviewing files that changed from the base of the PR and between 817ce8c and b35d9bd.

📒 Files selected for processing (15)
  • .env.example
  • CLAUDE.md
  • packages/adapters/src/forge/github/adapter.test.ts
  • packages/adapters/src/forge/github/adapter.ts
  • packages/core/src/github-auth/auth.test.ts
  • packages/core/src/github-auth/auth.ts
  • packages/core/src/github-auth/credential-helper-install.ts
  • packages/core/src/github-auth/private-key.ts
  • packages/core/src/github-auth/types.ts
  • packages/docs-web/src/content/docs/adapters/github-app-setup.md
  • packages/server/package.json
  • packages/server/src/github-auth-bootstrap.test.ts
  • packages/server/src/github-auth-bootstrap.ts
  • packages/server/src/index.ts
  • scripts/git-credential-archon.sh
✅ Files skipped from review due to trivial changes (2)
  • packages/server/package.json
  • .env.example
🚧 Files skipped from review as they are similar to previous changes (7)
  • packages/core/src/github-auth/types.ts
  • CLAUDE.md
  • scripts/git-credential-archon.sh
  • packages/core/src/github-auth/auth.ts
  • packages/adapters/src/forge/github/adapter.test.ts
  • packages/core/src/github-auth/private-key.ts
  • packages/adapters/src/forge/github/adapter.ts

Comment thread packages/core/src/github-auth/auth.test.ts Outdated
CR1 — .env.example: move inline comment off GITHUB_TOKEN= line so dotenv
parsers don't misread the value as `# Same as GH_TOKEN...`.

CR2 — auth.ts: invalidateToken AND invalidateRepo now also evict
octokitCache. Per-installation Octokit holds its own internal
createAppAuth token state; without this eviction the "fresh" Octokit
handed to the retry path could keep serving the dead token from the
SDK's private cache. Paired with cascading lookupCache eviction
already added in b35d9bd.

CR3 — github-app-setup.md: "401 loop" section reworked. API 401s are
caused by installation / token / permission issues, not webhook secret.
Webhook-secret mismatch causes signature-verification failures at the
inbound webhook endpoint — split into its own section under "Webhook
deliveries fail signature verification".

CR4 — git-credential-archon.sh: add `curl --connect-timeout 2
--max-time 5` so git doesn't block indefinitely when Archon isn't
listening. Existing fail-soft (`exit 0` on any error) preserved.

CR5 — auth.test.ts T6: replace direct Date.now reassignment +
eslint-disable with spyOn(Date, 'now').mockImplementation pattern;
restore via mockRestore in finally. Lint-clean.

CR6 — server/index.ts: move the public-bind fail-fast guard ABOVE
Bun.serve so a rejected config never opens the listening socket. Was
previously after Bun.serve + server_listening log, briefly exposing
/internal/git-credential during the throw window.

CR7 — server/index.ts: replace manual type cast on /internal/git-
credential request body with a Zod schema (`gitCredentialRequestSchema
= z.object({ host: z.string().optional(), path: z.string().optional() })`)
and safeParse. Per CLAUDE.md's "always Zod for request validation"
convention. Inline declaration because the endpoint is a one-off
internal surface (not OpenAPI-published).

CR8 — auth.test.ts: regression tests covering CR2. invalidateToken and
invalidateRepo both verified to evict the Octokit instance — the next
getOctokitForInstallation() returns a brand-new Octokit. Without CR2
these tests fail.

All findings addressed; no items skipped.
@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown

Actionable comments posted: 0

@Wirasm

Wirasm commented May 28, 2026

Copy link
Copy Markdown
Collaborator Author

Re-review after b35d9bd + 429997f

Reviewed against the 4 critical + 8 important + 6 test-gap findings from the previous comment. Verified each fix against the actual code on disk, not just commit messages.

Critical findings — all resolved

  • C1 (fail-fast on public bind): index.ts:766-780 now throws (not WARNs) when App mode is active and bind is non-loopback, with ARCHON_ALLOW_INTERNAL_ON_PUBLIC_BIND=1 opt-out. CR6 also moved the guard above Bun.serve — the listening socket never opens on rejected config, so there's no window where /internal/git-credential is briefly reachable before the throw. Solid.
  • C2 (lookupCache eviction): new invalidateRepo(owner, repo) at auth.ts:183-207 evicts all three caches (token, lookup, octokit) in one call. Adapter uses it at adapter.ts:167. CR2 follow-up correctly added octokitCache.delete() — the per-installation Octokit holds its own createAppAuth token state internally, so without that eviction the "fresh" Octokit handed to the retry could keep serving the dead token from the SDK's private cache. Both invalidateToken and invalidateRepo cascade through all three caches now. Stronger than I suggested.
  • C3 (retry observability): adapter.ts:169-187 wraps fn(fresh) in try/catch with explicit ERROR log carrying owner, repo, firstStatus, retryStatus. Comment documents "Bound by design: no third retry" — addresses the original Q2 about infinite-loop risk as documented intent. Clean.
  • C4 (ERROR not WARN): index.ts:654 now logs internal.git_credential_resolve_failed at ERROR, with a comment explaining why ("live credential-vending failure…operators need this surfaced loudly"). Done.

Important findings — all resolved

  • I1: installCredentialHelper now returns CredentialHelperInstallResult discriminated union (installed | skipped | failed) — credential-helper-install.ts:45-48. Adapter logs the actual outcome instead of false-success in binary builds.
  • I2: getInstallationTokenById wraps the POST in try/catch with ERROR log carrying installationId + HTTP statusauth.ts:117-126. Dead installations are now greppable.
  • I3: Shell script has explicit case "$path" in */*) ;; *) exit 0 ;; esac at git-credential-archon.sh:46-52. Defense-in-depth in place.
  • I4: Stderr diagnostics on every fall-through path: non-github host (line 38), malformed path (49), curl failure with captured stderr + exit code (66-67), empty token response with response length (77-78). Unattended workflows now see root cause.
  • I5: PEM normalization now covers \\n, \\r\\n, and raw \r\n (file-loaded CRLF) — private-key.ts:24, 35. Tested in auth.test.ts:129, 147.
  • I6: auth.ts:5-9 header now accurately states lookupCache is "evicted on 401 via invalidateRepo."
  • I7: adapter.ts comment no longer references "PR-C" (verified via grep).
  • I8: credential-helper-install.ts:11-16 header drops the "this PR" framing; now states the structural condition (source script must exist on disk for source builds).

Test gaps — all closed

  • T1: parseGitCredentialPath extracted to github-auth-bootstrap.ts:57-73, tested with 12 cases covering owner/repo, .git suffix, single-segment, three-segment, .. traversal, literal null-byte injection ('alice\x00bad/widgets' — confirmed real null bytes in the test source), leading slash, leading dot, case preservation. The null-byte test is a substantively stronger security test than my original "validate $path" suggestion.
  • T2: selectGitHubAuthMode extracted + tested with 8 cases including the dual-mode conflict path with actionable-message assertion.
  • T3: adapter.test.ts:1302 proves the second consecutive 401 propagates with exactly one invalidateRepo call (no infinite loop). Regression guard at line 1328 also asserts invalidateRepo is used, not the old invalidate(id) shape — locks in the C2 fix.
  • T5: Both PEM CRLF paths tested: file-loaded \r\n (line 129) and inline literal \\r\\n (line 147).
  • T6: lookupCache 1h TTL boundary tested at auth.test.ts:323 using spyOn(Date, 'now').mockImplementation (CR5: replaced direct Date.now reassignment + eslint-disable with a lint-clean pattern). Tests "just inside" (59m59s) and "just outside" (1h+1ms) the boundary.
  • T4 (shellcheck): Skipped per author's stated YAGNI — acceptable; the 30-line POSIX script with one HTTP-call branch is well within manual-review surface, especially now with the stderr-diagnostics paths + connect-timeout.

CodeRabbit follow-ups — verified

  • CR1: .env.example inline-comment relocation prevents dotenv from misreading GITHUB_TOKEN= value as # Same as GH_TOKEN….
  • CR4: curl --connect-timeout 2 --max-time 5 at git-credential-archon.sh:61 — git no longer blocks indefinitely when Archon isn't listening.
  • CR7: /internal/git-credential request body validated via inline gitCredentialRequestSchema = z.object({...}) + safeParse (index.ts:632-640) rather than manual type cast. Matches CLAUDE.md's "always Zod for request validation."

Suggestions not adopted — agree these are YAGNI

  • Dropping octokitCache: keep, especially now that it's correctly invalidated alongside the token cache. The double-eviction logic earns the cache's keep.
  • Splitting /internal/* to a separate loopback-only listener: fail-fast bind check + reverse-proxy doc cover the threat model adequately.
  • !appId/!webhookSecret non-null assertions: lint forbids ! post-narrowing; defensive guards are the right call.

One minor observation (not blocking)

The gitCredentialRequestSchema at index.ts:632 is inline-declared inside the conditional, so it's re-created on server startup but never per-request — fine. Worth noting it could move to module scope alongside the imports for consistency with the rest of the file's schema usage pattern, but the comment explains why it's local. Up to you.


Verdict — READY TO MERGE

All 4 critical findings resolved with stronger fixes than I suggested (octokitCache cascade, pre-listen guard placement, defense-in-depth at three layers — adapter / endpoint / shell). All 8 important findings resolved. 5 of 6 test gaps closed (T4 reasonably deferred). CodeRabbit's 8 follow-ups addressed. CI green (ubuntu, windows, docker-build all pass).

The structural design is unchanged from PR-B's original shape; the fixes are localized to the auth module, adapter, server bootstrap, and shell helper. Discriminated union remains exhaustively narrowed. Fail-fast principle now correctly applied at every load-bearing decision point.

🤖 Generated by /prp-core:prp-review-agents re-review pass

…Octokit)

Updates the auth.ts header to acknowledge that the cache is effectively
three-level, not two. The third layer — `octokitCache: installationId →
Octokit` — was already there, but its load-bearing role for 401 retry
correctness wasn't documented. Each Octokit instance owns its own
private createAppAuth token state, opaque to us; without an
octokitCache.delete(id) the "fresh" Octokit handed to the retry path
keeps serving the dead token from the SDK's hidden cache, and our
visible cache evictions do nothing.

References the CodeRabbit comment on PR #1788 that surfaced the issue,
so the next reader can trace why invalidateToken / invalidateRepo
delete from three caches rather than two.

Comment-only change — type-check + tests green.
@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown

Actionable comments posted: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant