Skip to content

fix(clone): authenticate non-GitHub forge URLs via GITLAB_TOKEN / GITEA_TOKEN (fixes #1655)#1658

Merged
Wirasm merged 2 commits into
coleam00:devfrom
kagura-agent:fix/clone-multi-forge-auth
May 15, 2026
Merged

fix(clone): authenticate non-GitHub forge URLs via GITLAB_TOKEN / GITEA_TOKEN (fixes #1655)#1658
Wirasm merged 2 commits into
coleam00:devfrom
kagura-agent:fix/clone-multi-forge-auth

Conversation

@kagura-agent

@kagura-agent kagura-agent commented May 13, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: Web UI clone handler only injects auth tokens for github.com URLs — private repos on GitLab, Gitea, or Forgejo silently fail to clone
  • Why it matters: The URL form is the most discoverable entry point for registering projects; non-GitHub users hit a dead end
  • What changed: Replaced the github.com-specific string substitution with a forge-aware resolveForgeAuth() resolver that picks the right env var + URL scheme per hostname
  • What did not change: No changes to SSH clone paths, registerRepository(), or the chat orchestrator clone flow

UX Journey

Before

  User                   Archon                        Git
  ────                   ──────                        ───
  pastes GitLab URL ──▶  cloneRepository()
                         checks github.com → false
                         clones unauthenticated ──────▶ 401 Unauthorized
  sees "Failed to clone" ◀─

After

  User                   Archon                        Git
  ────                   ──────                        ───
  pastes GitLab URL ──▶  cloneRepository()
                         resolveForgeAuth() → GITLAB_TOKEN + oauth2: scheme
                         clones authenticated ────────▶ 200 OK
  repo registered ◀──────

Architecture Diagram

N/A — no new services/components. Single function addition in clone.ts.

Label Snapshot

fix, core

Change Metadata

Key Value
Risk Low — isolated to clone auth path
Lines changed ~57 src + ~137 test
Packages touched @archon/core
New dependencies None

Linked Issue

Closes #1655

Validation Evidence

  • 53 tests pass (bun test packages/core/src/handlers/clone.test.ts)
  • bun run type-check clean across all 10 packages
  • lint-staged (eslint + prettier) clean on commit

New tests added:

  • GitLab.com HTTPS auth (oauth2: scheme)
  • Self-hosted GitLab auth
  • Gitea auth
  • Forgejo auth
  • Unknown forge (no auth injected)
  • resolveForgeAuth() unit tests (4)

Security Impact

  • Auth tokens are only injected when the corresponding env var is set (same pattern as existing GH_TOKEN)
  • Token sanitization via existing sanitizeError() already covers clone failures
  • No new env vars introduced — reuses GITLAB_TOKEN and GITEA_TOKEN already used by forge adapters

Compatibility/Migration

  • Backward compatible: Existing GH_TOKEN + github.com behavior is unchanged
  • No config/schema/DB changes

Human Verification

  • Review resolveForgeAuth() mapping table matches forge adapter expectations
  • Confirm oauth2: prefix is correct for GitLab personal access tokens

Risks and Mitigations

Risk Mitigation
Self-hosted GitLab hostname detection too broad (gitlab substring) Ordered by specificity: github.com first, then gitlab.com, then gitlab
Missing forge (Bitbucket, etc.) Falls through to unauthenticated clone (same as before), easy to extend

Side Effects/Blast Radius

None — only the cloneRepository() auth block is touched.

Rollback Plan

Revert single commit — previous GH_TOKEN-only logic restored.

Summary by CodeRabbit

  • New Features

    • Authenticate HTTPS clone URLs across multiple forges (GitHub, GitLab, Gitea, Forgejo), including GitLab oauth2 handling, self‑hosted detection, and SSH→HTTPS normalization.
  • Tests

    • Added comprehensive multi‑forge authentication tests covering token injection, negative cases, SSH→HTTPS behavior, and hostname/port parsing.
  • Documentation

    • Documented GH_TOKEN, GITLAB_TOKEN, and GITEA_TOKEN and added adapter/quick‑start guidance.

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 13, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Clone handler authentication is extended from GitHub-only to multi-forge support. New resolveForgeAuth maps repository hosts (GitHub, GitLab, Gitea, Forgejo and self-hosted patterns) to environment tokens/schemes; clone URL construction and SSH normalization now use this resolver. Tests and docs updated accordingly.

Changes

Multi-forge authentication support

Layer / File(s) Summary
Forge auth resolution implementation
packages/core/src/handlers/clone.ts
Adds safeParseUrl, forge auth config and exports resolveForgeAuth(url) that maps GitHub/GitLab/Gitea/Forgejo host patterns (including self-hosted) to env token variables and auth scheme prefixes, returning { token, scheme }.
SSH→HTTPS normalization and registerRepository update
packages/core/src/handlers/clone.ts
Adjusts SSH-style normalization to apply the git@host:owner/repo regex against the normalized/cleaned URL when converting to https://{host}/{rest} for normalizeRepoUrl and registerRepository.
Clone URL construction with resolved forge auth
packages/core/src/handlers/clone.ts
Updates cloneRepository to use resolveForgeAuth(workingUrl) instead of GitHub-only GH_TOKEN substitution; embeds resolved {scheme}{token}@ into clone URLs for parsed HTTPS or bare host/owner/repo inputs.
Multi-forge authentication test suite and validation
packages/core/src/handlers/clone.test.ts
Adjusts existing GH_TOKEN test to delete GITLAB_TOKEN; adds multi-forge authentication tests covering GitLab (including oauth2:), self-hosted, Gitea/Forgejo, unknown forges and SSH→HTTPS injection cases; adds resolveForgeAuth unit tests.
Documentation: token tips and configuration
packages/docs-web/src/content/docs/adapters/community/gitea.md, .../gitlab.md, .../configuration.md, .../quick-start.md, .../guides/script-nodes.md, .../reference/variables.md
Adds tips that GITEA_TOKEN/GITLAB_TOKEN enable authenticated cloning, adds GH_TOKEN, GITLAB_TOKEN, GITEA_TOKEN to env var docs, updates Quick Start prerequisites to mention tokens for private repos, and adds a note on script-node environment handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • coleam00/Archon#1656: Overlaps on packages/core/src/handlers/clone.ts SSH→HTTPS parsing and normalization changes.

Poem

🐰 I nudge tokens into doors of forge and host,

oauth2 and user@ glide where needed most,
Gitea, GitLab, GitHub — all find their key,
Clones hum quiet, ports parsed properly,
A rabbit hops away, git trees safe and free.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: extending clone authentication beyond GitHub to support GitLab and Gitea/Forgejo via environment variables, and references the closed issue.
Description check ✅ Passed The description provides a well-structured summary covering problem, why it matters, what changed, UX journey before/after, security impact, compatibility, validation evidence, human verification checklist, risks/mitigations, and rollback plan—substantially matching the template requirements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/core/src/handlers/clone.ts`:
- Around line 63-74: resolveForgeAuth currently matches FORGE_AUTH entries
against the whole URL string; instead parse the URL's hostname (e.g., with new
URL(url).hostname.toLowerCase()) and compare the hostname to each entry's
hostPattern using exact label boundaries (hostname === entry.hostPattern ||
hostname.endsWith('.' + entry.hostPattern)) so only the correct host or its
subdomains match; handle invalid URLs by falling back to returning { token:
undefined, scheme: '' } and keep returning the matched entry's token and scheme
when found (refer to resolveForgeAuth and FORGE_AUTH to locate the fix).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ebefae8-5057-47f2-acd5-b6d9504bcb90

📥 Commits

Reviewing files that changed from the base of the PR and between dffefd6 and beee784.

📒 Files selected for processing (2)
  • packages/core/src/handlers/clone.test.ts
  • packages/core/src/handlers/clone.ts

Comment thread packages/core/src/handlers/clone.ts
@kagura-agent

Copy link
Copy Markdown
Contributor Author

Addressed in d2d65bd — thanks for catching the security issue!

Changes:

  • resolveForgeAuth() now parses the URL hostname first and matches against that (exact match for known hosts like github.com, label-based match for self-hosted like gitlab.mycompany.com)
  • URLs with forge names only in the path (e.g. https://evil.example.com/gitlab/mirror) no longer resolve a token
  • Added 2 security regression tests for the path-based injection vector

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/core/src/handlers/clone.test.ts (1)

304-309: ⚡ Quick win

Harden env cleanup to keep tests order-independent.

process.env cleanup is distributed inside tests; if one test exits early, later tests can inherit stale forge tokens. Reset forge env vars in beforeEach/afterEach for this describe block instead of per-test deletes (Line 306-309 and per-test cleanup lines).

♻️ Suggested change
 describe('multi-forge authentication', () => {
-  afterAll(() => {
+  beforeEach(() => {
+    delete process.env.GH_TOKEN;
+    delete process.env.GITLAB_TOKEN;
+    delete process.env.GITEA_TOKEN;
+  });
+
+  afterEach(() => {
+    delete process.env.GH_TOKEN;
     delete process.env.GITLAB_TOKEN;
     delete process.env.GITEA_TOKEN;
   });
@@
-      delete process.env.GITLAB_TOKEN;
@@
-      delete process.env.GITLAB_TOKEN;
@@
-      delete process.env.GITEA_TOKEN;
@@
-      delete process.env.GITEA_TOKEN;
@@
-      delete process.env.GITLAB_TOKEN;
As per coding guidelines, "**/*.{ts,tsx,test.ts}: ... keep tests deterministic with no flaky timing or network dependence without guardrails".

Also applies to: 311-425

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.test.ts` around lines 304 - 309, The
describe block "multi-forge authentication" uses per-test env cleanup which can
leave stale GITLAB_TOKEN/GITEA_TOKEN values if a test exits early; change the
block to explicitly clear and restore those environment variables in beforeEach
and afterEach so each test runs with a deterministic env. Locate the
"describe('multi-forge authentication')" block and add a beforeEach that deletes
process.env.GITLAB_TOKEN and process.env.GITEA_TOKEN (and any other forge token
envs used by these tests) and an afterEach that also deletes or restores them to
their original values to make tests order-independent and avoid relying on
per-test deletes inside individual tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/core/src/handlers/clone.test.ts`:
- Line 430: Replace the untyped require of resolveForgeAuth inside the describe
block with a top-level typed ESM import: move the import of resolveForgeAuth out
of the test body and add an `import { resolveForgeAuth } from '...'` at the top
of the module (ensuring it uses the same module specifier as other imports),
keeping existing module mocks setup before that import so tests still mock
dependencies correctly; update any references in the describe to use the
imported resolveForgeAuth.

---

Nitpick comments:
In `@packages/core/src/handlers/clone.test.ts`:
- Around line 304-309: The describe block "multi-forge authentication" uses
per-test env cleanup which can leave stale GITLAB_TOKEN/GITEA_TOKEN values if a
test exits early; change the block to explicitly clear and restore those
environment variables in beforeEach and afterEach so each test runs with a
deterministic env. Locate the "describe('multi-forge authentication')" block and
add a beforeEach that deletes process.env.GITLAB_TOKEN and
process.env.GITEA_TOKEN (and any other forge token envs used by these tests) and
an afterEach that also deletes or restores them to their original values to make
tests order-independent and avoid relying on per-test deletes inside individual
tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa8bdf6f-08ea-4191-ab69-167adbf62556

📥 Commits

Reviewing files that changed from the base of the PR and between beee784 and d2d65bd.

📒 Files selected for processing (2)
  • packages/core/src/handlers/clone.test.ts
  • packages/core/src/handlers/clone.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/core/src/handlers/clone.ts

Comment thread packages/core/src/handlers/clone.test.ts
@Wirasm

Wirasm commented May 13, 2026

Copy link
Copy Markdown
Collaborator

Review Summary

Verdict: minor-fixes-needed

This PR extends clone authentication to support GitLab and Gitea/Forgejo in addition to GitHub. The implementation is clean and well-scoped — one genuine bug needs fixing and documentation for the new environment variables is missing before merge.

Blocking issues

  • packages/core/src/handlers/clone.ts:342: Using parsed.host in the clone URL injects the port when present, producing malformed URLs like https://oauth2:TOKEN@gitlab.com:8443/owner/repo (port appears twice). Fix:

    cloneUrl = `https://${authScheme}${forgeToken}@${parsed.hostname}${parsed.pathname}${parsed.search}${parsed.hash}`;
  • packages/docs-web/src/content/docs/getting-started/quick-start.md (token setup section): GITLAB_TOKEN and GITEA_TOKEN are not documented for clone auth. Users setting these for webhook adapters have no indication they also enable private-repo cloning. Add notes for both tokens in the token setup section.

  • packages/docs-web/src/content/docs/adapters/community/gitlab.md and .../gitea.md: Both adapter pages describe auto-clone but don't mention that GITLAB_TOKEN / GITEA_TOKEN is required for private repos. Add a Prerequisites note.

Suggested fixes

  • packages/core/src/handlers/clone.ts:35-42: The ForgeAuthEntry docstring references non-existent exact: true/false properties. The matching logic is handled by two separate arrays (FORGE_AUTH vs. SELF_HOSTED_FORGE). Remove those lines and replace with:

    /** Known exact-hostname → env-var + scheme mappings. */
  • packages/core/src/handlers/clone.ts (new exported function): resolveForgeAuth(url) is a public API with no documentation. Add to reference/configuration.md or guides/authoring-workflows.md.

  • packages/core/src/handlers/clone.test.ts: Add tests for SSH URLs with GitLab/Gitea tokens (e.g. git@gitlab.com:group/repo.githttps://gitlab.com/group/repo + token injection). The current flow works, but a regression test would guard against future changes.

Minor / nice-to-have

  • packages/core/src/handlers/clone.test.ts:307-311: afterAll cleanup for GITLAB_TOKEN / GITEA_TOKEN may not run if the suite is interrupted early. The existing GH_TOKEN tests use beforeEach for clean state — align the cleanup pattern.
  • resolveForgeAuth unit tests: Add tests for bare host/path form (github.com/owner/repo), self-hosted GitLab oauth2: scheme detection, and empty/invalid input.
  • CHANGELOG.md: Add entry under [Unreleased] for this fix (closes Web UI clone handler only injects auth token for github.com URLs #1655): "Clone authentication now works for GitLab (GITLAB_TOKEN) and Gitea/Forgejo (GITEA_TOKEN)."
  • oauth2: scheme and self-hosted detection notes in quick-start or adapter pages would help users debugging auth failures.

Compliments

  • The safeParseUrl defensive pattern with explicit null return and caller branching is a clean, maintainable approach — and well-documented in the inline comment.
  • The security regression test confirming tokens don't leak when forge names appear only in the URL path is a strong addition.
  • PR template compliance is excellent — all sections filled, CodeRabbit pre-merge checks pass, direction alignment confirmed.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality, docs-impact.

kagura-agent added a commit to kagura-agent/Archon that referenced this pull request May 13, 2026
…cleanup

- Fix parsed.host → parsed.hostname to avoid port duplication in clone URLs
- Generalize SSH→HTTPS conversion from GitHub-only to all forges (git@host:path)
- Fix ForgeAuthEntry docstring (remove non-existent exact: true/false refs)
- Add GITLAB_TOKEN/GITEA_TOKEN docs to quick-start, configuration, adapter pages
- Add SSH URL + port duplication regression tests
- Harden test env cleanup with beforeEach/afterEach, import afterEach from bun:test
- Remove redundant per-test env cleanup (beforeEach/afterEach handles it)

Addresses review by @Wirasm on coleam00#1658.
@kagura-agent

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @Wirasm! All feedback addressed in 51f0c0a:

Blocking fixes:

  • parsed.hostparsed.hostname — fixes port duplication in clone URLs (added regression test for URLs with custom ports)
  • ✅ Documentation added for GITLAB_TOKEN and GITEA_TOKEN clone auth:
    • Quick-start prerequisites
    • Configuration env var table (GH_TOKEN, GITLAB_TOKEN, GITEA_TOKEN)
    • GitLab + Gitea adapter pages (:::tip in Prerequisites)

Suggested fixes:

  • ✅ Fixed ForgeAuthEntry docstring — removed references to non-existent exact: true/false
  • ✅ Generalized SSH→HTTPS conversion from GitHub-only to all forges (git@host:path pattern)
  • ✅ Added SSH URL tests for GitLab and Gitea token injection
  • ✅ Hardened test env cleanup with beforeEach/afterEach (CodeRabbit suggestion)

All 58 clone tests passing.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
packages/core/src/handlers/clone.test.ts (2)

480-510: 💤 Low value

Consider beforeEach/afterEach for env var cleanup consistency.

These unit tests set and delete env vars inline within each test, which works correctly. However, using beforeEach/afterEach (like the multi-forge suite at lines 306-316) would improve consistency and ensure cleanup even if a test assertion fails before reaching the delete statement.

♻️ Optional cleanup pattern alignment
   describe('resolveForgeAuth', () => {
     const { resolveForgeAuth } = require('./clone');
+
+    beforeEach(() => {
+      delete process.env.GH_TOKEN;
+      delete process.env.GITLAB_TOKEN;
+      delete process.env.GITEA_TOKEN;
+    });
+
+    afterEach(() => {
+      delete process.env.GH_TOKEN;
+      delete process.env.GITLAB_TOKEN;
+      delete process.env.GITEA_TOKEN;
+    });
 
     test('returns GH_TOKEN for github.com', () => {
       process.env.GH_TOKEN = 'ghp_abc';
       const result = resolveForgeAuth('https://github.com/owner/repo');
       expect(result).toEqual({ token: 'ghp_abc', scheme: '' });
-      delete process.env.GH_TOKEN;
     });
 
     test('returns GITLAB_TOKEN with oauth2: scheme for gitlab.com', () => {
       process.env.GITLAB_TOKEN = 'glpat-xyz';
       const result = resolveForgeAuth('https://gitlab.com/owner/repo');
       expect(result).toEqual({ token: 'glpat-xyz', scheme: 'oauth2:' });
-      delete process.env.GITLAB_TOKEN;
     });
 
     // ... similar changes for remaining tests
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.test.ts` around lines 480 - 510, The tests
around resolveForgeAuth currently set and delete process.env variables inline;
replace those per-test inline mutations with a beforeEach that ensures relevant
env vars (GH_TOKEN, GITLAB_TOKEN) are cleared or set to known defaults and an
afterEach that restores the original process.env entries to guarantee cleanup on
failure; update the tests that currently call process.env.GH_TOKEN /
process.env.GITLAB_TOKEN and delete them to rely on the shared
beforeEach/afterEach setup so each test isolates side effects and mirrors the
multi-forge suite pattern.

260-262: ⚡ Quick win

Align cleanup with the new multi-forge suite pattern.

The GH_TOKEN suite sets the token in beforeEach (line 257) but cleans up in afterAll. If a test fails or if only some tests in this suite run, the env var may leak to other tests. The new multi-forge suite (lines 312-316) demonstrates the correct pattern using afterEach for cleanup, ensuring determinism even when tests are interrupted.

♻️ Align env var cleanup with beforeEach/afterEach pattern
-    afterAll(() => {
+    afterEach(() => {
       delete process.env.GH_TOKEN;
     });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.test.ts` around lines 260 - 262, The
GH_TOKEN test suite sets process.env.GH_TOKEN in the beforeEach block but
currently removes it in afterAll, which can leak the env var; change the cleanup
to an afterEach that clears process.env.GH_TOKEN to match the multi-forge suite
pattern and ensure deterministic teardown even if tests fail or run
partially—update the suite surrounding the beforeEach that sets GH_TOKEN to use
afterEach for deletion instead of afterAll.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/core/src/handlers/clone.test.ts`:
- Around line 470-472: The authenticated URL rebuild in clone.ts currently omits
parsed.port so non-standard ports are stripped; update the reconstruction (where
parsed, authScheme and forgeToken are used) to append parsed.port when present
(e.g., include `:${parsed.port}` conditionally between parsed.hostname and
parsed.pathname) and then update the test in clone.test.ts that asserts the
authenticated URL (the expectation using cloneCall?.[1]?.[1]) to expect the URL
with the port preserved (e.g., include :8443).

In `@packages/core/src/handlers/clone.ts`:
- Around line 327-336: The authenticated-clone logic in
packages/core/src/handlers/clone.ts uses parsed.hostname which drops ports, so
change it to use parsed.host (hostname[:port]) when rebuilding cloneUrl inside
the forgeToken branch of the code that sets cloneUrl (refer to safeParseUrl,
workingUrl, parsed, and cloneUrl) to preserve custom ports; also add a
regression test (e.g., clone.test.ts) using a URL like
https://gitlab.example.com:8443/group/repo.git to assert the resulting
authenticated URL includes the :8443 port.

---

Nitpick comments:
In `@packages/core/src/handlers/clone.test.ts`:
- Around line 480-510: The tests around resolveForgeAuth currently set and
delete process.env variables inline; replace those per-test inline mutations
with a beforeEach that ensures relevant env vars (GH_TOKEN, GITLAB_TOKEN) are
cleared or set to known defaults and an afterEach that restores the original
process.env entries to guarantee cleanup on failure; update the tests that
currently call process.env.GH_TOKEN / process.env.GITLAB_TOKEN and delete them
to rely on the shared beforeEach/afterEach setup so each test isolates side
effects and mirrors the multi-forge suite pattern.
- Around line 260-262: The GH_TOKEN test suite sets process.env.GH_TOKEN in the
beforeEach block but currently removes it in afterAll, which can leak the env
var; change the cleanup to an afterEach that clears process.env.GH_TOKEN to
match the multi-forge suite pattern and ensure deterministic teardown even if
tests fail or run partially—update the suite surrounding the beforeEach that
sets GH_TOKEN to use afterEach for deletion instead of afterAll.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bb34a98d-6149-4c08-a139-fe53ca414024

📥 Commits

Reviewing files that changed from the base of the PR and between d2d65bd and 51f0c0a.

📒 Files selected for processing (6)
  • packages/core/src/handlers/clone.test.ts
  • packages/core/src/handlers/clone.ts
  • packages/docs-web/src/content/docs/adapters/community/gitea.md
  • packages/docs-web/src/content/docs/adapters/community/gitlab.md
  • packages/docs-web/src/content/docs/getting-started/configuration.md
  • packages/docs-web/src/content/docs/getting-started/quick-start.md
✅ Files skipped from review due to trivial changes (4)
  • packages/docs-web/src/content/docs/adapters/community/gitlab.md
  • packages/docs-web/src/content/docs/adapters/community/gitea.md
  • packages/docs-web/src/content/docs/getting-started/quick-start.md
  • packages/docs-web/src/content/docs/getting-started/configuration.md

Comment on lines +470 to +472
expect(cloneCall?.[1]?.[1]).toBe(
'https://oauth2:glpat-porttest@gitlab.example.com/owner/repo'
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Extract the URL construction logic from clone.ts to verify port handling

rg -n -A 10 -B 5 'resolveForgeAuth|parsed\.hostname|parsed\.host' packages/core/src/handlers/clone.ts | head -60

Repository: coleam00/Archon

Length of output: 1813


🏁 Script executed:

# Read the test file around lines 464-472 to see the full test context
sed -n '450,480p' packages/core/src/handlers/clone.test.ts

Repository: coleam00/Archon

Length of output: 1254


🏁 Script executed:

# Check if there are other port-related tests in clone.test.ts
rg -n -B 3 -A 3 ':8443|8443|port' packages/core/src/handlers/clone.test.ts

Repository: coleam00/Archon

Length of output: 2145


🏁 Script executed:

# Verify if parsed.port is available and check the full URL reconstruction logic
rg -n -B 2 -A 2 'parsed\.port' packages/core/src/handlers/clone.ts

Repository: coleam00/Archon

Length of output: 41


🏁 Script executed:

# Find safeParseUrl function to see what properties are available
rg -n -A 20 'function safeParseUrl|export.*safeParseUrl' packages/core/src/handlers/clone.ts

Repository: coleam00/Archon

Length of output: 792


🏁 Script executed:

# Check if there's any URL parsing utility in the codebase
fd -e ts -e tsx | xargs rg -l 'safeParseUrl' | head -10

Repository: coleam00/Archon

Length of output: 95


🏁 Script executed:

# Look for any URL.parse or similar in clone.ts
rg -n 'new URL|URL\.parse|safeParseUrl' packages/core/src/handlers/clone.ts

Repository: coleam00/Archon

Length of output: 227


🏁 Script executed:

# Verify URL object has port property and how it should be used
python3 << 'EOF'
# Simulate URL parsing behavior in JavaScript
from urllib.parse import urlparse

test_urls = [
    'https://gitlab.example.com:8443/owner/repo',
    'https://gitlab.example.com/owner/repo',
    'https://oauth2:token@gitlab.example.com:8443/owner/repo',
]

for url in test_urls:
    parsed = urlparse(url)
    print(f"URL: {url}")
    print(f"  hostname: {parsed.hostname}")
    print(f"  port: {parsed.port}")
    print(f"  path: {parsed.path}")
    
    # Show what correct reconstruction should look like
    if parsed.port:
        reconstructed = f"https://oauth2:token@{parsed.hostname}:{parsed.port}{parsed.path}"
    else:
        reconstructed = f"https://oauth2:token@{parsed.hostname}{parsed.path}"
    print(f"  reconstructed: {reconstructed}")
    print()
EOF

Repository: coleam00/Archon

Length of output: 620


🏁 Script executed:

# Check if there are other tests that verify port handling
rg -n 'port|:80|:443|:8443|:8080' packages/core/src/handlers/clone.test.ts | grep -v node_modules

Repository: coleam00/Archon

Length of output: 999


🏁 Script executed:

# Verify the implementation actually needs fixing by checking the full reconstruction logic
sed -n '320,340p' packages/core/src/handlers/clone.ts

Repository: coleam00/Archon

Length of output: 869


Port is stripped from authenticated clone URL, breaking non-standard port cloning.

The implementation at line 330 of clone.ts reconstructs the URL using only parsed.hostname and parsed.pathname, completely omitting parsed.port. For git clone to connect to non-standard ports, the port must be included in the URL (e.g., https://oauth2:token@gitlab.example.com:8443/owner/repo).

The test validates this buggy behavior: with input https://gitlab.example.com:8443/owner/repo, it expects https://oauth2:glpat-porttest@gitlab.example.com/owner/repo (no port). The test comment "Should NOT contain double port" is misleading—the correct goal is to preserve a single port, not strip it entirely. This breaks cloning from self-hosted GitLab/Gitea instances on non-standard ports.

Fix required:

  • Modify line 330 to conditionally include the port: `https://${authScheme}${forgeToken}@${parsed.hostname}${parsed.port ? `:${parsed.port}` : ''}${parsed.pathname}`
  • Update test expectation to include the port: 'https://oauth2:glpat-porttest@gitlab.example.com:8443/owner/repo'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.test.ts` around lines 470 - 472, The
authenticated URL rebuild in clone.ts currently omits parsed.port so
non-standard ports are stripped; update the reconstruction (where parsed,
authScheme and forgeToken are used) to append parsed.port when present (e.g.,
include `:${parsed.port}` conditionally between parsed.hostname and
parsed.pathname) and then update the test in clone.test.ts that asserts the
authenticated URL (the expectation using cloneCall?.[1]?.[1]) to expect the URL
with the port preserved (e.g., include :8443).

Comment thread packages/core/src/handlers/clone.ts
kagura-agent added a commit to kagura-agent/Archon that referenced this pull request May 13, 2026
…cleanup

- Fix parsed.host → parsed.hostname to avoid port duplication in clone URLs
- Generalize SSH→HTTPS conversion from GitHub-only to all forges (git@host:path)
- Fix ForgeAuthEntry docstring (remove non-existent exact: true/false refs)
- Add GITLAB_TOKEN/GITEA_TOKEN docs to quick-start, configuration, adapter pages
- Add SSH URL + port duplication regression tests
- Harden test env cleanup with beforeEach/afterEach, import afterEach from bun:test
- Remove redundant per-test env cleanup (beforeEach/afterEach handles it)

Addresses review by @Wirasm on coleam00#1658.
@kagura-agent kagura-agent force-pushed the fix/clone-multi-forge-auth branch from 51f0c0a to f21784d Compare May 13, 2026 12:20

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
packages/core/src/handlers/clone.ts (1)

327-336: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Custom port on self-hosted forges is still dropped from the authenticated clone URL.

Line 330 still reads parsed.hostname, so an input like https://gitlab.example.com:8443/group/repo.git rebuilds to https://oauth2:<TOKEN>@gitlab.example.com/group/repo.git — the :8443 is lost and git clone will hit the default HTTPS port. This was flagged previously and is one of the explicit PR goals (self-hosted support with custom ports). The PR notes mention a "port-duplication regression" test, but per the current code the failure mode is port omission, not duplication.

Use parsed.host (which is hostname[:port] with no duplication when the URL doesn't carry a port) and add a regression test that asserts the port survives the rewrite for a URL like https://gitlab.example.com:8443/group/repo.git.

🔧 Proposed fix
   if (forgeToken) {
     const parsed = safeParseUrl(workingUrl);
     if (parsed) {
-      cloneUrl = `https://${authScheme}${forgeToken}@${parsed.hostname}${parsed.pathname}`;
+      cloneUrl = `https://${authScheme}${forgeToken}@${parsed.host}${parsed.pathname}`;
     } else if (!workingUrl.startsWith('http')) {
       // Bare host/path form (e.g. github.com/owner/repo)
       cloneUrl = `https://${authScheme}${forgeToken}@${workingUrl}`;
     }
     getLog().debug('clone_authenticated');
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.ts` around lines 327 - 336, The
authenticated clone URL builder in clone.ts loses custom ports because it uses
parsed.hostname; change it to use parsed.host (which includes hostname:port)
when constructing cloneUrl in the block that sets cloneUrl =
`https://${authScheme}${forgeToken}@${parsed.hostname}${parsed.pathname}` so it
becomes `https://${authScheme}${forgeToken}@${parsed.host}${parsed.pathname}`;
also add a regression test exercising the code path that parses an input like
`https://gitlab.example.com:8443/group/repo.git` (using safeParseUrl/clone logic
or the function that returns cloneUrl) and assert the resulting authenticated
URL preserves `:8443`.
🧹 Nitpick comments (1)
packages/core/src/handlers/clone.ts (1)

54-63: ⚡ Quick win

Move the resolveForgeAuth JSDoc next to the function it documents.

The docstring on lines 54-57 sits before SELF_HOSTED_FORGE (which has its own JSDoc on line 58), so IDE tooling and tsdoc/API extractors will attach the description to SELF_HOSTED_FORGE (or drop it entirely) instead of resolveForgeAuth on line 65. Relocate the block to immediately above the function.

📝 Proposed reorder
-/**
- * Resolve forge-specific authentication token and URL scheme for a repository URL.
- * Returns the token and auth scheme prefix, or empty values if no token is available.
- */
 /** Well-known self-hosted hostname label patterns → env var + scheme. */
 const SELF_HOSTED_FORGE: { label: string; envVar: string; scheme: string }[] = [
   { label: 'gitlab', envVar: 'GITLAB_TOKEN', scheme: 'oauth2:' },
   { label: 'gitea', envVar: 'GITEA_TOKEN', scheme: '' },
   { label: 'forgejo', envVar: 'GITEA_TOKEN', scheme: '' },
 ];

+/**
+ * Resolve forge-specific authentication token and URL scheme for a repository URL.
+ * Returns the token and auth scheme prefix, or empty values if no token is available.
+ */
 export function resolveForgeAuth(url: string): { token: string | undefined; scheme: string } {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/handlers/clone.ts` around lines 54 - 63, The JSDoc for
resolveForgeAuth is currently placed above SELF_HOSTED_FORGE so tools will
associate it with the wrong symbol; move the comment block currently on lines
describing "Resolve forge-specific authentication token and URL scheme..." so it
appears immediately above the resolveForgeAuth function declaration (reference:
resolveForgeAuth) and leave the SELF_HOSTED_FORGE JSDoc with the
SELF_HOSTED_FORGE constant; ensure only the resolveForgeAuth description sits
directly above the resolveForgeAuth function signature so tsdoc/IDE pick it up
correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@packages/core/src/handlers/clone.ts`:
- Around line 327-336: The authenticated clone URL builder in clone.ts loses
custom ports because it uses parsed.hostname; change it to use parsed.host
(which includes hostname:port) when constructing cloneUrl in the block that sets
cloneUrl =
`https://${authScheme}${forgeToken}@${parsed.hostname}${parsed.pathname}` so it
becomes `https://${authScheme}${forgeToken}@${parsed.host}${parsed.pathname}`;
also add a regression test exercising the code path that parses an input like
`https://gitlab.example.com:8443/group/repo.git` (using safeParseUrl/clone logic
or the function that returns cloneUrl) and assert the resulting authenticated
URL preserves `:8443`.

---

Nitpick comments:
In `@packages/core/src/handlers/clone.ts`:
- Around line 54-63: The JSDoc for resolveForgeAuth is currently placed above
SELF_HOSTED_FORGE so tools will associate it with the wrong symbol; move the
comment block currently on lines describing "Resolve forge-specific
authentication token and URL scheme..." so it appears immediately above the
resolveForgeAuth function declaration (reference: resolveForgeAuth) and leave
the SELF_HOSTED_FORGE JSDoc with the SELF_HOSTED_FORGE constant; ensure only the
resolveForgeAuth description sits directly above the resolveForgeAuth function
signature so tsdoc/IDE pick it up correctly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 68c81bbe-31e9-43f0-9169-943d396b4696

📥 Commits

Reviewing files that changed from the base of the PR and between 51f0c0a and f21784d.

📒 Files selected for processing (6)
  • packages/core/src/handlers/clone.test.ts
  • packages/core/src/handlers/clone.ts
  • packages/docs-web/src/content/docs/adapters/community/gitea.md
  • packages/docs-web/src/content/docs/adapters/community/gitlab.md
  • packages/docs-web/src/content/docs/getting-started/configuration.md
  • packages/docs-web/src/content/docs/getting-started/quick-start.md
✅ Files skipped from review due to trivial changes (2)
  • packages/docs-web/src/content/docs/adapters/community/gitlab.md
  • packages/docs-web/src/content/docs/adapters/community/gitea.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/docs-web/src/content/docs/getting-started/configuration.md
  • packages/core/src/handlers/clone.test.ts

@Wirasm

Wirasm commented May 14, 2026

Copy link
Copy Markdown
Collaborator

Review Summary

Verdict: minor-fixes-needed

This PR adds multi-forge clone authentication for GitLab, Gitea, and Forgejo — a solid, well-tested extension of the existing GitHub-only auth flow. The implementation is clean and follows project conventions. Two issues need attention before merge: a security concern in hostname detection and missing docs for the new env vars.

Blocking issues

  • packages/core/src/handlers/clone.ts:72-75 — Self-hosted forge detection matches labels anywhere in the hostname string, including after the TLD (e.g., gitlab.example.com.gitlab.attacker.com would trigger injection). Fix: only check hostname labels that appear before the TLD boundary.

  • packages/docs-web/src/content/docs/getting-started/configuration.md — The primary environment-variable table is missing GH_TOKEN, GITLAB_TOKEN, and GITEA_TOKEN. These are the vars users first look for when setting up private-repo clones.

Suggested fixes

  • packages/core/src/handlers/clone.ts:51 — Remove gitea.com from FORGE_AUTH; it's dead code since the gitea label in SELF_HOSTED_FORGE already handles self-hosted cases.

  • packages/core/src/handlers/clone.ts:~325-332 — Add a test for the bare host/path URL form (github.com/owner/repo) to exercise the else branch where safeParseUrl returns null.

Minor / nice-to-have

  • packages/core/src/handlers/clone.ts:47 — Move the orphaned JSDoc comment above the SELF_HOSTED_FORGE declaration as a block comment.
  • packages/docs-web/src/content/docs/getting-started/quick-start.md — Consider moving the private-repo token step from a numbered prerequisite to a tip callout under "What's Next?".
  • packages/docs-web/src/content/docs/reference/troubleshooting.md — Add GITLAB_TOKEN and GITEA_TOKEN variants to the 401/403 clone-failure section.
  • packages/core/src/handlers/clone.ts:~406 — Add isolated test for registerRepository SSH→HTTPS URL normalization.

Compliments

  • The resolveForgeAuth function is a clean, pure abstraction that handles all input forms without silent failures — good architecture.
  • Adapter docs (gitlab.md, gitea.md) got excellent tip boxes explaining dual-purpose token usage.
  • Error-handling patterns are consistent throughout; the safeParseUrl → explicit null-check pattern is well-applied.

Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality, docs-impact.

@kagura-agent

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review @Wirasm! 🙏

I've pushed a follow-up commit (acc36f12) addressing the remaining feedback:

Must-fix (blocking):

  1. Hostname boundary check — Changed labels.includes(entry.label)labels[0] === entry.label so self-hosted forge detection only matches the leftmost subdomain. Prevents token leakage to crafted hostnames like gitlab.example.com.gitlab.attacker.com.
  2. Docs: forge tokens in env tableGH_TOKEN, GITLAB_TOKEN, and GITEA_TOKEN added to the configuration.md environment variable table (done in previous commit).

Should-fix:
3. ✅ Removed dead gitea.com from FORGE_AUTH — it was redundant since the gitea label in SELF_HOSTED_FORGE already handles Gitea instances.
4. ✅ Bare host/path test — Added test for github.com/owner/repo form (no protocol) to exercise the else branch in safeParseUrl.
5. ✅ Moved orphaned JSDocresolveForgeAuth documentation now sits directly above the function declaration.

All 61 tests pass.

…cleanup

- Fix parsed.host → parsed.hostname to avoid port duplication in clone URLs
- Generalize SSH→HTTPS conversion from GitHub-only to all forges (git@host:path)
- Fix ForgeAuthEntry docstring (remove non-existent exact: true/false refs)
- Add GITLAB_TOKEN/GITEA_TOKEN docs to quick-start, configuration, adapter pages
- Add SSH URL + port duplication regression tests
- Harden test env cleanup with beforeEach/afterEach, import afterEach from bun:test
- Remove redundant per-test env cleanup (beforeEach/afterEach handles it)

Addresses review by @Wirasm on coleam00#1658.
- Only match first hostname label for self-hosted forge detection,
  preventing token leakage to crafted hostnames like
  gitlab.example.com.gitlab.attacker.com
- Remove dead gitea.com entry from FORGE_AUTH (handled by SELF_HOSTED_FORGE)
- Add test for bare host/path URL form (github.com/owner/repo)
- Move orphaned JSDoc comment above resolveForgeAuth function
@Wirasm Wirasm force-pushed the fix/clone-multi-forge-auth branch from acc36f1 to 47d88bb Compare May 15, 2026 13:41
@Wirasm Wirasm merged commit c94803e into coleam00:dev May 15, 2026
3 of 4 checks passed
cropse pushed a commit to cropse/Archon that referenced this pull request May 19, 2026
…EA_TOKEN (fixes coleam00#1655) (coleam00#1658)

* fix: address review feedback — hostname bug, docs, SSH support, test cleanup

- Fix parsed.host → parsed.hostname to avoid port duplication in clone URLs
- Generalize SSH→HTTPS conversion from GitHub-only to all forges (git@host:path)
- Fix ForgeAuthEntry docstring (remove non-existent exact: true/false refs)
- Add GITLAB_TOKEN/GITEA_TOKEN docs to quick-start, configuration, adapter pages
- Add SSH URL + port duplication regression tests
- Harden test env cleanup with beforeEach/afterEach, import afterEach from bun:test
- Remove redundant per-test env cleanup (beforeEach/afterEach handles it)

Addresses review by @Wirasm on coleam00#1658.

* fix(clone): hostname boundary check, remove dead code, add bare URL test

- Only match first hostname label for self-hosted forge detection,
  preventing token leakage to crafted hostnames like
  gitlab.example.com.gitlab.attacker.com
- Remove dead gitea.com entry from FORGE_AUTH (handled by SELF_HOSTED_FORGE)
- Add test for bare host/path URL form (github.com/owner/repo)
- Move orphaned JSDoc comment above resolveForgeAuth function
@Wirasm Wirasm mentioned this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Web UI clone handler only injects auth token for github.com URLs

2 participants