fix(core): match SSH URL host generically, not just github.com (closes #1614)#1656
Conversation
The SSH-to-HTTPS converter in normalizeRepoUrl() and registerRepository() only matched `git@github.com:` literally. Custom SSH host aliases, GitHub Enterprise, Gitea, GitLab, and Bitbucket SSH URLs were left unchanged, which produced workspace paths containing literal `git@<host>:` segments. On Windows the colon makes mkdir fail with ENOTDIR; on Unix the owner extraction is malformed. Replace the literal-host check with an SCP-style regex `/^git@([^:]+):(.+)$/` at both call sites. The github.com case still converts identically; new hosts (custom aliases, GHE, GitLab, Bitbucket) now produce path-safe HTTPS URLs. Closes coleam00#1614.
📝 WalkthroughWalkthroughRepository clone handlers now detect and convert generic SSH-form URLs ( ChangesSSH URL Normalization for Repository Registration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/core/src/handlers/clone.test.ts (1)
253-285: ⚡ Quick winAdd matching non-github/custom-host coverage for
registerRepositorytoo.Great additions for
cloneRepository; please add one analogous test forregisterRepository(remoteoriginasgit@<host>:<owner>/<repo>.git) so both changed normalization call sites are protected.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/core/src/handlers/clone.test.ts` around lines 253 - 285, Add a new unit test in packages/core/src/handlers/clone.test.ts that mirrors the two SSH->HTTPS cases for cloneRepository but calls registerRepository instead; specifically mock createCodebase (mockCreateCodebase) to return a Codebase with repository_url set to the expected HTTPS form, call registerRepository('git@host:owner/repo.git') for both a custom-host alias case and a non-github host case, then inspect spyExecFileAsync.mock.calls to find the git command that sets origin (the call where args[0] === 'git' and args[1]?.[0] === 'remote' or similar) and assert the remote URL contains the HTTPS URL (e.g., 'https://gh-work/owner/repo' or 'https://gitlab.example.com/team/project') and does not contain 'git@'; keep the test structure and mocks consistent with existing cloneRepository tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/core/src/handlers/clone.ts`:
- Around line 178-180: The current SSH-to-HTTPS normalization uses
/^git@([^:]+):(.+)$/.exec(workingUrl) and rewrites workingUrl, which allows
lookalike hosts (e.g., github.com.evil) to pass a later
workingUrl.includes('github.com') check and receive GH_TOKEN; to fix, extract
the SSH host (sshMatch[1]) into a variable and either only rewrite to HTTPS when
that host is exactly 'github.com' or replace the later includes('github.com')
gate with a strict hostname check using URL parsing (e.g., new
URL(workingUrl).hostname === 'github.com'); update the code around the sshMatch
handling and the GH_TOKEN conditional to use this exact-host check so tokens are
only sent to the real github.com.
---
Nitpick comments:
In `@packages/core/src/handlers/clone.test.ts`:
- Around line 253-285: Add a new unit test in
packages/core/src/handlers/clone.test.ts that mirrors the two SSH->HTTPS cases
for cloneRepository but calls registerRepository instead; specifically mock
createCodebase (mockCreateCodebase) to return a Codebase with repository_url set
to the expected HTTPS form, call registerRepository('git@host:owner/repo.git')
for both a custom-host alias case and a non-github host case, then inspect
spyExecFileAsync.mock.calls to find the git command that sets origin (the call
where args[0] === 'git' and args[1]?.[0] === 'remote' or similar) and assert the
remote URL contains the HTTPS URL (e.g., 'https://gh-work/owner/repo' or
'https://gitlab.example.com/team/project') and does not contain 'git@'; keep the
test structure and mocks consistent with existing cloneRepository tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5cc34aab-64ba-4d55-b1f9-b349abcbc595
📒 Files selected for processing (2)
packages/core/src/handlers/clone.test.tspackages/core/src/handlers/clone.ts
| const sshMatch = /^git@([^:]+):(.+)$/.exec(workingUrl); | ||
| if (sshMatch) { | ||
| workingUrl = `https://${sshMatch[1]}/${sshMatch[2]}`; |
There was a problem hiding this comment.
Prevent GH_TOKEN leakage to lookalike hosts after generic SSH normalization.
Line 178-Line 180 now normalizes any git@<host>:... to HTTPS, which makes the later workingUrl.includes('github.com') gate unsafe. A host like git@github.com.evil:org/repo.git becomes https://github.com.evil/... and will receive GH_TOKEN.
Suggested fix
- if (ghToken && workingUrl.includes('github.com')) {
+ let isGithubHost = false;
+ try {
+ const parsed = new URL(workingUrl.startsWith('http') ? workingUrl : `https://${workingUrl}`);
+ isGithubHost = parsed.hostname === 'github.com';
+ } catch {
+ isGithubHost = false;
+ }
+
+ if (ghToken && isGithubHost) {🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 178-178: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.
(coderabbit.command-injection.exec-js)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/handlers/clone.ts` around lines 178 - 180, The current
SSH-to-HTTPS normalization uses /^git@([^:]+):(.+)$/.exec(workingUrl) and
rewrites workingUrl, which allows lookalike hosts (e.g., github.com.evil) to
pass a later workingUrl.includes('github.com') check and receive GH_TOKEN; to
fix, extract the SSH host (sshMatch[1]) into a variable and either only rewrite
to HTTPS when that host is exactly 'github.com' or replace the later
includes('github.com') gate with a strict hostname check using URL parsing
(e.g., new URL(workingUrl).hostname === 'github.com'); update the code around
the sshMatch handling and the GH_TOKEN conditional to use this exact-host check
so tokens are only sent to the real github.com.
Review SummaryVerdict: ready-to-merge This is a clean, well-scoped bug fix. The PR replaces hardcoded Blocking issuesNone. Suggested fixesNone. Minor / nice-to-haveNone. Compliments
Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review. |
…m00#1656) The SSH-to-HTTPS converter in normalizeRepoUrl() and registerRepository() only matched `git@github.com:` literally. Custom SSH host aliases, GitHub Enterprise, Gitea, GitLab, and Bitbucket SSH URLs were left unchanged, which produced workspace paths containing literal `git@<host>:` segments. On Windows the colon makes mkdir fail with ENOTDIR; on Unix the owner extraction is malformed. Replace the literal-host check with an SCP-style regex `/^git@([^:]+):(.+)$/` at both call sites. The github.com case still converts identically; new hosts (custom aliases, GHE, GitLab, Bitbucket) now produce path-safe HTTPS URLs. Closes coleam00#1614.
Summary
normalizeRepoUrl()andregisterRepository()only matchedgit@github.com:literally. SSH URLs with any other host (custom aliases, GHE, Gitea, GitLab, Bitbucket) were left asgit@<host>:owner/repo, producing workspace paths with literalgit@<host>:segments. On Windows the colon makes mkdir fail with ENOTDIR; on Unix the owner extraction is malformed.git@host:ownerdirectory on Unix.cleaned.startsWith('git@github.com:')for the SCP-style regex/^git@([^:]+):(.+)$/. The github.com path is identical; non-github SSH hosts now convert to path-safe HTTPS.workingUrl.includes('github.com')so non-github HTTPS URLs are not authenticated with a GH token. Local-path and HTTPS code paths are untouched.UX Journey
Before
After
Architecture Diagram
Before
After
Connection inventory:
cloneRepositorynormalizeRepoUrlregisterRepository(inline ssh→https)normalizeRepoUrl(inline ssh→https)cloneRepositoryincludes('github.com')Label Snapshot
risk: lowsize: XScorecore:handlers/cloneChange Metadata
bugcoreLinked Issue
Validation Evidence (required)
The full-monorepo
bun run lintOOMs the Node ESLint process on this branch and on a clean checkout ofupstream/main(pre-existing on this machine), so it is run scoped to the two touched files. The same OOM also blocks running the full test suite to completion; the file-scopedclone.test.tsrun passes all 46 tests including the two new ones for custom-alias and non-github SSH hosts.Security Impact (required)
NoNoNo(GH_TOKEN injection logic is untouched; still gates onworkingUrl.includes('github.com'), so agit@gh-work:o/rURL converted tohttps://gh-work/o/rwill not receive a GitHub token; a custom alias whose name contains the substringgithub.comwould, but that is pre-existing behavior in this file)NoCompatibility / Migration
Yes. The github.com case produces the byte-identical URLhttps://github.com/owner/repoit did before.NoNoHuman Verification (required)
Verified scenarios in
clone.test.ts:git@github.com:owner/repo.git→https://github.com/owner/repo(regression test, unchanged behavior)git@github.com:owner/repo.gitwithGH_TOKEN=ghp_…→ token-injected HTTPS URL (regression test, unchanged behavior)git@gh-work:owner/repo.git→https://gh-work/owner/repo(new test)git@gitlab.example.com:team/project.git→https://gitlab.example.com/team/project(new test)Edge cases checked:
not.toContain('git@')) still hold under the regex.git@github.com:still hits the github.com substring guard.What was not verified:
[^:], so the output cannot contain a colon.git@[::1]:owner/repo). Out of scope and not part of any existing call site.Side Effects / Blast Radius (required)
cloneRepository(url)orregisterRepository(localPath)whose remote is an SSH URL. github.com remotes are byte-identical.github.com(e.g.git@github.com-work:o/r) would now producehttps://github.com-work/o/r, which on the existingworkingUrl.includes('github.com')check would still trigger GH_TOKEN injection. This is the same loose-substring behavior the file has today on HTTPS URLs; not a regression.Rollback Plan (required)
git revert <merge-sha>ondev. The twomatch()blocks are local to this commit.Risks and Mitigations
git@but is not an SSH URL (e.g. a username happens to begitin a non-SSH context) would match the regex. Mitigation: SCP-style SSH grammar is<user>@<host>:<path>, and thegit@prefix plus colon plus non-empty path is the canonical form for git SSH remotes; the previous code also matched only on thegit@prefix, so the input class accepted here is strictly a superset of what the previous code accepted.Summary by CodeRabbit