Skip to content

fix(install): fetch hosted git deps over https, not ssh#394

Merged
jdx merged 4 commits intomainfrom
fix/hosted-git-https
Apr 30, 2026
Merged

fix(install): fetch hosted git deps over https, not ssh#394
jdx merged 4 commits intomainfrom
fix/hosted-git-https

Conversation

@jdx
Copy link
Copy Markdown
Contributor

@jdx jdx commented Apr 30, 2026

Summary

  • For github / gitlab / bitbucket deps, re-derive an HTTPS URL from (host, owner, repo, sha) at fetch time instead of dialing whatever scheme the lockfile recorded. Matches npm pacote and pnpm gitHostedTarballFetcher.
  • SHA-pinned hosted reads go through https://codeload.github.com/<owner>/<repo>/tar.gz/<sha> (no git binary, no SSH key); on any HTTP error, fall back to a shallow git clone over the rewritten HTTPS URL so a system git credential helper (gh CLI etc.) keeps private repos working.
  • Branch / tag committishes are pinned via git ls-remote on the same rewritten HTTPS URL before reaching the codeload path. Non-hosted hosts (self-hosted GitLab / Gitea / arbitrary) still use the URL as written, preserving SSH-only setups.

Why

Reported in discussion #335 as a follow-up to #338. After #338 made package-lock.json git deps parse correctly, aube was dialing the lockfile-canonical git+ssh://git@github.com/…#<sha> directly, which fails for users with HTTPS-only git auth (e.g. gh CLI's git credential helper, no ~/.ssh/). npm/pnpm never dial that SSH URL — it's an identity form, not a fetch URL.

LocalSource::Git.url in the lockfile is preserved verbatim so cross-tool round-trip with pnpm / npm / yarn is unaffected.

Test plan

  • cargo test -p aube-lockfile -p aube-store -p aube-resolver --lib (459 unit tests, all green)
  • cargo clippy --all-targets -- -D warnings
  • cargo fmt --check
  • mise run test:bats test/git_deps.bats test/install.bats test/package_lock_write.bats (87 / 87 pass — clone fallback path unchanged)
  • 5 new unit tests covering: parse_hosted_git across all clone-URL forms (https / ssh / git+ / scp / github:), per-provider codeload URL synthesis, extract_codeload_tarball wrapper-strip + caching, defense against ../absolute symlink targets in tarball entries, short-commit rejection.

🤖 Generated with Claude Code


Note

Medium Risk
Changes git dependency fetching/materialization and introduces new tarball extraction logic, which can affect install reliability and security if edge cases slip through. Mitigated by strict SHA gating, cache reuse, and explicit unsafe-path/symlink defenses with clone fallback.

Overview
Git dependencies on GitHub/GitLab/Bitbucket are now resolved/installed using npm/pnpm-like hosted routing: derive a canonical (host, owner, repo) identity from the lockfile URL, pin refs via git ls-remote over derived HTTPS, then prefer fetching a provider-specific codeload-style tarball for full-SHA commits (with fallback to shallow git clone).

This adds a new codeload extraction cache and hardened tarball extraction path in aube-store (wrapper-dir stripping, atomic cache population, entry/path/symlink validation, and Windows symlink handling), and wires the new flow through both the resolver (resolve_git_source now takes an optional RegistryClient) and installer while keeping the original lockfile url bytes unchanged for cross-tool round-tripping.

Reviewed by Cursor Bugbot for commit 8bd9198. Bugbot is set up for automated code reviews on this repo. Configure here.

Match npm pacote / pnpm gitHostedTarballFetcher: for github / gitlab /
bitbucket deps, the lockfile-canonical `git+ssh://git@…` URL is
identity-only — re-derive an HTTPS URL from `(host, owner, repo, sha)`
each install. Public reads go through `codeload.github.com/<owner>/<repo>/tar.gz/<sha>`
(no `git` binary, no SSH key); on any HTTP error, fall back to a
shallow `git clone` over the rewritten HTTPS URL so a system git
credential helper (gh CLI etc.) keeps private repos working.

Branch / tag committishes are pinned via `git ls-remote` on the same
rewritten HTTPS URL before reaching the codeload path, so users with
no SSH key configured can install non-SHA-pinned hosted git deps too.

Non-hosted hosts (self-hosted GitLab / Gitea / arbitrary) keep using
the URL as written in the lockfile, preserving SSH-only setups.

Closes the follow-up reported in discussion #335 after #338: a
`package-lock.json` recording `git+ssh://git@github.com/…#<sha>` no
longer requires an SSH key to install.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented Apr 30, 2026

Greptile Summary

This PR re-derives HTTPS fetch URLs at install/resolve time for GitHub, GitLab, and Bitbucket deps instead of dialling the lockfile-canonical git+ssh:// form, matching npm pacote / pnpm gitHostedTarballFetcher semantics. SHA-pinned deps go through a codeload HTTPS tarball with a new atomic-rename, tar-hardened extract cache, falling back to shallow git clone over HTTPS on any HTTP or extraction error; non-hosted and self-hosted URLs are unchanged.

Confidence Score: 5/5

Safe to merge; only P2 findings present, both in edge cases that fall back gracefully to git clone.

All P1-level concerns (codeload extraction failure falling back to clone, SSH→HTTPS rewrite consistency, cache key agreement between resolver and installer, tar safety) are handled correctly. Two P2 items found: dead trim_end_matches code and a missing second is_dir() guard in a concurrent rename edge case.

crates/aube-store/src/lib.rs — concurrent rename recovery path around line 1971

Important Files Changed

Filename Overview
crates/aube-lockfile/src/lib.rs Adds HostedGit/HostedGitHost structs and parse_hosted_git for normalising lockfile URLs across all SSH/HTTPS/scp variants; trim_end_matches('/') on repo is unreachable dead code but otherwise logic is solid and well-tested.
crates/aube-resolver/src/local_source.rs Rewrites resolve_git_source to: (1) rewrite SSH lockfile URLs to HTTPS for hosted providers, (2) attempt codeload tarball fetch, (3) fall back to git clone. The codeload extraction failure now correctly falls through to clone (matching the installer). Cache hit path, fallback symmetry, and client=None handling all look correct.
crates/aube-store/src/lib.rs Adds extract_codeload_tarball, codeload_cache_lookup, and codeload_cache_paths; extraction has solid tar safety (path traversal, symlink escapes, entry/size caps, atomic rename). Minor gap: concurrent rename recovery misses a second is_dir() guard after the retry rename.
crates/aube-resolver/src/resolve.rs One-line change threads Some(self.client.as_ref()) into resolve_git_source; straightforward and correct.
crates/aube/src/commands/install/mod.rs Install path mirrors the resolver's codeload-first then clone-fallback logic; cache reuse key is consistent with the resolver (original_url + resolved), and HTTPS URL rewrite for shallow-clone git_host_in_list check is intentional.

Fix All in Claude Code

Reviews (4): Last reviewed commit: "fix(resolver): fall back to clone on cod..." | Re-trigger Greptile

Comment thread crates/aube-store/src/lib.rs Outdated
Two review-flagged issues:

1. The three new codeload tests sandboxed `cache_dir()` by mutating
   `XDG_CACHE_HOME` from inside `unsafe { set_var }`. cargo test runs
   in parallel by default, so tests racing on the env-var read each
   other's tempdir — Linux happened to schedule us out of it but
   Windows surfaced it as a PermissionDenied on a sibling test's
   already-dropped tempdir, breaking CI on that runner.

   Factor a private `extract_codeload_tarball_at(cache_root, …)` and
   have tests pass their own `tempfile::tempdir()` directly. The
   public wrapper still resolves the cache root via `dirs::cache_dir()`,
   so production callers are unchanged.

2. The Windows symlink branch of `extract_codeload_tarball` was a
   silent `let _ = …` — packages that ship symlinks would extract
   into a half-populated tree and the linker would later fail on a
   missing entry several layers down with no breadcrumb back to the
   git dep. Surface it as `Error::Tar` with a hint to remove the
   cached extract so the next install attempt falls through to
   `git clone` (which materializes symlinks via git's own write path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jdx
Copy link
Copy Markdown
Contributor Author

jdx commented Apr 30, 2026

Addressed both findings + the windows CI failure (same root cause as the test set_var race) in 0eaf662:

  • Test set_var race: factored a private extract_codeload_tarball_at(cache_root, …) and updated all three tests to pass their own tempfile::tempdir() directly. No more env-var mutation under parallel cargo test. Public API is unchanged — the wrapper still resolves the cache root via dirs::cache_dir().
  • Windows symlink silently dropped: now returns Error::Tar with a hint to remove the cached extract so the next install falls through to git clone (which can materialize symlinks via git's own admin-aware write path).

Verified locally: 75/75 store unit tests + 11/11 git_deps bats green. CI should be clean now.

Comment thread crates/aube/src/commands/install/mod.rs
Both the resolver and the install path were calling
`fetch_tarball_bytes` *before* `extract_codeload_tarball`'s top-of-
function `target.is_dir()` check. On the resolver→installer reuse path
that meant the installer paid a full HTTPS round-trip for the codeload
tarball only for the extract to immediately short-circuit and discard
the bytes. The `git_shallow_clone` fallback already has this fast
path at its entry — codeload should match.

Add `aube_store::codeload_cache_lookup(url, commit)` (a public,
network-free `target.is_dir()` probe sharing the same cache key
derivation as the extract path) and consult it before the fetch in
both call sites. Cache miss → unchanged behavior. Cache hit → skip
the download entirely.

Found by Cursor Bugbot review on PR #394.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jdx
Copy link
Copy Markdown
Contributor Author

jdx commented Apr 30, 2026

Cursor Bugbot's redundant-download finding addressed in b3a5dad: added aube_store::codeload_cache_lookup(url, commit) (network-free is_dir probe sharing the same cache key derivation as extract_codeload_tarball) and consult it before fetch_tarball_bytes in both the resolver and installer. Resolver→installer reuse no longer pays a wasted HTTPS round-trip — matches what git_shallow_clone's top-of-function fast path already does.

Comment thread crates/aube-resolver/src/local_source.rs Outdated
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b3a5dad. Configure here.

Comment thread crates/aube-resolver/src/local_source.rs Outdated
Mirror the installer: a corrupt or unexpectedly-shaped codeload tarball
(CDN hiccup, unsafe-path rejection, Windows symlink) now falls through
to the shallow `git clone` path instead of hard-failing the resolve.
Previously the `??` on `spawn_blocking` propagated the inner extract
error past the clone fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jdx
Copy link
Copy Markdown
Contributor Author

jdx commented Apr 30, 2026

Addressed in 8bd9198: the resolver's codeload extract path now matches on Ok / Err and falls through to git clone on extract failure, symmetric with the installer at crates/aube/src/commands/install/mod.rs:498-520. The previous ?? was eating the clone fallback for corrupt / unsafe-path / Windows-symlink tarballs.

Verified locally: cargo clippy --all-targets -- -D warnings clean, cargo test -p aube-resolver -p aube-store -p aube-lockfile --lib 462/462 green.

Written with Claude.

@jdx jdx merged commit 78e5db3 into main Apr 30, 2026
17 checks passed
@jdx jdx deleted the fix/hosted-git-https branch April 30, 2026 12:58
This was referenced Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant