Skip to content

Add support for Git dependencies#283

Merged
charliermarsh merged 6 commits intomainfrom
charlie/vcs
Nov 2, 2023
Merged

Add support for Git dependencies#283
charliermarsh merged 6 commits intomainfrom
charlie/vcs

Conversation

@charliermarsh
Copy link
Copy Markdown
Member

@charliermarsh charliermarsh commented Nov 2, 2023

Summary

This PR adds support for Git dependencies, like:

flask @ git+https://github.com/pallets/flask.git

Right now, they're only supported in the resolver (and not the installer), since the installer doesn't yet support source distributions at all.

The general approach here is based on Cargo's Git implementation. Specifically, I adapted Cargo's git module to perform the cloning, which is based on libgit2.

As compared to Cargo's implementation, I made the following changes:

  • Removed any unnecessary code.
  • Fixed any Clippy errors for our stricter ruleset.
  • Removed the dependency on curl, in favor of reqwest which we use elsewhere.
  • Removed the ability to use gix. Cargo allows the use of gix as an experimental flag, but it only supports a small subset of the operations. When Cargo fully adopts gix, we should plan to do the same.
  • Removed Cargo's host key checking. We need to re-add this! I'll do it shortly.
  • Removed Cargo's progress bars. We should re-add this too, but we use indicatif and Cargo had their own thing.

There are a few follow-ups to consider:

  • Adding support in the installer.
  • When we lock, we should write out the Git URL that includes the exact SHA. This lets us cache in perpetuity and avoids dependencies changing without re-locking.
  • When we resolve, we should always try to refresh Git dependencies. (Right now, we skip if the wheel was already built.)

I'll work on the latter two in follow-up PRs.

Closes #202.

@charliermarsh charliermarsh force-pushed the charlie/vcs branch 3 times, most recently from bf0a2c8 to 2519334 Compare November 2, 2023 01:31
Comment thread crates/puffin-vcs/src/git.rs Outdated
@@ -0,0 +1,1367 @@
/// Git support is derived from Cargo's implementation.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file does the actual Git operations. It's largely derived from Cargo.

.join(short_id.as_str());
db.copy_to(actual_rev, &checkout_path, self.strategy, &self.client)?;

Ok(checkout_path)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cool thing about Cargo's implementation is that they maintain a "database" of repositories. So every repository that you ever check out is included in there. Then, peer to it, they have a directory of checkouts, which are created by checking out the commit in the database and hardlinking from the database to the checkout. So it's fast to switch between commits of the same repository.

@charliermarsh charliermarsh force-pushed the charlie/vcs branch 7 times, most recently from e075b5f to bed2256 Compare November 2, 2023 04:08
@charliermarsh charliermarsh marked this pull request as ready for review November 2, 2023 04:09

debug!("Performing a Git fetch for: {remote_url}");
match strategy {
FetchStrategy::Cli => fetch_with_cli(repo, remote_url, &refspecs, tags),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are so many issues about Cargo not working as expected with SSH dependencies that I'm wondering if we should just make this the default (rust-lang/cargo#2078, rust-lang/cargo#3381, etc.).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This would not remove our dependency on libgit2 since this is just the fetch.)

Copy link
Copy Markdown
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the parts copied from cargo.

It's a pity that we copy the code and there's no reusable crate.

Comment thread crates/puffin-build/src/lib.rs
Comment thread crates/puffin-build/src/lib.rs Outdated
Comment thread crates/puffin-resolver/src/resolver.rs Outdated
@charliermarsh
Copy link
Copy Markdown
Member Author

@konstin - I agree although I'm actually happy with how little code we had to copy in the end. It may not feel that way from the review, but I was able to pare it down a lot. Now, we're really only borrowing the actual Git operations and logic around when to refresh, when to reset, recursively cloning submodules, etc., which at least feels somewhat domain agnostic.

@charliermarsh charliermarsh merged commit 62c474d into main Nov 2, 2023
@charliermarsh charliermarsh deleted the charlie/vcs branch November 2, 2023 15:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support VCS dependencies

2 participants