Conversation
|
r? @epage (rustbot has picked a reviewer for you, use r? to override) |
0c9d0c6 to
a305af5
Compare
62af2de to
1759263
Compare
|
Very early results for the MVP of fetch (a lot is missing, but it works already).
And for additional reference:
MeasurementsGix
|
2552fec to
e8c4201
Compare
e8c4201 to
fb11423
Compare
b903e2d to
759c264
Compare
|
☔ The latest upstream changes (presumably #10771) made this pull request unmergeable. Please resolve the merge conflicts. |
|
Will admit at this point in the review, my eyes are starting to glaze over the more complex parts (progress), particularly @weihanglo could you serve as a second pair of eyes on that? |
|
Thanks for the swift review! I am working on resolving the issues one by one, probably creating a commit for each of them while pushing from time to time while adding replies to the conversations. Apologies for the notifications that this might generate - I wish there was a way to bundle that but don't know of any. I am collecting a comments on conversations on what GitHub calls 'outdated' items as it won't let me answer there directly.
I probably won't manage to address all comments today but hope to get it done by Monday. Review ReportAll comments have been addressed with a commit except for the following, which have received a response instead.
If I am not missing anything, you can re-check, @epage. Thanks for your patience - I am rarely as fast as I hope to be. |
|
☔ The latest upstream changes (presumably #11699) made this pull request unmergeable. Please resolve the merge conflicts. |
|
☔ The latest upstream changes (presumably #11646) made this pull request unmergeable. Please resolve the merge conflicts. |
|
Regarding the merge-conflict: I have already rebased onto master locally but am holding back the commits to not break the commit references in the comments. |
|
Thanks @weihanglo for the review! I think I have addressed all issues, with only one question remaining here. |
There was a problem hiding this comment.
Finished the review of the progress bar part. Looking great in general! At least #9444 will be resolved after merge.
Just post the result of cargo tree -p prodash here for people to check the dependency tree of the crate gathering progress information. Cargo may not need some of those features (e.g. human_format) but it is fine with me btw.
cargo tree -p prodash
prodash v23.0.0
├── bytesize v1.1.0
├── dashmap v5.4.0
│ ├── cfg-if v1.0.0
│ ├── hashbrown v0.12.3
│ ├── lock_api v0.4.9
│ │ └── scopeguard v1.1.0
│ │ [build-dependencies]
│ │ └── autocfg v1.1.0
│ ├── once_cell v1.17.1
│ └── parking_lot_core v0.9.7
│ ├── cfg-if v1.0.0
│ ├── libc v0.2.139
│ └── smallvec v1.10.0
├── human_format v1.0.3
└── parking_lot v0.11.2
├── instant v0.1.12
│ └── cfg-if v1.0.0
├── lock_api v0.4.9 (*)
└── parking_lot_core v0.8.6
├── cfg-if v1.0.0
├── instant v0.1.12 (*)
├── libc v0.2.139
└── smallvec v1.10.0
| let url_for_authentication = &mut *url_for_authentication; | ||
| let remote = repo | ||
| .remote_at(orig_url)? | ||
| .with_refspecs( |
There was a problem hiding this comment.
Should we call with_fetch_tags here to align what Cargo does now?
cargo/src/cargo/sources/git/utils.rs
Lines 881 to 883 in ddfe87f
There was a problem hiding this comment.
That's a great catch! Indeed downloading included tags is the default, but users could override this setting with their git configuration, so better set it explicitly. It's done in ef04a36 (not pushed yet, didn't want to destroy previous references after rebase).
There was a problem hiding this comment.
Is gix::remote::fetch::Tags::Included equivalent to git2::AutotagOption::All? I think you're correct. It makes sense not to fetch unreachable refs, though I don't know if anyone depends on the behaviour.
There was a problem hiding this comment.
That's an amazing catch! And I don't even know why I put this there but can imagine that I was so focussed on the Included case after implementing it, that I put it there without re-checking how it's used. c585dc4 offers an alternative implementation which should match the one in git2.
|
☔ The latest upstream changes (presumably #11725) made this pull request unmergeable. Please resolve the merge conflicts. |
|
Thanks for your review notes, @weihanglo, I think I have addressed them all and left comments accordingly. Furthermore I spoke to @epage and gathered more tasks to address. They are also related to dependencies, so @weihanglo's comment on review tasks
Out of scopeThe idea is to review dependencies in
|
| .to_owned(); | ||
| let connection = | ||
| remote.connect(gix::remote::Direction::Fetch, &mut progress)?; | ||
| let mut authenticate = connection.configured_credentials(url)?; |
There was a problem hiding this comment.
with_authentication prioritize SSH over credential.helper. Maybe I understand it wrongly but it seems that the same rule doesn't hold in gix default authentication callback.
In addition, I have no idea if it works but git::Cred::default() seems to respect NTLM and Kerberos. Do we have that via the default credential helper?
It's totally reasonable to follow what git does, yet this is undoubtedly a breaking change if the old rule doesn't hold. I do feel when we stabilize gioxide feature in Cargo, we can advertise it follows what git does and ditch custom rules.
There was a problem hiding this comment.
Indeed, by nature these implementations will be different, even though I hope ultimately they have the same result. Due to a lack of tests (due to the incredible difficulty of setting them up) it's not easy to say what behaviour would actually occour and I am not able to run these test cases by hand either.
Due to gitoxide being by default like git but flexible enough to probably be like git2 as well, I believe we can get close to what git2 does once it's a bit clearer what that is.
In addition, I have no idea if it works but git::Cred::default() seems to respect NTLM and Kerberos. Do we have that via the default credential helper?
That depends on the credentials helper that is ultimately invoked, but I don't know. It seems these are specifically added by git2 and I don't know what it takes to add them to gitoxide - it mostly delegates to the credentials helper and that seems like the right thing to do.
In any case, I think it's worth keeping an eye on this topic and get to the trial stage to have users report the cases they need to work, and then we make them work as needed. If ultimately that means to setup the credential helper correctly, that seems like a fair thing to do as it's something most people have to do once for git anyway.
|
Once again, I believe I have addressed all review notes :). Here is an overview of the broad strokes taken, and many commits have been added for the individual steps. a4492ce contains an overview of how the dependency tree was cleared. Thanks everybody :). |
weihanglo
left a comment
There was a problem hiding this comment.
Thank you again, @Byron! I believe this PR is good to merge. Some regressions about progress bar could be resolved afterward.
@epage, do you have any concern? I don't feel like the merge needs an FCP but if you like.
Just notice that gix depends on crossbeam-utils and crossbeam-channel. For channel (1.67) and scoped thread (1.63) they have been ported to std in recent. We could track them and perhaps someday remove them from gix.
| let (rate, unit) = human_readable_bytes(counter.rate() as u64); | ||
| let msg = format!(", {rate:.2}{unit}/s"); | ||
|
|
||
| progress_bar.tick(objects, total_objects, &msg)?; |
There was a problem hiding this comment.
(Not really a blocker. We can fix this in follow-ups I believe.)
There are still some subtle changes/regressions regarding the progress bar.
- In the old implementation, we didn't split download and resolving deltas into two steps; they shared the one progress. In this PR we have different upper bounds for each task. Is it possible to follow the old behavior?
- It seems like the
delta_index_objectstask always gets stuck in 50%. Is that a bug in gix or just we display the wrong thing?
I don't mind if we eventually get two progresses but the progress reports should be correct and make sense to users.
Here is a record you can take as a reference: https://asciinema.org/a/563786?t=3
There was a problem hiding this comment.
A great catch, I just (tried) to go by keeping the code similar and called it a day, but didn't actually compare if it is the same.
In the old implementation, we didn't split download and resolving deltas into two steps; they shared the one progress. In this PR we have different upper bounds for each task. Is it possible to follow the old behavior?
That should be possible, I will give it a another shot.
It seems like the delta_index_objects task always gets stuck in 50%. Is that a bug in gix or just we display the wrong thing?
I think it's a display issue, as it doesn't get stuck when cloning with gix as far as I can tell. Maybe it's fixed as soon as I take a closer look at the above issue.
Here is a record you can take as a reference: https://asciinema.org/a/563786?t=3
Whoop whoop, that will be so helpful, thanks a lot!
I will be back once this is fixed, with a video for comparison :).
|
Thanks so much! I will definitely take a look at the progress bar issue now that I know it exists.
A great catch indeed! With an MSRV of 1.64 I have just ditched the direct |
Forget it. I was wrong. Only |
I'm ready to merge. Just waiting to hear back if we should block the progress bar improvements on it. |
|
☔ The latest upstream changes (presumably #11764) made this pull request unmergeable. Please resolve the merge conflicts. |
|
@weihanglo and @epage and I agree it's for the better to merge right away and provide fixes to the progress bar (and everything else) in follow-up PRs accordingly. Thus I have now squashed all commits, hopefully with a useful commit-message as well. Thanks everyone for lending your expertise to help me get this far. |
This allows to use `gitoxide` for all fetch operations, boosting performance for fetching the `crates.io` index by a factor of 2.2x, while being consistent on all platforms. For trying it, nightly builds of `cargo` can specify `-Zgitoxide=fetch`. It's also possible to set the `__CARGO_USE_GITOXIDE_INSTEAD_OF_GIT2=1` environment variable (value matters), which is when `-Zgitoxide=none` can be used to use `git2` instead. Limitations ----------- Note that what follows are current shortcomings that will be addressed in future PRs. - it's likely that authentication around the `ssh` protocol will work differently in practice as it uses the `ssh` program. - clones from `file://` based crates indices will need the `git` binary to serve the locatl repository. - the progress bar shown when fetching doesn't work like the orgiinal, but should already feel 'faster'.
|
This is soon to be merged. @Byron wrote a decent commit message cfffda9, calling out important stuff to follow up. Let me repost them here:
One thing also needs attentions is the behavior change of git credential helper mentioned in #11448 (comment). Thank you to everyone involved in this pull request! @bors r+ |
|
☀️ Test successful - checks-actions |
This PR is the first step towards resolving #1171.
In order to get there, we integrate
gitoxideintocargoin such a way that one can control its usage in nightly via-ZgitoxideorZgitoxide=<feature>[,featureN].Planned features are:
gitxide(this PR)gitoxide(planned)The above list is a prediction and might change as we understand the requirements better.
Testing and Transitioning
By default, everything stays as is. However, relevant tests can be re-runwith
gitoxideusingThere are about 200 tests with 'git' in their name and I plan to enable them one by one. That way the costs for CI stay managable (my first measurement with one test was 2min 30s), while allowing to take one step at a time.
Custom tests shall be added once we realize that more coverage is needed.
That way we should be able to maintain running
git2andgitoxideside by side until we are willing to switch over togitoxideentirely on stable cargo. Then turning ongit2might be a feature toggle for a while until we finally remove it from the codebase.Please see the above paragraph as invitation for discussion, it's merely a basis to explore from and improve upon.
Tasks
gitoxide(MVP)cargospecific values are passed togitoxidegit2code can handle crates-index clones created withgitoxideand vice-versagitoxideenabled testing - it's not used anymoreremove all TODOs and use crates-index version ofThe remaining 2 TODO's are more like questions for the reviewer.git-repositorygitoxiderelease and refer to the latest version from crates.io (instead of git-dependency)Postponed Tasks
I suggest to go breadth-first and implement the most valuable features first, and then aim for a broad replacement of
git2. What's left is details and improved compatibility with thegit2implementation that will be required oncegitoxideshould become the default implementation on stable to complete the transition.fileprotocol (i.e. without usinggit). Simple cases likeclonecan probably be supported quickly,fetchneeds more work though due to negotiation.libssh(avoid LGPL)libssh2based) transport. Look at this issue for some history.Proposed Workflow
I am now using stacked git to keep commits meaningful during development. This will also mean that before reviews I will force-push a lot as changes will be bucketed into their respective commits.
Once review officially begins I will stop force-pushing and create small commits to address review comments. That way it should be easier to understand how things change over time.
Those review-comments can certainly be squashed into one commit before merging.
Please let me know if this is feasible or if there are other ways of working you prefer.
Development notes
curl.gitoxideas it simply usesssh(the program) and calls it a day. No authentication flows are supported there yet and the goal would be to matchgitthere at least (which it might already do by just callingssh). Needs investigation. Once en-par withgitI thinkcargocan restart the whole fetch operation to try different user names like before.ssh-program based transport can now understand permission-denied errors, but the capability isn't used after all since a builtin ssh transport is required.git::Progressand just ignore most of the calls, but that's known to be too slow as the implementation assumes aProgress::inc()call is as fast as an atomic increment and makes no attempt to reduce its calls to it.thiserrorcould make spurious error checks nicer and less error prone during maintenance. It's not a problem though.RUSTFLAGS=--cfgto influence the entire build and unit-tests as environment variables didn't get through to the binary built and run for tests.Questions
gitoxideis configured the user has the opportunity to override these values using more specific git options, for example using url specific http settings. This looks like a feature to me, but if it's notgitoxideneeds to provide a way to disable applying these overrides. Please let me know what's desired here - my preference is to allow overrides.gitoxidecurrently opens repositories similar to howgitdoes which respects git specific environment variables. This might be a deviation from how it was before and can be turned off. My preference is to see it as a feature.Prerequisite PRs
util::Progressthread-safe as prerequisite of #11448 #11602