Skip to content

Conversation

@geofft
Copy link
Contributor

@geofft geofft commented Jul 17, 2025

No description provided.

@geofft geofft requested review from jtfmumm, konstin and zanieb July 17, 2025 18:54
@geofft geofft force-pushed the geofft/python-download-old-versions branch from 05f956e to af86992 Compare July 17, 2025 18:58
@geofft geofft temporarily deployed to uv-test-registries July 17, 2025 19:01 — with GitHub Actions Inactive
@geofft
Copy link
Contributor Author

geofft commented Jul 17, 2025

I'll fix up docs and I can add some test coverage.

This lets you do things like

target/debug/uv python list --python-downloads-json-url https://github.com/astral-sh/uv/raw/0.7.12/crates/uv-python/download-metadata.json

The high-level goal here is making it a little bit easier for users to roll back if we unexpectedly break something in python-build-standalone. I also intend to add, likely in separate PRs,

  • a --uv-tag 0.7.12 option that is syntax sugar for the above
  • maybe some form of explicit versioning to the JSON, retroactively calling the current format version 1 but printing a nice message if we see {"version": "2", ...} (we do not necessarily have to ever move to version 2, but having the error is nice)
  • something to astral-sh/setup-uv to let you use this option

&self,
downloader: &'a ManagedPythonDownloader,
) -> impl Iterator<Item = &'a ManagedPythonDownload> {
let s = self.clone();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to clone here? That seems weird?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I needed this to get past some sort of lifetiime issue because the iterator captures self, but it seems fine now. I must have fixed something else better. I'll get rid of it, thanks!

@zanieb
Copy link
Member

zanieb commented Jul 17, 2025

I might need a couple sentence summary in the PR description about the high-level approach here.

@geofft
Copy link
Contributor Author

geofft commented Jul 17, 2025

I might need a couple sentence summary in the PR description about the high-level approach here.

In terms of implementation, or something more like design/intended semantics? In terms of implementation, this is broken down into two commits which you might prefer for reviewing, but the change is basically:

  • Instead of having static methods on ManagedPythonDownload that populate a once_cell, make an explicit ManagedPythonDownloader object to hold that state, and adjust a handful of 'static lifetimes as needed
  • Make the ManagedPythonDownloader constructor async, so we can do HTTP requests from it
  • Also have its callers pass in a client
  • Actually do the HTTP request

Oh, one note which I forgot to mention is that per serde::from_reader's docs, you definitely shouldn't call it on an unbuffered reader (I ran strace and confirmed it's read(2)ing one byte at a time) and you might still get better results calling it on a big string than on a buffered reader. So I'm reading the whole thing into memory regardless of source, which should be fine because it's under two megabytes at the moment.

In terms of semantics, this breaks the exciting corner case of parsing irc://localhost/path/to/file.json as the local file /path/to/file.json and instead treats that as a filename. An http/https URL is handled (hopefully with the normal set of network configuration for uv?), a valid file URL is handled, an invalid file URL (with a hostname, except on Windows where you get a UNC path back, or with %00 or something similarly exciting) is an error, and everything else is treated as a path. There should be no other changes to semantics.

konstin added a commit that referenced this pull request Jul 18, 2025
Reviewing #14687, I noticed that we had implemented a `Url::from_url_or_path`-like function, but it wasn't reusable. This change `Verbatim::from_url_or_path` so we can use it in other places too.

The PEP 508 parser is an odd place for this, but that's where `VerbatimUrl` and `Scheme` are already living.
konstin added a commit that referenced this pull request Jul 18, 2025
Reviewing #14687, I noticed that we had implemented a `Url::from_url_or_path`-like function, but it wasn't reusable. This change `Verbatim::from_url_or_path` so we can use it in other places too.

The PEP 508 parser is an odd place for this, but that's where `VerbatimUrl` and `Scheme` are already living.
Copy link
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test as mentioned, the code is looking good already.

})
.into_iter()
.map(|a| InstallRequest::new(a, python_downloads_json_url.as_deref()))
.map(|a| InstallRequest::new(a, &downloader))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your fault but I'm only now seeing it.

Suggested change
.map(|a| InstallRequest::new(a, &downloader))
.map(|request| InstallRequest::new(request, &downloader))

.iter()
.map(|target| PythonRequest::parse(target.as_str()))
.map(|a| InstallRequest::new(a, python_downloads_json_url.as_deref()))
.map(|a| InstallRequest::new(a, &downloader))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.map(|a| InstallRequest::new(a, &downloader))
.map(|request| InstallRequest::new(request, &downloader))

Comment on lines 373 to 376
let s = self.clone();
downloader
.iter_all()
.filter(move |download| s.satisfied_by_download(download))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let s = self.clone();
downloader
.iter_all()
.filter(move |download| s.satisfied_by_download(download))
downloader
.iter_all()
.filter(move |download| self.satisfied_by_download(download))

.as_ref()
.map(|a| PythonDownloadRequest::iter_downloads(a, python_downloads_json_url.as_deref()))
.transpose()?
.map(|a| PythonDownloadRequest::iter_downloads(a, &downloader))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.map(|a| PythonDownloadRequest::iter_downloads(a, &downloader))
.map(|request| request.iter_downloads(&downloader))

) -> Result<&'static ManagedPythonDownload, Error> {
if let Some(download) = request.iter_downloads(python_downloads_json_url)?.next() {
) -> Result<&ManagedPythonDownload, Error> {
if let Some(download) = request.iter_downloads(self).next() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call looks backwards, flipping the method so that the call is self.iter_downloads(request) would look more natural


let file = fs_err::File::open(json_source.as_ref())?;
) -> Result<Self, Error> {
// Although read_url() handles file:// URLs and converts them to local file reads, here we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried improving the URL parsing and reusability situation in #14712, but I didn't get anything that's noticeable better than the current logic

konstin added a commit that referenced this pull request Jul 18, 2025
Reviewing #14687, I noticed that we had implemented a
`Url::from_url_or_path`-like function, but it wasn't reusable. This
change `Verbatim::from_url_or_path` so we can use it in other places
too.

The PEP 508 parser is an odd place for this, but that's where
`VerbatimUrl` and `Scheme` are already living.
///
/// If there is no stable version matching the request, a compatible pre-release version will
/// be searched for — even if a pre-release was not explicitly requested.
pub fn from_request(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the rename, this is a little weird? Like ManagedPythonDownloader::from_request -> ManagedPythonDownload when I'd expect from_.. to return Self.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, for this and Konsti's comment on "this call looks backwards" I was trying to minimize the amount of textual churn, but probably better to not optimize for that (especially now that reviewers have seen the commits so far :) ), I can rework things a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be resolved by #14687 (comment)

Makes sense. In general, don't be afraid to refactor in uv.

}

impl ManagedPythonDownload {
impl ManagedPythonDownloader {
Copy link
Member

@zanieb zanieb Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might rename this to ManagedPythonDownloads then have download_for_requestand iter_downloads methods (or just find and iter)? This doesn't actually download the versions, which is a little misleading.

Copy link
Member

@zanieb zanieb Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the rename is also a little confusing because the client you pass to this struct won't be used by ManagedPythonDownload for the actual download

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that occurred to me too. Maybe ManagedPythonDownloadList for a little more visual distinction than the plural?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I initially suggested that too and edited it out. I don't mind either way — ManagedPythonDownloads is a little closer to our typical naming in the project.

/// Note: The list is generated on the first call to this function.
/// so `python_downloads_json_url` is only used in the first call to this function.
pub fn iter_all(
pub async fn new(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have a docstring. What's the tldr on why this can't just take a BaseClientBuilder? I'm not worried about the performance, but it'd be a little clearer that it doesn't necessarily need a materialized client.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the callsites and it seems like you have a BaseClientBuilder you could provide instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, the only reason is that two of its callers construct a client shortly after calling this, and so if we passed a builder, we'd be calling .build() twice.. If that's not a problem I can definitely change that (and lazy-construct the client if we're in the HTTP(S) case only). But I was worried I'd have to do something like pass a &mut Option<BaseClient> to pass the constructed client back to the caller to reuse it, or something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably fine if we build twice.

Comment on lines 675 to 679
if let Ok(path) = url.to_file_path() {
Source::Path(Cow::Owned(path))
} else {
return Err(Error::InvalidUrlFormat(url));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd probably write this as a let Ok(...) else return instead of if let Ok(...) else return so you can early return

@konstin konstin added the enhancement New feature or improvement to existing functionality label Jul 18, 2025
geofft added 7 commits July 18, 2025 23:30
In preparation for supporting download sources over HTTP, change the
implicit creation of the download list to an explicit async constructor.
Also pass it a BaseClient which will be used in the next commit.
…wnload list

If the JSON does not deserialize, check to see if any key is named
"version" and print a different error message hinting at upgrading uv.

The intention is that we will add this field on backwards-incompatible
changes (and set it to an integer or string or something else that does
not deserialize into a JsonPythonDownload).
@geofft geofft force-pushed the geofft/python-download-old-versions branch 2 times, most recently from e7ab6bc to 25dd584 Compare August 4, 2025 17:50
@konstin konstin changed the title Support http/https URLs in uv python --python-downloads-json-url Support http/https URLs in uv python --python-downloads-json-url Aug 4, 2025
Copy link
Member

@konstin konstin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a (wiremock) test, otherwise r+ me on the code.

/// URL pointing to JSON of custom Python installations.
///
/// Note that currently, only local paths are supported.
/// This can be a local path or file://, http://, or https:// URL.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads like there is a particle missing

konstin added a commit that referenced this pull request Aug 6, 2025
From reviewing #14687, I was reminded that every base client operation requires manually destructuring the `NetworkSettings`. We can avoid that by passing the global network settings directly.
konstin added a commit that referenced this pull request Aug 6, 2025
From reviewing #14687, I was reminded that every base client operation requires manually destructuring the `NetworkSettings`. We can avoid that by passing the global network settings directly.
konstin added a commit that referenced this pull request Aug 6, 2025
From reviewing #14687, I was reminded that every base client operation requires manually destructuring the `NetworkSettings`. We can avoid that by passing the global network settings directly.
@MeitarR
Copy link
Contributor

MeitarR commented Oct 14, 2025

Hi @geofft , this feature is something I'm very interested in as well.

I noticed there hasn't been any recent activity - are you still planning to work on it? If not, I'd be happy to help move it forward or open a continuation PR based on your work.

@konstin konstin removed the request for review from jtfmumm October 14, 2025 11:59
Gankra added a commit that referenced this pull request Nov 14, 2025
…16542)

continuation PR based on #14687

---------

Co-authored-by: Geoffrey Thomas <geofft@ldpreload.com>
Co-authored-by: Aria Desires <aria.desires@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or improvement to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants