Skip to content

Support proxy index urls#11782

Closed
jtfmumm wants to merge 8 commits intomainfrom
jtfm/url-proxy
Closed

Support proxy index urls#11782
jtfmumm wants to merge 8 commits intomainfrom
jtfm/url-proxy

Conversation

@jtfmumm
Copy link
Contributor

@jtfmumm jtfmumm commented Feb 25, 2025

[This draft PR still has a couple of minor FIXMEs and a number of FIXMEs for documenting code. I also plan to add more tests]

These changes enable users to supply a proxy index url for an index/registry via an --index-proxy-url argument or UV_INDEX_PROXY_URL env var (both take values of the form <index-name>=<proxy-url>). The "canonical" URL for that index is defined via the normal url field in [[tool.uv.index]].

This canonical URL is written to the lockfile. But when a proxy url is passed in for that index, it will actually be used in place of the canonical URL (though never written to the lockfile). This way, if different users or machines have private proxy urls, these will never be checked into source control. Proxy URLs are supported for uv add, uv sync, uv lock, and uv run.

There is currently a test for adding a package with a proxy index and then removing .venv and syncing.

Closes #6349

@jtfmumm jtfmumm force-pushed the jtfm/url-proxy branch 2 times, most recently from 7d8690b to d219a7b Compare February 25, 2025 21:02
@zanieb zanieb self-requested a review February 25, 2025 21:03
@jtfmumm jtfmumm force-pushed the jtfm/url-proxy branch 2 times, most recently from 0aab145 to 0e06041 Compare February 25, 2025 21:25
#[error("Proxy index requires a name, but the name was empty")]
MissingName,
// #[error(transparent)]
// UrlParse(#[from] url::ParseError),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Nit: dead code)

@charliermarsh
Copy link
Member

Nice job figuring out all the plumbing here. As a higher-level comment, we should definitely get feedback from the users that participated in #6349 to understand whether the proposed API solves their problem (or not). I'd probably suggesting commenting there with a clear outline on the proposed design and a link to this PR sooner rather than later, so that we can make sure we're on the right track. In that comment, you could also ask if users are willing / able to test this branch out on their own projects.

pub extra_index_url: Option<Vec<Maybe<PipExtraIndex>>>,

/// FIXME Document
#[arg(long, env = EnvVars::UV_INDEX_PROXY_URL, value_delimiter = ' ', value_parser = parse_index_proxy_url, help_heading = "Index options")]
Copy link
Member

@zanieb zanieb Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about UV_INDEX_PROXY_URL not matching the UV_INDEX_<name>_USERNAME and UV_INDEX_<name>_PASSWORD options. Presumably, the reason the environment variables include the index name is it makes it easier to set variables for multiple indexes? We also may have just copied Poetry here. But, otherwise you have to pass them as separated values (with spaces here, commas in some other interfaces) — which is fine in the general case, but probably problematic for credentials where separators could be valid characters.

I'm not entirely sure this should change, but consistency is important. You could argue we should support UV_INDEX_USERNAME=<name>=<username> instead, but I don't know how we'd deal with the aforementioned case where the separator is a valid character in the value.

Another argument is that we don't have CLI options for usernames and passwords — they're part of the --index <...> value. It could be sufficiently distinct for that reason. We could weigh the usage of --index-proxy-url directly matching UV_INDEX_PROXY_URL over consistency with the other environment variables.

Another option is to support both, but I don't think there's strong justification for that.

Copy link
Contributor Author

@jtfmumm jtfmumm Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can index names contain a combination of dashes and underscores (e.g., alpha-1_b)? If there were a name like this, you couldn't specify it following the UV_INDEX_PROXY_URL_<name> approach.

My thinking was to keep the values of the command line arg and env var the same, but I'm open to alternatives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already support index names in environment variables, so there should be no "new" problems here.

The supported characters are at

'a'..='z' | 'A'..='Z' | '0'..='9' | '-' | '_' | '.' => {}

and here's the environment variable key conversion

/// Converts the index name to an environment variable name.
///
/// For example, given `IndexName("foo-bar")`, this will return `"FOO_BAR"`.
pub fn to_env_var(&self) -> String {
self.0
.chars()
.map(|c| {
if c.is_ascii_alphanumeric() {
c.to_ascii_uppercase()
} else {
'_'
}
})
.collect::<String>()
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm still undecided what's best for users here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an edge case, but what happens if a user names one index alpha-b and another alpha_b? Do we treat those as the same name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a good question — I think we should require unique normalized names and error if they do that. I'm not sure if we do today.

Copy link
Contributor Author

@jtfmumm jtfmumm Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that's the right behavior, which means it's not a reason against your env var proposal. Though I'm still not sure whether it's better to have the values be the same as for the cli arg.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On that note, can you check if we error and if not open a ticket to do so?

I'm leaning towards the <name> in the variable. I think consistency across our environment variables probably makes more sense than consistency between the CLI and environment; so:

UV_INDEX_<name>_USERNAME
UV_INDEX_<name>_PASSWORD
UV_INDEX_<name>_PROXY_URL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. And yeah, I agree with you. Let's change it to your proposal.

uv_snapshot!(context.filters(), context.add()
.arg("anyio>=4.3.0")
.arg("--index-proxy-url")
.arg("alpha=https://public:heron@pypi-proxy.fly.dev/basic-auth/simple"), @r"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be preferable not to use our proxy here, it's a bit flaky and I'd only use it if you're trying to test for proper credential management. You should just be able to use the normal PyPI?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of testing credentials, it may be worth having a separate test that uses the UV_INDEX... variables to set credentials and ensure they're propagated to the proxy URL correctly.

Copy link
Contributor Author

@jtfmumm jtfmumm Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting question. The proxy replacement only works if the url bases for sdists/wheels are the same as the canonical url. As @charliermarsh mentioned in the original issue, this isn't true for PyPi, which uses urls like https://files.pythonhosted.org/packages/94/49/26a7b0f3f35da4b5a65f081943b7bcd22d7002f5f0fb8098ec1ff21cb6ef/black-25.1.0.tar.gz.

Testing locally, I've been using devpi, which on its own writes your localhost URL as the base for everything. The PyPi proxy I'm using in this test has the same problem as PyPi. Do we have anything like devpi we've used for other tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting. That's.. tricky. I worry this will be a problem for users, but we can discuss that separately — be sure to call this out in the pull request summary and documentation.

We don't have anything like that for other tests. You can use the un-authenticated proxy at https://pypi-proxy.fly.dev instead. See https://github.com/astral-sh/pypi-proxy

Comment on lines +304 to +306
} else {
Err(ProxyIndexSourceError::MissingName)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it makes sense to allow --proxy-url <url> without a name? Like we do for --index <url>?

If so, I think it'd override the "default" index URL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could make sense, but my concern is that since PyPi is the default out of the box, it will increase the chances of users hitting confusing behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems indicative that we'll need to find a way to fix the confusing behavior or provide a dedicated error message — but we can consider that separately. It's fine to not support in the initial version, as long as we've considered how it would work if we supported it in the future.

@jtfmumm jtfmumm added the enhancement New feature or improvement to existing functionality label Mar 12, 2025
@abhiaagarwal
Copy link

Hey @jtfmumm, are you still working on this? This has become necessary for me, I'd be happy to take it over if you're too busy :D

@zanieb
Copy link
Member

zanieb commented Apr 1, 2025

@abhiaagarwal this is blocked by a need for more feedback on the feature and use-cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or improvement to existing functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request for uv.lock to support different index urls across different developer machines and CI environments

4 participants