`url_exists` related improvements by haampie · Pull Request #34095 · spack/spack

haampie · 2022-11-23T15:51:32Z

For reasons beyond me Python thinks it's a great idea to upgrade HEAD
requests to GET requests when following redirects. So, this PR adds a
better HTTPRedirectHandler, and also moves some ad-hoc logic around
for dealing with disabling SSL certs verification.

Also, I'm stumped by the fact that Spack's url_exists does not use
HEAD requests at all, so in certain cases Spack awkwardly downloads
something first to see if it can download it, and then downloads it
again because it knows it can download it. So, this PR ensures that both
urllib and botocore use HEAD requests.

Finally, it also removes some things that were there to support currently
unsupported Python versions.

Notice that the HTTP spec section 10.3.2 just talks about how to deal
with POST request on redirect (whether to follow or not):

If the 301 status code is received in response to a request other
than GET or HEAD, the user agent MUST NOT automatically redirect the
request unless it can be confirmed by the user, since this might
change the conditions under which the request was issued.

Note: When automatically redirecting a POST request after
receiving a 301 status code, some existing HTTP/1.0 user agents
will erroneously change it into a GET request.

Python has a comment about this, they choose to go with the "erroneous change".
But they then mess up the HEAD request while following the redirect, probably
because they were too busy discussing how to deal with POST.

See python/cpython#99731

For reasons beyond me Python thinks it's a great idea to upgrade HEAD requests to GET requests when following redirects. So, this PR adds a better `HTTPRedirectHandler`, and also moves some ad-hoc logic around for dealing with disable ssl cert verification. Also, I'm stumped by the fact that Spack's `url_exists` does not use HEAD requests at all, so in certain cases Spack awkwardly downloads something first to see if it can download it, and then downloads it because it know it can download it. So, this PR ensures that both urllib and botocore use HEAD requests.

haampie · 2022-11-23T17:46:13Z

@spackbot fix style

spackbot-app · 2022-11-23T17:46:16Z

Let me see if I can fix that for you!

spackbot-app · 2022-11-23T17:48:14Z

I was able to run spack style --fix for you!

spack style --fix

==> Running style checks on spack
  selected: isort, mypy, black, flake8
==> Modified files
  lib/spack/spack/test/web.py
  lib/spack/spack/util/web.py
==> Running isort checks
Fixing /tmp/tmp3yze3ko_/spack/lib/spack/spack/util/web.py
  isort checks were clean
==> Running mypy checks
lib/spack/llnl/util/lang.py:1064: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
lib/spack/llnl/util/lock.py:84: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
lib/spack/spack/util/timer.py:68: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
lib/spack/spack/bootstrap.py:87: note: By default the bodies of untyped functions are not checked, consider using --check-untyped-defs  [annotation-unchecked]
Success: no issues found in 565 source files
  mypy checks were clean
==> Running black checks
All done! ✨ 🍰 ✨
2 files left unchanged.
  black checks were clean
==> Running flake8 checks
  flake8 checks were clean
==> spack style checks were clean

Keep in mind that I cannot fix your flake8 or mypy errors, so if you have any you'll need to fix them and update the pull request. If I was able to push to your branch, if you make further changes you will need to pull from your updated branch before pushing again.

I've updated the branch with style fixes.

This reverts commit d06fd26.

This reverts commit d06fd26. The problem is that Bitbucket's API forwards download requests to an S3 bucket using a temporary URL. This URL includes a signature for the request, which embeds the HTTP verb. That means only GET requests are allowed, and HEAD requests would fail verification, leading to 403 erros. The same is observed when using `curl -LI ...`

For reasons beyond me Python thinks it's a great idea to upgrade HEAD requests to GET requests when following redirects. So, this PR adds a better `HTTPRedirectHandler`, and also moves some ad-hoc logic around for dealing with disabling SSL certs verification. Also, I'm stumped by the fact that Spack's `url_exists` does not use HEAD requests at all, so in certain cases Spack awkwardly downloads something first to see if it can download it, and then downloads it again because it knows it can download it. So, this PR ensures that both urllib and botocore use HEAD requests. Finally, it also removes some things that were there to support currently unsupported Python versions. Notice that the HTTP spec [section 10.3.2](https://datatracker.ietf.org/doc/html/rfc2616.html#section-10.3.2) just talks about how to deal with POST request on redirect (whether to follow or not): > If the 301 status code is received in response to a request other > than GET or HEAD, the user agent MUST NOT automatically redirect the > request unless it can be confirmed by the user, since this might > change the conditions under which the request was issued. > Note: When automatically redirecting a POST request after > receiving a 301 status code, some existing HTTP/1.0 user agents > will erroneously change it into a GET request. Python has a comment about this, they choose to go with the "erroneous change". But they then mess up the HEAD request while following the redirect, probably because they were too busy discussing how to deal with POST. See python/cpython#99731

This reverts commit d06fd26. The problem is that Bitbucket's API forwards download requests to an S3 bucket using a temporary URL. This URL includes a signature for the request, which embeds the HTTP verb. That means only GET requests are allowed, and HEAD requests would fail verification, leading to 403 erros. The same is observed when using `curl -LI ...`

spackbot-app bot added core PR affects Spack core functionality fetching tests General test capability(ies) utilities labels Nov 23, 2022

haampie changed the title ~~urlopen related improvements~~ url_exists related improvements Nov 23, 2022

haampie force-pushed the fix/url-things branch 5 times, most recently from 76aaa85 to a1a2f98 Compare November 23, 2022 16:33

haampie force-pushed the fix/url-things branch from a1a2f98 to 5e637d0 Compare November 23, 2022 16:35

haampie added 2 commits November 23, 2022 17:47

Add a test

9a22f14

style

55c69eb

[@spackbot] updating style on behalf of haampie

799884f

tgamblin approved these changes Nov 23, 2022

View reviewed changes

tgamblin enabled auto-merge (squash) November 23, 2022 19:00

tgamblin merged commit d06fd26 into spack:develop Nov 23, 2022

haampie deleted the fix/url-things branch November 23, 2022 21:37

haampie added a commit that referenced this pull request Nov 26, 2022

Revert "url_exists related improvements (#34095)"

514162a

This reverts commit d06fd26.

greenc-FNAL mentioned this pull request Dec 1, 2022

Sync with upstream FNALssi/spack#72

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`url_exists` related improvements#34095

`url_exists` related improvements#34095
tgamblin merged 4 commits intospack:developfrom
haampie:fix/url-things

haampie commented Nov 23, 2022 •

edited

Loading

Uh oh!

haampie commented Nov 23, 2022

Uh oh!

spackbot-app bot commented Nov 23, 2022

Uh oh!

spackbot-app bot commented Nov 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

haampie commented Nov 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haampie commented Nov 23, 2022

Uh oh!

spackbot-app bot commented Nov 23, 2022

Uh oh!

spackbot-app bot commented Nov 23, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

haampie commented Nov 23, 2022 •

edited

Loading