buildcache: Tell servers not to cache index or hash by scottwittenburg · Pull Request #40339 · spack/spack

scottwittenburg · 2023-10-05T19:30:37Z

If the remote url is S3, this extra information is associated with the object so that a cloudfront distribution (which has appropriate cache policy attached) will never cache the mirror index or index hash. The goal is to avoid generating pipelines from a stale index.

scottwittenburg · 2023-10-10T21:15:01Z

Since it's hard to know whether this would work, I tested with the following script once I had set up cloudfront for the s3://spack-binaries-prs bucket, and configured the minimum TTL = 0:

test script

import codecs
import json
import os

import spack.util.web as web_util
import spack.util.url as url_util


fetch_base_url = "https://binaries-prs.spack.io"
push_base_url = "s3://spack-binaries-prs"
object_prefix = "test/staletesting/myobject.json"
object_push_url = url_util.join(push_base_url, object_prefix)
object_fetch_url = url_util.join(fetch_base_url, object_prefix)

object_version_1 = {"version": 5}
local_object_path_1 = os.path.join(os.getcwd(), "myobject_local_v1.json")
with open(local_object_path_1, "w") as fd:
  fd.write(json.dumps(object_version_1))

object_version_2 = {"version": 6}
local_object_path_2 = os.path.join(os.getcwd(), "myobject_local_v2.json")
with open(local_object_path_2, "w") as fd:
  fd.write(json.dumps(object_version_2))

# Push version 1 to the single object prefix
web_util.push_to_url(
  local_object_path_1,
  object_push_url,
  keep_original=True,
  extra_args={
      "ContentType": "application/json",
      "CacheControl": "no-cache",
  },
)

# Read version 1 from the single prefix using the cloudfront url
_, _, remote_file_obj = web_util.read_from_url(object_fetch_url)
remote_contents_v1 = codecs.getreader("utf-8")(remote_file_obj).read()
local_fetch_path_v1 = os.path.join(os.getcwd(), "myobject_fetched_v1.json")
with open(local_fetch_path_v1, "w") as fd:
  fd.write(remote_contents_v1)

# Push version 2 to the single object prefix
web_util.push_to_url(
  local_object_path_2,
  object_push_url,
  keep_original=True,
  extra_args={
      "ContentType": "application/json",
      "CacheControl": "no-cache",
  },
)

# Read version 2 (let's see about that though) from the single object
# using the cloudfront url
_, _, remote_file_obj = web_util.read_from_url(object_fetch_url)
remote_contents_v2 = codecs.getreader("utf-8")(remote_file_obj).read()
local_fetch_path_v2 = os.path.join(os.getcwd(), "myobject_fetched_v2.json")
with open(local_fetch_path_v2, "w") as fd:
  fd.write(remote_contents_v2)

# Now we can:
#     diff myobject_local_v1.json myobject_fetched_v1.json
#     diff myobject_local_v2.json myobject_fetched_v2.json

And that's how I learned that setting the CacheControl: no-cache header works to tell cloudfront not to cache, but that if it has already cached the object without that header, and you overwrite it with the header in S3, you will need to wait until the original TLL expires (or else invalidate the cache yourself manually) before the script produces the same value after the second write/read.

kwryankrattiger

This looks good to me and matches the docs as well. The testing looks sufficiently thorough, I can't think of anything else that needs to be tried.

buildcache: Tell servers not to cache index or hash

7d795b3

spackbot-app bot added binary-packages core PR affects Spack core functionality labels Oct 5, 2023

scottwittenburg mentioned this pull request Oct 5, 2023

gitlab ci: Rework how mirrors are configured #39939

Merged

5 tasks

alalazo assigned haampie and kwryankrattiger Oct 9, 2023

kwryankrattiger approved these changes Oct 10, 2023

View reviewed changes

scottwittenburg merged commit d9cb1a1 into spack:develop Oct 12, 2023

scottwittenburg deleted the buildcache-do-not-cache-index branch October 12, 2023 00:14

mtaillefumier pushed a commit to mtaillefumier/spack that referenced this pull request Oct 23, 2023

buildcache: Tell servers not to cache index or hash (spack#40339)

0595813

RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Nov 2, 2023

buildcache: Tell servers not to cache index or hash (spack#40339)

eb66587

mtaillefumier pushed a commit to mtaillefumier/spack that referenced this pull request Dec 14, 2023

buildcache: Tell servers not to cache index or hash (spack#40339)

99c490d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buildcache: Tell servers not to cache index or hash#40339

buildcache: Tell servers not to cache index or hash#40339
scottwittenburg merged 1 commit intospack:developfrom
scottwittenburg:buildcache-do-not-cache-index

scottwittenburg commented Oct 5, 2023

Uh oh!

scottwittenburg commented Oct 10, 2023

Uh oh!

kwryankrattiger left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

scottwittenburg commented Oct 5, 2023

Uh oh!

scottwittenburg commented Oct 10, 2023

Uh oh!

kwryankrattiger left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants