Skip to content

buildcache: Tell servers not to cache index or hash#40339

Merged
scottwittenburg merged 1 commit intospack:developfrom
scottwittenburg:buildcache-do-not-cache-index
Oct 12, 2023
Merged

buildcache: Tell servers not to cache index or hash#40339
scottwittenburg merged 1 commit intospack:developfrom
scottwittenburg:buildcache-do-not-cache-index

Conversation

@scottwittenburg
Copy link
Copy Markdown
Contributor

If the remote url is S3, this extra information is associated with the object so that a cloudfront distribution (which has appropriate cache policy attached) will never cache the mirror index or index hash. The goal is to avoid generating pipelines from a stale index.

@spackbot-app spackbot-app bot added binary-packages core PR affects Spack core functionality labels Oct 5, 2023
@scottwittenburg
Copy link
Copy Markdown
Contributor Author

Since it's hard to know whether this would work, I tested with the following script once I had set up cloudfront for the s3://spack-binaries-prs bucket, and configured the minimum TTL = 0:

test script
import codecs
import json
import os

import spack.util.web as web_util
import spack.util.url as url_util


fetch_base_url = "https://binaries-prs.spack.io"
push_base_url = "s3://spack-binaries-prs"
object_prefix = "test/staletesting/myobject.json"
object_push_url = url_util.join(push_base_url, object_prefix)
object_fetch_url = url_util.join(fetch_base_url, object_prefix)

object_version_1 = {"version": 5}
local_object_path_1 = os.path.join(os.getcwd(), "myobject_local_v1.json")
with open(local_object_path_1, "w") as fd:
  fd.write(json.dumps(object_version_1))

object_version_2 = {"version": 6}
local_object_path_2 = os.path.join(os.getcwd(), "myobject_local_v2.json")
with open(local_object_path_2, "w") as fd:
  fd.write(json.dumps(object_version_2))

# Push version 1 to the single object prefix
web_util.push_to_url(
  local_object_path_1,
  object_push_url,
  keep_original=True,
  extra_args={
      "ContentType": "application/json",
      "CacheControl": "no-cache",
  },
)

# Read version 1 from the single prefix using the cloudfront url
_, _, remote_file_obj = web_util.read_from_url(object_fetch_url)
remote_contents_v1 = codecs.getreader("utf-8")(remote_file_obj).read()
local_fetch_path_v1 = os.path.join(os.getcwd(), "myobject_fetched_v1.json")
with open(local_fetch_path_v1, "w") as fd:
  fd.write(remote_contents_v1)

# Push version 2 to the single object prefix
web_util.push_to_url(
  local_object_path_2,
  object_push_url,
  keep_original=True,
  extra_args={
      "ContentType": "application/json",
      "CacheControl": "no-cache",
  },
)

# Read version 2 (let's see about that though) from the single object
# using the cloudfront url
_, _, remote_file_obj = web_util.read_from_url(object_fetch_url)
remote_contents_v2 = codecs.getreader("utf-8")(remote_file_obj).read()
local_fetch_path_v2 = os.path.join(os.getcwd(), "myobject_fetched_v2.json")
with open(local_fetch_path_v2, "w") as fd:
  fd.write(remote_contents_v2)

# Now we can:
#     diff myobject_local_v1.json myobject_fetched_v1.json
#     diff myobject_local_v2.json myobject_fetched_v2.json

And that's how I learned that setting the CacheControl: no-cache header works to tell cloudfront not to cache, but that if it has already cached the object without that header, and you overwrite it with the header in S3, you will need to wait until the original TLL expires (or else invalidate the cache yourself manually) before the script produces the same value after the second write/read.

Copy link
Copy Markdown
Contributor

@kwryankrattiger kwryankrattiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me and matches the docs as well. The testing looks sufficiently thorough, I can't think of anything else that needs to be tried.

@scottwittenburg scottwittenburg merged commit d9cb1a1 into spack:develop Oct 12, 2023
@scottwittenburg scottwittenburg deleted the buildcache-do-not-cache-index branch October 12, 2023 00:14
mtaillefumier pushed a commit to mtaillefumier/spack that referenced this pull request Oct 23, 2023
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Nov 2, 2023
mtaillefumier pushed a commit to mtaillefumier/spack that referenced this pull request Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

binary-packages core PR affects Spack core functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants