Skip to content

Mirror CMake dependencies along with Bazel dependencies#42172

Closed
murgatroid99 wants to merge 9 commits into
grpc:masterfrom
murgatroid99:cmake_deps_mirror
Closed

Mirror CMake dependencies along with Bazel dependencies#42172
murgatroid99 wants to merge 9 commits into
grpc:masterfrom
murgatroid99:cmake_deps_mirror

Conversation

@murgatroid99

Copy link
Copy Markdown
Member

The only github.com links in build_autogenerated.yaml are the ones that correspond to packages that CMakeLists.txt downloads, so we mirror those links in GCS the same way we do bazel dependencies. The goal is to use this mirror as an alternative download target in CMake to minimize flakes from having our downloads throttled by GitHub.

I also switched from gsutil to gcloud storage because gsutil wasn't working for me and the command line output recommended switching.

@sergiitk

Copy link
Copy Markdown
Member

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request migrates the mirror update script from gsutil to gcloud storage and introduces mirroring for GitHub archives found in build_autogenerated.yaml. The review identifies a critical issue where the existence check using gcloud storage objects list fails to return a non-zero exit code for missing files, suggesting the use of the describe command instead. Additionally, the regex patterns used in the script should be updated to use POSIX-compliant character classes to ensure portability across different operating systems.

Comment thread bazel/update_mirror.sh
Comment thread bazel/update_mirror.sh Outdated
done

# Collect the github archives to mirror from build_autogenerated.yaml
grep -o '\s*- https://github.com/.*' build_autogenerated.yaml | sed 's/^\s*- https:\/\///' | while read -r line ; do

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of \s in grep and sed is not POSIX-compliant and may not be supported by all versions (e.g., it will fail on BSD/macOS). Additionally, using .* at the end of the regex is risky as it can capture trailing whitespace or comments from the YAML file, leading to invalid URLs being passed to curl.

It is safer to use [[:space:]] for portability and [^[:space:]]* to ensure only the URL is captured.

Suggested change
grep -o '\s*- https://github.com/.*' build_autogenerated.yaml | sed 's/^\s*- https:\/\///' | while read -r line ; do
grep -o '[[:space:]]*- https://github.com/[^[:space:]]*' build_autogenerated.yaml | sed 's|^[[:space:]]*- https://||' | while read -r line ; do

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't care about full POSIX compliance. GNU works, and BSD works just fine too:

$ /usr/bin/grep -Eo '\s*- https://github.com/(google|envoyproxy|bufbuild|cncf)/.+\.tar\.gz' build_autogenerated.yaml
  - https://github.com/google/cel-spec/archive/refs/tags/v0.25.1.tar.gz
  - https://github.com/envoyproxy/data-plane-api/archive/6ef568cf4a67362849911d1d2a546fd9f35db2ff.tar.gz
  - https://github.com/bufbuild/protoc-gen-validate/archive/refs/tags/v1.2.1.tar.gz
  - https://github.com/cncf/xds/archive/ee656c7534f5d7dc23d44dd611689568f72017a6.tar.gz
$ /usr/bin/grep --version
grep (BSD grep, GNU compatible) 2.6.0-FreeBSD

@sergiitk sergiitk Apr 17, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sed doesn't work with \s. Here's what works:

$ /usr/bin/grep -Eo '\s*- https://github.com/.+(.tar.gz|.zip)' build_autogenerated.yaml 
  | /usr/bin/sed 's@^[[:space:]]*- https://@@'

Comment thread bazel/update_mirror.sh Outdated
done

# Collect the github archives to mirror from build_autogenerated.yaml
grep -o '\s*- https://github.com/.*' build_autogenerated.yaml | sed 's/^\s*- https:\/\///' | while read -r line ; do

@sergiitk sergiitk Apr 17, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
grep -o '\s*- https://github.com/.*' build_autogenerated.yaml | sed 's/^\s*- https:\/\///' | while read -r line ; do
grep -Eo '\s*- https://github.com/.+\.(gz|zip)' build_autogenerated.yaml | sed 's/^\s*- https:\/\///' | while read -r line ; do

Since the script is not checking the exact property external_proto_libraries:, let's make it a bit safer so we don't download web pages accidentally from any other place where we may have github links in the future.

We also could restrict it even more with an allowlist like so (google|envoyproxy|bufbuild|cncf):

grep -Eo '\s*- https://github.com/(google|envoyproxy|bufbuild|cncf)/.+\.tar\.gz' build_autogenerated.yaml

and add a logic to print out those we skipped - but this is probably fine to start without an allow-list

@sergiitk

Copy link
Copy Markdown
Member

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the dependency mirroring process by switching to gcloud storage and adding support for CMake dependencies. It also introduces a fallback download mechanism for CMake archives to use a gRPC-hosted mirror. However, the implementation of the fallback logic via an external shell script introduces significant portability concerns for macOS and Windows environments due to dependencies on curl, sha256sum, and bash-specific syntax. It is recommended to refactor this logic to use native CMake file(DOWNLOAD) commands to ensure cross-platform compatibility.

Comment thread cmake/download_archive.cmake Outdated
Comment thread cmake/download_with_fallback.sh Outdated
Comment thread bazel/update_mirror.sh
Comment thread bazel/update_mirror.sh

# Collect the github archives to mirror from build_autogenerated.yaml
echo "Updating CMake dependency mirrors"
grep -Eo '\s*- https://github.com/.+\.(tar\.gz|zip)' build_autogenerated.yaml | sed 's@^[[:space:]]*- https://@@' | while read -r line ; do

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming this works with Mac's sed and grep

$ /usr/bin/grep -Eo '\s*- https://github.com/.+\.(tar\.gz|zip)' build_autogenerated.yaml | /usr/bin/sed 's@^[[:space:]]*- https://@@'
github.com/google/cel-spec/archive/refs/tags/v0.25.1.tar.gz
github.com/envoyproxy/data-plane-api/archive/6ef568cf4a67362849911d1d2a546fd9f35db2ff.tar.gz
github.com/googleapis/googleapis/archive/2193a2bfcecb92b92aad7a4d81baa428cafd7dfd.zip
github.com/bufbuild/protoc-gen-validate/archive/refs/tags/v1.2.1.tar.gz
github.com/cncf/xds/archive/ee656c7534f5d7dc23d44dd611689568f72017a6.tar.gz

@sergiitk

Copy link
Copy Markdown
Member

Internal ref b/505233243

asheshvidyut pushed a commit to asheshvidyut/grpc that referenced this pull request Apr 23, 2026
The only github.com links in `build_autogenerated.yaml` are the ones that correspond to packages that `CMakeLists.txt` downloads, so we mirror those links in GCS the same way we do bazel dependencies. The goal is to use this mirror as an alternative download target in CMake to minimize flakes from having our downloads throttled by GitHub.

I also switched from `gsutil` to `gcloud storage` because `gsutil` wasn't working for me and the command line output recommended switching.

Closes grpc#42172

COPYBARA_INTEGRATE_REVIEW=grpc#42172 from murgatroid99:cmake_deps_mirror 63468be
PiperOrigin-RevId: 903662332
copybara-service Bot pushed a commit that referenced this pull request May 29, 2026
…2513)

The `str.removeprefix` method was introduced in Python 3.9. Some CI environments still use older versions of Python, causing template rendering to fail. Replacing it with a slice-based approach for compatibility.

We should revert this once we upgrade `grpc-ubuntu20-large` Kokoro base VM image.

The issue was introduced in #42172 - but it's a CI tech debt issue. Normally we should be assuming that we run a modern python.

Closes #42513

COPYBARA_INTEGRATE_REVIEW=#42513 from sergiitk:fix/ci/cmake-old-python-removeprefix eefdf17
PiperOrigin-RevId: 923156519
asheshvidyut pushed a commit to asheshvidyut/grpc that referenced this pull request May 29, 2026
…pc#42513)

The `str.removeprefix` method was introduced in Python 3.9. Some CI environments still use older versions of Python, causing template rendering to fail. Replacing it with a slice-based approach for compatibility.

We should revert this once we upgrade `grpc-ubuntu20-large` Kokoro base VM image.

The issue was introduced in grpc#42172 - but it's a CI tech debt issue. Normally we should be assuming that we run a modern python.

Closes grpc#42513

COPYBARA_INTEGRATE_REVIEW=grpc#42513 from sergiitk:fix/ci/cmake-old-python-removeprefix eefdf17
PiperOrigin-RevId: 923156519
asheshvidyut pushed a commit to a-detiste/grpc that referenced this pull request Jun 10, 2026
The only github.com links in `build_autogenerated.yaml` are the ones that correspond to packages that `CMakeLists.txt` downloads, so we mirror those links in GCS the same way we do bazel dependencies. The goal is to use this mirror as an alternative download target in CMake to minimize flakes from having our downloads throttled by GitHub.

I also switched from `gsutil` to `gcloud storage` because `gsutil` wasn't working for me and the command line output recommended switching.

Closes grpc#42172

COPYBARA_INTEGRATE_REVIEW=grpc#42172 from murgatroid99:cmake_deps_mirror 63468be
PiperOrigin-RevId: 903662332
asheshvidyut pushed a commit to a-detiste/grpc that referenced this pull request Jun 10, 2026
…pc#42513)

The `str.removeprefix` method was introduced in Python 3.9. Some CI environments still use older versions of Python, causing template rendering to fail. Replacing it with a slice-based approach for compatibility.

We should revert this once we upgrade `grpc-ubuntu20-large` Kokoro base VM image.

The issue was introduced in grpc#42172 - but it's a CI tech debt issue. Normally we should be assuming that we run a modern python.

Closes grpc#42513

COPYBARA_INTEGRATE_REVIEW=grpc#42513 from sergiitk:fix/ci/cmake-old-python-removeprefix eefdf17
PiperOrigin-RevId: 923156519
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants