chore: Add ca-certificates to the worker's runtime image to support reading signed S3 URLs in the CLP connector (fixes #56).#55
Conversation
WalkthroughInstalls ca-certificates alongside tzdata in the runtime stage of Changes
Sequence Diagram(s)(omitted — changes are a simple Dockerfile package installation and do not affect runtime control flow) Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
ca-certificates dependency of the worker's runtime image.ca-certificates dependency of the worker's runtime image to support CLP S3.
ca-certificates dependency of the worker's runtime image to support CLP S3.ca-certificates dependency of the worker's runtime image to support CLP S3.
There was a problem hiding this comment.
Actionable comments posted: 3
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile(1 hunks)
🧰 Additional context used
🪛 Hadolint (2.12.0)
presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile
[info] 50-50: Avoid additional packages by specifying --no-install-recommends
(DL3015)
[warning] 50-50: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
| # CLP S3 support currently needs ca-certificates package to access the presigned URL | ||
| RUN apt-get update && \ | ||
| DEBIAN_FRONTEND=noninteractive apt-get install -y ca-certificates && \ | ||
| apt-get clean && \ | ||
| rm -rf /var/lib/apt/lists/* |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
If keeping a Debian/Ubuntu base, align with hadolint: use --no-install-recommends and consider pinning
Should you intentionally use a Debian/Ubuntu runtime image, incorporate minimal install flags and (optionally) version pinning to satisfy DL3015/DL3008 and reduce image size.
Minimal change:
-RUN apt-get update && \
- DEBIAN_FRONTEND=noninteractive apt-get install -y ca-certificates && \
+RUN apt-get update && \
+ DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y ca-certificates && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*Optional: pin the package version if you need full reproducibility (at the cost of more maintenance):
# Example only; verify the exact version available in your base image repos.
DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y ca-certificates=20240203🧰 Tools
🪛 Hadolint (2.12.0)
[info] 50-50: Avoid additional packages by specifying --no-install-recommends
(DL3015)
[warning] 50-50: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
🤖 Prompt for AI Agents
In presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile
around lines 49-53, the apt-get install omits --no-install-recommends (and
optionally a version pin) which triggers hadolint DL3015/DL3008 and increases
image size; change the install line to use DEBIAN_FRONTEND=noninteractive
apt-get install --no-install-recommends -y ca-certificates and keep the existing
apt-get clean and rm -rf /var/lib/apt/lists/*, and if you require strict
reproducibility optionally pin the ca-certificates package to a specific known
version available in the base image repos.
🧹 Nitpick (assertive)
Reduce layers and duplicate metadata downloads by combining the two package steps
You have two separate package-install RUNs (tzdata and ca-certificates). Combining them into a single RUN reduces layers and avoids repeating metadata downloads.
For CentOS base (recommended for your default BASE_IMAGE):
RUN dnf -y install tzdata ca-certificates && \
dnf clean all && \
rm -rf /var/cache/dnfIf you intentionally use Debian/Ubuntu:
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y tzdata ca-certificates && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*🧰 Tools
🪛 Hadolint (2.12.0)
[info] 50-50: Avoid additional packages by specifying --no-install-recommends
(DL3015)
[warning] 50-50: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
🤖 Prompt for AI Agents
In presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile
around lines 49 to 53, there are separate RUN steps installing tzdata and
ca-certificates which duplicate package metadata downloads and create extra
layers; combine those installs into a single RUN so package lists are fetched
once and both packages are installed together, then clean caches in the same
layer (use apt-get update then DEBIAN_FRONTEND=noninteractive apt-get install
--no-install-recommends -y tzdata ca-certificates && apt-get clean && rm -rf
/var/lib/apt/lists/*), or if the base image is CentOS use a single dnf install
line and clean dnf cache in the same RUN.
💡 Verification agent
🧩 Analysis chain
Switch to dnf/microdnf: current apt-get step will fail with CentOS Stream 9 base
Your BASE_IMAGE defaults to CentOS Stream 9 (Line 14), which does not provide apt-get. This RUN will fail unless BASE_IMAGE is overridden to a Debian/Ubuntu image at build time.
If you intend to keep CentOS Stream 9 as the default, replace apt with dnf/microdnf. Suggested change within this hunk:
-RUN apt-get update && \
- DEBIAN_FRONTEND=noninteractive apt-get install -y ca-certificates && \
- apt-get clean && \
- rm -rf /var/lib/apt/lists/*
+RUN dnf -y install ca-certificates && \
+ dnf clean all && \
+ rm -rf /var/cache/dnfAnd for consistency, you should also update the earlier tzdata step (outside this hunk) to use dnf on CentOS:
# Replace Lines 44-47 with:
RUN dnf -y install tzdata && \
dnf clean all && \
rm -rf /var/cache/dnfIf, instead, CI/build scripts override BASE_IMAGE to a Debian/Ubuntu image, please confirm. You can run this script to search for such overrides and docker build args:
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Searching for BASE_IMAGE overrides and docker build invocations…"
rg -n -C3 -i 'BASE_IMAGE' || true
rg -n -C3 -i 'prestissimo-runtime\.dockerfile' || true
rg -n -C3 -i 'docker\s+build(x)?\b.*(prestissimo-runtime\.dockerfile|--build-arg\s+BASE_IMAGE)' || trueLength of output: 14741
Switch package manager to DNF for CentOS Stream 9 base
The default BASE_IMAGE=quay.io/centos/centos:stream9 in prestissimo-runtime.dockerfile doesn’t include apt-get, so the current install steps will fail in CI (Jenkins builds don’t override the base). Please update the two install blocks as follows:
• In scripts/dockerfiles/prestissimo-runtime.dockerfile (around line 49):
-RUN apt-get update && \
- DEBIAN_FRONTEND=noninteractive apt-get install -y ca-certificates && \
- apt-get clean && \
- rm -rf /var/lib/apt/lists/*
+RUN dnf -y install ca-certificates && \
+ dnf clean all && \
+ rm -rf /var/cache/dnf• And replace the earlier tzdata step (around line 44):
-RUN apt-get update && \
- DEBIAN_FRONTEND=noninteractive apt-get install -y tzdata && \
- apt-get clean && \
- rm -rf /var/lib/apt/lists/*
+RUN dnf -y install tzdata && \
+ dnf clean all && \
+ rm -rf /var/cache/dnfNote: Maven-plugin and Docker Compose builds explicitly override BASE_IMAGE=ubuntu:22.04, so apt-get remains valid there; this change ensures the CentOS Stream 9 default works too.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # CLP S3 support currently needs ca-certificates package to access the presigned URL | |
| RUN apt-get update && \ | |
| DEBIAN_FRONTEND=noninteractive apt-get install -y ca-certificates && \ | |
| apt-get clean && \ | |
| rm -rf /var/lib/apt/lists/* | |
| # Replace the tzdata install step with DNF | |
| RUN dnf -y install tzdata && \ | |
| dnf clean all && \ | |
| rm -rf /var/cache/dnf | |
| # CLP S3 support currently needs ca-certificates package to access the presigned URL | |
| RUN dnf -y install ca-certificates && \ | |
| dnf clean all && \ | |
| rm -rf /var/cache/dnf |
🧰 Tools
🪛 Hadolint (2.12.0)
[info] 50-50: Avoid additional packages by specifying --no-install-recommends
(DL3015)
[warning] 50-50: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
kirkrodrigues
left a comment
There was a problem hiding this comment.
Can you also open an issue describing the problem? That way if we have similar issues in the future or we need to justify this solution to the upstream devs, we can look back at the issue.
| apt-get clean && \ | ||
| rm -rf /var/lib/apt/lists/* | ||
|
|
||
| # CLP S3 support currently needs ca-certificates package to access the presigned URL |
There was a problem hiding this comment.
How about replacing this and the previous block with:
# NOTE:
# - We need `ca-certificates` to support reads from signed S3 URLs.
# - We need `tzdata` as a workaround temporary for https://github.com/prestodb/presto/issues/25531
RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y \
ca-certificates \
tzdata \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
Do you mean open an issue in OSS repo or our fork? |
I mean open an issue in our fork. Eventually, when we try to PR the CLP connector to the OSS repo, we can refer back to the issue in our fork. |
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
chore: Add `ca-certificates` to the worker's runtime image to support reading signed S3 URLs in the CLP connector.
ca-certificates dependency of the worker's runtime image to support CLP S3.ca-certificates to the worker's runtime image to support reading signed S3 URLs in the CLP connector.
…e.dockerfile Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile (1)
45-50: Use dnf (or detect the package manager) — apt-get will fail with the CentOS Stream 9 default baseThe default BASE_IMAGE is quay.io/centos/centos:stream9, which doesn’t provide apt-get. This RUN will fail unless the base is overridden to Debian/Ubuntu at build time. Make the step portable by detecting the package manager (apt vs dnf), or switch to dnf for the CentOS default.
Apply this diff to make the install step work across both Debian/Ubuntu and CentOS Stream 9:
-RUN apt-get update \ - && DEBIAN_FRONTEND=noninteractive apt-get install -y \ - ca-certificates \ - tzdata \ - && apt-get clean \ - && rm -rf /var/lib/apt/lists/* +RUN set -eux; \ + if command -v apt-get >/dev/null 2>&1; then \ + apt-get update; \ + DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \ + ca-certificates \ + tzdata; \ + apt-get clean; \ + rm -rf /var/lib/apt/lists/*; \ + elif command -v dnf >/dev/null 2>&1; then \ + dnf -y install --setopt=install_weak_deps=False \ + ca-certificates \ + tzdata; \ + dnf clean all; \ + rm -rf /var/cache/dnf; \ + else \ + echo "Unsupported base image (no apt-get or dnf)"; exit 1; \ + fiIf you prefer to keep separate images per OS family, alternatively replace this block with a dnf-only install for the CentOS default and rely on a different Dockerfile when BASE_IMAGE is Ubuntu.
Run this script to verify whether CI/build tooling overrides BASE_IMAGE to a Debian/Ubuntu image (in which case apt-get would work today, but the CentOS default would still be broken):
#!/bin/bash set -euo pipefail echo "Searching for BASE_IMAGE overrides and docker build invocations…" rg -n -C3 -i '\bBASE_IMAGE\b' rg -n -C3 -i 'prestissimo-runtime\.dockerfile' rg -n -C3 -i 'docker\s+build(x)?\b.*(prestissimo-runtime\.dockerfile|--build-arg\s+BASE_IMAGE)'
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile(1 hunks)
🧰 Additional context used
🪛 Hadolint (2.12.0)
presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile
[info] 45-45: Avoid additional packages by specifying --no-install-recommends
(DL3015)
[warning] 45-45: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-non-hash-gen)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: test (17.0.13, :presto-main)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-local-queries)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-general)
- GitHub Check: test (17.0.13, :presto-main-base)
- GitHub Check: test (17.0.13, :presto-tests -P presto-tests-general)
- GitHub Check: test (8.0.442, :presto-main)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-tpch-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P presto-tests-execution-memory)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-resource-manager)
- GitHub Check: test (17.0.13, :presto-tests -P ci-only-distributed-queries)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-plan-determinism)
- GitHub Check: test (8.0.442, :presto-tests -P ci-only-aggregation-queries)
- GitHub Check: prestocpp-linux-build-for-test
- GitHub Check: maven-checks (8.0.442)
- GitHub Check: maven-checks (17.0.13)
- GitHub Check: prestissimo-worker-images-build
🔇 Additional comments (1)
presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile (1)
42-44: Good rationale in the NOTE blockClear explanation of why ca-certificates and tzdata are present; thanks for capturing the Presto issue context.
| RUN apt-get update \ | ||
| && DEBIAN_FRONTEND=noninteractive apt-get install -y \ | ||
| ca-certificates \ | ||
| tzdata \ | ||
| && apt-get clean \ | ||
| && rm -rf /var/lib/apt/lists/* |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Hadolint: add --no-install-recommends (apt) and consider version pinning (optional)
- For Debian/Ubuntu, prefer --no-install-recommends to reduce image size (DL3015). Included in the portable snippet above.
- Optional: pin package versions for full reproducibility (DL3008). Only do this if you’re prepared to maintain version bumps; otherwise, leave unpinned.
🧰 Tools
🪛 Hadolint (2.12.0)
[info] 45-45: Avoid additional packages by specifying --no-install-recommends
(DL3015)
[warning] 45-45: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
🤖 Prompt for AI Agents
In presto-native-execution/scripts/dockerfiles/prestissimo-runtime.dockerfile
around lines 45 to 50, the apt-get install invocation should add
--no-install-recommends to avoid pulling recommended packages and reduce image
size; update the command to call apt-get install -y --no-install-recommends
ca-certificates tzdata while keeping DEBIAN_FRONTEND=noninteractive, apt-get
clean and removal of /var/lib/apt/lists/*, and optionally pin package versions
(e.g. ca-certificates=<version> tzdata=<version>) only if you are prepared to
maintain them for reproducibility.
ca-certificates to the worker's runtime image to support reading signed S3 URLs in the CLP connector.ca-certificates to the worker's runtime image to support reading signed S3 URLs in the CLP connector (fixes #56).
kirkrodrigues
left a comment
There was a problem hiding this comment.
Added the issue it fixes to the PR's title.
b00627f
into
y-scope:release-0.293-clp-connector
Description
CLP S3 support needs this to access the presign URL.
Change the source image in CLP's
tools/deployment/presto-clp/docker-compose.ymlfromy-scopetoanloweeand it works.Checklist
breaking change.
Validation performed
Download the image and E2E test w/ and w/o S3 CLP package.
Summary by CodeRabbit
Bug Fixes
Chores