Skip to content

chore(docker): Add --link flags to COPY/ADD operations for improved build performance (fixes #1408).#1411

Merged
junhaoliao merged 6 commits into
y-scope:mainfrom
junhaoliao:docker-copy
Oct 20, 2025
Merged

chore(docker): Add --link flags to COPY/ADD operations for improved build performance (fixes #1408).#1411
junhaoliao merged 6 commits into
y-scope:mainfrom
junhaoliao:docker-copy

Conversation

@junhaoliao

@junhaoliao junhaoliao commented Oct 13, 2025

Copy link
Copy Markdown
Member

Description

As the title suggests, this PR ensure the --link argument is added to all COPY commands in all Dockerfile in this repo.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. Observed all build CIs passed.

Summary by CodeRabbit

  • Chores
    • Optimized container image builds across multiple base environments to improve build speed, caching and layer deduplication.
    • Streamlined file incorporation during image construction, which may reduce image size and pull/update times.
    • Applies to Ubuntu, CentOS Stream, manylinux, musllinux (x86_64, aarch64), and package images.
    • No changes to runtime behaviour or user-facing functionality.

@coderabbitai

coderabbitai Bot commented Oct 13, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Multiple Dockerfiles were edited to add the BuildKit syntax directive and convert various ADD/COPY instructions to use COPY --link (including stage-flattening COPY --from=base / /, ./tools/scripts/lib_install, and packaging COPYs). No build stage ordering or exported/public interfaces were changed.

Changes

Cohort / File(s) Summary
clp-core Dockerfile updates
components/core/tools/docker-images/clp-core-ubuntu-jammy/Dockerfile
Replaced several ADD with COPY --link for directories/files (clg, clp, clp-s, glt, make-dictionaries-readable) and changed final-stage COPY --from=base / / to COPY --link --from=base / /. Added Dockerfile syntax directive.
Stage-flatten COPY uses --link
components/core/tools/docker-images/clp-env-base-centos-stream-9/Dockerfile, components/core/tools/docker-images/clp-env-base-ubuntu-jammy/Dockerfile
Replaced COPY --from=base / / with COPY --link --from=base / /.
lib_install COPY uses --link
components/core/tools/docker-images/clp-env-base-manylinux_2_28-aarch64/Dockerfile, components/core/tools/docker-images/clp-env-base-manylinux_2_28-x86_64/Dockerfile, components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/Dockerfile, components/core/tools/docker-images/clp-env-base-musllinux_1_2-x86_64/Dockerfile, components/core/tools/docker-images/clp-env-base-ubuntu-jammy/Dockerfile
Replaced COPY ./tools/scripts/lib_install ./tools/scripts/lib_install (or ADD) with COPY --link ./tools/scripts/lib_install ./tools/scripts/lib_install. Added Dockerfile syntax directive in some files.
clp-package COPYs add --link
tools/docker-images/clp-package/Dockerfile
Added --link to COPYs for /setup-scripts, build/clp-package/opt/clp, and final-stage COPY --from=base / /.

Sequence Diagram(s)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "chore(docker): Use COPY --link for improved file copying efficiency (resolves #1408)" accurately and specifically describes the main change in the changeset. The title clearly identifies what is being modified (COPY commands in Dockerfiles), how they are being modified (adding the --link flag), and the intended benefit (improved file copying efficiency). The title is concise, uses proper conventional commit formatting with a clear scope (docker), and avoids vague or misleading language. A teammate reviewing the repository history would immediately understand that this pull request addresses Docker build optimization through file-copying improvements across multiple Dockerfiles.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9ba1140 and d9d9413.

📒 Files selected for processing (3)
  • components/core/tools/docker-images/clp-core-ubuntu-jammy/Dockerfile (2 hunks)
  • components/core/tools/docker-images/clp-env-base-centos-stream-9/Dockerfile (2 hunks)
  • components/core/tools/docker-images/clp-env-base-ubuntu-jammy/Dockerfile (2 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
components/core/tools/docker-images/clp-env-base-ubuntu-jammy/Dockerfile

[low] 1-23: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-23: Ensure that a user for the container has been created

(CKV_DOCKER_3)


[low] 1-23: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-23: Ensure that a user for the container has been created

(CKV_DOCKER_3)

components/core/tools/docker-images/clp-core-ubuntu-jammy/Dockerfile

[low] 1-31: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-31: Ensure that a user for the container has been created

(CKV_DOCKER_3)

components/core/tools/docker-images/clp-env-base-centos-stream-9/Dockerfile

[low] 3-3: Ensure the base image uses a non latest version tag

(CKV_DOCKER_7)


[low] 1-23: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-23: Ensure that a user for the container has been created

(CKV_DOCKER_3)


[low] 1-23: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)


[low] 1-23: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🪛 Hadolint (2.14.0)
components/core/tools/docker-images/clp-env-base-centos-stream-9/Dockerfile

[warning] 3-3: Always tag the version of an image explicitly

(DL3006)

🔇 Additional comments (9)
components/core/tools/docker-images/clp-env-base-ubuntu-jammy/Dockerfile (3)

1-1: BuildKit syntax directive added; verify build environment compatibility.

The # syntax=docker/dockerfile:1 directive enables BuildKit features (including --link support). Ensure that all build environments (CI/CD, local developer machines, production image builds) have BuildKit enabled or default to it.


8-8: Appropriate conversion from ADD to COPY --link for local file copying.

The change from ADD to COPY --link is correct: ADD is unnecessary for copying local files, and COPY --link enables hard-linking for improved efficiency. The preceding mkdir (line 7) ensures the destination directory exists.


22-23: COPY --link --from=base / / correctly adds link flag to stage flattening.

The multi-stage pattern (base → scratch with full-filesystem copy) is sound, and the --link flag allows the Docker builder to optimise layer storage by hard-linking where possible.

Verify that all Dockerfiles in the repository have been updated consistently with --link flags as described in the PR summary.

components/core/tools/docker-images/clp-core-ubuntu-jammy/Dockerfile (3)

1-1: Syntax directive enables BuildKit features for --link flag.

The # syntax=docker/dockerfile:1 directive is required to enable the --link flag used throughout this Dockerfile. This is the correct approach for leveraging BuildKit optimisations.


19-23: COPY --link replacements for local source files are correct.

Converting ADD directives to COPY --link for local files (not remote URLs or archives) aligns with the PR objectives and Docker best practices. The --link flag enables hard-link copies in BuildKit, improving build efficiency.


27-27: Verify multi-stage COPY --link handles stage flattening correctly.

Adding --link to the COPY --from=base / / command (entire root filesystem copy) is consistent with the PR approach. Since all build CIs passed, this appears functional, but ensure that the broad scope of this copy operation does not introduce unexpected layer behaviour in your build environment.

components/core/tools/docker-images/clp-env-base-centos-stream-9/Dockerfile (3)

1-1: BuildKit syntax directive correctly added.

The # syntax=docker/dockerfile:1 directive enables BuildKit features required for the --link flag on COPY commands. Placement at the beginning of the file is correct.


8-8: LGTM—--link flag correctly applied to lib_install COPY.

The hard-link optimization is appropriately applied to the local directory copy. Syntax and flag placement are correct.


23-23: Verify hard-linking behaviour with stage-flattening copy.

The --link flag has been applied to the COPY --from=base / / command, which copies the entire filesystem from the base stage. Ensure that hard-linking works as expected when copying a full root directory across build stages, particularly regarding symbolic links, permissions, and other filesystem metadata that may not hard-link correctly.

You may want to validate that the resulting image behaves identically to builds without --link, especially:

  • Any symbolic links that should be preserved
  • File permissions and ownership across copied files
  • Total build time improvement with this optimization

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@junhaoliao junhaoliao marked this pull request as ready for review October 13, 2025 01:26
@junhaoliao junhaoliao requested a review from a team as a code owner October 13, 2025 01:26

@Bill-hbrhbr Bill-hbrhbr left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two comments on out-of-range lines:

  1. It's recommended to add
    # syntax=docker/dockerfile:1
    at the top of every Dockerfile that uses COPY --link. Or to be direct, just add it to every Dockerfile.
    The rationale is that COPY --link requires syntax v1.4, and by specifying v1, we always get the latest syntax version:
    https://docs.docker.com/build/buildkit/frontend/#stable-channel
    https://hub.docker.com/r/docker/dockerfile#linked-copies-copy---link-add---link
  2. We can also add ADD --link, and I believe if we are not dealing with URL downloads, we can change all the ADD --link to COPY --link

@junhaoliao

Copy link
Copy Markdown
Member Author

We can also add ADD --link, and I believe if we are not dealing with URL downloads, we can change all the ADD --link to COPY --link

i believe all the ADD directives are not intended to download remote files or extract archives, so i replaced them all with ADD --link

@Bill-hbrhbr Bill-hbrhbr linked an issue Oct 19, 2025 that may be closed by this pull request
5 tasks
@Bill-hbrhbr

Bill-hbrhbr commented Oct 19, 2025

Copy link
Copy Markdown
Contributor

For title, how about:

chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes #1408).

@junhaoliao junhaoliao changed the title chore(docker): Use COPY --link for improved file copying efficiency (resolves #1408). chore(docker): Add --link flags to COPY/ADD operations for improved build performance (fixes #1408). Oct 20, 2025
@junhaoliao junhaoliao merged commit 11d73fb into y-scope:main Oct 20, 2025
27 checks passed
@junhaoliao junhaoliao deleted the docker-copy branch October 20, 2025 04:08
LinZhihao-723 added a commit to LinZhihao-723/clp that referenced this pull request Oct 22, 2025
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421)

* docs: Add Slack community invite badge to home page README. (y-scope#1418)

* refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417)

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405)

* chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411)

* fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430)

- Save cache entries using unique key per entry.
- Restore latest entries using key prefix.
- Avoid using outputs from optionally-run `restore` task.

* fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444)

* fix(webui): Submit queries that failed ANTLR validation to Presto.  (y-scope#1450)

* feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434)

* feat(webui): Show query speed in native search status. (y-scope#1429)

* fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453)

* feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448)

* feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459)

* Add filters.

* Update cargo lock.

---------

Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com>
Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com>
Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
LinZhihao-723 added a commit to LinZhihao-723/clp that referenced this pull request Oct 22, 2025
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421)

* docs: Add Slack community invite badge to home page README. (y-scope#1418)

* refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417)

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405)

* chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411)

* fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430)

- Save cache entries using unique key per entry.
- Restore latest entries using key prefix.
- Avoid using outputs from optionally-run `restore` task.

* fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444)

* fix(webui): Submit queries that failed ANTLR validation to Presto.  (y-scope#1450)

* feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434)

* feat(webui): Show query speed in native search status. (y-scope#1429)

* fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453)

* feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448)

* feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459)

* Add filters.

* Update cargo lock.

---------

Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com>
Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com>
Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
LinZhihao-723 added a commit to LinZhihao-723/clp that referenced this pull request Oct 22, 2025
* feat(webui): Add drawer to display guided query and errors. (y-scope#1421)

* docs: Add Slack community invite badge to home page README. (y-scope#1418)

* refactor(clp-package): Simplify StrEnum and Path serialization via Annotated serializers. (y-scope#1417)

Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* build(clp-package): Adopt uv + hatchling as the build and packaging backend for Python components (resolves y-scope#1396); Upgrade dependencies for Python components. (y-scope#1405)

* chore(docker): Add `--link` flags to COPY/ADD operations for improved build performance (fixes y-scope#1408). (y-scope#1411)

* fix(ci): Correctly update and restore cache of `lint:check-cpp-lint-static-full`'s generated files (fixes y-scope#1419): (y-scope#1430)

- Save cache entries using unique key per entry.
- Restore latest entries using key prefix.
- Avoid using outputs from optionally-run `restore` task.

* fix(clp-rust-utils): Use AWS SDK default configuration with latest behavior version for S3 client. (y-scope#1445)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* refactor(clp-package): Remove unused `python-dotenv` dependency and related imports (fixes y-scope#1443). (y-scope#1444)

* fix(webui): Submit queries that failed ANTLR validation to Presto.  (y-scope#1450)

* feat(clp-s): Explicitly reject unstructured log inputs during compression. (y-scope#1434)

* feat(webui): Show query speed in native search status. (y-scope#1429)

* fix(job-orchestration): Make `tag_ids` a required `list[int]` for compatibility with Spider compressor. (y-scope#1453)

* feat(clp-mcp-server): Add log viewer links to query results for displaying in LLM output. (y-scope#1454)

Co-authored-by: Junhao Liao <junhao.liao@yscope.com>

* feat(ci): Add tasks for checking and updating Rust lock file (`Cargo.lock`); Add check to GH workflow. (y-scope#1448)

* feat(webui): Trigger submit action when pressing Enter on Monaco single line editor. (y-scope#1459)

* Add filters.

* Update cargo lock.

* Stupid fix

---------

Co-authored-by: davemarco <83603688+davemarco@users.noreply.github.com>
Co-authored-by: Abigail Matthews <abigail.v.matthews@gmail.com>
Co-authored-by: sitaowang1998 <sitaowang1998@outlook.com>
Co-authored-by: Junhao Liao <junhao@junhao.ca>
Co-authored-by: Junhao Liao <junhao.liao@yscope.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com>
Co-authored-by: hoophalab <200652805+hoophalab@users.noreply.github.com>
Co-authored-by: Huangshi Tian <All-less@users.noreply.github.com>
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adopt COPY --link in Dockerfiles for improved build performance

2 participants