chore: Make llama.cpp a default engine, speedup CI#1177
Conversation
8514ef5 to
59bcf14
Compare
364c121 to
2ab8eb2
Compare
It does not need any external libraries.
Also fix two docker lints (undefined variables).
## Walkthrough
The changes update several Dockerfiles to streamline and unify Rust build steps, shifting from selective crate builds to workspace-wide builds and focusing final build steps on C bindings. The Rust toolchain installation profile is set to "default" instead of "minimal." Documentation and configuration are updated to reflect new default features and clarify build instructions.
## Changes
| Files/Paths | Change Summary |
|-----------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------|
| container/Dockerfile.none, container/Dockerfile.tensorrt_llm, <br>container/Dockerfile.vllm | Rust toolchain installation profile changed from "minimal" to "default" in rustup-init. |
| container/Dockerfile.sglang, container/Dockerfile.tensorrt_llm, <br>container/Dockerfile.vllm | Rust build steps simplified: replaced multi-crate selective builds with workspace-wide builds; C bindings built in dedicated step. |
| container/Dockerfile.vllm | Environment variables `CPATH` and `PKG_CONFIG_PATH` set explicitly; artifact copying consolidated; build process unified. |
| docs/guides/dynamo_run.md | Documentation updated: clarified default llama.cpp build behavior and recommended GPU feature flags for optimized builds. |
| launch/dynamo-run/Cargo.toml | License header removed; default features expanded to include "llamacpp" alongside "mistralrs". |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant Dockerfile
participant Rustup
participant Cargo
participant Workspace
participant CBinder
Dockerfile->>Rustup: Install Rust (profile: default)
Dockerfile->>Cargo: Build workspace (--workspace, features, release)
Cargo->>Workspace: Compile all crates
Dockerfile->>CBinder: Build C bindings crate (release)
CBinder->>Workspace: Generate C include files and shared librariesSuggested labels
Suggested reviewers
Poem
|
There was a problem hiding this comment.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
docs/guides/dynamo_run.md (1)
26-28:⚠️ Potential issueUpdate default engine documentation
The introduction still states thatmistralrsis the default engine, but the PR makesllamacppa default feature. Please revise to reflect that llama.cpp is now included (or re-order to showllamacppas default).
🧹 Nitpick comments (3)
docs/guides/dynamo_run.md (2)
308-308: Add missing comma for clarity
Consider writing “built for CPU by default, for an optimized build…” to improve readability.🧰 Tools
🪛 LanguageTool
[uncategorized] ~308-~308: Possible missing comma found.
Context: ...lt for CPU by default. For an optimized build pass the appropriate feature flag (high...(AI_HYDRA_LEO_MISSING_COMMA)
311-311: Specify language for code block
Add a language tag (e.g., ```bash) after the opening backticks to enable syntax highlighting.container/Dockerfile.sglang (1)
323-327: Consider slimming down CI artifact copies.
Copying the entire/workspacedirectory fromwheel_builderbrings over source code and all build artifacts, potentially bloating the CI image. As an optional refactor, you could copy only the necessary C bindings source (lib/bindings/c) and the compiled artifacts (e.g.,target/release/libdynamo_llm_capi.so) to reduce image size.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
container/Dockerfile.none(1 hunks)container/Dockerfile.sglang(2 hunks)container/Dockerfile.tensorrt_llm(3 hunks)container/Dockerfile.vllm(4 hunks)docs/guides/dynamo_run.md(1 hunks)launch/dynamo-run/Cargo.toml(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/guides/dynamo_run.md
[uncategorized] ~308-~308: Possible missing comma found.
Context: ...lt for CPU by default. For an optimized build pass the appropriate feature flag (high...
(AI_HYDRA_LEO_MISSING_COMMA)
🪛 markdownlint-cli2 (0.17.2)
docs/guides/dynamo_run.md
310-310: Fenced code blocks should have a language specified
null
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: Build and Test - vllm
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (.)
🔇 Additional comments (12)
launch/dynamo-run/Cargo.toml (2)
1-2: Inconsistent AI summary: License header still present
The AI-generated summary indicates the license header was removed, but lines 1–2 show it remains in this file.Likely an incorrect or invalid review comment.
17-17: Enablellamacppas a default feature
Adding"llamacpp"alongside"mistralrs"in thedefaultarray correctly makes the llama.cpp engine build by default.container/Dockerfile.none (1)
35-35: Switch Rust installation profile to default
Changing--profile minimalto--profile defaultensures the full toolchain is installed, aligning with other container builds.container/Dockerfile.tensorrt_llm (4)
207-207: Switch Rust installation profile to default
Using--profile defaulthere matches the change in other Dockerfiles and ensures a complete Rust toolchain.
276-281: Simplify workspace build
Replacing selective builds withcargo build --workspace --features dynamo-llm/block-managerreduces complexity and guarantees consistency across all crates.
299-299: Copy full workspace from wheel_builder
Consolidating artifact copying by bringing in the entire/workspacedirectory simplifies maintenance and prevents missing files.
303-307: Build C bindings in dev stage
Explicitly runningcargo buildin/workspace/lib/bindings/censures the C API and headers are compiled and available for packaging.container/Dockerfile.vllm (4)
262-262: Switch Rust installation profile to default
Aligning with other container builds by installing Rust with--profile default.
375-380: Consolidate workspace build
Usingcargo build --workspace --features dynamo-llm/block-managersimplifies the build process and ensures all workspace crates are included.
401-402: Copy full workspace and NIXL artifacts
Bringing in the complete/workspaceand NIXL library paths fromwheel_builderreduces Docker layer complexity.
406-408: Build C API bindings in CI minimum stage
Compiling the C bindings in/workspace/lib/bindings/censures the shared library and headers are produced for downstream images.container/Dockerfile.sglang (1)
297-301: Approve consolidated Rust workspace build.
Switching to a singlecargo build --workspacestep simplifies compilation across all crates, leverages incremental caching, and maintains reproducibility with--releaseand--locked.
Co-authored-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Graham King <graham@gkgk.org>
Co-authored-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Graham King <graham@gkgk.org>
Co-authored-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Graham King <graham@gkgk.org>
Co-authored-by: Anant Sharma <anants@nvidia.com> Signed-off-by: Graham King <graham@gkgk.org>
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
container/Dockerfile.tensorrt_llm (1)
303-304: Implement C bindings build as previously suggested
This exactly matches the earlier recommendation to runcd lib/bindings/c && cargo build --release --lockedin the dev image to generate C include files and the shared library.
🧹 Nitpick comments (2)
container/Dockerfile.tensorrt_llm (2)
207-207: Consider image size implications of switching Rust profile
You’ve changed the rustup-init profile fromminimaltodefault, which pulls in extra components (docs, source, rustfmt, etc.). While this unifies the Dockerfiles, it may significantly increase image size. Please verify that this change doesn’t bloat the layer beyond acceptable limits or evaluate if you can selectively install only the necessary components.
299-299: Refine workspace copy to limit to necessary artifacts
Copying the entire/workspacedirectory fromwheel_buildermay include source files and intermediate artifacts you don’t need at runtime. Consider restricting this to only the built artifacts (e.g.,/workspace/dist,/workspace/target) to reduce image size and surface area.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
container/Dockerfile.sglang(2 hunks)container/Dockerfile.tensorrt_llm(3 hunks)container/Dockerfile.vllm(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- container/Dockerfile.sglang
- container/Dockerfile.vllm
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Build Documentation
- GitHub Check: Build and Test - vllm
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (.)
🔇 Additional comments (2)
container/Dockerfile.tensorrt_llm (2)
276-280: Consolidated workspace build improves maintainability
Switching tocargo build --workspacewith thedynamo-llm/block-managerfeature simplifies the build steps and ensures all crates are built in one go. This aligns well with your goal of unifying Rust build logic across images.
298-298: No substantive code on this line.
It does not need any external libraries.
Summary by CodeRabbit