Feat/UI math rendering#175
Merged
Merged
Conversation
The engine will start consuming llama.cpp via a git submodule at
engine/vendor/llama-cpp-rs/llama-cpp-sys-2/llama.cpp (see next commit).
Every job that compiles the engine therefore needs
`with: submodules: recursive` on its checkout step, otherwise
llama-cpp-sys-2's build.rs fails to find the C++/CUDA source tree.
Touched jobs (those that run cargo build on the engine):
* ci.yml: engine
* release-engine.yml: build (Linux/macOS standard matrix),
build-cuda (Linux CUDA container), build-windows, build-windows-cuda
* try-windows-ninja.yml: build-windows-cuda-ninja
Untouched (correctly): path-filter detector, hub, forge,
installer-preflight, the final release aggregator, and
build-llama-dll.yml (standalone llama.cpp DLL build that does not pass
through llama-cpp-sys-2).
…6.0-beta.7
Why this exists
---------------
EuLLM Engine 0.6.0-beta.{1..6} were pre-releases bumped specifically to
ship Gemma 4 12B Unified vision via llama.cpp's mtmd path with the new
`gemma4uv` projector (PRs ggml-org/llama.cpp#24077, #24082, #24091,
merged 3-4 Jun 2026). The blocker turned out to be that the upstream
Rust binding `llama-cpp-2` (utilityai/llama-cpp-rs) pins its llama.cpp
submodule to a late-April commit and has not bumped since, so the new
projector type is silently absent from bindgen output and Gemma 4 12B
Unified cannot load on a stock 0.1.146 binary.
ysimonson opened utilityai/llama-cpp-rs#1034 on 5 Jun 2026 that does
exactly the bump we need ("support-gemma4-12b" branch,
c491763bcd42eb742287afd5612883d3b6e5e3a8). The PR has zero reviews so
far. Rather than wait for an unbounded merge window we vendor the
sources in-tree, becoming the first public adopter of that PR — which
also lets us comment on it with end-to-end evidence (Linux CUDA,
Windows CUDA, macOS Metal) once beta.7 builds run.
Vendor strategy
---------------
* engine/vendor/llama-cpp-rs/ holds llama-cpp-2 0.1.147 and
llama-cpp-sys-2 0.1.147 copied verbatim from ysimonson's HEAD
c491763b. No source edits — the manifests use `workspace = true` and
resolve cleanly because the EuLLM root workspace mirrors upstream's
`[workspace.dependencies]` / `[workspace.lints]` tables byte for byte.
* Both vendor crates become members of the EuLLM root workspace; cargo
refused all the smaller scoping attempts (`[patch.crates-io]` path,
direct path-dep with `[workspace] exclude`, `package.workspace = ".."`
pin) because path-deps of a workspace member are unconditionally
absorbed into the patching workspace. Merging the workspaces sidesteps
this entirely — engine and hub don't opt into `[lints] workspace = true`
so the upstream pedantic lint set only applies to the vendor crates.
* llama.cpp is a git submodule of THIS repo (not nested inside the
vendor tree) pinned to 7c158fbb4aec1bdc9c81d6ca0e785139f4826fae —
the same SHA ysimonson chose, and the first commit including the
gemma4uv projector. .gitignore gets a `!engine/vendor/**` exception
so the broad `vendor/` rule doesn't swallow the vendored crates.
Engine wiring
-------------
* engine/Cargo.toml: llama-cpp-2 = { path = "vendor/llama-cpp-rs/...",
version = "0.1.147", features = ["sampler"] }. Version bump
0.1.146 -> 0.1.147 matches the vendor manifests.
* engine version 0.5.20 -> 0.6.0-beta.7. The 0.5.20 stable commit on
this branch (81010eb) stays valid and can be tagged independently for
the Latest release; this commit re-opens the pre-release line on top
of it, this time actually capable of loading Gemma 4 12B Unified.
Cleanup plan when upstream lands
--------------------------------
When utilityai/llama-cpp-rs#1034 merges and llama-cpp-2 0.1.147+ is
published to crates.io:
1. drop engine/vendor/llama-cpp-rs
2. drop the workspace members + workspace.dependencies + workspace.lints
that this commit adds to the root Cargo.toml
3. drop the llama.cpp submodule entry and the .gitignore exception
4. restore engine/Cargo.toml to `llama-cpp-2 = { version = "...",
features = ["sampler"] }` (registry, no path)
Known scope of impact
---------------------
* CI checkout steps now need `submodules: recursive` for engine-building
jobs — covered by the previous commit.
* Two upstream API renames in 0.1.147 (`llama_memory_breakdown_print` ->
`llama_rs_memory_breakdown_print`, `llama_params_fit` ->
`llama_rs_params_fit`) do not affect the engine — neither symbol is
referenced anywhere in our source.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.