Skip to content

Fix llama.cpp CMake build detection in save.py#5957

Open
ashzak wants to merge 7 commits into
unslothai:mainfrom
ashzak:fix/quick-wins-5832-5821
Open

Fix llama.cpp CMake build detection in save.py#5957
ashzak wants to merge 7 commits into
unslothai:mainfrom
ashzak:fix/quick-wins-5832-5821

Conversation

@ashzak

@ashzak ashzak commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #5832: when llama.cpp migrated from Makefile to CMake, the build detection in unsloth/save.py tried make clean first and failed with a confusing "Build system changed: The Makefile build has been replaced by CMake" error.

  • Add _is_cmake_only_llama_cpp() helper that detects the post-migration Makefile deprecation stub (or a missing Makefile)
  • Detect the build system before attempting make clean in install_llama_cpp_make_non_blocking(), install_llama_cpp_old(), and install_llama_cpp_blocking()
  • Suppress stderr when probing make clean as a fallback check

The modern GGUF path goes through unsloth_zoo.llama_cpp.install_llama_cpp(); the same guard for that path is in unslothai/unsloth-zoo#763.

A docs-download endpoint originally bundled in this PR was removed; #5821 should be solved by bundling real docs at build time in its own PR.

Testing

  • Helper verified against the actual deprecation stub from llama.cpp master (detected), a missing Makefile (detected), and a legacy buildable Makefile (not detected)
  • py_compile on unsloth/save.py passes

Fixes unslothai#5832: llama.cpp build fails due to CMake migration
- Add _is_cmake_only_llama_cpp() helper to detect CMake-only builds
- Check Makefile for deprecation notice before trying `make clean`
- Suppress confusing "Build system changed" error by detecting early
- Update all three build functions to use the new detection

Fixes unslothai#5821: Add documentation download for offline RAG
- Add /api/settings/docs/download endpoint to serve bundled docs
- Generate comprehensive markdown quick-reference guide
- Add download button to About tab in settings
- Add i18n translations (English and Chinese)

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new documentation download endpoint and a corresponding "Download" button in the frontend "About" tab to support offline use and local RAG systems. Additionally, it improves the llama.cpp build system detection by checking for CMake-only configurations before executing make. Feedback on the changes highlights a potential crash in unsloth/save.py if the make executable is missing from the system, suggesting wrapping the subprocess.run call in a try-except block to handle FileNotFoundError and fallback gracefully to CMake.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread unsloth/save.py Outdated
Comment on lines +1176 to +1181
result = subprocess.run(
["make", "clean", "-C", "llama.cpp"],
stdout = subprocess.DEVNULL,
stderr = subprocess.DEVNULL,
)
IS_CMAKE = result.returncode != 0

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using subprocess.run with shell=False (the default) to execute make will raise a FileNotFoundError if the make executable is not installed on the system (e.g., on Windows or minimal Docker containers). This will crash the GGUF export process.

To prevent this, wrap the execution in a try...except FileNotFoundError block and fallback to IS_CMAKE = True if make is unavailable.

Suggested change
result = subprocess.run(
["make", "clean", "-C", "llama.cpp"],
stdout = subprocess.DEVNULL,
stderr = subprocess.DEVNULL,
)
IS_CMAKE = result.returncode != 0
try:
result = subprocess.run(
["make", "clean", "-C", "llama.cpp"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL,
)
IS_CMAKE = result.returncode != 0
except FileNotFoundError:
IS_CMAKE = True

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7181d9c4fc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread studio/backend/routes/settings.py Outdated


# Documentation export endpoint for RAG and offline use
_DOCS_CACHE: str | None = None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid Python 3.10-only union syntax

In Python 3.9 environments, which are allowed by the root pyproject.toml (requires-python = ">=3.9,<3.15"), importing this route evaluates str | None immediately and raises TypeError: unsupported operand type(s) for |: 'type' and 'NoneType' before the Studio server can start. Use Optional[str] or add postponed annotations so the new settings route remains importable on the supported Python range.

Useful? React with 👍 / 👎.

if (token) headers.set("Authorization", `Bearer ${token}`);

try {
const res = await fetch(apiUrl("/api/settings/docs/download"), { headers });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use authenticated fetch for the protected download

Because /api/settings/docs/download is protected by Depends(get_current_subject), this raw fetch fails whenever the access token has expired but the refresh token is still valid: it receives a 401 and only logs an error, unlike the existing settings API helpers that use authFetch to refresh/retry and handle Tauri auto-auth. In that common expired-token scenario the new Download button does nothing until the user refreshes or logs in again.

Useful? React with 👍 / 👎.

ashzak added 2 commits June 2, 2026 12:23
Address Gemini review feedback - wrap subprocess.run in try-except
to handle FileNotFoundError when make is not installed, falling back
to CMake build system.
- Use Optional[str] instead of str | None for Python 3.9 compatibility
- Use authFetch instead of raw fetch for token refresh handling
@danielhanchen danielhanchen self-assigned this Jun 11, 2026
@danielhanchen

Copy link
Copy Markdown
Member

Thanks for the PR. I synced the branch with main and made two changes.

Kept the CMake detection fix (#5832). The _is_cmake_only_llama_cpp helper is the right approach. I verified it against the actual deprecation stub on llama.cpp master (detects it correctly), a missing Makefile, and a legacy buildable Makefile. Also trimmed the comments a bit.

Removed the docs download endpoint (#5821). The endpoint shipped a large hand-written markdown summary hardcoded inside routes/settings.py. That content duplicates docs.unsloth.ai and will silently drift out of date, and stale numbers presented as official documentation are worse than no file. The right shape for #5821 is bundling the real docs into the app at build time (or generating the file from the docs source) so there is a single source of truth. Happy to review that as its own PR.

Retitled the PR to match the remaining scope.

@danielhanchen danielhanchen changed the title fix: llama.cpp CMake build detection + docs download feature Fix llama.cpp CMake build detection in save.py Jun 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] llama.cpp build fails during the saving of the model to GGUF

2 participants