chore: llama.cpp as submodule by njbrake · Pull Request #819 · mozilla-ai/llamafile

njbrake · 2025-11-03T17:39:38Z

To test

# Check out this branch, run `make setup`  (this will setup up all the submodules and apply the patches)
make setup
# Git clone another copy
git clone git@github.com:mozilla-ai/llamafile.git tmp_copy
rm -rf tmp_copy/llama.cpp
cp -r llama.cpp tmp_copy
cd tmp_copy
git status

You should see

*[main][~/scm/llamafile/tmp]$ git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   llama.cpp/ggml-cuda.cu
        modified:   llama.cpp/ggml-metal.m
        modified:   llama.cpp/server/public/index.html
        modified:   llama.cpp/server/server.cpp

This is just some whitespace changes from autoformat rules, no content changes.

…pply-patches script perfectly lines up with the current whisper.cpp folder in the main branch

…file into brake/whisper_submodule

aittalam

Thanks for the PR @njbrake! Aside from a couple minor comments, there's a more important one IMHO to be addressed, i.e. the fact that (differently from whisper and sd) the process this time does not tell apart patched files from newly introduced ones (see my larger comment in renames.sh).

I think getting a better understanding of what is a change from llama.cpp's code vs what has been newly introduced would be beneficial to us. Even with the obvious assumption that there's a lot of changes, removing patched files from llamafile-files will leave us with a much smaller, cleaner set of files we know are not originally from llama.cpp. WDYT? Happy to do a quick sync on that.

.github/workflows/ci.yml

Makefile

llama.cpp.patches/renames.sh

njbrake · 2025-11-04T13:00:05Z

Thanks for the review @aittalam! I appreciate your in-depth review, and it was helpful to me as I think about the philosophy of the AI assisted coding world we're entering. In the case of this PR, I used Claude to help with the refactoring and took the time to verify that my refactor was technically correct, but I didn't take the time to actually review how it was doing it. What this resulted in was that I effectively shifted the cognitive burden of thinking about the PR off of me and onto you, the reviewer.

This is bad practice and not good for me as the author or you the reviewer. Imo the author (me) should remain the person primarily responsible for thinking about the design of the changes, not the reviewer. In this case, you had to both think about the design and also be the backstop, which isn't ideal 🙈 . Anyways, thank you for the solid review, sorry for my spaghetti code, and let me dig in to clean up the code (as David says, "humanify" it 😆 ).

aittalam

Thank you @njbrake! I think seeing what's left is very interesting.
I took the liberty of suggesting two more directories to remove (I spotted them with a diff -r llama.cpp /tmp/llamafile/llama.cpp, perhaps git diff ignores empty dirs?), and I also got the following:

diff -r llama.cpp /tmp/llamafile/llama.cpp

Only in llama.cpp: .git
diff -r llama.cpp/ggml-cuda.cu /tmp/llamafile/llama.cpp/ggml-cuda.cu
366c366
<                 kernelVersion % 100000);
---
>                 kernelVersion % 100000);
diff -r llama.cpp/ggml-metal.m /tmp/llamafile/llama.cpp/ggml-metal.m
3381c3381
<
---
>
diff -r llama.cpp/server/public/index.html /tmp/llamafile/llama.cpp/server/public/index.html
882c882
<     // such as "strings" or /* comments */. These regexps are then utilizied by the
---
>     // such as "strings" or /* comments */. These regexps are then utilizied by the
1085,1086c1085,1086
<     // This transforms _some_ markdown to html by replacing code blocks and
<     // urls with a placeholder, so that any markdown within these already
---
>     // This transforms _some_ markdown to html by replacing code blocks and
>     // urls with a placeholder, so that any markdown within these already
diff -r llama.cpp/server/server.cpp /tmp/llamafile/llama.cpp/server/server.cpp
3196c3196
<         int written = snprintf(url, sizeof(url), "http://%s:%d%s/",
---
>         int written = snprintf(url, sizeof(url), "http://%s:%d%s/",
3233c3233
<                     exit(1);
---
>                     exit(1);

... which I believe we could keep (it's I think the same lines you were referring to).

I pre-approved the PR so you can merge it immediately after the (minor) fix!

llama.cpp.patches/apply-patches.sh

Co-authored-by: Davide Eynard <davide.eynard@gmail.com>

njbrake and others added 14 commits October 31, 2025 14:09

feat: integrate whisper.cpp as a submodule with patches

d293b1f

simplify naming

fbb14ee

Not final state but wanted to get this up: with these edits now the a…

4533de8

…pply-patches script perfectly lines up with the current whisper.cpp folder in the main branch

apply patches in CI

648802a

Update ci.yml

b6c4e5e

Update ci.yml

1d13322

fix: register whisper.cpp submodule

40f116e

Merge branch 'brake/whisper_submodule' of github.com:mozilla-ai/llama…

27dbd26

…file into brake/whisper_submodule

pin whisper.cpp to commit

8194ff9

move to patches

5a902a6

patches

a758711

Refactor

081ecf2

convert stable diffusion to submodule

4aafc0e

llama.cpp as submodule

9047144

njbrake requested a review from aittalam November 3, 2025 17:39

github-actions bot added the llama.cpp label Nov 3, 2025

add to make setup

f5786d1

Base automatically changed from brake/sd_submodule to main November 3, 2025 18:48

njbrake and others added 3 commits November 3, 2025 13:51

Merge branch 'main' into brake/llamacpp_submodule

73b389c

whitespace

f876a90

patch all deps

5011687

github-actions bot added the devops label Nov 3, 2025

njbrake linked an issue Nov 3, 2025 that may be closed by this pull request

Llama.cpp as submodule #817

Closed

aittalam reviewed Nov 4, 2025

View reviewed changes

.github/workflows/ci.yml Show resolved Hide resolved

Makefile Outdated Show resolved Hide resolved

Makefile Outdated Show resolved Hide resolved

llama.cpp.patches/renames.sh Outdated Show resolved Hide resolved

llama.cpp.patches/renames.sh Outdated Show resolved Hide resolved

njbrake added 4 commits November 4, 2025 08:08

Fix whitespace issues

c3e5e99

Cleanup of files that are both renamed and copied, part 1

75d95f3

further cleanup

11463d8

split patches out into their own files

6c3125b

njbrake requested a review from aittalam November 4, 2025 14:04

aittalam approved these changes Nov 4, 2025

View reviewed changes

llama.cpp.patches/apply-patches.sh Show resolved Hide resolved

Update llama.cpp.patches/apply-patches.sh

9962c08

Co-authored-by: Davide Eynard <davide.eynard@gmail.com>

njbrake merged commit 9d975d0 into main Nov 4, 2025
2 checks passed

njbrake deleted the brake/llamacpp_submodule branch November 4, 2025 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: llama.cpp as submodule#819

chore: llama.cpp as submodule#819
njbrake merged 23 commits intomainfrom
brake/llamacpp_submodule

njbrake commented Nov 3, 2025 •

edited

Loading

Uh oh!

aittalam left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

njbrake commented Nov 4, 2025

Uh oh!

aittalam left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

njbrake commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aittalam left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

njbrake commented Nov 4, 2025

Uh oh!

aittalam left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

njbrake commented Nov 3, 2025 •

edited

Loading

aittalam left a comment •

edited

Loading