Skip to content

chore: llama.cpp as submodule#819

Merged
njbrake merged 23 commits intomainfrom
brake/llamacpp_submodule
Nov 4, 2025
Merged

chore: llama.cpp as submodule#819
njbrake merged 23 commits intomainfrom
brake/llamacpp_submodule

Conversation

@njbrake
Copy link
Copy Markdown
Collaborator

@njbrake njbrake commented Nov 3, 2025

To test

# Check out this branch, run `make setup`  (this will setup up all the submodules and apply the patches)
make setup
# Git clone another copy
git clone git@github.com:mozilla-ai/llamafile.git tmp_copy
rm -rf tmp_copy/llama.cpp
cp -r llama.cpp tmp_copy
cd tmp_copy
git status

You should see

*[main][~/scm/llamafile/tmp]$ git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   llama.cpp/ggml-cuda.cu
        modified:   llama.cpp/ggml-metal.m
        modified:   llama.cpp/server/public/index.html
        modified:   llama.cpp/server/server.cpp

This is just some whitespace changes from autoformat rules, no content changes.

@njbrake njbrake requested a review from aittalam November 3, 2025 17:39
Base automatically changed from brake/sd_submodule to main November 3, 2025 18:48
@github-actions github-actions bot added the devops label Nov 3, 2025
@njbrake njbrake linked an issue Nov 3, 2025 that may be closed by this pull request
Copy link
Copy Markdown
Member

@aittalam aittalam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @njbrake! Aside from a couple minor comments, there's a more important one IMHO to be addressed, i.e. the fact that (differently from whisper and sd) the process this time does not tell apart patched files from newly introduced ones (see my larger comment in renames.sh).

I think getting a better understanding of what is a change from llama.cpp's code vs what has been newly introduced would be beneficial to us. Even with the obvious assumption that there's a lot of changes, removing patched files from llamafile-files will leave us with a much smaller, cleaner set of files we know are not originally from llama.cpp. WDYT? Happy to do a quick sync on that.

@njbrake
Copy link
Copy Markdown
Collaborator Author

njbrake commented Nov 4, 2025

Thanks for the review @aittalam! I appreciate your in-depth review, and it was helpful to me as I think about the philosophy of the AI assisted coding world we're entering. In the case of this PR, I used Claude to help with the refactoring and took the time to verify that my refactor was technically correct, but I didn't take the time to actually review how it was doing it. What this resulted in was that I effectively shifted the cognitive burden of thinking about the PR off of me and onto you, the reviewer.

This is bad practice and not good for me as the author or you the reviewer. Imo the author (me) should remain the person primarily responsible for thinking about the design of the changes, not the reviewer. In this case, you had to both think about the design and also be the backstop, which isn't ideal 🙈 . Anyways, thank you for the solid review, sorry for my spaghetti code, and let me dig in to clean up the code (as David says, "humanify" it 😆 ).

@njbrake njbrake requested a review from aittalam November 4, 2025 14:04
Copy link
Copy Markdown
Member

@aittalam aittalam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @njbrake! I think seeing what's left is very interesting.
I took the liberty of suggesting two more directories to remove (I spotted them with a diff -r llama.cpp /tmp/llamafile/llama.cpp, perhaps git diff ignores empty dirs?), and I also got the following:

diff -r llama.cpp /tmp/llamafile/llama.cpp

Only in llama.cpp: .git
diff -r llama.cpp/ggml-cuda.cu /tmp/llamafile/llama.cpp/ggml-cuda.cu
366c366
<                 kernelVersion % 100000);
---
>                 kernelVersion % 100000);
diff -r llama.cpp/ggml-metal.m /tmp/llamafile/llama.cpp/ggml-metal.m
3381c3381
<
---
>
diff -r llama.cpp/server/public/index.html /tmp/llamafile/llama.cpp/server/public/index.html
882c882
<     // such as "strings" or /* comments */. These regexps are then utilizied by the
---
>     // such as "strings" or /* comments */. These regexps are then utilizied by the
1085,1086c1085,1086
<     // This transforms _some_ markdown to html by replacing code blocks and
<     // urls with a placeholder, so that any markdown within these already
---
>     // This transforms _some_ markdown to html by replacing code blocks and
>     // urls with a placeholder, so that any markdown within these already
diff -r llama.cpp/server/server.cpp /tmp/llamafile/llama.cpp/server/server.cpp
3196c3196
<         int written = snprintf(url, sizeof(url), "http://%s:%d%s/",
---
>         int written = snprintf(url, sizeof(url), "http://%s:%d%s/",
3233c3233
<                     exit(1);
---
>                     exit(1);

... which I believe we could keep (it's I think the same lines you were referring to).

I pre-approved the PR so you can merge it immediately after the (minor) fix!

Co-authored-by: Davide Eynard <davide.eynard@gmail.com>
@njbrake njbrake merged commit 9d975d0 into main Nov 4, 2025
2 checks passed
@njbrake njbrake deleted the brake/llamacpp_submodule branch November 4, 2025 15:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Llama.cpp as submodule

2 participants