ggml-webgpu: update WebGPU support and add link to blog/demo by reeselevine · Pull Request #23483 · ggml-org/llama.cpp

reeselevine · 2026-05-21T15:39:16Z

Overview

Adds blog I wrote introducing WebGPU to the hot topics, which also links to a paper we wrote.
Updates WebGPU docs to specify it's no longer just in progress

Let me know if there are any edits or updates I should make to the blog post too!

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: no

ggerganov

Well done!

CISC

Awesome!

reeselevine · 2026-05-21T18:00:02Z

thank you, and to you for all the hard work reviewing and building this project :)

CISC · 2026-05-21T18:15:40Z

thank you, and to you for all the hard work reviewing and building this project :)

I honestly did not expect to be credited in your blog, it has been an honor sir! :)

ngxson · 2026-05-21T18:23:27Z

Nice! Thanks for adding the blog post!

Btw I'm rethinking about not supporting mem64 now, so you think safari is worth supporting (i.e. what's the perf compared to chrome?) If it's worth keeping, I can probably distribute another build without jspi/mem64

reeselevine · 2026-05-21T20:02:33Z

@CISC of course, without you all this code would have taken a lot longer to be merged!

@ngxson yeah I think Safari is worth supporting. For example, on my M3 on LLama3.2 1b q4_k_m I see ~30 t/s decode on Safari vs ~50 on Chrome. So not great, but much better than Firefox, plus Safari is used by more people than Firefox and is the only way to run things on iOS.

* origin/master: server: only parse empty msg if continuing an assistant msg (ggml-org#23506) perplexity : fix integer overflow (ggml-org#23496) SYCL: improve MoE prefill throughput (ggml-org#23142) sycl : Level Zero detection in ggml_sycl_init (ggml-org#23097) SYCL : gated_delta_net K>1 (ggml-org#23174) SYCL: add BF16 to DMMV kernel path (~4x tg speedup on Intel Arc) (ggml-org#21580) docs: Update documentation with Granite 4.0/4.1 (ggml-org#23404) ggml-zendnn : add Q8_0 quantization support (ggml-org#23414) cmake : build router app only during standalone builds (ggml-org#23521) vocab : fix HybridDNA tokenizer (ggml-org#23466) cmake : add install() for impl libraries + fix apple builds (ggml-org#23511) CUDA: fix PDL CC check for JIT compilation (ggml-org#23471) cmake : remove STATIC from impl libraries, enable LLAMA_BUILD_APP by default (ggml-org#23462) Update WebGPU support and add link to blog/demo (ggml-org#23483) vulkan: fuse snake activation (mul, sin, sqr, mul, add) (ggml-org#22855)

Update WebGPU support and add link to blog/demo

e4d1ea4

reeselevine requested a review from ggerganov as a code owner May 21, 2026 15:39

reeselevine requested review from CISC, ggerganov and ngxson and removed request for ggerganov May 21, 2026 15:40

ggerganov approved these changes May 21, 2026

View reviewed changes

CISC approved these changes May 21, 2026

View reviewed changes

reeselevine merged commit ee7c305 into ggml-org:master May 21, 2026
2 of 3 checks passed

github-actions Bot added the documentation Improvements or additions to documentation label May 21, 2026

ProTekk pushed a commit to ProTekk/buun-llama-cpp that referenced this pull request May 22, 2026

Update WebGPU support and add link to blog/demo (ggml-org#23483)

eb41d1a

Alex7MV pushed a commit to Alex7MV/claude_llama.cpp that referenced this pull request May 22, 2026

Update WebGPU support and add link to blog/demo (ggml-org#23483)

121bc08

baramofme pushed a commit to baramofme/llama-cpp-turboquant that referenced this pull request May 23, 2026

Update WebGPU support and add link to blog/demo (ggml-org#23483)

7f50a48

srossitto79 pushed a commit to srossitto79/llama.cpp that referenced this pull request May 23, 2026

Update WebGPU support and add link to blog/demo (ggml-org#23483)

2d4e444

fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026

Update WebGPU support and add link to blog/demo (ggml-org#23483)

80d1319

turbo-tan pushed a commit to turbo-tan/llama.cpp-tq3 that referenced this pull request Jun 2, 2026

Update WebGPU support and add link to blog/demo (ggml-org#23483)

1c48d6e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-webgpu: update WebGPU support and add link to blog/demo#23483

ggml-webgpu: update WebGPU support and add link to blog/demo#23483
reeselevine merged 1 commit into
ggml-org:masterfrom
reeselevine:webgpu-support

reeselevine commented May 21, 2026 •

edited

Loading

Uh oh!

ggerganov left a comment

Uh oh!

CISC left a comment

Uh oh!

reeselevine commented May 21, 2026

Uh oh!

Uh oh!

CISC commented May 21, 2026

Uh oh!

ngxson commented May 21, 2026

Uh oh!

reeselevine commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

reeselevine commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Requirements

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

CISC left a comment

Choose a reason for hiding this comment

Uh oh!

reeselevine commented May 21, 2026

Uh oh!

Uh oh!

CISC commented May 21, 2026

Uh oh!

ngxson commented May 21, 2026

Uh oh!

reeselevine commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

reeselevine commented May 21, 2026 •

edited

Loading