ggml-webgpu: update WebGPU support and add link to blog/demo#23483
Conversation
|
thank you, and to you for all the hard work reviewing and building this project :) |
I honestly did not expect to be credited in your blog, it has been an honor sir! :) |
|
Nice! Thanks for adding the blog post! Btw I'm rethinking about not supporting mem64 now, so you think safari is worth supporting (i.e. what's the perf compared to chrome?) If it's worth keeping, I can probably distribute another build without jspi/mem64 |
|
@CISC of course, without you all this code would have taken a lot longer to be merged! @ngxson yeah I think Safari is worth supporting. For example, on my M3 on LLama3.2 1b q4_k_m I see ~30 t/s decode on Safari vs ~50 on Chrome. So not great, but much better than Firefox, plus Safari is used by more people than Firefox and is the only way to run things on iOS. |
* origin/master: server: only parse empty msg if continuing an assistant msg (ggml-org#23506) perplexity : fix integer overflow (ggml-org#23496) SYCL: improve MoE prefill throughput (ggml-org#23142) sycl : Level Zero detection in ggml_sycl_init (ggml-org#23097) SYCL : gated_delta_net K>1 (ggml-org#23174) SYCL: add BF16 to DMMV kernel path (~4x tg speedup on Intel Arc) (ggml-org#21580) docs: Update documentation with Granite 4.0/4.1 (ggml-org#23404) ggml-zendnn : add Q8_0 quantization support (ggml-org#23414) cmake : build router app only during standalone builds (ggml-org#23521) vocab : fix HybridDNA tokenizer (ggml-org#23466) cmake : add install() for impl libraries + fix apple builds (ggml-org#23511) CUDA: fix PDL CC check for JIT compilation (ggml-org#23471) cmake : remove STATIC from impl libraries, enable LLAMA_BUILD_APP by default (ggml-org#23462) Update WebGPU support and add link to blog/demo (ggml-org#23483) vulkan: fuse snake activation (mul, sin, sqr, mul, add) (ggml-org#22855)
Overview
Let me know if there are any edits or updates I should make to the blog post too!
Requirements