llama : more robust logic for determining Meta devices by ggerganov · Pull Request #16 · JohannesGaessler/llama.cpp

ggerganov · 2026-03-26T12:54:58Z

With this patch, we can combine RPC devices into the Meta device to achieve tensor parallelism over the network. Also print some info about the constructed Meta device:

Sample output after the change:

0.00.006.276 W llama_model_load_from_file_impl: skipping BLAS (Accelerate) for tensor parallelism
0.00.006.279 W llama_model_load_from_file_impl: skipping CPU (Apple M2 Ultra) for tensor parallelism
0.00.007.490 I llama_model_load_from_file_impl: creating a Meta device for tensor parallelism from 4 devices:
0.00.007.492 I llama_model_load_from_file_impl: - device 0: MTL0 (Apple M2 Ultra)
0.00.007.493 I llama_model_load_from_file_impl: - device 1: MTL1 (Apple M2 Ultra)
0.00.007.493 I llama_model_load_from_file_impl: - device 2: RPC0 (127.0.0.1:50052)
0.00.007.493 I llama_model_load_from_file_impl: - device 3: RPC1 (127.0.0.1:50053)

src/llama.cpp

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* llama : more robust logic for determining Meta devices * cont : fix devs size check Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * cont : fix log type Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

llama : more robust logic for determining Meta devices

d703e39

JohannesGaessler approved these changes Mar 26, 2026

View reviewed changes

src/llama.cpp Outdated Show resolved Hide resolved

src/llama.cpp Outdated Show resolved Hide resolved

ggerganov and others added 2 commits March 26, 2026 15:03

cont : fix devs size check

964195c

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

cont : fix log type

717d449

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

JohannesGaessler merged commit f1a9b87 into JohannesGaessler:ggml-meta-backend-8 Mar 26, 2026

ggerganov deleted the pr/19378-gg-2 branch March 26, 2026 13:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama : more robust logic for determining Meta devices#16

llama : more robust logic for determining Meta devices#16
JohannesGaessler merged 3 commits intoJohannesGaessler:ggml-meta-backend-8from
ggml-org:pr/19378-gg-2

ggerganov commented Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ggerganov commented Mar 26, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants