Skip to content

llama : more robust logic for determining Meta devices#16

Merged
JohannesGaessler merged 3 commits intoJohannesGaessler:ggml-meta-backend-8from
ggml-org:pr/19378-gg-2
Mar 26, 2026
Merged

llama : more robust logic for determining Meta devices#16
JohannesGaessler merged 3 commits intoJohannesGaessler:ggml-meta-backend-8from
ggml-org:pr/19378-gg-2

Conversation

@ggerganov
Copy link
Copy Markdown

With this patch, we can combine RPC devices into the Meta device to achieve tensor parallelism over the network. Also print some info about the constructed Meta device:

Sample output after the change:

0.00.006.276 W llama_model_load_from_file_impl: skipping BLAS (Accelerate) for tensor parallelism
0.00.006.279 W llama_model_load_from_file_impl: skipping CPU (Apple M2 Ultra) for tensor parallelism
0.00.007.490 I llama_model_load_from_file_impl: creating a Meta device for tensor parallelism from 4 devices:
0.00.007.492 I llama_model_load_from_file_impl: - device 0: MTL0 (Apple M2 Ultra)
0.00.007.493 I llama_model_load_from_file_impl: - device 1: MTL1 (Apple M2 Ultra)
0.00.007.493 I llama_model_load_from_file_impl: - device 2: RPC0 (127.0.0.1:50052)
0.00.007.493 I llama_model_load_from_file_impl: - device 3: RPC1 (127.0.0.1:50053)

ggerganov and others added 2 commits March 26, 2026 15:03
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
@JohannesGaessler JohannesGaessler merged commit f1a9b87 into JohannesGaessler:ggml-meta-backend-8 Mar 26, 2026
@ggerganov ggerganov deleted the pr/19378-gg-2 branch March 26, 2026 13:14
JohannesGaessler added a commit that referenced this pull request Mar 29, 2026
* llama : more robust logic for determining Meta devices

* cont : fix devs size check

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

* cont : fix log type

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants