Skip to content

metal : add env var to trigger graph capture#20398

Merged
ggerganov merged 1 commit intomasterfrom
gg/metal-capture-env
Mar 11, 2026
Merged

metal : add env var to trigger graph capture#20398
ggerganov merged 1 commit intomasterfrom
gg/metal-capture-env

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Mar 11, 2026

QoL for capturing execution of Metal graphs for profiling purposes.

Usage:

# capture the 8th compute graph
METAL_CAPTURE_ENABLED=1 GGML_METAL_CAPTURE_COMPUTE=8 llama-completion ...

0.00.707.463 W ggml_metal_graph_compute: capturing graph in /tmp/perf-metal-47193.gputrace

Inspect in XCode:

open /tmp/perf-metal-47193.gputrace

Useful for detecting register spills:

image

Or for finding hot spots in the kernel implementations:

image

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Mar 11, 2026
@ggerganov ggerganov merged commit c363256 into master Mar 11, 2026
16 of 75 checks passed
@ggerganov ggerganov deleted the gg/metal-capture-env branch March 11, 2026 14:25
ProgenyAlpha pushed a commit to ProgenyAlpha/llama.cpp that referenced this pull request Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant