Skip to content

graph : reduce topology branching#18548

Merged
ggerganov merged 1 commit intomasterfrom
gg/graph-avoid-branches
Jan 2, 2026
Merged

graph : reduce topology branching#18548
ggerganov merged 1 commit intomasterfrom
gg/graph-avoid-branches

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Jan 2, 2026

ref #18547

Reduce the amount of graph topology changes when switching between token and embedding inputs:

// before
if (ubatch.token) {
    inpL = ggml_scale(ctx0, inpL, sqrtf(n_embd));
    cb(inpL, "inp_scaled", -1);
}

// after
inpL = ggml_scale(ctx0, inpL, ubatch.token ? sqrtf(n_embd) : 1.0f);
cb(inpL, "inp_scaled", -1);

@github-actions github-actions bot added the model Model specific label Jan 2, 2026
@ggerganov ggerganov merged commit af1e8e1 into master Jan 2, 2026
71 checks passed
@ggerganov ggerganov deleted the gg/graph-avoid-branches branch January 2, 2026 17:02
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants