Skip to content

Be able to use imatrix computed with merged ffn_gate_up_exps#1419

Merged
ikawrakow merged 2 commits intomainfrom
ik/quantize_fused_up_gate
Mar 13, 2026
Merged

Be able to use imatrix computed with merged ffn_gate_up_exps#1419
ikawrakow merged 2 commits intomainfrom
ik/quantize_fused_up_gate

Conversation

@ikawrakow
Copy link
Copy Markdown
Owner

@ikawrakow ikawrakow commented Mar 13, 2026

This PR is a sibling of #1418.

If one has an imatrix available that has been computed using a model with merged ffn_gate_up_exps tensors, but one has a model where ffn_up_exps and ffn_gate_exps are separate, the PR allow the imatrix to be still used to quantize this model. Basically, the ffn_up_exps, ffn_gate_exps, and ffn_gate_up_exps tensors "see" exactly the same activations, so one can use the imatrix data for ffn_gate_up_exps also for ffn_up_exps and ffn_gate_exps. Correspondingly, also the reverse case is now supported. I.e., one has an imatrix computed with separate ffn_up_exps and ffn_gate_exps tensors, but now one wants to use it to quantize a model with merged ffn_gate_up_exps tensors. This will also work with this PR.

@ubergarm
Copy link
Copy Markdown
Contributor

Thanks for holding the world together a little longer!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants