Skip to content

Add AfmoeForCausalLM support#16477

Merged
CISC merged 10 commits intoggml-org:masterfrom
bartowski1182:master
Nov 14, 2025
Merged

Add AfmoeForCausalLM support#16477
CISC merged 10 commits intoggml-org:masterfrom
bartowski1182:master

Conversation

@bartowski1182
Copy link
Contributor

Adds support for upcoming AfmoeForCausalLM

Tokenizer is public ahead of model launch to avoid breaking conversion code

Make sure to read the contributing guidelines before submitting a PR

@bartowski1182 bartowski1182 marked this pull request as ready for review November 13, 2025 17:36
bartowski1182 and others added 3 commits November 13, 2025 18:03
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Comment on lines +14 to +15
inpL = ggml_scale(ctx0, inpL, sqrtf(float(n_embd)));
cb(inpL, "inp_embd_scaled", -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not very important to fix right now, but if the model supports multimodal in the future, you may need to skip scaling if the input is non-text:

// important: do not normalize weights for raw embeddings input (i.e. encoded image emdeddings)
if (ubatch.token) {
inpL = ggml_scale(ctx0, inpL, sqrtf(n_embd));
cb(inpL, "inp_scaled", -1);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, good to know thanks !

@CISC
Copy link
Member

CISC commented Nov 14, 2025

@bartowski1182 @ggerganov Merging in a little while unless you have anything more to add.

@CISC CISC merged commit e1fcf8b into ggml-org:master Nov 14, 2025
76 checks passed
Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
* Add AFMOE model support

* Update to vocab

* Add model sizing

* Undo Rope change for ARCEE model

* Address review comments

* Update modeling code is_sliding -> use_rope, replace hard-coded logic

* Fix AFMOE tokenizer

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update AFMoE tokenizer class identification to be more unique

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* Add AFMOE model support

* Update to vocab

* Add model sizing

* Undo Rope change for ARCEE model

* Address review comments

* Update modeling code is_sliding -> use_rope, replace hard-coded logic

* Fix AFMOE tokenizer

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update AFMoE tokenizer class identification to be more unique

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants