Add support for Tiny Aya Models by saurabhdash2512 · Pull Request #19611 · ggml-org/llama.cpp

saurabhdash2512 · 2026-02-14T00:22:21Z

Summary

This PR adds native support for the CohereLabs/tiny-aya family of models in llama.cpp. These models use a distinct BPE pre-tokenizer (tiny_aya) with a custom digit-grouping regex.

Tagging @ngxson for visibility.

src/llama-vocab.cpp

CISC · 2026-02-14T07:31:33Z

src/unicode.cpp

+        // tiny_aya digit grouping pattern from tokenizer.json:
+        //   {"type": "Split", "pattern": {"Regex": "\\d{1,3}(?=(?:\\d{3})*\\b)"}, "behavior": "Isolated"}
+        // Splits digits into groups of 3 from the right (e.g., 1234567 -> 1, 234, 567)
+        bpe_offsets = unicode_regex_split_custom_afmoe(text, offsets);


These are not exactly the same though, may be subtle tokenization differences.

hmm, right, @saurabhdash2512 I didn't notice this regex is different. If you prefer fixing this later, we can leave a TODO here and go back later after the model is released

I tested this with random strings and didn't see any differences, but added a comment incase there are any edge cases I missed.

saurabhdash2512 · 2026-02-15T18:07:31Z

@ngxson @CISC I've address the comments! Please let me know if there's anything else needed!

convert_hf_to_gguf.py

* upstream/master: (88 commits) ci : bump komac version (ggml-org#19682) build : link ws2_32 as PUBLIC on Windows (ggml-org#19666) build : cleanup library linking logic (ggml-org#19665) convert : add JoyAI-LLM-Flash (ggml-org#19651) perplexity: add proper batching (ggml-org#19661) common : inline functions (ggml-org#18639) ggml : make `ggml_is_view` as API (ggml-org#19539) model: Add support for Tiny Aya Models (ggml-org#19611) build : rework llama_option_depr to handle LLAMA_CURL (ggml-org#19658) Adjust workaround for ROCWMMA_FATTN/GFX9 to only newer ROCm veresions (ggml-org#19591) models : deduplicate delta-net graphs for Qwen family (ggml-org#19597) graph : fix KQ mask, lora, cvec reuse checks (ggml-org#19644) ggml: aarch64: Implement SVE in Gemm q4_k 8x8 q8_k Kernel (ggml-org#19132) sync : ggml ggml : bump version to 0.9.7 (ggml/1425) ggml : bump version to 0.9.6 (ggml/1423) cuda: optimize iq2xxs/iq2xs/iq3xxs dequantization (ggml-org#19624) docs: update s390x build docs (ggml-org#19643) build : remove LLAMA_HTTPLIB option (ggml-org#19623) cmake : check if KleidiAI API has been fetched (ggml-org#19640) ...

arch-btw · 2026-02-18T17:33:23Z

@saurabhdash2512 this only works for the base model, all the others use a different tokenizer

arch-btw · 2026-02-18T17:34:52Z

shasum tokenizer.json

CohereLabs/tiny-aya-earth
2227ea9c52e8afb3f98bfed2679008b275f2664de69dfde174b374389eb0225d
CohereLabs/tiny-aya-global
2227ea9c52e8afb3f98bfed2679008b275f2664de69dfde174b374389eb0225d
CohereLabs/tiny-aya-fire
2227ea9c52e8afb3f98bfed2679008b275f2664de69dfde174b374389eb0225d
CohereLabs/tiny-aya-water
2227ea9c52e8afb3f98bfed2679008b275f2664de69dfde174b374389eb0225d
CohereLabs/tiny-aya-base
8f21f6c4f761c192f486ea2c5b06b62b3ef30819b33dc105bdf8b26c8e7974f6

CISC · 2026-02-18T18:05:25Z

@saurabhdash2512 this only works for the base model, all the others use a different tokenizer

Looks like it wasn't a problem for Cohere nor DevQuasar to make GGUFs?

The tokenizer.json can be different without it being an issue, the important part is pre_tokenizer.

arch-btw · 2026-02-18T19:01:52Z

@CISC good to know! Unfortunately, it does throw an error:

raise NotImplementedError("BPE pre-tokenizer was not recognized - update get_vocab_base_pre()")
NotImplementedError: BPE pre-tokenizer was not recognized - update get_vocab_base_pre()

WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:** WARNING: The BPE pre-tokenizer was not recognized!
WARNING:hf-to-gguf:**          There are 2 possible reasons for this:
WARNING:hf-to-gguf:**          - the model has not been added to convert_hf_to_gguf_update.py yet
WARNING:hf-to-gguf:**          - the pre-tokenization config has changed upstream
WARNING:hf-to-gguf:**          Check your model files and convert_hf_to_gguf_update.py and update them accordingly.
WARNING:hf-to-gguf:** ref:     https://github.com/ggml-org/llama.cpp/pull/6920
WARNING:hf-to-gguf:**
WARNING:hf-to-gguf:** chkhsh:  89377a9f52591931c0fe65a2096be15bb578e5f211dc1dc2298a92dbede010bb
WARNING:hf-to-gguf:**************************************************************************************
WARNING:hf-to-gguf:

arch-btw · 2026-02-18T19:10:36Z

Forgot that it's gated, here they are:

tokenizer-base.json
tokenizer-global.json

It's strange, because it does appear that they match.

CISC · 2026-02-18T19:16:23Z

Forgot that it's gated, here they are:

tokenizer-base.json tokenizer-global.json

It's strange, because it does appear that they match.

Yep, those are for all purposes identical, check tokenizer_config.json.

arch-btw · 2026-02-18T19:20:16Z

You're right, those are quite different.

tokenizer_config-global.json
tokenizer_config-base.json

CISC · 2026-02-18T19:23:07Z

You're right, those are quite different.

Oh dear, that would do it:

-  "tokenizer_class": "CohereTokenizer",
+  "tokenizer_class": "CohereTokenizerFast",

@ngxson

* changes for tiny aya * changes to hash * changes to vocab * fix some tokenizer regex edge cases * update comment * add some comments for regex * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

@ngxson

* changes for tiny aya * changes to hash * changes to vocab * fix some tokenizer regex edge cases * update comment * add some comments for regex * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

@ngxson

* changes for tiny aya * changes to hash * changes to vocab * fix some tokenizer regex edge cases * update comment * add some comments for regex * Apply suggestion from @ngxson --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

saurabhdash2512 added 7 commits February 11, 2026 19:29

changes for tiny aya

a47878a

Merge branch 'master' into saurabh_tiny_aya

5264540

changes to hash

e4ec03f

changes to vocab

1326773

fix some tokenizer regex edge cases

ce36cfe

update comment

673b45b

Merge branch 'master' into saurabh_tiny_aya

ffca9e8

saurabhdash2512 requested review from CISC and ggerganov as code owners February 14, 2026 00:22

github-actions bot added the python python script changes label Feb 14, 2026

CISC reviewed Feb 14, 2026

View reviewed changes

add some comments for regex

7f0fe9d

CISC approved these changes Feb 16, 2026

View reviewed changes

ngxson approved these changes Feb 16, 2026

View reviewed changes

ngxson reviewed Feb 16, 2026

View reviewed changes

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

Apply suggestion from @ngxson

373da0e

ngxson merged commit 5f28c53 into ggml-org:master Feb 16, 2026
81 of 82 checks passed

Conversation

saurabhdash2512 commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

CISC Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

ngxson Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

saurabhdash2512 Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

saurabhdash2512 commented Feb 15, 2026

Uh oh!

Uh oh!

Uh oh!

arch-btw commented Feb 18, 2026

Uh oh!

arch-btw commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC commented Feb 18, 2026

Uh oh!

arch-btw commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arch-btw commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CISC commented Feb 18, 2026

Uh oh!

arch-btw commented Feb 18, 2026

Uh oh!

CISC commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

saurabhdash2512 commented Feb 14, 2026 •

edited

Loading

arch-btw commented Feb 18, 2026 •

edited

Loading

arch-btw commented Feb 18, 2026 •

edited

Loading

arch-btw commented Feb 18, 2026 •

edited

Loading