model : Add tokenizer from LFM2.5-Audio-1.5B by tdakhran · Pull Request #19687 · ggml-org/llama.cpp

tdakhran · 2026-02-17T14:21:52Z

LFM2.5-Audio-1.5B introduced a lightweight audio tokenizer.
It is based on the LFM2 architecture and serves as an embedding model with different input n_embd and output n_embd_out.

To be used in #18641.

To convert use

python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer

CISC · 2026-02-17T14:52:10Z

I see layer_types content differs in the config, should not this change be included?
https://github.com/ggml-org/llama.cpp/pull/18641/files#diff-ec77d8003b92ff283179456d36b8b56abf635e7b1232e70daf16676e8920ccf1L10299-R10299

tdakhran · 2026-02-17T15:14:01Z

I see layer_types content differs in the config, should not this change be included? https://github.com/ggml-org/llama.cpp/pull/18641/files#diff-ec77d8003b92ff283179456d36b8b56abf635e7b1232e70daf16676e8920ccf1L10299-R10299

You are absolutely right! I had multiple checkpoints and tested with wrong one... will fix ASAP

tdakhran · 2026-02-17T15:17:31Z

Thank you @CISC , pushed the fix, and now it works correctly with the upstream HF checkpoint

INFO:hf-to-gguf:gguf: key-value head count = [0, 0, 8, 0, 8, 0, 8, 0]

tdakhran · 2026-02-17T15:18:45Z

Tokenizer in this checkpoint is for conversion compatibility only. If there is a way to convert without a tokenizer, I can remove it.

tdakhran · 2026-02-17T16:33:17Z

Will test it a bit more.

CISC · 2026-02-17T18:16:17Z

Tokenizer in this checkpoint is for conversion compatibility only. If there is a way to convert without a tokenizer, I can remove it.

I'm not sure I follow. Truthfully I haven't looked into the other PR, so I'm sure I'm missing some details. :)

tdakhran · 2026-02-17T19:29:27Z

Tokenizer in this checkpoint is for conversion compatibility only. If there is a way to convert without a tokenizer, I can remove it.

I'm not sure I follow. Truthfully I haven't looked into the other PR, so I'm sure I'm missing some details. :)

The audio tokenizer is a model that takes embeddings as inputs and produces embeddings as output. I mean that it doesn't need vocab and tokenizer, because it doesn't have "tokens".

@CISC , thanks for your patience, but I didn't push the whole set of runtime changes required, somehow thought they are already merged.

Pushed them in this commit e3ae935, please re-review.

CISC · 2026-02-17T20:55:38Z

The audio tokenizer is a model that takes embeddings as inputs and produces embeddings as output. I mean that it doesn't need vocab and tokenizer, because it doesn't have "tokens".

I see, may require you to make a few changes to allow such a thing to load, the conversion part should be simple though. Maybe @ggerganov has some ideas?

tdakhran · 2026-02-17T21:22:30Z

looks like this did the job

    def set_vocab(self):
        self._set_vocab_none()

CISC · 2026-02-18T08:00:25Z

looks like this did the job

    def set_vocab(self):
        self._set_vocab_none()

Oh, right, forgot that was already a thing. :D

CISC · 2026-02-18T08:09:27Z

-                                               ggml_tensor *             inp_pos,
-                                               llm_graph_input_attn_kv * inp_attn,
-                                               int                       il) const {
+template <bool iswa> ggml_tensor * llm_build_lfm2<iswa>::build_attn_block(ggml_tensor *   cur,


Can we perhaps move all these out into a base class in a separate file so we don't build duplicates?

I went the lazy way and just moved them to inline lambdas in 0b6e909. Please let me know if this works. also rebased.

Hmmm, ok, I missed that they both used the conditional types so I guess it's unavoidable.

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer. Tokenizer based on LFM2 architecture and acts as "embedding" model with different input `n_embd` and output `n_embd_out`. To be used in ggml-org#18641. To convert use ```shell python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer ```

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

tdakhran · 2026-02-19T08:32:02Z

@CISC , please let me know if anything is needed to proceed with the merge.

CISC · 2026-02-19T08:53:59Z

@CISC , please let me know if anything is needed to proceed with the merge.

Sorry, should be good, just juggling a lot of PRs. :)

* model : Add tokenizer from LFM2.5-Audio-1.5B [LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer. Tokenizer based on LFM2 architecture and acts as "embedding" model with different input `n_embd` and output `n_embd_out`. To be used in ggml-org#18641. To convert use ```shell python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer ``` * Update convert_hf_to_gguf.py Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Formatting * Rework check for attention layers * Add LFM2 SWA model support * Address PR feedback * Set vocab to none * Move helper function definitions to cpp file --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

tdakhran requested a review from CISC as a code owner February 17, 2026 14:21

tdakhran mentioned this pull request Feb 17, 2026

[Do Not Merge] model : LFM2.5-Audio-1.5B #18641

Draft

6 tasks

CISC approved these changes Feb 17, 2026

View reviewed changes

Comment thread convert_hf_to_gguf.py Outdated

github-actions Bot added the python python script changes label Feb 17, 2026

tdakhran marked this pull request as draft February 17, 2026 15:14

tdakhran marked this pull request as ready for review February 17, 2026 15:16

tdakhran marked this pull request as draft February 17, 2026 16:32

github-actions Bot added the model Model specific label Feb 17, 2026

tdakhran marked this pull request as ready for review February 17, 2026 19:29

tdakhran requested a review from CISC February 17, 2026 19:29

CISC reviewed Feb 17, 2026

View reviewed changes

Comment thread src/models/lfm2.cpp Outdated

Comment thread src/models/lfm2.cpp

CISC reviewed Feb 18, 2026

View reviewed changes

tdakhran and others added 8 commits February 18, 2026 10:08

Update convert_hf_to_gguf.py

ff8c4df

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Formatting

8ee35bf

Rework check for attention layers

e6efc19

Add LFM2 SWA model support

1ceca7c

Address PR feedback

f6c250a

Set vocab to none

b38df66

Move helper function definitions to cpp file

0b6e909

tdakhran force-pushed the tarek/feat/upstream-audio-tokenizer branch from 7f1114b to 0b6e909 Compare February 18, 2026 09:08

CISC merged commit 8004f3a into ggml-org:master Feb 19, 2026
81 of 82 checks passed

tdakhran deleted the tarek/feat/upstream-audio-tokenizer branch February 19, 2026 09:17

Conversation

tdakhran commented Feb 17, 2026

Uh oh!

Uh oh!

CISC commented Feb 17, 2026

Uh oh!

tdakhran commented Feb 17, 2026

Uh oh!

tdakhran commented Feb 17, 2026

Uh oh!

tdakhran commented Feb 17, 2026

Uh oh!

tdakhran commented Feb 17, 2026

Uh oh!

CISC commented Feb 17, 2026

Uh oh!

tdakhran commented Feb 17, 2026

Uh oh!

CISC commented Feb 17, 2026

Uh oh!

Uh oh!

Uh oh!

tdakhran commented Feb 17, 2026

Uh oh!

CISC commented Feb 18, 2026

Uh oh!

CISC Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

tdakhran Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

CISC Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

tdakhran commented Feb 19, 2026

Uh oh!

CISC commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants