Skip to content

model : Add tokenizer from LFM2.5-Audio-1.5B#19687

Merged
CISC merged 8 commits into
ggml-org:masterfrom
tdakhran:tarek/feat/upstream-audio-tokenizer
Feb 19, 2026
Merged

model : Add tokenizer from LFM2.5-Audio-1.5B#19687
CISC merged 8 commits into
ggml-org:masterfrom
tdakhran:tarek/feat/upstream-audio-tokenizer

Conversation

@tdakhran

Copy link
Copy Markdown
Contributor

LFM2.5-Audio-1.5B introduced a lightweight audio tokenizer.
It is based on the LFM2 architecture and serves as an embedding model with different input n_embd and output n_embd_out.

To be used in #18641.

To convert use

python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer

Comment thread convert_hf_to_gguf.py Outdated
@github-actions github-actions Bot added the python python script changes label Feb 17, 2026
@CISC

CISC commented Feb 17, 2026

Copy link
Copy Markdown
Member

I see layer_types content differs in the config, should not this change be included?
https://github.com/ggml-org/llama.cpp/pull/18641/files#diff-ec77d8003b92ff283179456d36b8b56abf635e7b1232e70daf16676e8920ccf1L10299-R10299

@tdakhran

Copy link
Copy Markdown
Contributor Author

I see layer_types content differs in the config, should not this change be included? https://github.com/ggml-org/llama.cpp/pull/18641/files#diff-ec77d8003b92ff283179456d36b8b56abf635e7b1232e70daf16676e8920ccf1L10299-R10299

You are absolutely right! I had multiple checkpoints and tested with wrong one... will fix ASAP

@tdakhran tdakhran marked this pull request as draft February 17, 2026 15:14
@tdakhran tdakhran marked this pull request as ready for review February 17, 2026 15:16
@tdakhran

Copy link
Copy Markdown
Contributor Author

Thank you @CISC , pushed the fix, and now it works correctly with the upstream HF checkpoint

INFO:hf-to-gguf:gguf: key-value head count = [0, 0, 8, 0, 8, 0, 8, 0]

@tdakhran

Copy link
Copy Markdown
Contributor Author

Tokenizer in this checkpoint is for conversion compatibility only. If there is a way to convert without a tokenizer, I can remove it.

@tdakhran tdakhran marked this pull request as draft February 17, 2026 16:32
@tdakhran

Copy link
Copy Markdown
Contributor Author

Will test it a bit more.

@CISC

CISC commented Feb 17, 2026

Copy link
Copy Markdown
Member

Tokenizer in this checkpoint is for conversion compatibility only. If there is a way to convert without a tokenizer, I can remove it.

I'm not sure I follow. Truthfully I haven't looked into the other PR, so I'm sure I'm missing some details. :)

@github-actions github-actions Bot added the model Model specific label Feb 17, 2026
@tdakhran

Copy link
Copy Markdown
Contributor Author

Tokenizer in this checkpoint is for conversion compatibility only. If there is a way to convert without a tokenizer, I can remove it.

I'm not sure I follow. Truthfully I haven't looked into the other PR, so I'm sure I'm missing some details. :)

The audio tokenizer is a model that takes embeddings as inputs and produces embeddings as output. I mean that it doesn't need vocab and tokenizer, because it doesn't have "tokens".

@CISC , thanks for your patience, but I didn't push the whole set of runtime changes required, somehow thought they are already merged.

Pushed them in this commit e3ae935, please re-review.

@tdakhran tdakhran marked this pull request as ready for review February 17, 2026 19:29
@tdakhran tdakhran requested a review from CISC February 17, 2026 19:29
@CISC

CISC commented Feb 17, 2026

Copy link
Copy Markdown
Member

The audio tokenizer is a model that takes embeddings as inputs and produces embeddings as output. I mean that it doesn't need vocab and tokenizer, because it doesn't have "tokens".

I see, may require you to make a few changes to allow such a thing to load, the conversion part should be simple though. Maybe @ggerganov has some ideas?

Comment thread src/models/lfm2.cpp Outdated
Comment thread src/models/lfm2.cpp
@tdakhran

Copy link
Copy Markdown
Contributor Author

looks like this did the job

    def set_vocab(self):
        self._set_vocab_none()

@CISC

CISC commented Feb 18, 2026

Copy link
Copy Markdown
Member

looks like this did the job

    def set_vocab(self):
        self._set_vocab_none()

Oh, right, forgot that was already a thing. :D

Comment thread src/models/lfm2.cpp Outdated
ggml_tensor * inp_pos,
llm_graph_input_attn_kv * inp_attn,
int il) const {
template <bool iswa> ggml_tensor * llm_build_lfm2<iswa>::build_attn_block(ggml_tensor * cur,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we perhaps move all these out into a base class in a separate file so we don't build duplicates?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went the lazy way and just moved them to inline lambdas in 0b6e909. Please let me know if this works. also rebased.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, ok, I missed that they both used the conditional types so I guess it's unavoidable.

tdakhran and others added 8 commits February 18, 2026 10:08
[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@tdakhran tdakhran force-pushed the tarek/feat/upstream-audio-tokenizer branch from 7f1114b to 0b6e909 Compare February 18, 2026 09:08
@tdakhran

Copy link
Copy Markdown
Contributor Author

@CISC , please let me know if anything is needed to proceed with the merge.

@CISC

CISC commented Feb 19, 2026

Copy link
Copy Markdown
Member

@CISC , please let me know if anything is needed to proceed with the merge.

Sorry, should be good, just juggling a lot of PRs. :)

@CISC CISC merged commit 8004f3a into ggml-org:master Feb 19, 2026
81 of 82 checks passed
@tdakhran tdakhran deleted the tarek/feat/upstream-audio-tokenizer branch February 19, 2026 09:17
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
my-other-github-account pushed a commit to my-other-github-account/llama.cpp that referenced this pull request May 15, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
fewtarius pushed a commit to fewtarius/llama.cpp that referenced this pull request May 30, 2026
* model : Add tokenizer from LFM2.5-Audio-1.5B

[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer.

Tokenizer based on LFM2 architecture and acts as "embedding" model with
different input `n_embd` and output `n_embd_out`.

To be used in ggml-org#18641.

To convert use

```shell
python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer
```

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Formatting

* Rework check for attention layers

* Add LFM2 SWA model support

* Address PR feedback

* Set vocab to none

* Move helper function definitions to cpp file

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants