Skip to content

llama : the WPM vocabs use the CLS token as BOS#10930

Merged
ggerganov merged 2 commits intomasterfrom
gg/wpm-fix-bos
Dec 24, 2024
Merged

llama : the WPM vocabs use the CLS token as BOS#10930
ggerganov merged 2 commits intomasterfrom
gg/wpm-fix-bos

Conversation

@ggerganov
Copy link
Member

@ggerganov ggerganov commented Dec 21, 2024

https://github.com/ggerganov/llama.cpp/blob/9d5c7115879fe33828edf3a1892b597b61c1cd7d/src/llama-vocab.cpp#L1529-L1535

Not sure if this separation of BOS and CLS tokens is really necessary. The alternative would be to remove the notion of CLS tokens from the codebase and work only with BOS tokens - seems much simpler. The problem would be if there are models that have both BOS and CLS tokens that are distinct?

@ggerganov ggerganov merged commit 30caac3 into master Dec 24, 2024
@ggerganov ggerganov deleted the gg/wpm-fix-bos branch December 24, 2024 07:44
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025
* llama : the WPM vocabs use the CLS token as BOS

ggml-ci

* llama : add comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant