Deepseek R1 tokenization support by pcuenca · Pull Request #159 · huggingface/swift-transformers

pcuenca · 2025-01-24T13:38:35Z

The new test does not pass for some reason. Is there anything you think I'm doing wrong @DePasqualeOrg?

DePasqualeOrg · 2025-01-24T15:19:03Z

The encoded tokens from the test look like this:

[4913, 70398, 788, 895, 11, 13265, 1313, 788, 330, 19337, 3323, 497, 330, 38460, 788, 830, 11, 330, 75, 13105, 788, 895, 11, 330, 1796, 788, 330, 151646, 497, 330, 15338, 13533, 788, 895, 92, 151644, 74785, 279, 23670, 15473, 4128, 13, 151645]

And the test target looks like this:

[151646, 151644, 74785, 279, 23670, 15473, 4128, 13, 151645]

But even when changing the test target to the actually encoded tokens, it still crashes, so I still need to investigate. In any case, I have already verified that the DeepSeek models work with the latest Jinja.

pcuenca · 2025-01-24T16:30:23Z

The test targets were obtained from the Python tokenizer, they correspond to <｜begin▁of▁sentence｜><｜User｜>Describe the Swift programming language.<｜Assistant｜>. The problem here is that the bos_token is passed in the context as a dictionary, not a String. This means that the result from applyChatTemplate won't be correct, as the test shows.

Looking into it.

DePasqualeOrg · 2025-01-24T16:31:52Z

As you can see here, the prompt from the test is being encoded correctly, and there are no problems interacting with the model, but as soon as you call decode on the encoded tokens, it crashes. I think the problem must be somewhere in swift-transformers. Perhaps it's a text encoding issue. I noticed that spaces are getting encoded as an unusual character.

@pcuenca, since you're more familiar with the library's internals than me, perhaps you have a better intuition about how to approach the solution. My initial attempts at solutions with Sonnet and the entire library as context were unsuccessful.

Serialized AddedToken class partially supported (in addition to String values)

pcuenca · 2025-01-24T18:53:16Z

Package.swift

    dependencies: [
        .package(url: "https://github.com/apple/swift-argument-parser.git", from: "1.4.0"),
-        .package(url: "https://github.com/maiqingqiang/Jinja", from: "1.0.6")
+        .package(url: "https://github.com/johnmai-dev/Jinja", from: "1.1.0")


I'm going to move this and the chat template test to a new PR, since the rest of the fixes here are more general and unrelated to the new jinja engine.

pcuenca · 2025-01-24T19:11:02Z

Merging this. As explained, the jinja upgrade will come momentarily as these changes are general.

Update Jinja, add Qwen R1 test

1b81b02

pcuenca mentioned this pull request Jan 24, 2025

Update LLMModelFactory.swift ml-explore/mlx-swift-examples#183

Merged

More robust added token support

d76f589

Serialized AddedToken class partially supported (in addition to String values)

pcuenca changed the title ~~Update Jinja, add Qwen R1 test~~ Qwen R1 tokenization support Jan 24, 2025

pcuenca changed the title ~~Qwen R1 tokenization support~~ Deepseek R1 tokenization support Jan 24, 2025

Actually pass added tokens to decoder

a107089

pcuenca commented Jan 24, 2025

View reviewed changes

Temporarily revert jinja upgrade

91acea5

pcuenca merged commit 1fab24c into main Jan 24, 2025
1 check passed

pcuenca deleted the jinja-upgrade branch January 24, 2025 19:11

pcuenca mentioned this pull request Jan 26, 2025

Enable tool use #151

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepseek R1 tokenization support#159

Deepseek R1 tokenization support#159
pcuenca merged 4 commits intomainfrom
jinja-upgrade

pcuenca commented Jan 24, 2025 •

edited

Loading

Uh oh!

DePasqualeOrg commented Jan 24, 2025 •

edited

Loading

Uh oh!

pcuenca commented Jan 24, 2025

Uh oh!

DePasqualeOrg commented Jan 24, 2025

Uh oh!

pcuenca Jan 24, 2025

Uh oh!

pcuenca commented Jan 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pcuenca commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DePasqualeOrg commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcuenca commented Jan 24, 2025

Uh oh!

DePasqualeOrg commented Jan 24, 2025

Uh oh!

pcuenca Jan 24, 2025

Choose a reason for hiding this comment

Uh oh!

pcuenca commented Jan 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pcuenca commented Jan 24, 2025 •

edited

Loading

DePasqualeOrg commented Jan 24, 2025 •

edited

Loading