PyO3 0.22 by diliop · Pull Request #1665 · huggingface/tokenizers

diliop · 2024-10-23T18:21:18Z

Upgrade from PyO3 0.21 to 0.22

HuggingFaceDocBuilderDev · 2024-10-24T14:30:46Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

Nice! 🔥
Just approved the workflows, let's go 🚀
Feel free to ping me, tho I think you are up to a great start!

yookoala · 2024-10-24T14:36:16Z

Sadly ran into numpy compilation issue on Windows (32bit).

Details:
PyO3/rust-numpy#448

Hopefully, the fix is on the way:
PyO3/rust-numpy#463

yookoala · 2024-10-28T03:10:32Z

The fix has merged to rust-numpy. Now we need to wait for a new release that includes the merge.

yookoala · 2024-10-30T11:29:01Z

@diliop: rust-numpy v0.22.1 has been released and it should have the Win32 compilation issue fixed. Please update to use it and try again.

diliop · 2024-10-30T14:47:56Z

There is no need to change anything in code. Just re-running checks on this PR is sufficient since numpy is already set to 0.22 which includes 0.22.1. Looking a the checks above, you can see the 0.22.1 download here. That said, everything looks green :)

yookoala · 2024-10-30T15:04:11Z

@diliop: I think you probably should add "Closes #1639" to the PR's description so it will automatically close it when merged.

ArthurZucker

Thanks a TON! Let's remove the name and good to go

bindings/python/src/models.rs

ArthurZucker · 2024-11-01T09:17:32Z

🎉 thanks for the PR 🤗

ArthurZucker · 2024-11-05T10:56:42Z

bindings/python/src/tokenizer.rs

        &self,
        py: Python<'_>,
-        input: Vec<&PyAny>,
+        input: Bound<'_, PyList>,


actually this broke tokenizers because it only supports PyList now 😓 looking into a fix!

@diliop we need this to be probably PySequence, but I am not sure about the fix

Hey, I just saw this :( Yeah I "assumed" that Vec was only accepting list from Python hence PyList but if you need tuples then PySequence is indeed the way to go. Py* will give you the benefit of the Python type check at almost zero cost as you mentioned in your PR so they should be preferred where possible. That said, there should be tests covering this so I can pick this up over the weekend and make sure anything else I changed is also covered.

Cool yeah was in a rush to fix this, forgot about the tests, super nice if you want to add them 🤗

Took a quick stab at adding tests but from the looks of it I will need to spend a bit more time here to do this right since PySequence is not the right solution after all. The TL;DR here is that the reason why the change I made was not caught by tests is because the test covering this line was turned off (here). Turning the test back on will now fail on parsing ndarray as an additional input type. So encode_batch and encode_batch_fast need to support list, tuple and ndarray. I think I can support all 3 by changing the input arg type to something like Vec<Bound<'_, PyAny>> with some changes in PreTokenizedEncodeInput and TextEncodeInput. I will have some time to work on this over the weekend so hopefully I have fix for this soon.

#1679 should be the fix. Still planning to add more tests but this my hope is would be enough to restore the previous functionality.

Thanks a lot for diving into it!

PyO3 0.22

30ad235

diliop mentioned this pull request Oct 23, 2024

WIP: Bump pyo3 version to v0.22 #1646

Closed

ArthurZucker reviewed Oct 24, 2024

View reviewed changes

yookoala mentioned this pull request Oct 28, 2024

Numpy2 support fails when running with 32-bit windows PyO3/rust-numpy#448

Closed

yookoala mentioned this pull request Oct 28, 2024

[BUG] Unable to install with pip because PyO3 cannot be installed under Python 3.13 ModelTC/LightLLM#575

Closed

1 task

ArthurZucker mentioned this pull request Oct 29, 2024

Failing to build bindings with 0.19.1 #1505

Closed

diliop and others added 2 commits October 29, 2024 12:40

Merge branch 'huggingface:main' into main

e94fd67

Fix python stubs

bceb231

diliop requested a review from ArthurZucker October 30, 2024 14:50

cdce8p mentioned this pull request Oct 30, 2024

Add support for Python 3.13 home-assistant/core#129442

Merged

44 tasks

ArthurZucker approved these changes Oct 31, 2024

View reviewed changes

bindings/python/src/models.rs Outdated Show resolved Hide resolved

Remove name arg from PyModel::save Python signature

9aef5d7

ArthurZucker merged commit 6ade8c2 into huggingface:main Nov 1, 2024

cecilyen mentioned this pull request Nov 1, 2024

[v3-future] Python 3.13 support jupyterlab/jupyter-ai#1023

Open

OyvindTafjord mentioned this pull request Nov 4, 2024

Tokenizers v0.20.2 fails on batches as tuples #1672

Closed

ArthurZucker reviewed Nov 5, 2024

View reviewed changes

ArthurZucker mentioned this pull request Nov 5, 2024

fix pylist #1673

Merged

diliop mentioned this pull request Nov 8, 2024

Fix encode_batch and encode_batch_fast to accept ndarrays again #1679

Merged

Conversation

diliop commented Oct 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Oct 24, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

yookoala commented Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yookoala commented Oct 28, 2024

Uh oh!

yookoala commented Oct 30, 2024

Uh oh!

diliop commented Oct 30, 2024

Uh oh!

yookoala commented Oct 30, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ArthurZucker commented Nov 1, 2024

Uh oh!

ArthurZucker Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

diliop Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Nov 6, 2024

Choose a reason for hiding this comment

Uh oh!

diliop Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

diliop Nov 8, 2024

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Nov 15, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

diliop commented Oct 23, 2024 •

edited

Loading

yookoala commented Oct 24, 2024 •

edited

Loading