Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 #33456
Conversation
2a29247 to
8db30bd
Compare
SunMarc
left a comment
There was a problem hiding this comment.
Thanks for your work @jerryzh168 to enable serialization ! Really appreciate that you are doing the PRs on huggingface hub and transformers ! This looks pretty good ! I left a few comments
|
@SunMarc thanks for your thoughtful reviews! I have addressed all the comments I think, please take a look again, also not sure if the CI failure is relevant or not |
|
btw, current pytorch nightly has a perf regression: pytorch/ao#898 and we hope to fix this before 2.5 cherry-pick deadline |
SunMarc
left a comment
There was a problem hiding this comment.
Thanks for iterating! LGTM! Just a nit. Let me know when the regression is fixed. I've pinged a core maintainer to review the PR
|
To fix the failing test, can you rebase on main ? |
4f44f40 to
dc76271
Compare
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@ArthurZucker can you take a look at the PR? |
|
I've merged recently PR adding a new quantizer @jerryzh168. Sorry for that but could you rebase on main and update the |
f4d7c51 to
93f8c89
Compare
…nfig quantized model Summary: After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details Test Plan: tested locally Reviewers: Subscribers: Tasks: Tags:
93f8c89 to
b9543c8
Compare
|
@SunMarc @ArthurZucker updated, please take a look again |
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks for this PR, super sorry for the delay!
Super important to have serialization!
| @property | ||
| def is_serializable(self): | ||
| return False | ||
| def is_serializable(self, safe_serialization=None): |
There was a problem hiding this comment.
changing the property is a tad breaking, so let's just put the 🔴 on the PR!
|
Thanks a lot @jerryzh168 🤗 great contributions and I love that we can upload serialized quantized weights to the hub now! |
…uggingface#33456) * Enable non-safetensor serialization and deserialization for TorchAoConfig quantized model Summary: After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details Test Plan: tested locally Reviewers: Subscribers: Tasks: Tags: * formatting * formatting * minor fix * formatting * address comments * comments * minor fix * update doc * refactor compressed tensor quantizer
Summary:
After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers
Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details
Test Plan:
tested locally
https://gist.github.com/jerryzh168/965ccdbd595c9210d49cfbe31dc6705f
Reviewers:
Subscribers:
Tasks:
Tags: