Skip to content

Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 #33456

Merged
ArthurZucker merged 10 commits into
huggingface:mainfrom
jerryzh168:enable-torchao-ser
Sep 30, 2024
Merged

Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 #33456
ArthurZucker merged 10 commits into
huggingface:mainfrom
jerryzh168:enable-torchao-ser

Conversation

@jerryzh168

@jerryzh168 jerryzh168 commented Sep 13, 2024

Copy link
Copy Markdown
Contributor

Summary:
After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization in huggingface, with this we can now add the support in transformers

Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor see README for more details

Test Plan:
tested locally
https://gist.github.com/jerryzh168/965ccdbd595c9210d49cfbe31dc6705f

Reviewers:

Subscribers:

Tasks:

Tags:

@jerryzh168 jerryzh168 changed the title Enable non-safetensor serialization and deserialization for TorchAoCo… Enable non-safetensor ser/deser for TorchAoConfig quantized model Sep 13, 2024
@jerryzh168

Copy link
Copy Markdown
Contributor Author

cc @SunMarc @Wauplin can you take a look?

@SunMarc SunMarc left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @jerryzh168 to enable serialization ! Really appreciate that you are doing the PRs on huggingface hub and transformers ! This looks pretty good ! I left a few comments

Comment thread docs/source/en/quantization/torchao.md Outdated
Comment thread docs/source/en/quantization/torchao.md Outdated
Comment thread src/transformers/modeling_utils.py Outdated
Comment thread src/transformers/modeling_utils.py Outdated
Comment thread src/transformers/quantizers/quantizer_torchao.py Outdated
Comment thread src/transformers/utils/quantization_config.py Outdated
Comment thread src/transformers/quantizers/quantizer_torchao.py Outdated
Comment thread src/transformers/quantizers/quantizer_torchao.py Outdated
@jerryzh168

Copy link
Copy Markdown
Contributor Author

@SunMarc thanks for your thoughtful reviews! I have addressed all the comments I think, please take a look again, also not sure if the CI failure is relevant or not

@jerryzh168

Copy link
Copy Markdown
Contributor Author

btw, current pytorch nightly has a perf regression: pytorch/ao#898 and we hope to fix this before 2.5 cherry-pick deadline

@jerryzh168 jerryzh168 requested a review from SunMarc September 19, 2024 03:04

@SunMarc SunMarc left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating! LGTM! Just a nit. Let me know when the regression is fixed. I've pinged a core maintainer to review the PR

Comment thread src/transformers/modeling_utils.py Outdated
@SunMarc

SunMarc commented Sep 19, 2024

Copy link
Copy Markdown
Member

To fix the failing test, can you rebase on main ?

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@jerryzh168

Copy link
Copy Markdown
Contributor Author

@ArthurZucker can you take a look at the PR?

@SunMarc

SunMarc commented Sep 25, 2024

Copy link
Copy Markdown
Member

I've merged recently PR adding a new quantizer @jerryzh168. Sorry for that but could you rebase on main and update the is_serializable method ?

…nfig quantized model

Summary:
After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization
in huggingface, with this we can now add the support in transformers

Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor
see README for more details

Test Plan:
tested locally

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168

Copy link
Copy Markdown
Contributor Author

@SunMarc @ArthurZucker updated, please take a look again

@ArthurZucker ArthurZucker left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, super sorry for the delay!
Super important to have serialization!

@property
def is_serializable(self):
return False
def is_serializable(self, safe_serialization=None):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing the property is a tad breaking, so let's just put the 🔴 on the PR!

@ArthurZucker ArthurZucker changed the title Enable non-safetensor ser/deser for TorchAoConfig quantized model Enable non-safetensor ser/deser for TorchAoConfig quantized model 🔴 Sep 30, 2024
@ArthurZucker ArthurZucker merged commit 4bb49d4 into huggingface:main Sep 30, 2024
@ArthurZucker

Copy link
Copy Markdown
Collaborator

Thanks a lot @jerryzh168 🤗 great contributions and I love that we can upload serialized quantized weights to the hub now!

BernardZach pushed a commit to BernardZach/transformers that referenced this pull request Dec 5, 2024
…uggingface#33456)

* Enable non-safetensor serialization and deserialization for TorchAoConfig quantized model

Summary:
After huggingface/huggingface_hub#2440 we added non-safetensor serialization and deserialization
in huggingface, with this we can now add the support in transformers

Note that we don't plan to add safetensor serialization due to different goals of wrapper tensor subclass and safetensor
see README for more details

Test Plan:
tested locally

Reviewers:

Subscribers:

Tasks:

Tags:

* formatting

* formatting

* minor fix

* formatting

* address comments

* comments

* minor fix

* update doc

* refactor compressed tensor quantizer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants