Patch for FlanT5-XXL 8bit support#20760
Conversation
f90b269 to
55f8fcc
Compare
|
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Thanks so much for the fix @larsmennen !
I would personally advocate to focus only on T5, and we can add these patches later on if we figure out that the same issue occur for all subsidiary models! Can you revert the changes for longt5/perceiver & switch (ideally also keep the copy mechanism, so maybe add the # Copied from statements but use another model as t5 as reference (for e.g. for perceiver # Copied from transformers.src.models.longt5. ...)
Also don't forget to run the styling changes ;) (make fixup)
Thanks again!
sgugger
left a comment
There was a problem hiding this comment.
Thanks a lot! This looks good to me apart from the unrelated change in perceiver.
| ... trainable_position_encoding_kwargs=dict( | ||
| ... num_channels=256, | ||
| ... index_dims=config.image_size**2, | ||
| ... index_dims=config.image_size ** 2, |
There was a problem hiding this comment.
Let's leave this as is, it's not linked to this PR.
There was a problem hiding this comment.
Reverted, thanks (think that somehow came with one of the make scripts, but my env may not've been fully properly setup)
That makes sense! done |
… add_get_encoder_decoder_fsmt * 'main' of ssh://github.com/huggingface/transformers: (1433 commits) Add Universal Segmentation class + mapping (huggingface#20766) Stop calling expand_1d on newer TF versions (huggingface#20786) Fix object detection2 (huggingface#20798) [Pipeline] skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING` (huggingface#20790) Recompile `apex` in `DeepSpeed` CI image (huggingface#20788) Move convert_to_rgb to image_transforms module (huggingface#20784) Generate: use `GenerationConfig` as the basis for `.generate()` parametrization (huggingface#20388) Install video dependency for pipeline CI (huggingface#20777) Fixing object detection with `layoutlm` (huggingface#20776) [Pipeline] fix failing bloom `pipeline` test (huggingface#20778) Patch for FlanT5-XXL 8bit support (huggingface#20760) Install vision for TF pipeline tests (huggingface#20771) Even more validation. (huggingface#20762) Add Swin backbone (huggingface#20769) Install `torch-tensorrt 1.3.0` for DeepSpeed CI (huggingface#20764) Replaces xxx_required with requires_backends (huggingface#20715) [CI-Test] Fixes but also skips the mT5 tests (huggingface#20755) Fix attribute error problem (huggingface#20765) [Tests] Improve test_attention_outputs (huggingface#20701) Fix missing `()` in some usage of `is_flaky` (huggingface#20749) ...
What does this PR do?
Fixes #20287 .
In #20287 , 3 patches were proposed here: #20287 (comment)
keep_in_fp32_modulessupport #20683self.woinfloat32.hidden_statesactually has a differentdtypefrom thewoweights.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@younesbelkada @sgugger