Closed
Conversation
…ingface#16717) * [deepspeed / m2m_100] make deepspeed 3 work with layerdrop * fix * revert last
* Improve README * Make dataset_name argument optional * Improve local data * Fix bug * Improve README some more * Apply suggestions from code review * Improve README Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* [trainer / deepspeed] fix hyperparameter_search * require optuna * style * oops * add dep in the right place * create deepspeed-testing dep group * Trigger CI
…` + tests (huggingface#16657) * add low_cpu_mem_usage tests * wip: revamping * wip * install /usr/bin/time * wip * cleanup * cleanup * cleanup * cleanup * cleanup * fix assert * put the wrapper back * cleanup; switch to bert-base-cased * Trigger CI * Trigger CI
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* Pin Jax to last working release * Try lower * Try lower
… or Deepspeed (huggingface#16786) * optimizer issues related to saving * remove the "optimizer saving" option * reformat using make style
* Improve code * Fix bugs * Fix another bug * Clean up DTP as well * Update DPT model outputs Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* save intermediate * add vision * add vision * save * finish models * finish models * continue * finish * up * up * up * tests all pass * clean up * up * up * fix bugs in beit * correct docs * finish * finish docs * make style * up * more fixes * fix type hint * make style * Apply suggestions from code review Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/data2vec/test_modeling_data2vec_vision.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * fix test Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
…ingface#16806) * use base_version * make is_torch_less_than_1_8 match 1_11 Co-authored-by: Nicholas Broad <nicholas@nmbroad.com>
…ngface#16814) * Add passing encoder_outputs as tuple to existing test * Add check for tuple * Add check for tuple also for speech and vision Co-authored-by: jsnfly <jsnfly@gmx.de>
* Refactor issues with yaml * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Update .github/ISSUE_TEMPLATE/feature-request.yml Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.yml Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
…e in build (huggingface#16821) * fix _setup_devices in case where there is not torch.distributed * in training_args_sm.py as well
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
* Add first draft from previous PR * First draft * Improve README and remove num_labels * Make script more aligned with other scripts * Improve README and apply suggestion from code review
* Fix docstrings * Fix up * Fix
* Solved href rendering issue in heading Markdown references in headings such as '####' don't render well. Replaced it with <h4>...<a></a></h> banners. * PhonemeTokenizer optimization using phonemizer lib The backend should only be initialized once, otherwise it is reloaded. Added `init_backend` function, intializes a backend attribute. Phonemize re-uses self.backend. Should give ~10 times faster phonemization. * formatted file with make style * Documentation suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update /tokenization_wav2vec2_phoneme.py based on PR suggestion Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update CONTRIBUTING.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix issue avoid-misusing-assert-true found at https://codereview.doctor * fix tests * fix tf Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
- change new operations - finally add scaled softmax - added new args in the config
- save sharded models - final changes on the modeling script
- comment on alibi - add TODO on seq length
- added a text to test the commit Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
- attention mask change - generation works on BS176b Co-authored-by: thomasw21 <24695242+thomasw21@users.noreply.github.com>
904cced to
cc38b40
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?