Skip to content

Trainer.repo.push_to_hub returns None, causing raised exception #23712

@XekriRedmane

Description

@XekriRedmane

System Info

  • transformers version: 4.28.1
  • Platform: Windows-10-10.0.22621-SP0
  • Python version: 3.11.2
  • Huggingface_hub version: 0.14.1
  • Safetensors version: not installed
  • PyTorch version (GPU?): 2.0.1+cu117 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: NO
  • Using distributed or parallel set-up in script?: NO

Who can help?

@sgugger

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

For some root cause that I'm not certain of, Trainer.repo.push_to_hub can return None, which causes Trainer._push_from_checkpoint to raise an exception (as it expects a tuple to be returned).

Traceback (most recent call last):
  File "F:\eo-reco\run_speech_recognition_ctc.py", line 810, in <module>
    main()
  File "F:\eo-reco\run_speech_recognition_ctc.py", line 756, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\eo-reco\.env\Lib\site-packages\transformers\trainer.py", line 1664, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "F:\eo-reco\.env\Lib\site-packages\transformers\trainer.py", line 2019, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
  File "F:\eo-reco\.env\Lib\site-packages\transformers\trainer.py", line 2308, in _maybe_log_save_evaluate
    self._save_checkpoint(model, trial, metrics=metrics)
  File "F:\eo-reco\.env\Lib\site-packages\transformers\trainer.py", line 2462, in _save_checkpoint
    self._push_from_checkpoint(output_dir)
  File "F:\eo-reco\.env\Lib\site-packages\transformers\trainer.py", line 3649, in _push_from_checkpoint
    _, self.push_in_progress = self.repo.push_to_hub(
    ^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object

(Note: line numbers in run_speech_recognition_ctc.py will not be accurate, as I've copied it and modified it)

repo.push_to_hub can return None if the repo is clean, which will cause the issue. However, that might not have happened in my case, since there was no corresponding log message about that (assuming logging would immediately be logged, and not buffered).

Expected behavior

No exception, maybe just a warning.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions