[tune](deps): Bump pytorch-lightning from 1.0.3 to 1.3.1 in /python/requirements by dependabot[bot] · Pull Request #13 · architkulkarni/ray

dependabot · 2021-05-11T18:32:37Z

Bumps pytorch-lightning from 1.0.3 to 1.3.1.

Release notes

Sourced from pytorch-lightning's releases.

Standard weekly patch release

[1.3.1] - 2021-05-11

Fixed

Fixed DeepSpeed with IterableDatasets (#7362)

Fixed Trainer.current_epoch not getting restored after tuning (#7434)

Fixed local rank displayed in console log (#7395)

Contributors

@akihironitta @awaelchli @leezu

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Lightning CLI, PyTorch Profiler, Improved Early Stopping

[1.3.0] - 2021-05-06

Added

Added support for the EarlyStopping callback to run at the end of the training epoch (#6944)

Added synchronization points before and after setup hooks are run (#7202)

Added a teardown hook to ClusterEnvironment (#6942)

Added utils for metrics to scalar conversions (#7180)

Added utils for NaN/Inf detection for gradients and parameters (#6834)

Added more explicit exception message when trying to execute trainer.test() or trainer.validate() with fast_dev_run=True (#6667)

Added LightningCLI class to provide simple reproducibility with minimum boilerplate training CLI (#4492, #6862, #7156, #7299)

Added gradient_clip_algorithm argument to Trainer for gradient clipping by value (#6123).

Added a way to print to terminal without breaking up the progress bar (#5470)

Added support to checkpoint after training steps in ModelCheckpoint callback (#6146)

Added TrainerStatus.{INITIALIZING,RUNNING,FINISHED,INTERRUPTED} (#7173)

Added Trainer.validate() method to perform one evaluation epoch over the validation set (#4948)

Added LightningEnvironment for Lightning-specific DDP (#5915)

Added teardown() hook to LightningDataModule (#4673)

Added auto_insert_metric_name parameter to ModelCheckpoint (#6277)

Added arg to self.log that enables users to give custom names when dealing with multiple dataloaders (#6274)

Added teardown method to BaseProfiler to enable subclasses defining post-profiling steps outside of __del__ (#6370)

Added setup method to BaseProfiler to enable subclasses defining pre-profiling steps for every process (#6633)

Added no return warning to predict (#6139)

Added Trainer.predict config validation (#6543)

Added AbstractProfiler interface (#6621)

Added support for including module names for forward in the autograd trace of PyTorchProfiler (#6349)

Added support for the PyTorch 1.8.1 autograd profiler (#6618)

Added outputs parameter to callback's on_validation_epoch_end & on_test_epoch_end hooks (#6120)

Added configure_sharded_model hook (#6679)

Added support for precision=64, enabling training with double precision (#6595)

Added support for DDP communication hooks (#6736)

Added artifact_location argument to MLFlowLogger which will be passed to the MlflowClient.create_experiment call (#6677)

Added model parameter to precision plugins' clip_gradients signature (#6764, #7231)

Added is_last_batch attribute to Trainer (#6825)

... (truncated)

Changelog

Sourced from pytorch-lightning's changelog.

[1.3.1] - 2021-05-11

Fixed

Fixed DeepSpeed with IterableDatasets (#7362)

Fixed Trainer.current_epoch not getting restored after tuning (#7434)

Fixed local rank displayed in console log (#7395)

[1.3.0] - 2021-05-06

Added

Added support for the EarlyStopping callback to run at the end of the training epoch (#6944)

Added synchronization points before and after setup hooks are run (#7202)

Added a teardown hook to ClusterEnvironment (#6942)

Added utils for metrics to scalar conversions (#7180)

Added utils for NaN/Inf detection for gradients and parameters (#6834)

Added more explicit exception message when trying to execute trainer.test() or trainer.validate() with fast_dev_run=True (#6667)

Added LightningCLI class to provide simple reproducibility with minimum boilerplate training CLI ( #4492, #6862, #7156, #7299)

Added gradient_clip_algorithm argument to Trainer for gradient clipping by value (#6123).

Added a way to print to terminal without breaking up the progress bar (#5470)

Added support to checkpoint after training steps in ModelCheckpoint callback (#6146)

Added TrainerStatus.{INITIALIZING,RUNNING,FINISHED,INTERRUPTED} (#7173)

Added Trainer.validate() method to perform one evaluation epoch over the validation set (#4948)

Added LightningEnvironment for Lightning-specific DDP (#5915)

Added teardown() hook to LightningDataModule (#4673)

Added auto_insert_metric_name parameter to ModelCheckpoint (#6277)

Added arg to self.log that enables users to give custom names when dealing with multiple dataloaders (#6274)

Added teardown method to BaseProfiler to enable subclasses defining post-profiling steps outside of __del__ (#6370)

Added setup method to BaseProfiler to enable subclasses defining pre-profiling steps for every process (#6633)

Added no return warning to predict (#6139)

Added Trainer.predict config validation (#6543)

Added AbstractProfiler interface (#6621)

Added support for including module names for forward in the autograd trace of PyTorchProfiler (#6349)

Added support for the PyTorch 1.8.1 autograd profiler (#6618)

Added outputs parameter to callback's on_validation_epoch_end & on_test_epoch_end hooks (#6120)

Added configure_sharded_model hook (#6679)

Added support for precision=64, enabling training with double precision (#6595)

Added support for DDP communication hooks (#6736)

Added artifact_location argument to MLFlowLogger which will be passed to the MlflowClient.create_experiment call (#6677)

Added model parameter to precision plugins' clip_gradients signature ( #6764, #7231)

Added is_last_batch attribute to Trainer (#6825)

Added LightningModule.lr_schedulers() for manual optimization (#6567)

... (truncated)

Commits

b9b3ec5 v1.3.1
86e827e Pin Sphinx<4.0 (#7456)
f933d7a fix display bug (#7395)
85b71c8 fix 1.9 test (#7441)
fcd6be1 Restore trainer.current_epoch after tuning (#7434)
5ababc4 update ngc for 1.3 (#7414)
a4d676d Fix DeepSpeedPlugin with IterableDataset (#7362)
fbc8b20 update versions (#7409)
b181b8c release 1.3.0 (#7404)
d4d959b Call super().__init__() in MilestonesFinetuning example (#7398)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

Bumps [pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning) from 1.0.3 to 1.3.1. - [Release notes](https://github.com/PyTorchLightning/pytorch-lightning/releases) - [Changelog](https://github.com/PyTorchLightning/pytorch-lightning/blob/master/CHANGELOG.md) - [Commits](Lightning-AI/pytorch-lightning@1.0.3...1.3.1) Signed-off-by: dependabot[bot] <support@github.com>

dependabot · 2021-05-22T07:02:44Z

Superseded by #17.

We encountered SIGSEGV when running Python test `python/ray/tests/test_failure_2.py::test_list_named_actors_timeout`. The stack is: ``` #0 0x00007fffed30f393 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&) () from /lib64/libstdc++.so.6 #1 0x00007fffee707649 in ray::RayLog::GetLoggerName() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #2 0x00007fffee70aa90 in ray::SpdLogMessage::Flush() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #3 0x00007fffee70af28 in ray::RayLog::~RayLog() () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #4 0x00007fffee2b570d in ray::asio::testing::(anonymous namespace)::DelayManager::Init() [clone .constprop.0] () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #5 0x00007fffedd0d95a in _GLOBAL__sub_I_asio_chaos.cc () from /home/admin/dev/Arc/merge/ray/python/ray/_raylet.so #6 0x00007ffff7fe282a in call_init.part () from /lib64/ld-linux-x86-64.so.2 #7 0x00007ffff7fe2931 in _dl_init () from /lib64/ld-linux-x86-64.so.2 #8 0x00007ffff7fe674c in dl_open_worker () from /lib64/ld-linux-x86-64.so.2 #9 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #10 0x00007ffff7fe5ffe in _dl_open () from /lib64/ld-linux-x86-64.so.2 #11 0x00007ffff7d5f39c in dlopen_doit () from /lib64/libdl.so.2 #12 0x00007ffff7b82e79 in _dl_catch_exception () from /lib64/libc.so.6 #13 0x00007ffff7b82f13 in _dl_catch_error () from /lib64/libc.so.6 #14 0x00007ffff7d5fb09 in _dlerror_run () from /lib64/libdl.so.2 #15 0x00007ffff7d5f42a in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2 #16 0x00007fffef04d330 in py_dl_open (self=<optimized out>, args=<optimized out>) at /tmp/python-build.20220507135524.257789/Python-3.7.11/Modules/_ctypes/callproc.c:1369 ``` The root cause is that when loading `_raylet.so`, `static DelayManager _delay_manager` is initialized and `RAY_LOG(ERROR) << "RAY_testing_asio_delay_us is set to " << delay_env;` is executed. However, the static variables declared in `logging.cc` are not initialized yet (in this case, `std::string RayLog::logger_name_ = "ray_log_sink"`). It's better not to rely on the initialization order of static variables in different compilation units because it's not guaranteed. I propose to change all `RAY_LOG`s to `std::cerr` in `DelayManager::Init()`. The crash happens in Ant's internal codebase. Not sure why this test case passes in the community version though. BTW, I've tried different approaches: 1. Using a static local variable in `get_delay_us` and remove the global variable. This doesn't work because `init()` needs to access the variable as well. 2. Defining the global variable as type `std::unique_ptr<DelayManager>` and initialize it in `get_delay_us`. This works but it requires a lock to be thread-safe.

dependabot bot added the dependencies Pull requests that update a dependency file label May 11, 2021

dependabot bot closed this May 22, 2021

dependabot bot deleted the dependabot/pip/python/requirements/pytorch-lightning-1.3.1 branch May 22, 2021 07:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune](deps): Bump pytorch-lightning from 1.0.3 to 1.3.1 in /python/requirements#13

[tune](deps): Bump pytorch-lightning from 1.0.3 to 1.3.1 in /python/requirements#13
dependabot[bot] wants to merge 1 commit intomasterfrom
dependabot/pip/python/requirements/pytorch-lightning-1.3.1

dependabot bot commented on behalf of github May 11, 2021

Uh oh!

dependabot bot commented on behalf of github May 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot bot commented on behalf of github May 11, 2021

Standard weekly patch release

[1.3.1] - 2021-05-11

Fixed

Contributors

Lightning CLI, PyTorch Profiler, Improved Early Stopping

[1.3.0] - 2021-05-06

Added

[1.3.1] - 2021-05-11

Fixed

[1.3.0] - 2021-05-06

Added

Uh oh!

dependabot bot commented on behalf of github May 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants