Skip to content

Revert "Revert "[tune] PB2 (#11466)" (#11795)"#11812

Merged
richardliaw merged 2 commits intoray-project:masterfrom
richardliaw:revert-2-pb2
Nov 5, 2020
Merged

Revert "Revert "[tune] PB2 (#11466)" (#11795)"#11812
richardliaw merged 2 commits intoray-project:masterfrom
richardliaw:revert-2-pb2

Conversation

@richardliaw
Copy link
Copy Markdown
Contributor

@richardliaw richardliaw commented Nov 4, 2020

This reverts commit 7248d5f.

Introduces a lazy import for GPy and PB2 in general.

Why are these changes needed?

cc @amogkam

Do not approve yet, as there will be an attempt to repro existing issues.

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
@richardliaw richardliaw merged commit efa07d5 into ray-project:master Nov 5, 2020
@richardliaw richardliaw deleted the revert-2-pb2 branch November 5, 2020 04:47
@richardliaw
Copy link
Copy Markdown
Contributor Author

1085.36s$ ./ci/keep_alive bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=-tf,pytorch,-py37 python/ray/util/sgd/...
(03:21:23) WARNING: The following configs were expanded more than once: [ci]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
(03:21:23) INFO: Invocation ID: cf8cd7ff-7cb4-4f71-bd60-197d640c61f2
(03:21:24) INFO: Current date is 2020-11-05
(03:21:24) Loading: 
(03:21:24) Loading: 0 packages loaded
(03:21:24) DEBUG: /home/travis/build/ray-project/ray/bazel/ray_deps_setup.bzl:63:14: No implicit mirrors used because urls were explicitly provided
(03:21:24) Analyzing: 17 targets (0 packages loaded, 0 targets configured)
(03:21:24) INFO: Analyzed 17 targets (0 packages loaded, 21 targets configured).
(03:21:24) INFO: Found 17 test targets...
(03:21:24) [0 / 64] [Prepa] Expanding template python/ray/util/sgd/tune_example_2
(03:21:54) [53 / 70] 1 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_example_1; 17s local
(03:22:12) [53 / 70] 1 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_example_1; 36s local
(03:22:39) [54 / 71] 2 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_example_2; 26s local
(03:22:54) [55 / 72] 3 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_pbt; 5s local
(03:23:09) [55 / 72] 3 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_pbt; 20s local
(running, 2m total)
(03:23:40) [55 / 72] 3 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_pbt; 52s local
(03:24:01) [55 / 72] 3 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_pbt; 72s local
(03:24:25) [57 / 74] 5 / 17 tests; Testing //python/ray/util/sgd:image_models; 4s local
(03:24:54) [58 / 75] 6 / 17 tests; Testing //python/ray/util/sgd:mnist-ptl; 23s local
(running, 4m total)
(03:25:51) [60 / 77] 8 / 17 tests; Testing //python/ray/util/sgd:test_ptl; 47s local
(03:26:31) [60 / 77] 8 / 17 tests; Testing //python/ray/util/sgd:test_ptl; 87s local
(03:27:17) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 23s local
(running, 6m total)
(03:29:03) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 129s local
(running, 8m total)
(03:30:12) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 198s local
(running, 10m total)
(03:31:31) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 277s local
(running, 12m total)
(03:34:34) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 460s local
(running, 14m total)
(running, 16m total)
(03:37:54) [62 / 79] 10 / 17 tests; Testing //python/ray/util/sgd:test_torch_failure; 132s local
(running, 18m total)
(03:39:28) INFO: Elapsed time: 1085.156s, Critical Path: 529.03s
(03:39:28) INFO: 34 processes: 34 local.
(03:39:28) INFO: Build completed successfully, 69 total actions
//python/ray/util/sgd:benchmark                                          PASSED in 11.7s
//python/ray/util/sgd:cifar_pytorch_example_1                            PASSED in 36.2s
//python/ray/util/sgd:cifar_pytorch_example_2                            PASSED in 36.2s
//python/ray/util/sgd:cifar_pytorch_pbt                                  PASSED in 78.6s
//python/ray/util/sgd:dcgan                                              PASSED in 13.8s
//python/ray/util/sgd:image_models                                       PASSED in 10.3s
//python/ray/util/sgd:mnist-ptl                                          PASSED in 23.6s
//python/ray/util/sgd:raysgd_torch_signatures                            PASSED in 8.5s
//python/ray/util/sgd:test_ptl                                           PASSED in 110.4s
  WARNING: //python/ray/util/sgd:test_ptl: Test execution time (110.4s excluding execution overhead) outside of range for LONG tests. Consider setting timeout="moderate" or size="medium".
//python/ray/util/sgd:test_torch                                         PASSED in 528.9s
//python/ray/util/sgd:test_torch_failure                                 PASSED in 132.1s
  WARNING: //python/ray/util/sgd:test_torch_failure: Test execution time (132.1s excluding execution overhead) outside of range for LONG tests. Consider setting timeout="moderate" or size="medium".
//python/ray/util/sgd:test_torch_runner                                  PASSED in 1.4s
//python/ray/util/sgd:train_example_1                                    PASSED in 8.1s
//python/ray/util/sgd:train_example_2                                    PASSED in 8.6s
//python/ray/util/sgd:tune_example_1                                     PASSED in 19.4s
//python/ray/util/sgd:tune_example_2                                     PASSED in 28.0s
//python/ray/util/sgd:tune_example_3                                     PASSED in 28.1s
Executed 17 out of 17 tests: 17 tests pass.
(03:39:28) INFO: Build Event Protocol files produced successfully.
(03:39:28) INFO: Build completed successfully, 69 total actions
(03:39:28) INFO: Build completed successfully, 69 total actions
real	18m5.315s

@richardliaw
Copy link
Copy Markdown
Contributor Author

Compared to previously, with the sklearn/Gpy imports:

2330.79s$ ./ci/keep_alive bazel test --config=ci $(./scripts/bazel_export_options) --build_tests_only --test_tag_filters=-tf,pytorch,-py37 python/ray/util/sgd/...
(01:57:17) WARNING: The following configs were expanded more than once: [ci]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
(01:57:17) INFO: Invocation ID: a159bc38-c502-44d1-9ce2-fec92aef69f4
(01:57:17) INFO: Current date is 2020-11-04
(01:57:17) Loading: 
(01:57:18) Loading: 0 packages loaded
(01:57:18) DEBUG: /home/travis/build/ray-project/ray/bazel/ray_deps_setup.bzl:63:14: No implicit mirrors used because urls were explicitly provided
(01:57:18) Analyzing: 17 targets (0 packages loaded, 0 targets configured)
(01:57:18) INFO: Analyzed 17 targets (0 packages loaded, 21 targets configured).
(01:57:18) INFO: Found 17 test targets...
(01:57:18) [0 / 64] [Prepa] Expanding template python/ray/util/sgd/tune_example_2
(01:57:35) [52 / 69] Testing //python/ray/util/sgd:benchmark; 16s local
(01:58:03) [53 / 70] 1 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_example_1; 27s local
(01:58:18) [54 / 71] 2 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_example_2; 2s local
(01:58:33) [54 / 71] 2 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_example_2; 17s local
(01:58:56) [54 / 71] 2 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_example_2; 40s local
(running, 2m total)
(01:59:18) [55 / 72] 3 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_pbt; 21s local
(01:59:54) [55 / 72] 3 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_pbt; 57s local
(02:00:17) [55 / 72] 3 / 17 tests; Testing //python/ray/util/sgd:cifar_pytorch_pbt; 81s local
(02:00:44) [56 / 73] 4 / 17 tests; Testing //python/ray/util/sgd:dcgan; 15s local
(running, 4m total)
(02:01:28) [58 / 75] 6 / 17 tests; Testing //python/ray/util/sgd:mnist-ptl; 28s local
(02:02:24) [60 / 77] 8 / 17 tests; Testing //python/ray/util/sgd:test_ptl; 44s local
(02:03:10) [60 / 77] 8 / 17 tests; Testing //python/ray/util/sgd:test_ptl; 90s local
(running, 6m total)
(02:04:03) [60 / 77] 8 / 17 tests; Testing //python/ray/util/sgd:test_ptl; 143s local
(running, 8m total)
(02:06:04) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 113s local
(running, 10m total)
(02:07:24) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 192s local
(02:08:55) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 283s local
(running, 12m total)
(02:10:39) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 388s local
(running, 14m total)
(02:12:40) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 508s local
(running, 16m total)
(02:14:58) [61 / 78] 9 / 17 tests; Testing //python/ray/util/sgd:test_torch; 647s local
(running, 18m total)
(running, 20m total)
(02:17:37) [62 / 79] 10 / 17 tests; Testing //python/ray/util/sgd:test_torch_failure; 102s local
(running, 22m total)
(02:20:40) [62 / 79] 10 / 17 tests; Testing //python/ray/util/sgd:test_torch_failure; 285s local
(running, 24m total)
(running, 26m total)
(02:24:10) [62 / 79] 10 / 17 tests; Testing //python/ray/util/sgd:test_torch_failure; 495s local
(running, 28m total)
(running, 30m total)
(02:28:13) [62 / 79] 10 / 17 tests; Testing //python/ray/util/sgd:test_torch_failure; 737s local
(running, 32m total)
(running, 34m total)
(02:32:51) [62 / 79] 10 / 17 tests; Testing //python/ray/util/sgd:test_torch_failure; 100s local
(running, 36m total)
FLAKY: //python/ray/util/sgd:test_torch_failure (Summary)
      /home/travis/.cache/bazel/_bazel_travis/b88c129a127452fc94033a29d9f90e20/execroot/com_github_ray_project_ray/bazel-out/k8-opt/testlogs/python/ray/util/sgd/test_torch_failure/test_attempts/attempt_1.log
(02:33:52) INFO: From Testing //python/ray/util/sgd:test_torch_failure:
==================== Test output for //python/ray/util/sgd:test_torch_failure:
============================= test session starts ==============================
platform linux -- Python 3.6.10, pytest-5.4.3, py-1.9.0, pluggy-0.13.1 -- /opt/miniconda/bin/python3
cachedir: .pytest_cache
rootdir: /home/travis/.cache/bazel/_bazel_travis/b88c129a127452fc94033a29d9f90e20/execroot/com_github_ray_project_ray/bazel-out/k8-opt/bin/python/ray/util/sgd/test_torch_failure.runfiles/com_github_ray_project_ray
plugins: rerunfailures-9.1.1, sugar-0.9.4, timeout-1.4.2, remotedata-0.3.2, asyncio-0.14.0
collecting ... collected 6 items
::test_resize[False] PASSED                                              [ 16%]
::test_resize[True] PASSED                                               [ 33%]
::test_fail_twice[False] PASSED                                          [ 50%]
::test_fail_twice[True] PASSED                                           [ 66%]
::test_fail_with_recover[False] PASSED                                   [ 83%]
::test_fail_with_recover[True] -- Test timed out at 2020-11-04 02:30:55 UTC --
================================================================================
(running, 38m total)
(02:36:08) INFO: Elapsed time: 2330.514s, Critical Path: 1077.41s
(02:36:08) INFO: 36 processes: 36 local.
(02:36:08) INFO: Build completed successfully, 69 total actions
//python/ray/util/sgd:benchmark                                          PASSED in 16.7s
//python/ray/util/sgd:cifar_pytorch_example_1                            PASSED in 40.7s
//python/ray/util/sgd:cifar_pytorch_example_2                            PASSED in 40.6s
//python/ray/util/sgd:cifar_pytorch_pbt                                  PASSED in 93.2s
//python/ray/util/sgd:dcgan                                              PASSED in 16.8s
//python/ray/util/sgd:image_models                                       PASSED in 12.9s
//python/ray/util/sgd:mnist-ptl                                          PASSED in 28.7s
//python/ray/util/sgd:raysgd_torch_signatures                            PASSED in 12.1s
//python/ray/util/sgd:test_ptl                                           PASSED in 150.9s
//python/ray/util/sgd:test_torch                                         PASSED in 704.1s
//python/ray/util/sgd:test_torch_runner                                  PASSED in 2.2s
//python/ray/util/sgd:train_example_1                                    PASSED in 10.8s
//python/ray/util/sgd:train_example_2                                    PASSED in 12.1s
//python/ray/util/sgd:tune_example_1                                     PASSED in 29.2s
//python/ray/util/sgd:tune_example_2                                     PASSED in 40.4s
//python/ray/util/sgd:tune_example_3                                     PASSED in 40.5s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants