[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test. by sven1977 · Pull Request #22126 · ray-project/ray

sven1977 · 2022-02-04T20:36:16Z

This PR:

Provides a new training_iteration function for A3C (alternative to existing execution_plan).
By default, uses that new iteration function (_disable_execution_plan_api=True).
~3x speedup for tuned_examples/a3c/pong-a3c.yaml (16 worker, LSTM+CNN Atari problem).
As a consequence of this speedup, A3C learns the Pong problem again with an LSTM -> re-instates previously out-commented weekly learning test case (similar to tuned_examples/a3c/pong-a3c.yaml).

Why are these changes needed?

Related issue number

Checks

I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

avnishn · 2022-02-04T20:39:11Z

rllib/agents/a3c/a3c.py

+
+            # Synch updated weights back to the particular worker.
+            with self._timers[SYNCH_WORKER_WEIGHTS_TIMER]:
+                weights = local_worker.get_weights(local_worker.get_policies_to_train())


avnishn · 2022-02-04T20:40:33Z

rllib/agents/a3c/a3c.py

+        if global_vars:
+            local_worker.set_global_vars(global_vars)
+
+        # TODO: If we have processed more than one gradients


so to be clear we haven't written to result in this pr, right? But we want to for logging purposes.

This is still WIP. I need to add proper compilation of the results dict. The only thing that's missing is to combine those learner stats from all workers that have returned something from the async_parallel_requests call further above. This shim implementation right now only returns the last one.
Let me finish this before merging, of course.

avnishn · 2022-02-04T20:41:33Z

Merge pending results dict.

…on instead of `execution_plan`) and re-instate Pong learning test. (#22126)" This reverts commit ac3e6ab.

…on instead of `execution_plan`) and re-instate Pong learning test." (#22250) Reverts #22126 Breaks rllib:tests/test_io

…ad of `execution_plan`) and re-instate Pong learning test. (ray-project#22126)

…on instead of `execution_plan`) and re-instate Pong learning test." (ray-project#22250) Reverts ray-project#22126 Breaks rllib:tests/test_io

sven1977 added 4 commits February 4, 2022 11:37

wip.

54f608c

wip.

7ff2576

Merge branch 'master' of https://github.com/ray-project/ray into test_ac

91a84e3

wip.

a9fcc6f

sven1977 requested review from avnishn and gjoliver as code owners February 4, 2022 20:36

avnishn approved these changes Feb 4, 2022

View reviewed changes

sven1977 added 8 commits February 5, 2022 15:26

test.

dd2dc18

wip

3d3bd00

Merge branch 'master' of https://github.com/ray-project/ray into test_ac

e5418e2

wip

82bd57c

Merge branch 'master' of https://github.com/ray-project/ray into test_ac

bdac9aa

wip

d1758c3

Merge branch 'master' of https://github.com/ray-project/ray into test_ac

a7303fa

wip

89eb20e

avnishn mentioned this pull request Feb 7, 2022

[WIP, RLlib] Prototype of a3c training iteration fn #22094

Closed

6 tasks

sven1977 added 5 commits February 8, 2022 10:12

wip

261710b

LINT.

bd01069

wip

a7b51f8

Merge branch 'master' of https://github.com/ray-project/ray into test_ac

b03ad9e

wip

a3943ea

sven1977 merged commit ac3e6ab into ray-project:master Feb 8, 2022

wuisawesome added a commit that referenced this pull request Feb 9, 2022

Revert "[RLlib] Speedup A3C up to 3x (new training_iteration functi…

880a4ef

…on instead of `execution_plan`) and re-instate Pong learning test. (#22126)" This reverts commit ac3e6ab.

wuisawesome mentioned this pull request Feb 9, 2022

Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test." #22250

Merged

wuisawesome added a commit that referenced this pull request Feb 9, 2022

Revert "[RLlib] Speedup A3C up to 3x (new training_iteration functi…

b122f09

…on instead of `execution_plan`) and re-instate Pong learning test." (#22250) Reverts #22126 Breaks rllib:tests/test_io

simonsays1980 pushed a commit to simonsays1980/ray that referenced this pull request Feb 27, 2022

[RLlib] Speedup A3C up to 3x (new training_iteration function inste…

12f0364

…ad of `execution_plan`) and re-instate Pong learning test. (ray-project#22126)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test.#22126

[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test.#22126
sven1977 merged 17 commits intoray-project:masterfrom
sven1977:test_ac

sven1977 commented Feb 4, 2022 •

edited

Loading

Uh oh!

avnishn Feb 4, 2022

Uh oh!

avnishn Feb 4, 2022

Uh oh!

sven1977 Feb 5, 2022

Uh oh!

avnishn commented Feb 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sven1977 commented Feb 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Related issue number

Checks

Uh oh!

avnishn Feb 4, 2022

Choose a reason for hiding this comment

Uh oh!

avnishn Feb 4, 2022

Choose a reason for hiding this comment

Uh oh!

sven1977 Feb 5, 2022

Choose a reason for hiding this comment

Uh oh!

avnishn commented Feb 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sven1977 commented Feb 4, 2022 •

edited

Loading