Skip to content

[RLlib] Add Exploration API documentation.#7373

Merged
ericl merged 4 commits intoray-project:masterfrom
sven1977:exploration_api_add_documentation_to_docs
Mar 2, 2020
Merged

[RLlib] Add Exploration API documentation.#7373
ericl merged 4 commits intoray-project:masterfrom
sven1977:exploration_api_add_documentation_to_docs

Conversation

@sven1977
Copy link
Copy Markdown
Contributor

@sven1977 sven1977 commented Feb 28, 2020

How to use the new Exploration API needs to be explained in our documentation.
This PR adds it under RLlib->Training->Advanced Python APIs.

Related issue number

Checks

@sven1977 sven1977 requested a review from ericl February 28, 2020 11:00
@AmplabJenkins
Copy link
Copy Markdown

Can one of the admins verify this patch?

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/22555/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/22556/
Test FAILed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/22558/
Test FAILed.

@ericl ericl self-assigned this Feb 28, 2020

.. image:: custom_metric.png

Customized Exploration Behavior (Training and Evaluation)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add a link in rllib-toc.rst to this too (the TOC has to be manually updated).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

a deterministic action.
timestep (int): The current sampling time step. If None, the
component should try to use an internal counter, which it
then increments by 1. If provided, will set the internal
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this different from timesteps_total / num_steps_sampled?? We should make sure to use the same timestep counter / terminology everywhere to avoid confusion.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The policy always passes self.global_timestep into the call to get_exploration_action. It's the same behavior across all Policy types. This prop is updated in Policy.on_global_var_update -> global_vars["timestep"], so it should always be the total number of sample steps.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added that explanation to the docs as well.

"explore": True,
"exploration_config": {
"type": "EpsilonGreedy", # <- Exploration sub-class by name or full path to module+class
# (e.g. “ray.rllib.utils.exploration.epsilon_greedy.EpsilonGreedy”)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the code block overflow horizontally on my machine, consider trimming it down by a few words.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

The following table lists all existing Exploration sub-classes and where they
are currently used (by default):

.. image:: images/rllib-exploration-api-table.png
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table is informative. Can it be shown right after introducing the exploration config, but before the large blocks of code for Exploration class?

Also,

  • should include a google drawings link as a comment (shared public, view only) so that this can be edited in the future.
  • can you export the table as a SVG?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@sven1977
Copy link
Copy Markdown
Contributor Author

sven1977 commented Feb 29, 2020

@ericl : All requested changes have been made.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/22581/
Test FAILed.

Copy link
Copy Markdown
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

@ericl ericl merged commit 2d97650 into ray-project:master Mar 2, 2020
@ericl
Copy link
Copy Markdown
Contributor

ericl commented Mar 2, 2020

LINT test looks ok, merging.

Maybe we can skip the C++ build if no source components have changed, it looks like the build failures are contained to src.

@sven1977 sven1977 deleted the exploration_api_add_documentation_to_docs branch March 3, 2020 10:18
ffbin pushed a commit to antgroup/ant-ray that referenced this pull request Mar 20, 2020
* Add Exploration API documentation.

* Add Exploration API documentation.

* Add Exploration API documentation.

* Update exporation docs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants