[RLlib] Preparatory PR: Make EnvRunners use (enhanced) Connector API (#01: mostly cleanups and small fixes)#41074
Conversation
| for agent_id, ob in observations.items(): | ||
| worker = self.workers.local_worker() | ||
| preprocessed = worker.preprocessors[policy_id].transform(ob) | ||
| if worker.preprocessors.get(policy_id) is not None: |
There was a problem hiding this comment.
This is a bug fix.
| # If not specified, we will try to auto-detect this. | ||
| self._is_atari = None | ||
|
|
||
| # TODO (sven): Rename this method into `AlgorithmConfig.sampling()` |
There was a problem hiding this comment.
Now that we are aiming for a the EnvRunner API as the default, we should rename/clarify some of these config settings and methods.
There was a problem hiding this comment.
Please consider loading a checkpoint here? Are these renaming backward compatible?
There was a problem hiding this comment.
Is there even a story around this? Like can people even move from rllib 2+ to 3?
rllib/core/models/torch/encoder.py
Outdated
| bias=config.use_bias, | ||
| ) | ||
|
|
||
| self.state_in_out_spec = { |
There was a problem hiding this comment.
Simplified (repetitive) code.
There was a problem hiding this comment.
make this private attribute?
| return self._getattr_by_index("observations", indices, global_ts) | ||
|
|
||
| def get_actions( | ||
| def get_infos( |
There was a problem hiding this comment.
Reordered:
- obs, infos (<- env.reset data)
- action, reward, terminated/truncated (<- other env.step results)
- extra model outs
rllib/env/single_agent_env_runner.py
Outdated
| gym.register( | ||
| "custom-env-v0", | ||
| partial( | ||
| if ( |
| if local_worker and self.local_worker() is not None: | ||
| local_result = [func(self.local_worker())] | ||
|
|
||
| if not self.__worker_manager.actor_ids(): |
There was a problem hiding this comment.
Shortcut for local-worker only case.
| restart_failed_sub_environments: true | ||
|
|
||
| # Switch on evaluation workers being managed by AsyncRequestsManager object. | ||
| # Switch on asynchronous handling of evaluation workers. |
There was a problem hiding this comment.
AsyncRequestsManager doesn't exist anymore.
| return input_ | ||
|
|
||
|
|
||
| @DeveloperAPI |
There was a problem hiding this comment.
Very useful new utility. Inverse of already existing unbatch utility.
| # If not specified, we will try to auto-detect this. | ||
| self._is_atari = None | ||
|
|
||
| # TODO (sven): Rename this method into `AlgorithmConfig.sampling()` |
There was a problem hiding this comment.
Please consider loading a checkpoint here? Are these renaming backward compatible?
| # If not specified, we will try to auto-detect this. | ||
| self._is_atari = None | ||
|
|
||
| # TODO (sven): Rename this method into `AlgorithmConfig.sampling()` |
There was a problem hiding this comment.
Is there even a story around this? Like can people even move from rllib 2+ to 3?
rllib/core/models/torch/encoder.py
Outdated
| bias=config.use_bias, | ||
| ) | ||
|
|
||
| self.state_in_out_spec = { |
There was a problem hiding this comment.
make this private attribute?
rllib/utils/spaces/space_utils.py
Outdated
|
|
||
|
|
||
| @DeveloperAPI | ||
| def batch(list_of_structs, individual_items_already_have_batch_1: bool = False): |
There was a problem hiding this comment.
data types please (for input and output)
There was a problem hiding this comment.
can we have unittest of this ?
There was a problem hiding this comment.
also enhanced the docstring to make the example and explanations more clear.
| flat = [[] for _ in range(len(flattened_item))] | ||
| for i, value in enumerate(flattened_item): | ||
| flat[i].append(value) | ||
|
|
There was a problem hiding this comment.
add:
if item is None:
raise ValueError("Input list_of_structs does not contain valid structs.")
| in this struct represents the batch for a single component | ||
| (in case struct is tuple/dict). Alternatively, a simple batch of | ||
| primitives (non tuple/dict) might be returned. | ||
| """ |
There was a problem hiding this comment.
add
if not list_of_structs:
raise ValueError("Input list_of_structs is empty.")
|
Thanks for the review @kouroshHakha ! Waiting for tests to pass ... |
Preparatory PR: Make EnvRunners use (enhanced) Connector API (#1: mostly cleanups and small fixes)
Why are these changes needed?
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.