Refactor train/play and create uv run workflow without dedicated virtual environments#5623
Conversation
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR delivers a well-structured refactoring of the RL train/play scripts into unified entrypoints with library-specific dispatch, while adding a convenient uv run workflow for Newton training. The architecture maintains backward compatibility through deprecation warnings on legacy scripts.
Design Assessment
Architecture: Well-designed ✅
The dispatch pattern in common.py via dispatch_library_entrypoint() cleanly separates concerns:
- Unified entrypoints (
train.py,play.py) handle library selection - Library-specific implementations (
train_rsl_rl.py, etc.) contain RL-specific logic - Shared utilities reduce code duplication across libraries
The pyproject.toml changes for uv support are comprehensive, with proper platform markers for PyTorch CUDA indices.
Findings
🔴 Critical (1)
| File | Issue |
|---|---|
scripts/reinforcement_learning/rsl_rl/train.py:53 |
RSL-RL version downgrade: Changed from RSL_RL_VERSION = "5.0.1" to "3.0.1". This appears to be a significant regression that conflicts with the new train_rsl_rl.py which also specifies 3.0.1. Please clarify if this is intentional and update both files consistently. |
🟡 Medium (2)
| File | Issue |
|---|---|
pyproject.toml:270 |
Missing newline at EOF: The file ends without a trailing newline, which will cause pre-commit to fail (as seen in CI). |
scripts/reinforcement_learning/common.py:57-67 |
Exception handling in dispatch_library_entrypoint: When run_as_script=True, exceptions from runpy.run_path() propagate without cleanup of modified sys.argv/sys.path. Consider wrapping in try/except to ensure cleanup. |
🟢 Minor (2)
| File | Issue |
|---|---|
source/isaaclab/isaaclab/cli/__init__.py:35-36 |
Missing return codes: train() and play() functions call run_python_command() but don't return its exit code to the caller. |
scripts/reinforcement_learning/common.py:168 |
Docstring accuracy: import_local_module docstring could mention that ImportError is raised on failure to help callers handle errors appropriately. |
Test Coverage
The PR includes manual validation steps but lacks automated tests for the new unified entrypoints. Consider adding:
- Unit tests for
dispatch_library_entrypoint()with mocked library modules - Integration tests verifying deprecation warnings are emitted from legacy scripts
Existing tests in test_install.py are run and pass, which validates the Python discovery improvements.
CI Status
| Check | Status |
|---|---|
| pre-commit | ❌ Failing (likely due to missing EOF newline) |
| Check changelog fragments | ❌ Failing |
| Build Wheel | ✅ Passing |
| Installation Tests | ✅ Passing |
| license-check | ⏳ Pending |
Verdict
Minor fixes needed
The core refactoring is solid and well-designed. Please address:
- The RSL-RL version discrepancy (clarify or fix)
- Add trailing newline to
pyproject.toml - Add changelog fragment per CI requirements
Once these are resolved, this is ready to ship.
Update (ba80967):
✅ RSL-RL version fixed - Both files now correctly use 5.0.1
✅ Changelog fragment added - CI now passes
pyproject.toml EOF newline issue persists
Additional changes in this commit:
--rl_libraryis nowrequired=True(good change - explicit is better than implicit)- Added
uv_run.rstdocumentation for the experimental workflow - Updated deprecation warning in
sb3/train.pyto reflect new command format
Remaining: Fix the trailing newline in pyproject.toml to pass pre-commit.
Update (ac13a40):
✅ Documentation improvement: TOC entry now shows "uv run (experimental)" with clearer display name
pyproject.toml EOF newline still not fixed - pre-commit CI continues to fail. This is the last remaining blocker.
Update (9188599):
✅ EOF newline fixed - pyproject.toml now has proper trailing newline. Pre-commit should pass now.
🎉 All issues addressed! This PR is ready to merge.
Update (ca0a704):
New commits add release preparation changes and significant improvements to the LEAPP export flow:
Changes reviewed:
- ✅ LEAPP export.py refactored - The export script now exposes a modular API (
parse_export_args(),export_rsl_rl_agent(),run_export_with_hydra(),main_cli()) instead of module-level initialization. This is a well-designed change that:- Enables test batching (multiple exports per Kit process)
- Defers runtime imports until after Isaac Sim launches
- Improves cleanup with proper try/finally blocks and
contextlib.suppress()
- ✅ Test improvements -
test_rsl_rl_export_flow.pynow batches tasks to reduce CI runtime; pure utility tests (test_noise.py,test_wrench_composer.py) no longer require Isaac Sim - ✅ Import path fixes - Changed
from isaaclab.utils import configclasstofrom isaaclab.utils.configclass import configclassin physics configs (avoids lazy-load issues) - ✅ CI action enhancement - Added LEAPP package resolution logging and
unset HUB__ARGS__DETECT_ONLYfor OmniHub reliability - ✅ Changelog consolidation - Merged changelog.d fragments into CHANGELOG.rst with version bumps (isaaclab 5.2.1, isaaclab_newton 0.9.1, isaaclab_physx 0.7.1)
No new issues found. This is solid release/CI preparation work.
Greptile SummaryThis PR refactors the reinforcement learning train/play scripts into unified entrypoints (
Confidence Score: 3/5The refactoring is well-structured overall, but two defects in the new canonical files will break real workflows before the fix. The
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["./isaaclab.sh train or 'train' console script"] --> B["cli/__init__.py train()"]
B --> C["run_python_command(train.py, args)"]
C --> D["scripts/reinforcement_learning/train.py main()"]
D --> E["dispatch_library_entrypoint() parse --rl_library"]
E -->|run_as_script=False| F["import_local_module() train_lib.py"]
F --> G["module.run(library_args)"]
A2["./isaaclab.sh play or 'play' console script"] --> B2["cli/__init__.py play()"]
B2 --> C2["run_python_command(play.py, args)"]
C2 --> D2["scripts/reinforcement_learning/play.py main()"]
D2 --> E2["dispatch_library_entrypoint() parse --rl_library"]
E2 -->|run_as_script=True| F2["runpy.run_path() play_lib.py"]
|
| import isaaclab_tasks_experimental # noqa: F401 | ||
|
|
||
| RSL_RL_VERSION = "5.0.1" | ||
| RSL_RL_VERSION = "3.0.1" |
There was a problem hiding this comment.
Unintended version downgrade in deprecated script
The PR description says deprecated scripts are kept intact (only a deprecation warning is added), but RSL_RL_VERSION is also changed here from 5.0.1 down to 3.0.1. Users who still invoke this deprecated script will no longer receive a version check error when running an incompatible RSL-RL version.
| RSL_RL_VERSION = "3.0.1" | |
| RSL_RL_VERSION = "5.0.1" |
|
|
||
| def configure_io_descriptors(env_cfg: Any, args_cli: argparse.Namespace, logger: logging.Logger) -> None: | ||
| """Configure IO descriptor export on supported environment configs. | ||
|
|
||
| Args: | ||
| env_cfg: Isaac Lab environment config. | ||
| args_cli: Parsed command-line arguments. | ||
| logger: Logger used for unsupported environment warnings. | ||
| """ | ||
| if isinstance(env_cfg, ManagerBasedRLEnvCfg): |
There was a problem hiding this comment.
Dead code:
args_cli.rl_library is None can never be true
--rl_library is declared with default="rsl_rl", so parse_known_args always produces a non-None value and the if args_cli.rl_library is None branch is unreachable. A user who runs ./isaaclab.sh train without --rl_library silently falls through to rsl_rl instead of seeing the unified help page. Removing the default (or using default=None) would let the help branch trigger.
| warnings.warn( | ||
| "scripts/reinforcement_learning/sb3/train.py is deprecated. Use " | ||
| "`./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task <TASK>` instead. " | ||
| "Example: `./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task Isaac-Cartpole-v0`.", | ||
| DeprecationWarning, | ||
| stacklevel=1, | ||
| ) |
There was a problem hiding this comment.
Inconsistent deprecation message format
All other per-library train.py deprecation warnings direct users to ./isaaclab.sh train --rl_library <lib>, but this one references the older form ./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3. This is confusing for users migrating from sb3.
| warnings.warn( | |
| "scripts/reinforcement_learning/sb3/train.py is deprecated. Use " | |
| "`./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task <TASK>` instead. " | |
| "Example: `./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task Isaac-Cartpole-v0`.", | |
| DeprecationWarning, | |
| stacklevel=1, | |
| ) | |
| warnings.warn( | |
| "scripts/reinforcement_learning/sb3/train.py is deprecated. Use " | |
| "`./isaaclab.sh train --rl_library sb3 --task <TASK>` instead. " | |
| "Example: `./isaaclab.sh train --rl_library sb3 --task Isaac-Cartpole-v0`.", | |
| DeprecationWarning, | |
| stacklevel=1, | |
| ) |
The typed preset selectors (physics=NAME, renderer=NAME, presets=NAME) from isaac-sim#5587 fail on the unified `scripts/reinforcement_learning/train.py` dispatcher with `Key 'physics' is not in struct`, because the new `train_<library>.py` / `play_<library>.py` scripts introduced by isaac-sim#5623 forward the argparse remainder straight to Hydra without folding the typed tokens into the canonical `presets=<csv>` form. They also don't register the preset-selection help text, so `--help` output doesn't list the syntax. isaac-sim#5623 landed at 2026-05-15 13:03 PDT, isaac-sim#5587 at 14:31 PDT the same day. isaac-sim#5587's diff applied `setup_preset_cli` + `fold_preset_tokens` to the older `*/train.py` / `*/play.py` scripts that existed when the PR was opened; the post-isaac-sim#5623 rebase pulled in the new entrypoint files but didn't apply the same integration because they were not in the PR's diff context. Apply the canonical pattern from `isaaclab_tasks.utils.preset_cli` to the 4 unified train scripts and the 4 unified play scripts. `setup_preset_cli` registers the preset-selection help text on each library's parser and runs `parse_known_args`; `fold_preset_tokens` rewrites the typed selectors into the canonical form post-argparse. Both help-text discovery and the fold work whether invoked through the unified dispatcher or directly. `train_rlinf.py` / `play_rlinf.py` are out of scope -- they manage their own GlobalHydra instance and have never been integrated with the typed-preset CLI. Verified locally on develop: python scripts/reinforcement_learning/train.py --rl_library rsl_rl \ --task=Isaac-Ant-v0 physics=newton_mjwarp --headless reproduces the "Key 'physics' is not in struct" error before the fix and launches Newton training successfully after. Same command via direct invocation (`python scripts/reinforcement_learning/rsl_rl/play_rsl_rl.py ... physics=newton_mjwarp ...`) and `--help` discoverability both verified.
…y scripts (#5715) (#5724) Cherry-pick of #5715 to `release/3.0.0-beta2`. ## 1. Summary - Typed preset selectors (`physics=NAME`, `renderer=NAME`, `presets=NAME`) from #5587 silently fail on the unified `scripts/reinforcement_learning/train.py` dispatcher with `Key 'physics' is not in struct`. - Root cause is merge-ordering between #5587 (preset CLI, 2026-05-15 14:31 PDT) and #5623 (unified train/play refactor, 2026-05-15 13:03 PDT): the new `train_<library>.py` / `play_<library>.py` files were authored without the `setup_preset_cli` + `fold_preset_tokens` integration that #5587 had applied to the older scripts. - Wires the integration into the 4 unified train scripts and the 4 unified play scripts. Adds a regression test (`scripts/reinforcement_learning/test/test_typed_preset_cli_train_play.py`) that drives `physics=does_not_exist` through the unified dispatcher for all 8 entrypoints and asserts the Hydra struct error is absent and the resolver's `Unknown preset(s)` error is present. ## 2. Cherry-picked commits | Commit on `develop` | Cherry-pick | |---|---| | `4e9ebc75595` – Wire preset CLI into unified train/play entrypoints | `debaf15404b` | | `0dc06846b18` – Add subprocess test that typed preset selectors reach the resolver | `c239fbf2e27` | No conflicts; clean cherry-pick. ## 3. Test plan - [x] Repro confirmed before the fix. - [x] All 4 train scripts smoke-tested with `physics=newton_mjwarp` on Isaac-Ant-v0 — each reaches Newton sim init (no Hydra struct error). Both invocation paths (dispatcher + direct) verified. - [x] All 4 play scripts smoke-tested — typed selector resolves; downstream failure is the expected fake-checkpoint `FileNotFoundError`. - [x] `--help` output includes `physics=NAME` / `renderer=NAME` / `presets=NAME[,NAME,...]` with task variant lists. - [x] Invalid value (`physics=does_not_exist`) produces a clear `Unknown preset(s)` error from the resolver. - [x] `scripts/reinforcement_learning/test/test_typed_preset_cli_train_play.py` — 8 parametrized cases pass in ~20 s on CPU. - [x] Pre-commit clean. - [ ] CI green on `Docker + Tests`. ## 4. Out of scope `train_rlinf.py` / `play_rlinf.py` — manage their own `GlobalHydra` instance, never integrated with the typed preset CLI. Adding support is a feature, not a regression fix. ## 5. Related - Develop PR: #5715 - Source PR for the typed preset CLI: #5587 - Source PR for the unified train/play refactor: #5623
Description
This PR refactors the reinforcement learning train/play scripts into unified entry points while preserving the existing library folder structure and adding a lightweight
uvworkflow for fresh source checkouts.The main changes are:
scripts/reinforcement_learning/train.py --library <library>scripts/reinforcement_learning/play.py --library <library>scripts/reinforcement_learning/rsl_rl/train_rsl_rl.pyscripts/reinforcement_learning/rsl_rl/play_rsl_rl.pytrain.pyandplay.pyscripts intact, with deprecation warnings and migration examples.scripts/reinforcement_learning/common.py../isaaclab.sh train --library <library> ..../isaaclab.sh play --library <library> ..../isaaclab.sh -p train.py ...and./isaaclab.sh -p play.py ....train --library <library> ...play --library <library> ...pyproject.tomlproject so a fresh clone can run kitless Newton training with:uv run train --library rsl_rl --task Isaac-Cartpole-Direct-v0 presets=newton_mjwarp --num_envs 4096uvenvironment to the same PyTorch family used by the Isaac Lab installer to avoid CUDA stack churn when users switch betweenuv runand./isaaclab.sh../isaaclab.shprefers an active or repo-local virtual environment before falling back to system Python../isaaclab.sh --installin uv-created virtual environments that do not include thepipmodule by usinguv pipfor venv-targeted pip operations.Motivation:
The previous RL scripts duplicated substantial train/play setup logic across libraries. This made behavior harder to keep consistent and increased maintenance cost when updating shared functionality. The new structure keeps library-specific logic in each library folder while centralizing shared dispatch and common helpers.
The
uvworkflow gives users a fast path from a fresh clone to kitless Newton training without manually creating an environment first. Isaac Sim / Kit workflows, including PhysX, continue to use the existing full installation path.Dependencies:
No new required runtime dependencies are added to Isaac Lab packages. The root source-checkout
pyproject.tomldescribes the local development environment used byuv run.Fixes # N/A
Type of change
Screenshots
Not applicable.
Validation
Ran:
uv lockuv lock --checkuv run --frozen train --help./isaaclab.sh --install./isaaclab.sh -p -m pytest source/isaaclab/test/cli/test_install.py -q./isaaclab.sh -p -m pytest source/isaaclab/test/cli/test_install.py::TestGetPipCommand source/isaaclab/test/cli/test_install.py::TestExtractPythonExe -q./isaaclab.sh -p train.py --help/./isaaclab.sh -p play.py --helpChecklist
pre-commitchecks with./isaaclab.sh --formatCONTRIBUTORS.mdor my name already exists there