Refactor train/play and create uv run workflow without dedicated virtual environments by StafaH · Pull Request #5623 · isaac-sim/IsaacLab

StafaH · 2026-05-15T02:35:37Z

Description

This PR refactors the reinforcement learning train/play scripts into unified entry points while preserving the existing library folder structure and adding a lightweight uv workflow for fresh source checkouts.

The main changes are:

Added unified RL entrypoints:
- scripts/reinforcement_learning/train.py --library <library>
- scripts/reinforcement_learning/play.py --library <library>
Added library-specific implementation files under the existing library folders, for example:
- scripts/reinforcement_learning/rsl_rl/train_rsl_rl.py
- scripts/reinforcement_learning/rsl_rl/play_rsl_rl.py
Kept the old per-library train.py and play.py scripts intact, with deprecation warnings and migration examples.
Added shared RL entrypoint utilities in scripts/reinforcement_learning/common.py.
Added direct Isaac Lab CLI commands:
- ./isaaclab.sh train --library <library> ...
- ./isaaclab.sh play --library <library> ...
Kept bare script aliases for ./isaaclab.sh -p train.py ... and ./isaaclab.sh -p play.py ....
Added Python package entry points so installed environments can run:
- train --library <library> ...
- play --library <library> ...
Added a root source-checkout pyproject.toml project so a fresh clone can run kitless Newton training with:
- uv run train --library rsl_rl --task Isaac-Cartpole-Direct-v0 presets=newton_mjwarp --num_envs 4096
Pinned the source-checkout uv environment to the same PyTorch family used by the Isaac Lab installer to avoid CUDA stack churn when users switch between uv run and ./isaaclab.sh.
Updated docs, tests, tools, and pretrained checkpoint helpers to use the unified train/play entrypoints.
Fixed CLI Python discovery so ./isaaclab.sh prefers an active or repo-local virtual environment before falling back to system Python.
Fixed ./isaaclab.sh --install in uv-created virtual environments that do not include the pip module by using uv pip for venv-targeted pip operations.

Motivation:

The previous RL scripts duplicated substantial train/play setup logic across libraries. This made behavior harder to keep consistent and increased maintenance cost when updating shared functionality. The new structure keeps library-specific logic in each library folder while centralizing shared dispatch and common helpers.

The uv workflow gives users a fast path from a fresh clone to kitless Newton training without manually creating an environment first. Isaac Sim / Kit workflows, including PhysX, continue to use the existing full installation path.

Dependencies:

No new required runtime dependencies are added to Isaac Lab packages. The root source-checkout pyproject.toml describes the local development environment used by uv run.

Fixes # N/A

Type of change

New feature (non-breaking change which adds functionality)
Documentation update

Screenshots

Not applicable.

Validation

Ran:

uv lock
uv lock --check
uv run --frozen train --help
./isaaclab.sh --install
./isaaclab.sh -p -m pytest source/isaaclab/test/cli/test_install.py -q
./isaaclab.sh -p -m pytest source/isaaclab/test/cli/test_install.py::TestGetPipCommand source/isaaclab/test/cli/test_install.py::TestExtractPythonExe -q
Help smoke tests for unified train/play entrypoints and ./isaaclab.sh -p train.py --help / ./isaaclab.sh -p play.py --help

Checklist

I have read and understood the contribution guidelines
I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog
I have added my name to the CONTRIBUTORS.md or my name already exists there

isaaclab-review-bot

🤖 Isaac Lab Review Bot

Summary

This PR delivers a well-structured refactoring of the RL train/play scripts into unified entrypoints with library-specific dispatch, while adding a convenient uv run workflow for Newton training. The architecture maintains backward compatibility through deprecation warnings on legacy scripts.

Design Assessment

Architecture: Well-designed ✅

The dispatch pattern in common.py via dispatch_library_entrypoint() cleanly separates concerns:

Unified entrypoints (train.py, play.py) handle library selection
Library-specific implementations (train_rsl_rl.py, etc.) contain RL-specific logic
Shared utilities reduce code duplication across libraries

The pyproject.toml changes for uv support are comprehensive, with proper platform markers for PyTorch CUDA indices.

Findings

🔴 Critical (1)

File	Issue
`scripts/reinforcement_learning/rsl_rl/train.py:53`	RSL-RL version downgrade: Changed from `RSL_RL_VERSION = "5.0.1"` to `"3.0.1"`. This appears to be a significant regression that conflicts with the new `train_rsl_rl.py` which also specifies `3.0.1`. Please clarify if this is intentional and update both files consistently.

🟡 Medium (2)

File	Issue
`pyproject.toml:270`	Missing newline at EOF: The file ends without a trailing newline, which will cause pre-commit to fail (as seen in CI).
`scripts/reinforcement_learning/common.py:57-67`	Exception handling in `dispatch_library_entrypoint`: When `run_as_script=True`, exceptions from `runpy.run_path()` propagate without cleanup of modified `sys.argv`/`sys.path`. Consider wrapping in try/except to ensure cleanup.

🟢 Minor (2)

File	Issue
`source/isaaclab/isaaclab/cli/__init__.py:35-36`	Missing return codes: `train()` and `play()` functions call `run_python_command()` but don't return its exit code to the caller.
`scripts/reinforcement_learning/common.py:168`	Docstring accuracy: `import_local_module` docstring could mention that `ImportError` is raised on failure to help callers handle errors appropriately.

Test Coverage

The PR includes manual validation steps but lacks automated tests for the new unified entrypoints. Consider adding:

Unit tests for dispatch_library_entrypoint() with mocked library modules
Integration tests verifying deprecation warnings are emitted from legacy scripts

Existing tests in test_install.py are run and pass, which validates the Python discovery improvements.

CI Status

Check	Status
pre-commit	❌ Failing (likely due to missing EOF newline)
Check changelog fragments	❌ Failing
Build Wheel	✅ Passing
Installation Tests	✅ Passing
license-check	⏳ Pending

Verdict

Minor fixes needed

The core refactoring is solid and well-designed. Please address:

The RSL-RL version discrepancy (clarify or fix)
Add trailing newline to pyproject.toml
Add changelog fragment per CI requirements

Once these are resolved, this is ready to ship.

Update (ba80967):
✅ RSL-RL version fixed - Both files now correctly use 5.0.1
✅ Changelog fragment added - CI now passes
⚠️ pre-commit still failing - pyproject.toml EOF newline issue persists

Additional changes in this commit:

--rl_library is now required=True (good change - explicit is better than implicit)
Added uv_run.rst documentation for the experimental workflow
Updated deprecation warning in sb3/train.py to reflect new command format

Remaining: Fix the trailing newline in pyproject.toml to pass pre-commit.

Update (ac13a40):
✅ Documentation improvement: TOC entry now shows "uv run (experimental)" with clearer display name

⚠️ pyproject.toml EOF newline still not fixed - pre-commit CI continues to fail. This is the last remaining blocker.

Update (9188599):
✅ EOF newline fixed - pyproject.toml now has proper trailing newline. Pre-commit should pass now.

🎉 All issues addressed! This PR is ready to merge.

Update (ca0a704):
New commits add release preparation changes and significant improvements to the LEAPP export flow:

Changes reviewed:

✅ LEAPP export.py refactored - The export script now exposes a modular API (parse_export_args(), export_rsl_rl_agent(), run_export_with_hydra(), main_cli()) instead of module-level initialization. This is a well-designed change that:
- Enables test batching (multiple exports per Kit process)
- Defers runtime imports until after Isaac Sim launches
- Improves cleanup with proper try/finally blocks and contextlib.suppress()
✅ Test improvements - test_rsl_rl_export_flow.py now batches tasks to reduce CI runtime; pure utility tests (test_noise.py, test_wrench_composer.py) no longer require Isaac Sim
✅ Import path fixes - Changed from isaaclab.utils import configclass to from isaaclab.utils.configclass import configclass in physics configs (avoids lazy-load issues)
✅ CI action enhancement - Added LEAPP package resolution logging and unset HUB__ARGS__DETECT_ONLY for OmniHub reliability
✅ Changelog consolidation - Merged changelog.d fragments into CHANGELOG.rst with version bumps (isaaclab 5.2.1, isaaclab_newton 0.9.1, isaaclab_physx 0.7.1)

No new issues found. This is solid release/CI preparation work.

greptile-apps · 2026-05-15T02:40:54Z

Greptile Summary

This PR refactors the reinforcement learning train/play scripts into unified entrypoints (scripts/reinforcement_learning/train.py and play.py) that dispatch to library-specific implementations, adds a uv run source-checkout workflow via a new root pyproject.toml, and improves isaaclab.sh Python discovery and pip command handling.

Unified dispatch — new common.py helpers route --rl_library <lib> to per-library train_<lib>.py / play_<lib>.py files; old per-library train.py/play.py emit DeprecationWarning and remain functional.
uv workflow — root pyproject.toml gains a [project] table with pinned PyTorch wheels indexed by CUDA version and platform, enabling uv run train from a fresh clone.
CLI improvements — extract_python_exe now prefers an active or repo-local venv before falling back to kit Python; get_pip_command sets UV_PYTHON for venv-targeted uv pip installs.

Confidence Score: 3/5

The refactoring is well-structured overall, but two defects in the new canonical files will break real workflows before the fix.

The train_rl_games.py file imports distutils.util.strtobool, which is absent from Python 3.12+ — the minimum version this project targets — so any attempt to train with RL-Games via the new unified entrypoint will fail immediately on import. Separately, rsl_rl/train.py has its minimum RSL-RL version silently rolled back from 5.0.1 to 3.0.1 as a side effect of the refactor, bypassing a version guard that was there for a reason. The rest of the PR — the dispatch mechanism, uv workflow, CLI Python discovery, and pip improvements — looks solid and ready.

scripts/reinforcement_learning/rl_games/train_rl_games.py (distutils import) and scripts/reinforcement_learning/rsl_rl/train.py (RSL_RL_VERSION change) need attention before merge.

Important Files Changed

Filename	Overview
scripts/reinforcement_learning/rl_games/train_rl_games.py	New canonical RL-Games training implementation; imports `distutils.util.strtobool` which is removed in Python 3.12+, causing an immediate ImportError.
scripts/reinforcement_learning/rsl_rl/train.py	Deprecated wrapper for RSL-RL training; inadvertently downgrades RSL_RL_VERSION from 5.0.1 to 3.0.1 beyond the stated change of adding only a deprecation warning.
scripts/reinforcement_learning/common.py	New shared dispatch utilities; dead code in `dispatch_library_entrypoint` (the `rl_library is None` help branch never fires due to `default="rsl_rl"`).
scripts/reinforcement_learning/sb3/train.py	Deprecated SB3 training wrapper; deprecation message uses the old verbose form instead of the short `./isaaclab.sh train` form used by all other libraries.
source/isaaclab/isaaclab/cli/utils.py	Improves Python discovery and updates `get_pip_command` to detect venv Python for `uv pip` even when not activated; logic looks correct.
source/isaaclab/isaaclab/cli/init.py	Adds `train`/`play` console-script entry points and CLI shortcuts; early dispatch logic is clean and correct.
pyproject.toml	Adds `[project]` table for the `uv run` source-checkout workflow with pinned PyTorch family and per-platform CUDA index routing; looks well-structured.
source/isaaclab/setup.py	Adds `train`, `play`, and `isaaclab` console scripts as package entry points; straightforward and correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["./isaaclab.sh train or 'train' console script"] --> B["cli/__init__.py train()"]
    B --> C["run_python_command(train.py, args)"]
    C --> D["scripts/reinforcement_learning/train.py main()"]
    D --> E["dispatch_library_entrypoint() parse --rl_library"]
    E -->|run_as_script=False| F["import_local_module() train_lib.py"]
    F --> G["module.run(library_args)"]
    A2["./isaaclab.sh play or 'play' console script"] --> B2["cli/__init__.py play()"]
    B2 --> C2["run_python_command(play.py, args)"]
    C2 --> D2["scripts/reinforcement_learning/play.py main()"]
    D2 --> E2["dispatch_library_entrypoint() parse --rl_library"]
    E2 -->|run_as_script=True| F2["runpy.run_path() play_lib.py"]

Comments Outside Diff (1)

scripts/reinforcement_learning/rl_games/train_rl_games.py, line 750 (link)

Python 3.12 incompatibility: distutils removed

distutils was removed from the standard library in Python 3.12 (PEP 594), yet pyproject.toml declares requires-python = ">=3.12". Loading this module will immediately raise ModuleNotFoundError: No module named 'distutils'. A suitable drop-in is a small inline helper — for example type=lambda x: x.lower() in ("1", "true", "yes") — or setuptools' re-export of distutils if setuptools is guaranteed to be present.

_{Reviews (1): Last reviewed commit: "Add train play entrypoints" | Re-trigger Greptile}

greptile-apps · 2026-05-15T02:41:01Z

    import isaaclab_tasks_experimental  # noqa: F401

-RSL_RL_VERSION = "5.0.1"
+RSL_RL_VERSION = "3.0.1"


Unintended version downgrade in deprecated script

The PR description says deprecated scripts are kept intact (only a deprecation warning is added), but RSL_RL_VERSION is also changed here from 5.0.1 down to 3.0.1. Users who still invoke this deprecated script will no longer receive a version check error when running an incompatible RSL-RL version.

Suggested change

RSL_RL_VERSION = "3.0.1"

RSL_RL_VERSION = "5.0.1"

greptile-apps · 2026-05-15T02:41:02Z

+
+def configure_io_descriptors(env_cfg: Any, args_cli: argparse.Namespace, logger: logging.Logger) -> None:
+    """Configure IO descriptor export on supported environment configs.
+
+    Args:
+        env_cfg: Isaac Lab environment config.
+        args_cli: Parsed command-line arguments.
+        logger: Logger used for unsupported environment warnings.
+    """
+    if isinstance(env_cfg, ManagerBasedRLEnvCfg):


Dead code: args_cli.rl_library is None can never be true

--rl_library is declared with default="rsl_rl", so parse_known_args always produces a non-None value and the if args_cli.rl_library is None branch is unreachable. A user who runs ./isaaclab.sh train without --rl_library silently falls through to rsl_rl instead of seeing the unified help page. Removing the default (or using default=None) would let the help branch trigger.

greptile-apps · 2026-05-15T02:41:03Z

+warnings.warn(
+    "scripts/reinforcement_learning/sb3/train.py is deprecated. Use "
+    "`./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task <TASK>` instead. "
+    "Example: `./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task Isaac-Cartpole-v0`.",
+    DeprecationWarning,
+    stacklevel=1,
+)


Inconsistent deprecation message format

All other per-library train.py deprecation warnings direct users to ./isaaclab.sh train --rl_library <lib>, but this one references the older form ./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3. This is confusing for users migrating from sb3.

Suggested change

warnings.warn(

"scripts/reinforcement_learning/sb3/train.py is deprecated. Use "

"`./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task <TASK>` instead. "

"Example: `./isaaclab.sh -p scripts/reinforcement_learning/train.py --rl_library sb3 --task Isaac-Cartpole-v0`.",

DeprecationWarning,

stacklevel=1,

)

warnings.warn(

"scripts/reinforcement_learning/sb3/train.py is deprecated. Use "

"`./isaaclab.sh train --rl_library sb3 --task <TASK>` instead. "

"Example: `./isaaclab.sh train --rl_library sb3 --task Isaac-Cartpole-v0`.",

DeprecationWarning,

stacklevel=1,

)

The typed preset selectors (physics=NAME, renderer=NAME, presets=NAME) from isaac-sim#5587 fail on the unified `scripts/reinforcement_learning/train.py` dispatcher with `Key 'physics' is not in struct`, because the new `train_<library>.py` / `play_<library>.py` scripts introduced by isaac-sim#5623 forward the argparse remainder straight to Hydra without folding the typed tokens into the canonical `presets=<csv>` form. They also don't register the preset-selection help text, so `--help` output doesn't list the syntax. isaac-sim#5623 landed at 2026-05-15 13:03 PDT, isaac-sim#5587 at 14:31 PDT the same day. isaac-sim#5587's diff applied `setup_preset_cli` + `fold_preset_tokens` to the older `*/train.py` / `*/play.py` scripts that existed when the PR was opened; the post-isaac-sim#5623 rebase pulled in the new entrypoint files but didn't apply the same integration because they were not in the PR's diff context. Apply the canonical pattern from `isaaclab_tasks.utils.preset_cli` to the 4 unified train scripts and the 4 unified play scripts. `setup_preset_cli` registers the preset-selection help text on each library's parser and runs `parse_known_args`; `fold_preset_tokens` rewrites the typed selectors into the canonical form post-argparse. Both help-text discovery and the fold work whether invoked through the unified dispatcher or directly. `train_rlinf.py` / `play_rlinf.py` are out of scope -- they manage their own GlobalHydra instance and have never been integrated with the typed-preset CLI. Verified locally on develop: python scripts/reinforcement_learning/train.py --rl_library rsl_rl \ --task=Isaac-Ant-v0 physics=newton_mjwarp --headless reproduces the "Key 'physics' is not in struct" error before the fix and launches Newton training successfully after. Same command via direct invocation (`python scripts/reinforcement_learning/rsl_rl/play_rsl_rl.py ... physics=newton_mjwarp ...`) and `--help` discoverability both verified.

…y scripts (#5715) (#5724) Cherry-pick of #5715 to `release/3.0.0-beta2`. ## 1. Summary - Typed preset selectors (`physics=NAME`, `renderer=NAME`, `presets=NAME`) from #5587 silently fail on the unified `scripts/reinforcement_learning/train.py` dispatcher with `Key 'physics' is not in struct`. - Root cause is merge-ordering between #5587 (preset CLI, 2026-05-15 14:31 PDT) and #5623 (unified train/play refactor, 2026-05-15 13:03 PDT): the new `train_<library>.py` / `play_<library>.py` files were authored without the `setup_preset_cli` + `fold_preset_tokens` integration that #5587 had applied to the older scripts. - Wires the integration into the 4 unified train scripts and the 4 unified play scripts. Adds a regression test (`scripts/reinforcement_learning/test/test_typed_preset_cli_train_play.py`) that drives `physics=does_not_exist` through the unified dispatcher for all 8 entrypoints and asserts the Hydra struct error is absent and the resolver's `Unknown preset(s)` error is present. ## 2. Cherry-picked commits | Commit on `develop` | Cherry-pick | |---|---| | `4e9ebc75595` – Wire preset CLI into unified train/play entrypoints | `debaf15404b` | | `0dc06846b18` – Add subprocess test that typed preset selectors reach the resolver | `c239fbf2e27` | No conflicts; clean cherry-pick. ## 3. Test plan - [x] Repro confirmed before the fix. - [x] All 4 train scripts smoke-tested with `physics=newton_mjwarp` on Isaac-Ant-v0 — each reaches Newton sim init (no Hydra struct error). Both invocation paths (dispatcher + direct) verified. - [x] All 4 play scripts smoke-tested — typed selector resolves; downstream failure is the expected fake-checkpoint `FileNotFoundError`. - [x] `--help` output includes `physics=NAME` / `renderer=NAME` / `presets=NAME[,NAME,...]` with task variant lists. - [x] Invalid value (`physics=does_not_exist`) produces a clear `Unknown preset(s)` error from the resolver. - [x] `scripts/reinforcement_learning/test/test_typed_preset_cli_train_play.py` — 8 parametrized cases pass in ~20 s on CPU. - [x] Pre-commit clean. - [ ] CI green on `Docker + Tests`. ## 4. Out of scope `train_rlinf.py` / `play_rlinf.py` — manage their own `GlobalHydra` instance, never integrated with the typed preset CLI. Adding support is a feature, not a regression fix. ## 5. Related - Develop PR: #5715 - Source PR for the typed preset CLI: #5587 - Source PR for the unified train/play refactor: #5623

StafaH added 2 commits May 14, 2026 19:20

Add refactored train/play scripts

b43d300

Add train play entrypoints

5f18709

StafaH requested a review from ooctipus as a code owner May 15, 2026 02:35

github-actions Bot added isaac-lab Related to Isaac Lab team infrastructure labels May 15, 2026

isaaclab-review-bot Bot reviewed May 15, 2026

View reviewed changes

greptile-apps Bot reviewed May 15, 2026

View reviewed changes

Add docs entry and changelog

5753e36

StafaH requested review from Mayankm96, jtigue-bdai and kellyguo11 as code owners May 15, 2026 02:42

github-actions Bot added the documentation Improvements or additions to documentation label May 15, 2026

StafaH added 4 commits May 14, 2026 19:44

Fix rsl_rl version and rl_library arg required

8713eb2

Fix incorrect warning for sb3

ba80967

Add experimental to uv run tab

ac13a40

Fix end of file

9188599

hujc7 mentioned this pull request May 15, 2026

[CI] Set OMP_/OPENBLAS_/MKL_NUM_THREADS=1 in test containers #5625

Closed

3 tasks

kellyguo11 approved these changes May 15, 2026

View reviewed changes

Merge branch 'develop' into mh/uv_train

ca0a704

kellyguo11 added this to Isaac Lab May 15, 2026

kellyguo11 moved this to In review in Isaac Lab May 15, 2026

kellyguo11 moved this from In review to Ready to merge in Isaac Lab May 15, 2026

kellyguo11 merged commit 7015be2 into isaac-sim:develop May 15, 2026
33 of 34 checks passed

github-project-automation Bot moved this from Ready to merge to Done in Isaac Lab May 15, 2026

isaaclab-review-bot Bot mentioned this pull request May 15, 2026

Implements passive fixed tendons with mjwarp #5522

Merged

4 tasks

hujc7 mentioned this pull request May 21, 2026

[PresetCLI] Make physics=/renderer=/presets= work on the unified train/play scripts #5715

Merged

8 tasks

hujc7 mentioned this pull request May 21, 2026

[PresetCLI] Cherry-pick: wire typed preset CLI into unified train/play scripts (#5715) #5724

Merged

8 tasks

StafaH deleted the mh/uv_train branch May 28, 2026 19:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor train/play and create uv run workflow without dedicated virtual environments#5623

Refactor train/play and create uv run workflow without dedicated virtual environments#5623
kellyguo11 merged 8 commits into
isaac-sim:developfrom
StafaH:mh/uv_train

StafaH commented May 15, 2026

Uh oh!

isaaclab-review-bot Bot left a comment •

edited

Loading

Uh oh!

greptile-apps Bot commented May 15, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot May 15, 2026

Uh oh!

greptile-apps Bot May 15, 2026

Uh oh!

greptile-apps Bot May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

StafaH commented May 15, 2026

Description

Type of change

Screenshots

Validation

Checklist

Uh oh!

isaaclab-review-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

🤖 Isaac Lab Review Bot

Summary

Design Assessment

Findings

🔴 Critical (1)

🟡 Medium (2)

🟢 Minor (2)

Test Coverage

CI Status

Verdict

Uh oh!

greptile-apps Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Comments Outside Diff (1)

Uh oh!

greptile-apps Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

isaaclab-review-bot Bot left a comment •

edited

Loading

greptile-apps Bot commented May 15, 2026 •

edited

Loading