Skip to content

Cherry-picks fixes for SKRL, multi-GPU docs, LEAPP imports#5876

Merged
kellyguo11 merged 4 commits into
isaac-sim:release/3.0.0-beta2from
kellyguo11:cherry-pick-0529
May 30, 2026
Merged

Cherry-picks fixes for SKRL, multi-GPU docs, LEAPP imports#5876
kellyguo11 merged 4 commits into
isaac-sim:release/3.0.0-beta2from
kellyguo11:cherry-pick-0529

Conversation

@kellyguo11

Copy link
Copy Markdown
Contributor

AntoineRichard and others added 3 commits May 29, 2026 18:47
Raises the `skrl` optional dependency floor to `>=2.1.0` in both the
`isaaclab-rl` package metadata and wheel-builder metadata. This avoids
installing older `skrl` releases that reference the removed `wp.context`
API when used with `warp-lang` 1.13.

No runtime code changes are included.

Fixes: N/A

- Bug fix (non-breaking change which fixes an issue)

N/A - dependency metadata change.

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [ ] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [x] I have added a changelog fragment under
`source/<pkg>/changelog.d/` for every touched package (do **not** edit
`CHANGELOG.rst` or bump `extension.toml` — CI handles that)
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
# Description

Update to docs to highlight how to solve a potential multigpu training
issue (not a bug but a known issue)

## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
Fixes # (issue)

Following isaac-sim#5633, the CLI path
broke due to an error in import order. That resulted in a failure inside
the simulation setup which would cause a silent failure when running the
CLI. This fix conforms to the new `_load_runtime_dependencies()` pattern
introduced by the previous PR and fixes the runtime for the export. New
export script follows these steps:

1. Parse minimal launcher args.
2. Start AppLauncher / Kit.
3. Import torch, leapp, RSL-RL, task utilities, and Hydra helpers.
4. Resolve task config.
5. Create the env with gym.make().
6. Patch for LEAPP and export.

## Type of change

<!-- As you go through the list, delete the ones that are not
applicable. -->

- Bug fix (non-breaking change which fixes an issue)


## Checklist

- [x] I have read and understood the [contribution
guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html)
- [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with
`./isaaclab.sh --format`
- [x] I have made corresponding changes to the documentation
- [x] My changes generate no new warnings
- [x] I have added tests that prove my fix is effective or that my
feature works
- [x] I have updated the changelog and the corresponding version in the
extension's `config/extension.toml` file
- [x] I have added my name to the `CONTRIBUTORS.md` or my name already
exists there
@github-actions github-actions Bot added documentation Improvements or additions to documentation isaac-lab Related to Isaac Lab team infrastructure labels May 30, 2026

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR cherry-picks three bug fixes from develop to the release/3.0.0-beta2 branch: SKRL version bump to 2.1.0 for warp-lang 1.13 compatibility, LEAPP export import restructuring, and multi-GPU NCCL documentation. The SKRL version bump and documentation changes are correct, but the new test added for SKRL version validation has a compatibility issue with the target branch.

Design Assessment

Partially sound — The core changes (SKRL version bump, LEAPP import deferral, NCCL docs) are well-designed. However, one cherry-picked test assumes a pyproject.toml structure that doesn't exist on the target release branch.

Findings

Detailed findings posted as inline comments below.

Test Coverage

  • Bug fixes: No regression tests needed for version bumps or documentation
  • New test: Adds test for SKRL version consistency, but it won't work on this branch
  • Gaps: The test needs adaptation for the target branch's pyproject.toml structure

CI Status

Checks in progress at time of review.

Verdict

Significant concerns

The new test test_skrl_scripts_min_version_matches_source_package will fail on this branch because it assumes the newer pyproject.toml format from develop (with [project.optional-dependencies]), but release/3.0.0-beta2 uses the legacy setup.py for dependencies. The test function _source_skrl_min_version() will raise KeyError: 'project' when trying to access data["project"]["optional-dependencies"]["skrl"].

Options to resolve:

  1. Skip this test for the release branch cherry-pick
  2. Modify the test to read from setup.py instead for this branch
  3. Backport the pyproject.toml restructuring from develop (likely undesirable for a release branch)

Update (commit 7246084): ✅ Previous critical concern fully addressed.

The _source_skrl_min_version() function has been rewritten to parse setup.py (via AST) instead of pyproject.toml, which is the correct approach for the release/3.0.0-beta2 branch. The tomllib import was also removed. This resolves the KeyError: 'project' issue that would have caused the test to fail.

No new issues found in the incremental changes.

raise AssertionError(f"Could not find string constant {name}.")


def _source_skrl_min_version() -> Version:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: Test incompatible with target branch pyproject.toml structure

This function reads data["project"]["optional-dependencies"]["skrl"] from source/isaaclab_rl/pyproject.toml, but on the target branch release/3.0.0-beta2, that file only contains:

[build-system]
requires = ["setuptools<82.0.0", "wheel", "toml"]
build-backend = "setuptools.build_meta"

There is no [project] section. The skrl dependency is defined in setup.py, not pyproject.toml, on this release branch.

This will cause the test to fail with KeyError: 'project' when run.

Suggested fix: Either skip this test for the release branch, or modify it to read from setup.py:

def _source_skrl_min_version() -> Version:
    # On release branches, skrl dep is in setup.py; on develop it's in pyproject.toml
    pyproject_path = REPO_ROOT / "source/isaaclab_rl/pyproject.toml"
    with pyproject_path.open("rb") as f:
        data = tomllib.load(f)
    
    if "project" in data and "optional-dependencies" in data.get("project", {}):
        # develop branch: pyproject.toml has full project metadata
        for dep in data["project"]["optional-dependencies"].get("skrl", []):
            req = Requirement(dep)
            if req.name == "skrl":
                # ... extract version
    else:
        # release branch: fall back to setup.py
        import ast
        setup_path = REPO_ROOT / "source/isaaclab_rl/setup.py"
        # parse setup.py to find EXTRAS_REQUIRE["skrl"]

Or simply skip this test if the structure doesn't match:

def _source_skrl_min_version() -> Version | None:
    with (REPO_ROOT / "source/isaaclab_rl/pyproject.toml").open("rb") as f:
        data = tomllib.load(f)
    
    try:
        deps = data["project"]["optional-dependencies"]["skrl"]
    except KeyError:
        return None  # Skip on branches without pyproject.toml project section
    # ...

expected_version = _source_skrl_min_version()
skrl_scripts = [
"scripts/reinforcement_learning/skrl/play.py",
"scripts/reinforcement_learning/skrl/play_skrl.py",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: Test will fail due to _source_skrl_min_version() incompatibility

This test calls _source_skrl_min_version() which will raise KeyError on the target branch because the pyproject.toml structure doesn't match what the function expects.

Suggested fix: Add a skip condition:

def test_skrl_scripts_min_version_matches_source_package():
    """skrl runtime guards must match the package metadata lower bound."""
    expected_version = _source_skrl_min_version()
    if expected_version is None:
        pytest.skip("pyproject.toml does not contain skrl version metadata (legacy setup.py branch)")
    # ... rest of test

@@ -176,7 +182,7 @@ def get_actor_memory_module(policy_nn):
return None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Suggestion: Consider adding type annotation comment

The type hints were removed from device and dtype parameters because torch isn't imported at module level anymore. This is correct, but it might be worth adding a type comment for documentation purposes:

def ensure_actor_hidden_state_initialized(policy_nn, batch_size: int, device, dtype):
    # type: (..., torch.device, torch.dtype) -> ...
    """Initialize and return the actor hidden state when a recurrent policy has not created it yet."""

Alternatively, you could use TYPE_CHECKING to preserve type hints:

from typing import TYPE_CHECKING
if TYPE_CHECKING:
    import torch

def ensure_actor_hidden_state_initialized(
    policy_nn, batch_size: int, device: "torch.device", dtype: "torch.dtype"
):

This is minor — the current approach works fine.

@greptile-apps

greptile-apps Bot commented May 30, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This cherry-pick bundles three fixes from develop: bumping the skrl minimum version to 2.1.0 (for warp-lang 1.13 compatibility), refactoring the LEAPP export script to defer all heavy imports after Isaac Sim launches, and adding multi-GPU NCCL P2P hang documentation.

  • skrl 2.1.0: SKRL_VERSION in all four play/train scripts, the setup.py extras entry, and the wheel-builder manifest are updated; a new regression test ensures scripts and package metadata stay in sync. The wheel-builder entries were also aligned from the old >=1.4.3 floor.
  • LEAPP export refactor: torch, leapp, and annotate are now lazily imported inside _load_runtime_dependencies(); simulation lifecycle is delegated to a launch_simulation context manager, removing the explicit AppLauncher call from main_cli.
  • Multi-GPU docs: New section explains the NCCL_P2P_DISABLE=1 workaround for hangs caused by restricting visible GPUs with CUDA_VISIBLE_DEVICES.

Confidence Score: 3/5

The skrl version bump and documentation changes are safe, but the new regression test will always raise KeyError before performing any assertions.

The new test_skrl_scripts_min_version_matches_source_package test reads pyproject.toml expecting a [project.optional-dependencies.skrl] table, but that file only contains a [build-system] block — the skrl requirement is in setup.py. The KeyError fires every run, breaking CI for this test.

source/isaaclab_tasks/test/test_train_scripts_deterministic.py — _source_skrl_min_version() reads the wrong file for the skrl lower bound.

Important Files Changed

Filename Overview
source/isaaclab_tasks/test/test_train_scripts_deterministic.py Adds test_skrl_scripts_min_version_matches_source_package, but _source_skrl_min_version() reads pyproject.toml which lacks a [project] table, so the test always raises KeyError: 'project'.
scripts/reinforcement_learning/leapp/rsl_rl/export.py Defers torch, leapp, and annotate imports to _load_runtime_dependencies(); moves simulation lifecycle into a launch_simulation context manager; makes simulation_app optional.
source/isaaclab_rl/setup.py Raises the skrl optional-dependency floor from >=2.0.0 to >=2.1.0.
tools/wheel_builder/res/python_packages.toml Aligns wheel-builder skrl entries (previously >=1.4.3) with the new package-metadata floor of >=2.1.0.
docs/source/features/multi_gpu.rst Adds troubleshooting guidance for NCCL hangs when CUDA_VISIBLE_DEVICES restricts visible GPUs; documents NCCL_P2P_DISABLE=1 workaround.
scripts/reinforcement_learning/skrl/play.py Bumps SKRL_VERSION guard from 2.0.0 to 2.1.0.
scripts/reinforcement_learning/skrl/train.py Bumps SKRL_VERSION guard from 2.0.0 to 2.1.0.

Reviews (1): Last reviewed commit: "use lazy runtime imports to import leapp..." | Re-trigger Greptile

Comment on lines +46 to +61
with (REPO_ROOT / "source/isaaclab_rl/pyproject.toml").open("rb") as f:
data = tomllib.load(f)

for dependency in data["project"]["optional-dependencies"]["skrl"]:
requirement = Requirement(dependency)
if requirement.name != "skrl":
continue
lower_bounds = [
Version(specifier.version)
for specifier in requirement.specifier
if specifier.operator in {">=", "==", "~="}
]
if lower_bounds:
return max(lower_bounds)

raise AssertionError("Could not find skrl lower bound in source/isaaclab_rl/pyproject.toml")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 _source_skrl_min_version reads from a pyproject.toml that has no [project] table

source/isaaclab_rl/pyproject.toml contains only a [build-system] section; it does not have a [project] or [project.optional-dependencies] table. The skrl floor bound (skrl>=2.1.0) lives in source/isaaclab_rl/setup.py under EXTRAS_REQUIRE["skrl"]. As written, data["project"] will raise KeyError: 'project' every time test_skrl_scripts_min_version_matches_source_package runs, making the test permanently broken.

@kellyguo11 kellyguo11 merged commit 6292fe3 into isaac-sim:release/3.0.0-beta2 May 30, 2026
59 of 61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation infrastructure isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants