Skip to content

Refresh README, add AGENTS.md, add --view flag to benchmarks/run.py#1359

Merged
erikfrey merged 4 commits into
google-deepmind:mainfrom
erikfrey:readme-refresh
May 18, 2026
Merged

Refresh README, add AGENTS.md, add --view flag to benchmarks/run.py#1359
erikfrey merged 4 commits into
google-deepmind:mainfrom
erikfrey:readme-refresh

Conversation

@erikfrey

@erikfrey erikfrey commented May 15, 2026

Copy link
Copy Markdown
Collaborator

Improves the README and first-time user experience.

  • Rewrites the README to lead with a quickstart example and reorganizes sections for clarity.
  • Replaces the compatibility table with a concise exceptions list.
  • Adds an animated examples gallery showcasing benchmark scenes.
  • Adds AGENTS.md with development workflow conventions and a .agent/rules pointer for AI coding assistants.
  • Adds a --view flag to benchmarks/run.py that launches mjwarp-viewer on a benchmark scene with nworld=1, so the examples in the README work out of the box.

Rewrites the README to lead with a quickstart example, reorganizes sections for clarity, replaces the compatibility table with a concise exceptions list, and adds an animated examples gallery showcasing benchmark scenes.

Adds AGENTS.md with development workflow conventions and a .agent/rules pointer for AI coding assistants.

Adds a --view flag to benchmarks/run.py that launches mjwarp-viewer on a benchmark scene with nworld=1, so the examples in the README work out of the box.
@erikfrey erikfrey requested a review from thowell May 15, 2026 00:18
@thowell

thowell commented May 15, 2026

Copy link
Copy Markdown
Collaborator

awesome pr! thanks @erikfrey

Comment thread README.md
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
- **Integrator**: `IMPLICIT` not yet supported
- **Solver**: `PGS` and `noslip` not yet supported
- **Actuator / Sensors**: `PLUGIN` types not yet supported
- **Flex**: flex-flex collisions, `selfcollide`, `mjEQ_FLEXVERT`, and `mjEQ_FLEXSTRAIN` not yet supported

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if we should just say that flex is experimental right now and that not all features are implemented or optimized yet?
@quagla thoughts?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — simplified to "experimental — not all features are implemented or optimized yet". @quagla please review the wording.

Comment thread README.md Outdated
@thowell

thowell commented May 15, 2026

Copy link
Copy Markdown
Collaborator

left a few comments, additionally:

🐛 Identified Bugs & Structural Issues

1. Broken Exception Logging in _view_benchmark

  • File & Lines: benchmarks/run.py (Lines 223–224)
  • Code Block:
    except subprocess.CalledProcessError as e:
      log.error("Viewer for %s failed:\n%s", name, e.stderr)
  • The Bug: The subprocess.run(...) call on line 178 does not specify capture_output=True or redirect stderr. In Python, if output streams are not captured, the CalledProcessError.stderr attribute is None.
  • Impact: If mjwarp-viewer fails or crashes, the script will print Viewer for <name> failed:\nNone, completely hiding the traceback or terminal output that would diagnose the issue.
  • Suggested Fix: Log the error code instead, or omit e.stderr (since the output is already printed directly to the terminal):
    except subprocess.CalledProcessError as e:
      log.error("Viewer for %s failed with exit code %d", name, e.returncode)

💡 Recommended Potential Improvements

1. Avoid Benchmark "Window Flood" when Viewing

  • File & Lines: benchmarks/run.py (Lines 218–225)
  • The Issue: If a user runs python benchmarks/run.py --view with the default filter .* (or any filter matching multiple scenes), the script iterates over all matched benchmarks and opens them in mjwarp-viewer sequentially. When the user exits one window, the next one immediately launches. This forces the user to close up to 9 windows in a row.
  • Recommendation: If --view is specified and multiple benchmarks match the filter, either raise an error, issue a warning, or default to launching only the first match:
    if _ARGS.view and len(benchmarks) > 1:
      log.warning("Multiple benchmarks matched. Forcing view of the first match: %s", list(benchmarks.keys())[0])
      benchmarks = {list(benchmarks.keys())[0]: list(benchmarks.values())[0]}

2. Check for uv Availability Gracefully

  • File & Lines: benchmarks/run.py (Line 178)
  • The Issue: The script runs subprocess.run(("uv", "run") + tuple(cmd), ...) assuming uv is installed and globally available in the system path. If a developer runs this script without uv installed, it crashes with a generic FileNotFoundError: [Errno 2] No such file or directory: 'uv'.
  • Recommendation: Add a quick check to verify uv is installed, and output a friendly error message instructing how to install it:
    import shutil
    if not shutil.which("uv"):
      log.error("Astral 'uv' is required but was not found in PATH. Please install uv (https://astral.sh/uv) and try again.")
      sys.exit(1)

3. Ensure Onboarding Environment Consistency in README Quickstart

  • File: README.md
  • The Issue: The updated README provides a quickstart command:
    git clone https://github.com/google-deepmind/mujoco_warp.git && cd mujoco_warp
    python benchmarks/run.py -f unitree_g1_flat --view
    If run in a standard environment without uv or any virtual environment activated, the user's system Python may lack required dependencies or crash.
  • Recommendation: Change the quickstart to run with uv run, guaranteeing the on-demand creation and synchronization of the correct virtual environment automatically:
    git clone https://github.com/google-deepmind/mujoco_warp.git && cd mujoco_warp
    uv run python benchmarks/run.py -f unitree_g1_flat --view
    This is extremely robust and fully aligns with the standards defined in AGENTS.md.

🚀 Advanced Onboarding & First-Time Run Enhancements (Taking the PR to the Next Level)

To create a truly world-class onboarding experience, the following enhancements could be incorporated into this PR:

4. Early CUDA & GPU Driver Diagnostics Check

  • File: benchmarks/run.py
  • Concept: Since MuJoCo Warp relies heavily on hardware acceleration via NVIDIA Warp, running it on systems without a compatible NVIDIA GPU or misconfigured CUDA paths will lead to cryptic runtime/C++ crashes.
  • Recommendation: Add a fast, lightweight diagnostic check at script startup to verify if CUDA is initialized and available. If it's missing, fail-fast with highly actionable instructions:
    try:
      import warp as wp
      # Fast check of Warp CUDA availability
      if not wp.is_cuda_available():
         log.warning("NVIDIA GPU / CUDA acceleration is not active in Warp. Simulating on CPU.")
    except ImportError:
      pass

5. CLI Discovered Benchmark Lister (--list)

  • File: benchmarks/run.py
  • Concept: A first-time user may not know what benchmark scenes exist (e.g., unitree_g1_flat, humanoid, cloth). Currently, they have to inspect the source directory structure or look at the README.
  • Recommendation: Add a --list or -l flag to the CLI that lists all discovered benchmarks and their descriptions, so users can explore the available options directly from their terminal:
    $ uv run benchmarks/run.py --list
    Discovered benchmarks:
    - unitree_g1_flat: Rollout of Unitree G1 on flat ground
    - unitree_g1_hfield: Rollout of Unitree G1 on heightfield terrain
    - humanoid: Standard humanoid balancing and walking
    - cloth: Soft body cloth simulation
    ...

6. Interactive Selection Menu

  • File: benchmarks/run.py
  • Concept: If a user runs uv run benchmarks/run.py --view without specifying a filter, instead of flooding them by launching all windows sequentially, we can prompt them with a simple interactive text-based list to select which scene they wish to inspect.
  • Recommendation:
    if _ARGS.view and len(benchmarks) > 1:
        print("Available benchmark scenes:")
        for i, name in enumerate(benchmarks.keys()):
            print(f"[{i}] {name}")
        choice = input("Select a scene to view (default 0): ").strip()
        # Default to first if empty or invalid
        selected_name = list(benchmarks.keys())[int(choice)] if choice.isdigit() else list(benchmarks.keys())[0]
        benchmarks = {selected_name: benchmarks[selected_name]}

- Use high-throughput instead of fast in README intro (thowell)
- Simplify --event_trace flag usage, drop =True (thowell)
- Remove Jacobian format exception, sparse Jacobians now supported (thowell)
- Simplify Flex to experimental status (thowell)
- Move differentiability out of exceptions list, link to google-deepmind#500 (thowell)
- Link implicit integrator exception to PR google-deepmind#1339 (thowell)
- Fix e.stderr logging in _view_benchmark, use e.returncode instead (thowell)
- Warn and show only first match when --view matches multiple benchmarks (thowell)
@erikfrey

Copy link
Copy Markdown
Collaborator Author

Thanks for the thorough review @thowell! Addressed the inline comments and the follow-up in 00ef296:

Applied:

  • "fast" → "high-throughput" in intro
  • Simplified --event_trace flag (dropped =True)
  • Removed Jacobian format exception (sparse Jacobians now supported)
  • Simplified Flex to "experimental" status — @quagla please review the wording
  • Moved differentiability out of exceptions list into its own paragraph, linked to Differentiability #500
  • Linked implicit integrator exception to implicit integrator #1339
  • Fixed e.stderr bug in _view_benchmark — now logs e.returncode instead (stderr is None since we don't capture output)
  • Added window flood protection: --view now warns and shows only the first match when multiple benchmarks match the filter

Not applying (intentionally):

  • uv availability check: pre-existing issue across all of run.py, not specific to this PR. The FileNotFoundError is self-explanatory.
  • uv run python in quickstart: run.py is intentionally designed to use only stdlib imports so it works with bare system Python. It delegates heavy lifting to mjwarp-testspeed/mjwarp-viewer via uv run subprocess calls.
  • CUDA diagnostics, --list flag, interactive menu: great ideas for follow-up PRs.

erikfrey added 2 commits May 15, 2026 11:58
Restructure --view to avoid loop+break pattern, add empty benchmarks check with helpful error message. Unwrap hard-wrapped lines in AGENTS.md and soften PR body guidelines.
@erikfrey erikfrey merged commit 19d4a45 into google-deepmind:main May 18, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants