Refresh README, add AGENTS.md, add --view flag to benchmarks/run.py by erikfrey · Pull Request #1359 · google-deepmind/mujoco_warp

erikfrey · 2026-05-15T00:03:47Z

Improves the README and first-time user experience.

Rewrites the README to lead with a quickstart example and reorganizes sections for clarity.
Replaces the compatibility table with a concise exceptions list.
Adds an animated examples gallery showcasing benchmark scenes.
Adds AGENTS.md with development workflow conventions and a .agent/rules pointer for AI coding assistants.
Adds a --view flag to benchmarks/run.py that launches mjwarp-viewer on a benchmark scene with nworld=1, so the examples in the README work out of the box.

Rewrites the README to lead with a quickstart example, reorganizes sections for clarity, replaces the compatibility table with a concise exceptions list, and adds an animated examples gallery showcasing benchmark scenes. Adds AGENTS.md with development workflow conventions and a .agent/rules pointer for AI coding assistants. Adds a --view flag to benchmarks/run.py that launches mjwarp-viewer on a benchmark scene with nworld=1, so the examples in the README work out of the box.

thowell · 2026-05-15T08:38:05Z

awesome pr! thanks @erikfrey

thowell · 2026-05-15T09:01:15Z

+- **Integrator**: `IMPLICIT` not yet supported
+- **Solver**: `PGS` and `noslip` not yet supported
+- **Actuator / Sensors**: `PLUGIN` types not yet supported
+- **Flex**: flex-flex collisions, `selfcollide`, `mjEQ_FLEXVERT`, and `mjEQ_FLEXSTRAIN` not yet supported


wondering if we should just say that flex is experimental right now and that not all features are implemented or optimized yet?
@quagla thoughts?

Done — simplified to "experimental — not all features are implemented or optimized yet". @quagla please review the wording.

thowell · 2026-05-15T09:07:38Z

left a few comments, additionally:

🐛 Identified Bugs & Structural Issues

1. Broken Exception Logging in `_view_benchmark`

File & Lines: benchmarks/run.py (Lines 223–224)

Code Block:

except subprocess.CalledProcessError as e:
  log.error("Viewer for %s failed:\n%s", name, e.stderr)

The Bug: The subprocess.run(...) call on line 178 does not specify capture_output=True or redirect stderr. In Python, if output streams are not captured, the CalledProcessError.stderr attribute is None.
Impact: If mjwarp-viewer fails or crashes, the script will print Viewer for <name> failed:\nNone, completely hiding the traceback or terminal output that would diagnose the issue.

Suggested Fix: Log the error code instead, or omit e.stderr (since the output is already printed directly to the terminal):

except subprocess.CalledProcessError as e:
  log.error("Viewer for %s failed with exit code %d", name, e.returncode)

💡 Recommended Potential Improvements

1. Avoid Benchmark "Window Flood" when Viewing

File & Lines: benchmarks/run.py (Lines 218–225)
The Issue: If a user runs python benchmarks/run.py --view with the default filter .* (or any filter matching multiple scenes), the script iterates over all matched benchmarks and opens them in mjwarp-viewer sequentially. When the user exits one window, the next one immediately launches. This forces the user to close up to 9 windows in a row.

Recommendation: If --view is specified and multiple benchmarks match the filter, either raise an error, issue a warning, or default to launching only the first match:

if _ARGS.view and len(benchmarks) > 1:
  log.warning("Multiple benchmarks matched. Forcing view of the first match: %s", list(benchmarks.keys())[0])
  benchmarks = {list(benchmarks.keys())[0]: list(benchmarks.values())[0]}

2. Check for `uv` Availability Gracefully

File & Lines: benchmarks/run.py (Line 178)
The Issue: The script runs subprocess.run(("uv", "run") + tuple(cmd), ...) assuming uv is installed and globally available in the system path. If a developer runs this script without uv installed, it crashes with a generic FileNotFoundError: [Errno 2] No such file or directory: 'uv'.

Recommendation: Add a quick check to verify uv is installed, and output a friendly error message instructing how to install it:

import shutil
if not shutil.which("uv"):
  log.error("Astral 'uv' is required but was not found in PATH. Please install uv (https://astral.sh/uv) and try again.")
  sys.exit(1)

3. Ensure Onboarding Environment Consistency in README Quickstart

File: README.md
The Issue: The updated README provides a quickstart command:
```
git clone https://github.com/google-deepmind/mujoco_warp.git && cd mujoco_warp
python benchmarks/run.py -f unitree_g1_flat --view
```
If run in a standard environment without uv or any virtual environment activated, the user's system Python may lack required dependencies or crash.
Recommendation: Change the quickstart to run with uv run, guaranteeing the on-demand creation and synchronization of the correct virtual environment automatically:
```
git clone https://github.com/google-deepmind/mujoco_warp.git && cd mujoco_warp
uv run python benchmarks/run.py -f unitree_g1_flat --view
```
This is extremely robust and fully aligns with the standards defined in AGENTS.md.

🚀 Advanced Onboarding & First-Time Run Enhancements (Taking the PR to the Next Level)

To create a truly world-class onboarding experience, the following enhancements could be incorporated into this PR:

4. Early CUDA & GPU Driver Diagnostics Check

File: benchmarks/run.py
Concept: Since MuJoCo Warp relies heavily on hardware acceleration via NVIDIA Warp, running it on systems without a compatible NVIDIA GPU or misconfigured CUDA paths will lead to cryptic runtime/C++ crashes.

Recommendation: Add a fast, lightweight diagnostic check at script startup to verify if CUDA is initialized and available. If it's missing, fail-fast with highly actionable instructions:

try:
  import warp as wp
  # Fast check of Warp CUDA availability
  if not wp.is_cuda_available():
     log.warning("NVIDIA GPU / CUDA acceleration is not active in Warp. Simulating on CPU.")
except ImportError:
  pass

5. CLI Discovered Benchmark Lister (`--list`)

File: benchmarks/run.py
Concept: A first-time user may not know what benchmark scenes exist (e.g., unitree_g1_flat, humanoid, cloth). Currently, they have to inspect the source directory structure or look at the README.

Recommendation: Add a --list or -l flag to the CLI that lists all discovered benchmarks and their descriptions, so users can explore the available options directly from their terminal:

$ uv run benchmarks/run.py --list
Discovered benchmarks:
- unitree_g1_flat: Rollout of Unitree G1 on flat ground
- unitree_g1_hfield: Rollout of Unitree G1 on heightfield terrain
- humanoid: Standard humanoid balancing and walking
- cloth: Soft body cloth simulation
...

6. Interactive Selection Menu

File: benchmarks/run.py
Concept: If a user runs uv run benchmarks/run.py --view without specifying a filter, instead of flooding them by launching all windows sequentially, we can prompt them with a simple interactive text-based list to select which scene they wish to inspect.

Recommendation:

if _ARGS.view and len(benchmarks) > 1:
    print("Available benchmark scenes:")
    for i, name in enumerate(benchmarks.keys()):
        print(f"[{i}] {name}")
    choice = input("Select a scene to view (default 0): ").strip()
    # Default to first if empty or invalid
    selected_name = list(benchmarks.keys())[int(choice)] if choice.isdigit() else list(benchmarks.keys())[0]
    benchmarks = {selected_name: benchmarks[selected_name]}

- Use high-throughput instead of fast in README intro (thowell) - Simplify --event_trace flag usage, drop =True (thowell) - Remove Jacobian format exception, sparse Jacobians now supported (thowell) - Simplify Flex to experimental status (thowell) - Move differentiability out of exceptions list, link to google-deepmind#500 (thowell) - Link implicit integrator exception to PR google-deepmind#1339 (thowell) - Fix e.stderr logging in _view_benchmark, use e.returncode instead (thowell) - Warn and show only first match when --view matches multiple benchmarks (thowell)

erikfrey · 2026-05-15T18:53:48Z

Thanks for the thorough review @thowell! Addressed the inline comments and the follow-up in 00ef296:

Applied:

"fast" → "high-throughput" in intro
Simplified --event_trace flag (dropped =True)
Removed Jacobian format exception (sparse Jacobians now supported)
Simplified Flex to "experimental" status — @quagla please review the wording
Moved differentiability out of exceptions list into its own paragraph, linked to Differentiability #500
Linked implicit integrator exception to implicit integrator #1339
Fixed e.stderr bug in _view_benchmark — now logs e.returncode instead (stderr is None since we don't capture output)
Added window flood protection: --view now warns and shows only the first match when multiple benchmarks match the filter

Not applying (intentionally):

uv availability check: pre-existing issue across all of run.py, not specific to this PR. The FileNotFoundError is self-explanatory.
uv run python in quickstart: run.py is intentionally designed to use only stdlib imports so it works with bare system Python. It delegates heavy lifting to mjwarp-testspeed/mjwarp-viewer via uv run subprocess calls.
CUDA diagnostics, --list flag, interactive menu: great ideas for follow-up PRs.

Restructure --view to avoid loop+break pattern, add empty benchmarks check with helpful error message. Unwrap hard-wrapped lines in AGENTS.md and soften PR body guidelines.

erikfrey force-pushed the readme-refresh branch from 5be8291 to bf2fa74 Compare May 15, 2026 00:14

erikfrey force-pushed the readme-refresh branch from bf2fa74 to f6a9949 Compare May 15, 2026 00:15

erikfrey requested a review from thowell May 15, 2026 00:18

thowell reviewed May 15, 2026

View reviewed changes

Comment thread README.md

thowell reviewed May 15, 2026

View reviewed changes

Comment thread README.md Outdated

thowell reviewed May 15, 2026

View reviewed changes

Comment thread README.md Outdated

thowell reviewed May 15, 2026

View reviewed changes

Comment thread README.md Outdated

erikfrey added 2 commits May 15, 2026 11:58

Add PR review etiquette and amend guidelines to AGENTS.md

73ab9c9

Clean up run.py view logic and fix AGENTS.md formatting

0e28c54

Restructure --view to avoid loop+break pattern, add empty benchmarks check with helpful error message. Unwrap hard-wrapped lines in AGENTS.md and soften PR body guidelines.

thowell approved these changes May 18, 2026

View reviewed changes

erikfrey merged commit 19d4a45 into google-deepmind:main May 18, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refresh README, add AGENTS.md, add --view flag to benchmarks/run.py#1359

Refresh README, add AGENTS.md, add --view flag to benchmarks/run.py#1359
erikfrey merged 4 commits into
google-deepmind:mainfrom
erikfrey:readme-refresh

erikfrey commented May 15, 2026 •

edited

Loading

Uh oh!

thowell commented May 15, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thowell May 15, 2026

Uh oh!

erikfrey May 15, 2026

Uh oh!

Uh oh!

thowell commented May 15, 2026

Uh oh!

erikfrey commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erikfrey commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thowell commented May 15, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thowell May 15, 2026

Choose a reason for hiding this comment

Uh oh!

erikfrey May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thowell commented May 15, 2026

🐛 Identified Bugs & Structural Issues

1. Broken Exception Logging in _view_benchmark

💡 Recommended Potential Improvements

1. Avoid Benchmark "Window Flood" when Viewing

2. Check for uv Availability Gracefully

3. Ensure Onboarding Environment Consistency in README Quickstart

🚀 Advanced Onboarding & First-Time Run Enhancements (Taking the PR to the Next Level)

4. Early CUDA & GPU Driver Diagnostics Check

5. CLI Discovered Benchmark Lister (--list)

6. Interactive Selection Menu

Uh oh!

erikfrey commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

erikfrey commented May 15, 2026 •

edited

Loading

1. Broken Exception Logging in `_view_benchmark`

2. Check for `uv` Availability Gracefully

5. CLI Discovered Benchmark Lister (`--list`)