Problem
The agent has run_tests as an internal tool but users can't drive it directly. Test failures during agent work get surfaced as raw cargo test / pytest / npm test output, which is hard to parse in a transcript.
Proposed solution
/test [path] slash command:
- Auto-detect the project's test runner from manifest (
Cargo.toml → cargo, package.json → npm/pnpm/yarn, pyproject.toml → pytest, etc.).
- Run the test command. Stream output live.
- Parse failure output structurally:
- File:line of the assertion
- Expected vs actual values
- Backtrace (folded by default, expandable)
- Render failures as cards (like tool results) with file:line as a clickable opener (Ctrl+Click → open in $EDITOR /
code -g <file>:<line>).
- Summary line at the end: "12 passed, 1 failed, 0 ignored — 4.2s".
Companion: file:line clickability everywhere
Stack traces in tool errors, sandbox denials, etc. all get parsed for path:line patterns and rendered as openable links.
Acceptance criteria
Related
crates/tui/src/tools/run_tests.rs — existing internal tool.
crates/tui/src/commands/ — where the new slash command lives.
Problem
The agent has
run_testsas an internal tool but users can't drive it directly. Test failures during agent work get surfaced as rawcargo test/pytest/npm testoutput, which is hard to parse in a transcript.Proposed solution
/test [path]slash command:Cargo.toml→ cargo,package.json→ npm/pnpm/yarn,pyproject.toml→ pytest, etc.).code -g <file>:<line>).Companion: file:line clickability everywhere
Stack traces in tool errors, sandbox denials, etc. all get parsed for
path:linepatterns and rendered as openable links.Acceptance criteria
/testruns the right command per project type (Rust / Node / Python at minimum).$EDITOR,code -g,idea --line, etc.).Related
crates/tui/src/tools/run_tests.rs— existing internal tool.crates/tui/src/commands/— where the new slash command lives.