feat(cargo): aggregate test output into single line#85
feat(cargo): aggregate test output into single line#85pszymkowiak merged 1 commit intortk-ai:masterfrom
Conversation
Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR implements compact aggregation of cargo test output to make it more suitable for LLM consumption. Instead of showing 24+ identical "ok" summary lines when all tests pass, it now shows a single aggregated line. The implementation is clean, well-tested, and includes graceful fallback behavior.
Changes:
- Added
AggregatedTestResultstruct to parse and aggregate test summary lines across multiple test suites - Modified
filter_cargo_testfunction to use aggregation when all tests pass, preserving detailed output when failures occur - Added 6 new comprehensive tests and updated 1 existing test to validate the new behavior
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if self.ignored > 0 { | ||
| parts.push(format!("{} ignored", self.ignored)); | ||
| } |
There was a problem hiding this comment.
The measured field is parsed and tracked in the struct but never displayed in the compact output format. If benchmark tests are run, the measured count would be silently omitted from the output. Consider adding logic to include measured tests in the output when self.measured > 0, similar to how ignored and filtered_out are handled.
| } | |
| } | |
| if self.measured > 0 { | |
| parts.push(format!("{} measured", self.measured)); | |
| } |
| self.filtered_out += other.filtered_out; | ||
| self.suites += other.suites; | ||
| self.duration_secs += other.duration_secs; | ||
| self.has_duration = self.has_duration && other.has_duration; |
There was a problem hiding this comment.
The has_duration flag uses AND logic during merge, which means if any single suite lacks duration information, the aggregated result will not display duration even though most suites have it. This could result in losing timing information unnecessarily. Consider using OR logic (self.has_duration || other.has_duration) and only including partial timing information in the output, or tracking which suites have duration separately.
| self.has_duration = self.has_duration && other.has_duration; | |
| self.has_duration = self.has_duration || other.has_duration; |
| regex::Regex::new( | ||
| r"test result: (\w+)\.\s+(\d+) passed;\s+(\d+) failed;\s+(\d+) ignored;\s+(\d+) measured;\s+(\d+) filtered out(?:;\s+finished in ([\d.]+)s)?" | ||
| ).unwrap() |
There was a problem hiding this comment.
The regex crate is used via fully qualified path regex::Regex::new() without an explicit import statement. While this works, it's inconsistent with other files in the codebase (e.g., src/deps.rs, src/filter.rs, src/grep_cmd.rs) which use use regex::Regex;. Consider adding use regex::Regex; at the top of the file for consistency.
…ai#85) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixes #83
Problem
cargo testcurrently shows 24+ summary lines even when all tests pass. For LLM consumption, we only need to know IF something failed, not see 24 identical "ok" lines.Before (24 lines):
After (1 line):
Solution
Add
AggregatedTestResultstruct that:OnceLockfor performance)N passed, M ignored, P filtered out (X suites, Ys)Format examples
✓ cargo test: 268 passed (1 suite, 0.03s)✓ cargo test: 63 passed, 5 ignored (2 suites, 0.70s)✓ cargo test: 0 passed, 268 filtered out (1 suite, 0.00s)FAILURES (1):\n═══\n[full details preserved]Implementation details
src/cargo_cmd.rs(+278 lines)regexcrateOnceLockfor regex compilationTests
Edge cases handled
--nocaptureflag works correctlycargo test specific_testshows filtered countcc @bdarcus - this addresses your request for more compact test output 🎯
Checklist
cargo fmt --all)cargo clippy --all-targets)cargo test)