feat(cargo): aggregate test output into single line by FlorianBruniaux · Pull Request #85 · rtk-ai/rtk

FlorianBruniaux · 2026-02-12T17:07:10Z

Fixes #83

Problem

cargo test currently shows 24+ summary lines even when all tests pass. For LLM consumption, we only need to know IF something failed, not see 24 identical "ok" lines.

Before (24 lines):

✓ test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
✓ test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
... (x24)

After (1 line):

✓ cargo test: 137 passed (24 suites, 1.45s)

Solution

Add AggregatedTestResult struct that:

Parses test summary lines with regex (OnceLock for performance)
Merges multiple summaries when all tests pass
Formats compactly: N passed, M ignored, P filtered out (X suites, Ys)
Falls back gracefully if parsing fails
Preserves full details when failures occur (no aggregation)

Format examples

Case	Output
All pass	`✓ cargo test: 268 passed (1 suite, 0.03s)`
With ignored	`✓ cargo test: 63 passed, 5 ignored (2 suites, 0.70s)`
With filtered	`✓ cargo test: 0 passed, 268 filtered out (1 suite, 0.00s)`
Failures	`FAILURES (1):\n═══\n[full details preserved]`

Implementation details

Single file changed: src/cargo_cmd.rs (+278 lines)
No new dependencies: Uses existing regex crate
Zero-copy optimization: OnceLock for regex compilation
Backward compatible: Fallback to original behavior if regex fails

Tests

✅ 6 new tests + 1 modified
✅ All 268 tests pass
✅ Covers: multi-suite, failures, zero tests, ignored/filtered, singular/plural, regex fallback

Edge cases handled

✅ --nocapture flag works correctly
✅ cargo test specific_test shows filtered count
✅ Doc-tests + unit tests + integration tests all aggregate
✅ Malformed output falls back gracefully

cc @bdarcus - this addresses your request for more compact test output 🎯

Checklist

Code formatted (cargo fmt --all)
Clippy clean (cargo clippy --all-targets)
All tests pass (cargo test)
Manual testing on RTK project itself
Edge cases verified

Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Copilot

Pull request overview

This PR implements compact aggregation of cargo test output to make it more suitable for LLM consumption. Instead of showing 24+ identical "ok" summary lines when all tests pass, it now shows a single aggregated line. The implementation is clean, well-tested, and includes graceful fallback behavior.

Changes:

Added AggregatedTestResult struct to parse and aggregate test summary lines across multiple test suites
Modified filter_cargo_test function to use aggregation when all tests pass, preserving detailed output when failures occur
Added 6 new comprehensive tests and updated 1 existing test to validate the new behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-12T17:11:54Z

src/cargo_cmd.rs

+
+        if self.ignored > 0 {
+            parts.push(format!("{} ignored", self.ignored));
+        }


The measured field is parsed and tracked in the struct but never displayed in the compact output format. If benchmark tests are run, the measured count would be silently omitted from the output. Consider adding logic to include measured tests in the output when self.measured > 0, similar to how ignored and filtered_out are handled.

Suggested change

}

}

if self.measured > 0 {

parts.push(format!("{} measured", self.measured));

}

Copilot · 2026-02-12T17:11:55Z

src/cargo_cmd.rs

+        self.filtered_out += other.filtered_out;
+        self.suites += other.suites;
+        self.duration_secs += other.duration_secs;
+        self.has_duration = self.has_duration && other.has_duration;


The has_duration flag uses AND logic during merge, which means if any single suite lacks duration information, the aggregated result will not display duration even though most suites have it. This could result in losing timing information unnecessarily. Consider using OR logic (self.has_duration || other.has_duration) and only including partial timing information in the output, or tracking which suites have duration separately.

Suggested change

self.has_duration = self.has_duration && other.has_duration;

self.has_duration = self.has_duration || other.has_duration;

Copilot · 2026-02-12T17:11:55Z

src/cargo_cmd.rs

+            regex::Regex::new(
+                r"test result: (\w+)\.\s+(\d+) passed;\s+(\d+) failed;\s+(\d+) ignored;\s+(\d+) measured;\s+(\d+) filtered out(?:;\s+finished in ([\d.]+)s)?"
+            ).unwrap()


The regex crate is used via fully qualified path regex::Regex::new() without an explicit import statement. While this works, it's inconsistent with other files in the codebase (e.g., src/deps.rs, src/filter.rs, src/grep_cmd.rs) which use use regex::Regex;. Consider adding use regex::Regex; at the top of the file for consistency.

…ai#85) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings February 12, 2026 17:07

Copilot started reviewing on behalf of FlorianBruniaux February 12, 2026 17:07 View session

FlorianBruniaux requested a review from pszymkowiak February 12, 2026 17:07

Copilot AI reviewed Feb 12, 2026

View reviewed changes

pszymkowiak merged commit 06b1049 into rtk-ai:master Feb 12, 2026
8 checks passed

github-actions bot mentioned this pull request Feb 12, 2026

chore(master): release 0.15.0 #90

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cargo): aggregate test output into single line#85

feat(cargo): aggregate test output into single line#85
pszymkowiak merged 1 commit intortk-ai:masterfrom
FlorianBruniaux:feat/cargo-test-aggregate

FlorianBruniaux commented Feb 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-        }
+        }
+        if self.measured > 0 {
+            parts.push(format!("{} measured", self.measured));
+        }

	self.has_duration = self.has_duration && other.has_duration;
	self.has_duration = self.has_duration \|\| other.has_duration;

Conversation

FlorianBruniaux commented Feb 12, 2026

Problem

Solution

Format examples

Implementation details

Tests

Edge cases handled

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants