fix: improve CI reliability and hook coverage by FlorianBruniaux · Pull Request #95 · rtk-ai/rtk

FlorianBruniaux · 2026-02-12T21:50:55Z

Summary

Two independent fixes to improve RTK developer experience:

CI Reliability: Prevent Python/Go benchmark sections from being silently skipped
Hook Coverage: Add 8 missing command rewrites to auto-rewrite hooks

Changes

1. CI Benchmark Fixtures (`fb5b442`)

Problem: Python and Go benchmark sections could silently skip if fixture creation failed, hiding real issues.

Solution: Enhanced error handling in benchmark workflow to catch fixture failures early.

Impact: More reliable CI with explicit failures instead of silent skips.

2. Hook Command Coverage (`c335f82`)

Problem: rtk discover revealed ~18.2K tokens wasted on commands RTK supports but hooks don't rewrite.

Solution: Added 8 missing rewrites to rtk-rewrite.sh and rtk-suggest.sh:

cargo check/install/fmt
tree, find, diff
head → rtk read --max-lines (smart transformation)
wget

Impact: ~18.2K additional tokens saved automatically through hook rewrites.

Technical Note: Fixed BSD sed compatibility for head transformation (macOS).

Testing

✅ Hook syntax validation (3/3 files)
✅ Hook transformation tests (9/9 passing)
✅ Build verification: cargo test → 292 passed
✅ Manual testing on sample commands

Verification

# Test hook rewrites
echo '{"tool_input":{"command":"tree ."}}' | bash .claude/hooks/rtk-rewrite.sh
echo '{"tool_input":{"command":"head -10 Cargo.toml"}}' | bash .claude/hooks/rtk-rewrite.sh

# Run tests
cargo test

Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…pped **Problem:** Python and Go benchmark sections were silently skipped in CI because the RTK repository doesn't contain pyproject.toml or go.mod files. The sections only ran when these project files existed. **Solution:** 1. Create temporary fixtures with minimal project structure: - Python: pyproject.toml + sample.py + test_sample.py - Go: go.mod + main.go + main_test.go 2. Resolve RTK to absolute path to work after cd into temp dirs 3. Install required tools in CI workflow: - Python: ruff, pytest - Go: stable version + golangci-lint **Impact:** - Python/Go sections now appear in CI benchmark output - Self-contained fixtures ensure consistent benchmarking - No dependency on RTK project structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR aims to ensure Python and Go benchmark sections are exercised in CI by making scripts/benchmark.sh self-contained (temporary fixtures) and by installing the required tooling in the benchmark workflow.

Changes:

Add temporary Python and Go fixtures to scripts/benchmark.sh so benchmarks don’t depend on repository pyproject.toml / go.mod.
Update .github/workflows/benchmark.yml to install Python (ruff/pytest) and Go (setup-go + golangci-lint) tooling.
Enhance src/cargo_cmd.rs to aggregate multi-suite cargo test summaries into a compact single-line output (with expanded unit tests).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
`src/cargo_cmd.rs`	Adds aggregated/compact formatting for `cargo test` summaries and new tests.
`scripts/benchmark.sh`	Adds self-contained Python/Go benchmark sections using temporary fixtures and an absolute RTK path.
`.github/workflows/benchmark.yml`	Installs Python + Go tooling needed for the new benchmark sections in CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-12T22:00:21Z

scripts/benchmark.sh

+  PYTHON_FIXTURE=$(mktemp -d)
+  cd "$PYTHON_FIXTURE"



Temp fixture cleanup for the Python section only runs on the normal path. If the script exits early (e.g., Ctrl-C, unexpected error before the final rm -rf), the temp dir will be left behind and the working directory may stay changed. Consider using a trap to always cd back and remove $PYTHON_FIXTURE on EXIT/INT/TERM (or use pushd/popd).

Copilot · 2026-02-12T22:00:21Z

scripts/benchmark.sh

+if command -v go &> /dev/null && command -v golangci-lint &> /dev/null; then
+  section "go"



The Go benchmark section is still silent when prerequisites are missing (go/golangci-lint); it will simply not appear in output. To fully address "silently skipped" sections, consider adding an else branch that prints an explicit "go skipped" message with the missing tool(s).

Suggested change

if command -v go &> /dev/null && command -v golangci-lint &> /dev/null; then

section "go"

section "go"

if ! command -v go &> /dev/null || ! command -v golangci-lint &> /dev/null; then

echo "go skipped (missing: go and/or golangci-lint)" >&2

else

Copilot · 2026-02-12T22:00:22Z

scripts/benchmark.sh

-  section "Go stack"
+if command -v go &> /dev/null && command -v golangci-lint &> /dev/null; then
+  section "go"



Temp fixture cleanup for the Go section only runs on the normal path. If the script exits early (interrupt/unexpected error before the final rm -rf), the temp dir will be left behind and the working directory may stay changed. Consider using a trap to always cd back and remove $GO_FIXTURE on EXIT/INT/TERM (or use pushd/popd).

Suggested change

GO_ORIG_PWD="$(pwd)"

trap 'if [ -n "$GO_FIXTURE" ] && [ -d "$GO_FIXTURE" ]; then cd "$GO_ORIG_PWD" >/dev/null 2>&1 || true; rm -rf "$GO_FIXTURE"; fi' EXIT INT TERM

Copilot · 2026-02-12T22:00:22Z

.github/workflows/benchmark.yml


+      - name: Install Python tools
+        run: |
+          pip install ruff pytest


pip install ruff pytest without version pins makes CI benchmarks non-deterministic (a new ruff/pytest release can change output/behavior and cause flaky diffs). Consider pinning versions (or using a constraints file) to keep benchmark output stable over time.

Suggested change

pip install ruff pytest

pip install ruff==0.6.9 pytest==8.3.3

Copilot · 2026-02-12T22:00:22Z

.github/workflows/benchmark.yml

+
+      - name: Install Go tools
+        run: |
+          go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest


Installing golangci-lint with @latest is non-deterministic and can break CI unexpectedly when upstream releases. Consider pinning to a specific golangci-lint version (and updating it intentionally) to keep the benchmark workflow stable.

Suggested change

go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest

go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.59.1

Copilot · 2026-02-12T22:00:22Z

scripts/benchmark.sh

 # ===================
-if [ -f "pyproject.toml" ] || [ -f "requirements.txt" ] || [ -f "setup.py" ]; then
-  section "Python stack"
+if command -v python3 &> /dev/null && command -v ruff &> /dev/null && command -v pytest &> /dev/null; then


The Python benchmark section is still silent when prerequisites are missing (python3/ruff/pytest); it will simply not appear in output. Since the goal is to prevent silent skips, consider adding an else branch that prints an explicit "python skipped" message with the missing tool(s).

Suggested change

if command -v python3 &> /dev/null && command -v ruff &> /dev/null && command -v pytest &> /dev/null; then

# Vérifie les prérequis Python et affiche un message explicite si manquants

missing_python_tools=()

command -v python3 &> /dev/null || missing_python_tools+=("python3")

command -v ruff &> /dev/null || missing_python_tools+=("ruff")

command -v pytest &> /dev/null || missing_python_tools+=("pytest")

if [ ${#missing_python_tools[@]} -eq 0 ]; then

Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh: - cargo check/install/fmt - tree, find, diff - head → rtk read (with --max-lines transformation) - wget Fixes BSD sed compatibility for head transformation by using literal spaces instead of \s+ (which doesn't work on macOS). Impact: ~18.2K tokens saved on previously missed commands discovered by `rtk discover`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(cargo): aggregate test output into single line (rtk-ai#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(ci): prevent Python/Go benchmark sections from being silently skipped **Problem:** Python and Go benchmark sections were silently skipped in CI because the RTK repository doesn't contain pyproject.toml or go.mod files. The sections only ran when these project files existed. **Solution:** 1. Create temporary fixtures with minimal project structure: - Python: pyproject.toml + sample.py + test_sample.py - Go: go.mod + main.go + main_test.go 2. Resolve RTK to absolute path to work after cd into temp dirs 3. Install required tools in CI workflow: - Python: ruff, pytest - Go: stable version + golangci-lint **Impact:** - Python/Go sections now appear in CI benchmark output - Self-contained fixtures ensure consistent benchmarking - No dependency on RTK project structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(hooks): add missing RTK command rewrites Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh: - cargo check/install/fmt - tree, find, diff - head → rtk read (with --max-lines transformation) - wget Fixes BSD sed compatibility for head transformation by using literal spaces instead of \s+ (which doesn't work on macOS). Impact: ~18.2K tokens saved on previously missed commands discovered by `rtk discover`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

Resolved conflicts: - Version bumped to 0.15.4 (Cargo.toml, Cargo.lock, .release-please-manifest.json) - CHANGELOG.md: Added upstream releases (0.15.4, 0.15.3, 0.15.2) - Hooks: Adopted POSIX character classes ([[:space:]]) from upstream - src/parser/mod.rs: Added multibyte UTF-8 tests from upstream - src/ruff_cmd.rs: Kept functions public for lint/format dispatcher feature Upstream changes integrated: - rtk-ai#120: git status fix for non-repo folders - rtk-ai#93: UTF-8 panic prevention on multibyte chars - rtk-ai#98: POSIX grep compatibility in hooks - rtk-ai#95, rtk-ai#92: CI reliability and hook coverage improvements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(cargo): aggregate test output into single line (#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes #83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: add Python and Go support (#88) * feat(cargo): aggregate test output into single line (#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes #83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: add Python and Go language support Implements comprehensive support for Python and Go development tooling with 70-90% token reduction across all commands. Python commands (3): - rtk ruff: Linter/formatter with JSON (check) and text (format) parsing (80%+) - rtk pytest: Test runner with state machine text parser (90%+) - rtk pip: Package manager with auto-detect uv (70-85%) Go commands (4): - rtk go test: NDJSON streaming parser for interleaved test events (90%+) - rtk go build: Text filter showing errors only (80%) - rtk go vet: Text filter for issues (75%) - rtk golangci-lint: JSON parser grouped by rule (85%) Architecture: - Standalone Python commands (mirror lint/prettier pattern) - Go sub-enum (mirror git/cargo pattern) - 5 new modules: ruff_cmd, pytest_cmd, pip_cmd, go_cmd, golangci_cmd - Hook integration in rtk-rewrite.sh for transparent rewrites - Comprehensive tests (47 new tests, all passing) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat(benchmark): add Python and Go commands Add benchmark sections for Python (ruff, pytest, pip) and Go (go test/build/vet, golangci-lint) to validate >80% token savings in CI pipeline. Sections conditionally execute based on project markers (pyproject.toml, go.mod) and tool availability. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: make install-local.sh self-contained (#89) - Build from source automatically instead of requiring a pre-built binary - Default install dir to ~/.cargo/bin - Skip rebuild when binary is up to date - Warn if install dir is not in PATH * chore(master): release 0.15.0 (#90) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(vitest): robust JSON extraction for pnpm/dotenv prefixes (#92) * fix(vitest): robust JSON extraction for pnpm/dotenv prefixes Problem: RTK's vitest parser forces --reporter=json but pnpm/dotenv prepend non-JSON text to stdout (banners, env messages), causing 100% Tier 1 failure and useless 500-char passthrough. Solution: - Add extract_json_object() to parser/mod.rs (shared utility) - Algorithm: find "numTotalTests" or first standalone {, brace-balance forward - VitestParser now tries direct parse → extract+parse → regex → passthrough - Replace hardcoded Command::new("pnpm") with package_manager_exec("vitest") - Delete orphan doc comment on line 203 Impact: - Before: 100% Tier 3 passthrough with pnpm workflows - After: Tier 1 success with prefixes, maintains 99.5% token savings Tests: - 6 tests for extract_json_object (clean, pnpm, dotenv, nested, no-json, strings) - 3 tests for VitestParser with prefixes - All 277 tests pass Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * chore(benchmark): add vitest, pnpm, and gh commands Add benchmarks for recently implemented commands: - vitest run (PR #92 - JSON extraction fix) - pnpm list/outdated (PR #6) - gh pr list/run list (existing gh support) These commands are now tested in CI to ensure token savings are maintained. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: improve CI reliability and hook coverage (#95) * feat(cargo): aggregate test output into single line (#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes #83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(ci): prevent Python/Go benchmark sections from being silently skipped **Problem:** Python and Go benchmark sections were silently skipped in CI because the RTK repository doesn't contain pyproject.toml or go.mod files. The sections only ran when these project files existed. **Solution:** 1. Create temporary fixtures with minimal project structure: - Python: pyproject.toml + sample.py + test_sample.py - Go: go.mod + main.go + main_test.go 2. Resolve RTK to absolute path to work after cd into temp dirs 3. Install required tools in CI workflow: - Python: ruff, pytest - Go: stable version + golangci-lint **Impact:** - Python/Go sections now appear in CI benchmark output - Self-contained fixtures ensure consistent benchmarking - No dependency on RTK project structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(hooks): add missing RTK command rewrites Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh: - cargo check/install/fmt - tree, find, diff - head → rtk read (with --max-lines transformation) - wget Fixes BSD sed compatibility for head transformation by using literal spaces instead of \s+ (which doesn't work on macOS). Impact: ~18.2K tokens saved on previously missed commands discovered by `rtk discover`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * chore(master): release 0.15.1 (#96) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * feat(python): add lint dispatcher + universal format command Phase 1: Enhanced rtk lint - Add pylint JSON2 parser (80-85% token savings) - Add mypy text parser (75-80% token savings) - Smart dispatcher: Python tools (pip) vs JS tools (npm) - Reuse ruff_cmd JSON parser for rtk lint ruff Phase 2: New rtk format command - Universal formatter: black/ruff/prettier - Auto-detect from pyproject.toml/package.json - Implement black output parser (70-85% savings) - Reuse existing prettier/ruff formatters Phase 3: Hook integration - Auto-rewrite: pylint → rtk lint pylint - Auto-rewrite: mypy → rtk lint mypy - Auto-rewrite: black --check → rtk format black Files changed: - src/lint_cmd.rs: +454 lines (pylint/mypy parsers, dispatcher) - src/format_cmd.rs: +386 lines (NEW - universal formatter) - src/ruff_cmd.rs: Export filter functions as pub - src/prettier_cmd.rs: Export filter_prettier_output as pub - src/main.rs: Add Commands::Format + routing - hooks/rtk-rewrite.sh: Add Python tool rewrite rules Testing: 10 new unit tests, all 313 tests passing Impact: 80-90% token savings on Python workflows Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Michael Coen <mhcoen@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat(cargo): aggregate test output into single line (rtk-ai#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(ci): prevent Python/Go benchmark sections from being silently skipped **Problem:** Python and Go benchmark sections were silently skipped in CI because the RTK repository doesn't contain pyproject.toml or go.mod files. The sections only ran when these project files existed. **Solution:** 1. Create temporary fixtures with minimal project structure: - Python: pyproject.toml + sample.py + test_sample.py - Go: go.mod + main.go + main_test.go 2. Resolve RTK to absolute path to work after cd into temp dirs 3. Install required tools in CI workflow: - Python: ruff, pytest - Go: stable version + golangci-lint **Impact:** - Python/Go sections now appear in CI benchmark output - Self-contained fixtures ensure consistent benchmarking - No dependency on RTK project structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(hooks): add missing RTK command rewrites Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh: - cargo check/install/fmt - tree, find, diff - head → rtk read (with --max-lines transformation) - wget Fixes BSD sed compatibility for head transformation by using literal spaces instead of \s+ (which doesn't work on macOS). Impact: ~18.2K tokens saved on previously missed commands discovered by `rtk discover`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

) * feat(cargo): aggregate test output into single line (rtk-ai#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: add Python and Go support (rtk-ai#88) * feat(cargo): aggregate test output into single line (rtk-ai#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: add Python and Go language support Implements comprehensive support for Python and Go development tooling with 70-90% token reduction across all commands. Python commands (3): - rtk ruff: Linter/formatter with JSON (check) and text (format) parsing (80%+) - rtk pytest: Test runner with state machine text parser (90%+) - rtk pip: Package manager with auto-detect uv (70-85%) Go commands (4): - rtk go test: NDJSON streaming parser for interleaved test events (90%+) - rtk go build: Text filter showing errors only (80%) - rtk go vet: Text filter for issues (75%) - rtk golangci-lint: JSON parser grouped by rule (85%) Architecture: - Standalone Python commands (mirror lint/prettier pattern) - Go sub-enum (mirror git/cargo pattern) - 5 new modules: ruff_cmd, pytest_cmd, pip_cmd, go_cmd, golangci_cmd - Hook integration in rtk-rewrite.sh for transparent rewrites - Comprehensive tests (47 new tests, all passing) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * feat(benchmark): add Python and Go commands Add benchmark sections for Python (ruff, pytest, pip) and Go (go test/build/vet, golangci-lint) to validate >80% token savings in CI pipeline. Sections conditionally execute based on project markers (pyproject.toml, go.mod) and tool availability. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * feat: make install-local.sh self-contained (rtk-ai#89) - Build from source automatically instead of requiring a pre-built binary - Default install dir to ~/.cargo/bin - Skip rebuild when binary is up to date - Warn if install dir is not in PATH * chore(master): release 0.15.0 (rtk-ai#90) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(vitest): robust JSON extraction for pnpm/dotenv prefixes (rtk-ai#92) * fix(vitest): robust JSON extraction for pnpm/dotenv prefixes Problem: RTK's vitest parser forces --reporter=json but pnpm/dotenv prepend non-JSON text to stdout (banners, env messages), causing 100% Tier 1 failure and useless 500-char passthrough. Solution: - Add extract_json_object() to parser/mod.rs (shared utility) - Algorithm: find "numTotalTests" or first standalone {, brace-balance forward - VitestParser now tries direct parse → extract+parse → regex → passthrough - Replace hardcoded Command::new("pnpm") with package_manager_exec("vitest") - Delete orphan doc comment on line 203 Impact: - Before: 100% Tier 3 passthrough with pnpm workflows - After: Tier 1 success with prefixes, maintains 99.5% token savings Tests: - 6 tests for extract_json_object (clean, pnpm, dotenv, nested, no-json, strings) - 3 tests for VitestParser with prefixes - All 277 tests pass Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * chore(benchmark): add vitest, pnpm, and gh commands Add benchmarks for recently implemented commands: - vitest run (PR rtk-ai#92 - JSON extraction fix) - pnpm list/outdated (PR rtk-ai#6) - gh pr list/run list (existing gh support) These commands are now tested in CI to ensure token savings are maintained. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * fix: improve CI reliability and hook coverage (rtk-ai#95) * feat(cargo): aggregate test output into single line (rtk-ai#83) Problem: `cargo test` shows 24+ summary lines even when all pass. An LLM only needs to know IF something failed, not 24x "ok". Before (24 lines): ``` ✓ test result: ok. 2 passed; 0 failed; ... ✓ test result: ok. 0 passed; 0 failed; ... ... (x24) ``` After (1 line): ``` ✓ cargo test: 137 passed (24 suites, 1.45s) ``` Changes: - Add AggregatedTestResult struct with regex parsing - Merge multiple test summaries when all pass - Format: "N passed, M ignored, P filtered out (X suites, Ys)" - Fallback to original behavior if parsing fails - Failures still show full details (no aggregation) Tests: 6 new + 1 modified, covering all cases: - Multi-suite aggregation - Single suite (singular "suite") - Zero tests - With ignored/filtered out - Failures → no aggregation (detail preserved) - Regex fallback Closes rtk-ai#83 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * fix(ci): prevent Python/Go benchmark sections from being silently skipped **Problem:** Python and Go benchmark sections were silently skipped in CI because the RTK repository doesn't contain pyproject.toml or go.mod files. The sections only ran when these project files existed. **Solution:** 1. Create temporary fixtures with minimal project structure: - Python: pyproject.toml + sample.py + test_sample.py - Go: go.mod + main.go + main_test.go 2. Resolve RTK to absolute path to work after cd into temp dirs 3. Install required tools in CI workflow: - Python: ruff, pytest - Go: stable version + golangci-lint **Impact:** - Python/Go sections now appear in CI benchmark output - Self-contained fixtures ensure consistent benchmarking - No dependency on RTK project structure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(hooks): add missing RTK command rewrites Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh: - cargo check/install/fmt - tree, find, diff - head → rtk read (with --max-lines transformation) - wget Fixes BSD sed compatibility for head transformation by using literal spaces instead of \s+ (which doesn't work on macOS). Impact: ~18.2K tokens saved on previously missed commands discovered by `rtk discover`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> * chore(master): release 0.15.1 (rtk-ai#96) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * feat(python): add lint dispatcher + universal format command Phase 1: Enhanced rtk lint - Add pylint JSON2 parser (80-85% token savings) - Add mypy text parser (75-80% token savings) - Smart dispatcher: Python tools (pip) vs JS tools (npm) - Reuse ruff_cmd JSON parser for rtk lint ruff Phase 2: New rtk format command - Universal formatter: black/ruff/prettier - Auto-detect from pyproject.toml/package.json - Implement black output parser (70-85% savings) - Reuse existing prettier/ruff formatters Phase 3: Hook integration - Auto-rewrite: pylint → rtk lint pylint - Auto-rewrite: mypy → rtk lint mypy - Auto-rewrite: black --check → rtk format black Files changed: - src/lint_cmd.rs: +454 lines (pylint/mypy parsers, dispatcher) - src/format_cmd.rs: +386 lines (NEW - universal formatter) - src/ruff_cmd.rs: Export filter functions as pub - src/prettier_cmd.rs: Export filter_prettier_output as pub - src/main.rs: Add Commands::Format + routing - hooks/rtk-rewrite.sh: Add Python tool rewrite rules Testing: 10 new unit tests, all 313 tests passing Impact: 80-90% token savings on Python workflows Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com> Co-authored-by: Michael Coen <mhcoen@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Copilot AI review requested due to automatic review settings February 12, 2026 21:50

Copilot started reviewing on behalf of FlorianBruniaux February 12, 2026 21:51 View session

FlorianBruniaux force-pushed the fix/python-go-benchmark-fixtures branch from 3e02aaa to fb5b442 Compare February 12, 2026 21:56

Copilot AI reviewed Feb 12, 2026

View reviewed changes

FlorianBruniaux and others added 2 commits February 12, 2026 23:05

Merge branch 'master' into fix/python-go-benchmark-fixtures

7689fcf

FlorianBruniaux changed the title ~~fix(ci): prevent Python/Go benchmark sections from being silently skipped~~ fix: improve CI reliability and hook coverage Feb 12, 2026

pszymkowiak merged commit ac80bfa into rtk-ai:master Feb 12, 2026
2 checks passed

github-actions bot mentioned this pull request Feb 12, 2026

chore(master): release 0.15.1 #96

Merged

FlorianBruniaux deleted the fix/python-go-benchmark-fixtures branch February 13, 2026 07:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve CI reliability and hook coverage#95

fix: improve CI reliability and hook coverage#95
pszymkowiak merged 4 commits intortk-ai:masterfrom
FlorianBruniaux:fix/python-go-benchmark-fixtures

FlorianBruniaux commented Feb 12, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Copilot AI Feb 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if command -v go &> /dev/null && command -v golangci-lint &> /dev/null; then
		section "go"



	GO_ORIG_PWD="$(pwd)"
	trap 'if [ -n "$GO_FIXTURE" ] && [ -d "$GO_FIXTURE" ]; then cd "$GO_ORIG_PWD" >/dev/null 2>&1 \|\| true; rm -rf "$GO_FIXTURE"; fi' EXIT INT TERM

	pip install ruff pytest
	pip install ruff==0.6.9 pytest==8.3.3

	go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
	go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.59.1

-if command -v python3 &> /dev/null && command -v ruff &> /dev/null && command -v pytest &> /dev/null; then
+# Vérifie les prérequis Python et affiche un message explicite si manquants
+missing_python_tools=()
+command -v python3 &> /dev/null || missing_python_tools+=("python3")
+command -v ruff &> /dev/null || missing_python_tools+=("ruff")
+command -v pytest &> /dev/null || missing_python_tools+=("pytest")
+if [ ${#missing_python_tools[@]} -eq 0 ]; then

Conversation

FlorianBruniaux commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

1. CI Benchmark Fixtures (fb5b442)

2. Hook Command Coverage (c335f82)

Testing

Verification

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

FlorianBruniaux commented Feb 12, 2026 •

edited

Loading

1. CI Benchmark Fixtures (`fb5b442`)

2. Hook Command Coverage (`c335f82`)