Skip to content

fix: improve CI reliability and hook coverage#95

Merged
pszymkowiak merged 4 commits intortk-ai:masterfrom
FlorianBruniaux:fix/python-go-benchmark-fixtures
Feb 12, 2026
Merged

fix: improve CI reliability and hook coverage#95
pszymkowiak merged 4 commits intortk-ai:masterfrom
FlorianBruniaux:fix/python-go-benchmark-fixtures

Conversation

@FlorianBruniaux
Copy link
Collaborator

@FlorianBruniaux FlorianBruniaux commented Feb 12, 2026

Summary

Two independent fixes to improve RTK developer experience:

  1. CI Reliability: Prevent Python/Go benchmark sections from being silently skipped
  2. Hook Coverage: Add 8 missing command rewrites to auto-rewrite hooks

Changes

1. CI Benchmark Fixtures (fb5b442)

Problem: Python and Go benchmark sections could silently skip if fixture creation failed, hiding real issues.

Solution: Enhanced error handling in benchmark workflow to catch fixture failures early.

Impact: More reliable CI with explicit failures instead of silent skips.

2. Hook Command Coverage (c335f82)

Problem: rtk discover revealed ~18.2K tokens wasted on commands RTK supports but hooks don't rewrite.

Solution: Added 8 missing rewrites to rtk-rewrite.sh and rtk-suggest.sh:

  • cargo check/install/fmt
  • tree, find, diff
  • headrtk read --max-lines (smart transformation)
  • wget

Impact: ~18.2K additional tokens saved automatically through hook rewrites.

Technical Note: Fixed BSD sed compatibility for head transformation (macOS).

Testing

  • ✅ Hook syntax validation (3/3 files)
  • ✅ Hook transformation tests (9/9 passing)
  • ✅ Build verification: cargo test → 292 passed
  • ✅ Manual testing on sample commands

Verification

# Test hook rewrites
echo '{"tool_input":{"command":"tree ."}}' | bash .claude/hooks/rtk-rewrite.sh
echo '{"tool_input":{"command":"head -10 Cargo.toml"}}' | bash .claude/hooks/rtk-rewrite.sh

# Run tests
cargo test

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes rtk-ai#83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings February 12, 2026 21:50
…pped

**Problem:**
Python and Go benchmark sections were silently skipped in CI because
the RTK repository doesn't contain pyproject.toml or go.mod files.
The sections only ran when these project files existed.

**Solution:**
1. Create temporary fixtures with minimal project structure:
   - Python: pyproject.toml + sample.py + test_sample.py
   - Go: go.mod + main.go + main_test.go
2. Resolve RTK to absolute path to work after cd into temp dirs
3. Install required tools in CI workflow:
   - Python: ruff, pytest
   - Go: stable version + golangci-lint

**Impact:**
- Python/Go sections now appear in CI benchmark output
- Self-contained fixtures ensure consistent benchmarking
- No dependency on RTK project structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FlorianBruniaux FlorianBruniaux force-pushed the fix/python-go-benchmark-fixtures branch from 3e02aaa to fb5b442 Compare February 12, 2026 21:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to ensure Python and Go benchmark sections are exercised in CI by making scripts/benchmark.sh self-contained (temporary fixtures) and by installing the required tooling in the benchmark workflow.

Changes:

  • Add temporary Python and Go fixtures to scripts/benchmark.sh so benchmarks don’t depend on repository pyproject.toml / go.mod.
  • Update .github/workflows/benchmark.yml to install Python (ruff/pytest) and Go (setup-go + golangci-lint) tooling.
  • Enhance src/cargo_cmd.rs to aggregate multi-suite cargo test summaries into a compact single-line output (with expanded unit tests).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
src/cargo_cmd.rs Adds aggregated/compact formatting for cargo test summaries and new tests.
scripts/benchmark.sh Adds self-contained Python/Go benchmark sections using temporary fixtures and an absolute RTK path.
.github/workflows/benchmark.yml Installs Python + Go tooling needed for the new benchmark sections in CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +322 to 324
PYTHON_FIXTURE=$(mktemp -d)
cd "$PYTHON_FIXTURE"

Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temp fixture cleanup for the Python section only runs on the normal path. If the script exits early (e.g., Ctrl-C, unexpected error before the final rm -rf), the temp dir will be left behind and the working directory may stay changed. Consider using a trap to always cd back and remove $PYTHON_FIXTURE on EXIT/INT/TERM (or use pushd/popd).

Copilot uses AI. Check for mistakes.
Comment on lines +376 to 378
if command -v go &> /dev/null && command -v golangci-lint &> /dev/null; then
section "go"

Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Go benchmark section is still silent when prerequisites are missing (go/golangci-lint); it will simply not appear in output. To fully address "silently skipped" sections, consider adding an else branch that prints an explicit "go skipped" message with the missing tool(s).

Suggested change
if command -v go &> /dev/null && command -v golangci-lint &> /dev/null; then
section "go"
section "go"
if ! command -v go &> /dev/null || ! command -v golangci-lint &> /dev/null; then
echo "go skipped (missing: go and/or golangci-lint)" >&2
else

Copilot uses AI. Check for mistakes.
section "Go stack"
if command -v go &> /dev/null && command -v golangci-lint &> /dev/null; then
section "go"

Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temp fixture cleanup for the Go section only runs on the normal path. If the script exits early (interrupt/unexpected error before the final rm -rf), the temp dir will be left behind and the working directory may stay changed. Consider using a trap to always cd back and remove $GO_FIXTURE on EXIT/INT/TERM (or use pushd/popd).

Suggested change
GO_ORIG_PWD="$(pwd)"
trap 'if [ -n "$GO_FIXTURE" ] && [ -d "$GO_FIXTURE" ]; then cd "$GO_ORIG_PWD" >/dev/null 2>&1 || true; rm -rf "$GO_FIXTURE"; fi' EXIT INT TERM

Copilot uses AI. Check for mistakes.

- name: Install Python tools
run: |
pip install ruff pytest
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip install ruff pytest without version pins makes CI benchmarks non-deterministic (a new ruff/pytest release can change output/behavior and cause flaky diffs). Consider pinning versions (or using a constraints file) to keep benchmark output stable over time.

Suggested change
pip install ruff pytest
pip install ruff==0.6.9 pytest==8.3.3

Copilot uses AI. Check for mistakes.

- name: Install Go tools
run: |
go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installing golangci-lint with @latest is non-deterministic and can break CI unexpectedly when upstream releases. Consider pinning to a specific golangci-lint version (and updating it intentionally) to keep the benchmark workflow stable.

Suggested change
go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
go install github.com/golangci/golangci-lint/cmd/golangci-lint@v1.59.1

Copilot uses AI. Check for mistakes.
# ===================
if [ -f "pyproject.toml" ] || [ -f "requirements.txt" ] || [ -f "setup.py" ]; then
section "Python stack"
if command -v python3 &> /dev/null && command -v ruff &> /dev/null && command -v pytest &> /dev/null; then
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python benchmark section is still silent when prerequisites are missing (python3/ruff/pytest); it will simply not appear in output. Since the goal is to prevent silent skips, consider adding an else branch that prints an explicit "python skipped" message with the missing tool(s).

Suggested change
if command -v python3 &> /dev/null && command -v ruff &> /dev/null && command -v pytest &> /dev/null; then
# Vérifie les prérequis Python et affiche un message explicite si manquants
missing_python_tools=()
command -v python3 &> /dev/null || missing_python_tools+=("python3")
command -v ruff &> /dev/null || missing_python_tools+=("ruff")
command -v pytest &> /dev/null || missing_python_tools+=("pytest")
if [ ${#missing_python_tools[@]} -eq 0 ]; then

Copilot uses AI. Check for mistakes.
FlorianBruniaux and others added 2 commits February 12, 2026 23:05
Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh:
- cargo check/install/fmt
- tree, find, diff
- head → rtk read (with --max-lines transformation)
- wget

Fixes BSD sed compatibility for head transformation by using literal
spaces instead of \s+ (which doesn't work on macOS).

Impact: ~18.2K tokens saved on previously missed commands discovered
by `rtk discover`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FlorianBruniaux FlorianBruniaux changed the title fix(ci): prevent Python/Go benchmark sections from being silently skipped fix: improve CI reliability and hook coverage Feb 12, 2026
@pszymkowiak pszymkowiak merged commit ac80bfa into rtk-ai:master Feb 12, 2026
2 checks passed
FlorianBruniaux added a commit to FlorianBruniaux/rtk that referenced this pull request Feb 13, 2026
* feat(cargo): aggregate test output into single line (rtk-ai#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes rtk-ai#83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(ci): prevent Python/Go benchmark sections from being silently skipped

**Problem:**
Python and Go benchmark sections were silently skipped in CI because
the RTK repository doesn't contain pyproject.toml or go.mod files.
The sections only ran when these project files existed.

**Solution:**
1. Create temporary fixtures with minimal project structure:
   - Python: pyproject.toml + sample.py + test_sample.py
   - Go: go.mod + main.go + main_test.go
2. Resolve RTK to absolute path to work after cd into temp dirs
3. Install required tools in CI workflow:
   - Python: ruff, pytest
   - Go: stable version + golangci-lint

**Impact:**
- Python/Go sections now appear in CI benchmark output
- Self-contained fixtures ensure consistent benchmarking
- No dependency on RTK project structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hooks): add missing RTK command rewrites

Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh:
- cargo check/install/fmt
- tree, find, diff
- head → rtk read (with --max-lines transformation)
- wget

Fixes BSD sed compatibility for head transformation by using literal
spaces instead of \s+ (which doesn't work on macOS).

Impact: ~18.2K tokens saved on previously missed commands discovered
by `rtk discover`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
@FlorianBruniaux FlorianBruniaux deleted the fix/python-go-benchmark-fixtures branch February 13, 2026 07:46
FlorianBruniaux added a commit to FlorianBruniaux/rtk that referenced this pull request Feb 14, 2026
Resolved conflicts:
- Version bumped to 0.15.4 (Cargo.toml, Cargo.lock, .release-please-manifest.json)
- CHANGELOG.md: Added upstream releases (0.15.4, 0.15.3, 0.15.2)
- Hooks: Adopted POSIX character classes ([[:space:]]) from upstream
- src/parser/mod.rs: Added multibyte UTF-8 tests from upstream
- src/ruff_cmd.rs: Kept functions public for lint/format dispatcher feature

Upstream changes integrated:
- rtk-ai#120: git status fix for non-repo folders
- rtk-ai#93: UTF-8 panic prevention on multibyte chars
- rtk-ai#98: POSIX grep compatibility in hooks
- rtk-ai#95, rtk-ai#92: CI reliability and hook coverage improvements

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pszymkowiak pushed a commit that referenced this pull request Feb 14, 2026
* feat(cargo): aggregate test output into single line (#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes #83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: add Python and Go support (#88)

* feat(cargo): aggregate test output into single line (#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes #83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: add Python and Go language support

Implements comprehensive support for Python and Go development tooling
with 70-90% token reduction across all commands.

Python commands (3):
- rtk ruff: Linter/formatter with JSON (check) and text (format) parsing (80%+)
- rtk pytest: Test runner with state machine text parser (90%+)
- rtk pip: Package manager with auto-detect uv (70-85%)

Go commands (4):
- rtk go test: NDJSON streaming parser for interleaved test events (90%+)
- rtk go build: Text filter showing errors only (80%)
- rtk go vet: Text filter for issues (75%)
- rtk golangci-lint: JSON parser grouped by rule (85%)

Architecture:
- Standalone Python commands (mirror lint/prettier pattern)
- Go sub-enum (mirror git/cargo pattern)
- 5 new modules: ruff_cmd, pytest_cmd, pip_cmd, go_cmd, golangci_cmd
- Hook integration in rtk-rewrite.sh for transparent rewrites
- Comprehensive tests (47 new tests, all passing)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat(benchmark): add Python and Go commands

Add benchmark sections for Python (ruff, pytest, pip) and Go (go test/build/vet, golangci-lint) to validate >80% token savings in CI pipeline.

Sections conditionally execute based on project markers (pyproject.toml, go.mod) and tool availability.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: make install-local.sh self-contained (#89)

- Build from source automatically instead of requiring a pre-built binary
- Default install dir to ~/.cargo/bin
- Skip rebuild when binary is up to date
- Warn if install dir is not in PATH

* chore(master): release 0.15.0 (#90)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(vitest): robust JSON extraction for pnpm/dotenv prefixes (#92)

* fix(vitest): robust JSON extraction for pnpm/dotenv prefixes

Problem: RTK's vitest parser forces --reporter=json but pnpm/dotenv prepend
non-JSON text to stdout (banners, env messages), causing 100% Tier 1 failure
and useless 500-char passthrough.

Solution:
- Add extract_json_object() to parser/mod.rs (shared utility)
- Algorithm: find "numTotalTests" or first standalone {, brace-balance forward
- VitestParser now tries direct parse → extract+parse → regex → passthrough
- Replace hardcoded Command::new("pnpm") with package_manager_exec("vitest")
- Delete orphan doc comment on line 203

Impact:
- Before: 100% Tier 3 passthrough with pnpm workflows
- After: Tier 1 success with prefixes, maintains 99.5% token savings

Tests:
- 6 tests for extract_json_object (clean, pnpm, dotenv, nested, no-json, strings)
- 3 tests for VitestParser with prefixes
- All 277 tests pass

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* chore(benchmark): add vitest, pnpm, and gh commands

Add benchmarks for recently implemented commands:
- vitest run (PR #92 - JSON extraction fix)
- pnpm list/outdated (PR #6)
- gh pr list/run list (existing gh support)

These commands are now tested in CI to ensure token savings are maintained.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: improve CI reliability and hook coverage (#95)

* feat(cargo): aggregate test output into single line (#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes #83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(ci): prevent Python/Go benchmark sections from being silently skipped

**Problem:**
Python and Go benchmark sections were silently skipped in CI because
the RTK repository doesn't contain pyproject.toml or go.mod files.
The sections only ran when these project files existed.

**Solution:**
1. Create temporary fixtures with minimal project structure:
   - Python: pyproject.toml + sample.py + test_sample.py
   - Go: go.mod + main.go + main_test.go
2. Resolve RTK to absolute path to work after cd into temp dirs
3. Install required tools in CI workflow:
   - Python: ruff, pytest
   - Go: stable version + golangci-lint

**Impact:**
- Python/Go sections now appear in CI benchmark output
- Self-contained fixtures ensure consistent benchmarking
- No dependency on RTK project structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hooks): add missing RTK command rewrites

Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh:
- cargo check/install/fmt
- tree, find, diff
- head → rtk read (with --max-lines transformation)
- wget

Fixes BSD sed compatibility for head transformation by using literal
spaces instead of \s+ (which doesn't work on macOS).

Impact: ~18.2K tokens saved on previously missed commands discovered
by `rtk discover`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* chore(master): release 0.15.1 (#96)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat(python): add lint dispatcher + universal format command

Phase 1: Enhanced rtk lint
- Add pylint JSON2 parser (80-85% token savings)
- Add mypy text parser (75-80% token savings)
- Smart dispatcher: Python tools (pip) vs JS tools (npm)
- Reuse ruff_cmd JSON parser for rtk lint ruff

Phase 2: New rtk format command
- Universal formatter: black/ruff/prettier
- Auto-detect from pyproject.toml/package.json
- Implement black output parser (70-85% savings)
- Reuse existing prettier/ruff formatters

Phase 3: Hook integration
- Auto-rewrite: pylint → rtk lint pylint
- Auto-rewrite: mypy → rtk lint mypy
- Auto-rewrite: black --check → rtk format black

Files changed:
- src/lint_cmd.rs: +454 lines (pylint/mypy parsers, dispatcher)
- src/format_cmd.rs: +386 lines (NEW - universal formatter)
- src/ruff_cmd.rs: Export filter functions as pub
- src/prettier_cmd.rs: Export filter_prettier_output as pub
- src/main.rs: Add Commands::Format + routing
- hooks/rtk-rewrite.sh: Add Python tool rewrite rules

Testing: 10 new unit tests, all 313 tests passing
Impact: 80-90% token savings on Python workflows

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Michael Coen <mhcoen@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
ahundt pushed a commit to ahundt/rtk that referenced this pull request Feb 23, 2026
* feat(cargo): aggregate test output into single line (rtk-ai#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes rtk-ai#83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(ci): prevent Python/Go benchmark sections from being silently skipped

**Problem:**
Python and Go benchmark sections were silently skipped in CI because
the RTK repository doesn't contain pyproject.toml or go.mod files.
The sections only ran when these project files existed.

**Solution:**
1. Create temporary fixtures with minimal project structure:
   - Python: pyproject.toml + sample.py + test_sample.py
   - Go: go.mod + main.go + main_test.go
2. Resolve RTK to absolute path to work after cd into temp dirs
3. Install required tools in CI workflow:
   - Python: ruff, pytest
   - Go: stable version + golangci-lint

**Impact:**
- Python/Go sections now appear in CI benchmark output
- Self-contained fixtures ensure consistent benchmarking
- No dependency on RTK project structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hooks): add missing RTK command rewrites

Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh:
- cargo check/install/fmt
- tree, find, diff
- head → rtk read (with --max-lines transformation)
- wget

Fixes BSD sed compatibility for head transformation by using literal
spaces instead of \s+ (which doesn't work on macOS).

Impact: ~18.2K tokens saved on previously missed commands discovered
by `rtk discover`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
ahundt pushed a commit to ahundt/rtk that referenced this pull request Feb 23, 2026
* feat(cargo): aggregate test output into single line (rtk-ai#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes rtk-ai#83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(ci): prevent Python/Go benchmark sections from being silently skipped

**Problem:**
Python and Go benchmark sections were silently skipped in CI because
the RTK repository doesn't contain pyproject.toml or go.mod files.
The sections only ran when these project files existed.

**Solution:**
1. Create temporary fixtures with minimal project structure:
   - Python: pyproject.toml + sample.py + test_sample.py
   - Go: go.mod + main.go + main_test.go
2. Resolve RTK to absolute path to work after cd into temp dirs
3. Install required tools in CI workflow:
   - Python: ruff, pytest
   - Go: stable version + golangci-lint

**Impact:**
- Python/Go sections now appear in CI benchmark output
- Self-contained fixtures ensure consistent benchmarking
- No dependency on RTK project structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hooks): add missing RTK command rewrites

Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh:
- cargo check/install/fmt
- tree, find, diff
- head → rtk read (with --max-lines transformation)
- wget

Fixes BSD sed compatibility for head transformation by using literal
spaces instead of \s+ (which doesn't work on macOS).

Impact: ~18.2K tokens saved on previously missed commands discovered
by `rtk discover`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
ahundt pushed a commit to ahundt/rtk that referenced this pull request Feb 23, 2026
)

* feat(cargo): aggregate test output into single line (rtk-ai#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes rtk-ai#83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: add Python and Go support (rtk-ai#88)

* feat(cargo): aggregate test output into single line (rtk-ai#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes rtk-ai#83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: add Python and Go language support

Implements comprehensive support for Python and Go development tooling
with 70-90% token reduction across all commands.

Python commands (3):
- rtk ruff: Linter/formatter with JSON (check) and text (format) parsing (80%+)
- rtk pytest: Test runner with state machine text parser (90%+)
- rtk pip: Package manager with auto-detect uv (70-85%)

Go commands (4):
- rtk go test: NDJSON streaming parser for interleaved test events (90%+)
- rtk go build: Text filter showing errors only (80%)
- rtk go vet: Text filter for issues (75%)
- rtk golangci-lint: JSON parser grouped by rule (85%)

Architecture:
- Standalone Python commands (mirror lint/prettier pattern)
- Go sub-enum (mirror git/cargo pattern)
- 5 new modules: ruff_cmd, pytest_cmd, pip_cmd, go_cmd, golangci_cmd
- Hook integration in rtk-rewrite.sh for transparent rewrites
- Comprehensive tests (47 new tests, all passing)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat(benchmark): add Python and Go commands

Add benchmark sections for Python (ruff, pytest, pip) and Go (go test/build/vet, golangci-lint) to validate >80% token savings in CI pipeline.

Sections conditionally execute based on project markers (pyproject.toml, go.mod) and tool availability.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* feat: make install-local.sh self-contained (rtk-ai#89)

- Build from source automatically instead of requiring a pre-built binary
- Default install dir to ~/.cargo/bin
- Skip rebuild when binary is up to date
- Warn if install dir is not in PATH

* chore(master): release 0.15.0 (rtk-ai#90)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix(vitest): robust JSON extraction for pnpm/dotenv prefixes (rtk-ai#92)

* fix(vitest): robust JSON extraction for pnpm/dotenv prefixes

Problem: RTK's vitest parser forces --reporter=json but pnpm/dotenv prepend
non-JSON text to stdout (banners, env messages), causing 100% Tier 1 failure
and useless 500-char passthrough.

Solution:
- Add extract_json_object() to parser/mod.rs (shared utility)
- Algorithm: find "numTotalTests" or first standalone {, brace-balance forward
- VitestParser now tries direct parse → extract+parse → regex → passthrough
- Replace hardcoded Command::new("pnpm") with package_manager_exec("vitest")
- Delete orphan doc comment on line 203

Impact:
- Before: 100% Tier 3 passthrough with pnpm workflows
- After: Tier 1 success with prefixes, maintains 99.5% token savings

Tests:
- 6 tests for extract_json_object (clean, pnpm, dotenv, nested, no-json, strings)
- 3 tests for VitestParser with prefixes
- All 277 tests pass

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* chore(benchmark): add vitest, pnpm, and gh commands

Add benchmarks for recently implemented commands:
- vitest run (PR rtk-ai#92 - JSON extraction fix)
- pnpm list/outdated (PR rtk-ai#6)
- gh pr list/run list (existing gh support)

These commands are now tested in CI to ensure token savings are maintained.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix: improve CI reliability and hook coverage (rtk-ai#95)

* feat(cargo): aggregate test output into single line (rtk-ai#83)

Problem: `cargo test` shows 24+ summary lines even when all pass.
An LLM only needs to know IF something failed, not 24x "ok".

Before (24 lines):
```
✓ test result: ok. 2 passed; 0 failed; ...
✓ test result: ok. 0 passed; 0 failed; ...
... (x24)
```

After (1 line):
```
✓ cargo test: 137 passed (24 suites, 1.45s)
```

Changes:
- Add AggregatedTestResult struct with regex parsing
- Merge multiple test summaries when all pass
- Format: "N passed, M ignored, P filtered out (X suites, Ys)"
- Fallback to original behavior if parsing fails
- Failures still show full details (no aggregation)

Tests: 6 new + 1 modified, covering all cases:
- Multi-suite aggregation
- Single suite (singular "suite")
- Zero tests
- With ignored/filtered out
- Failures → no aggregation (detail preserved)
- Regex fallback

Closes rtk-ai#83

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* fix(ci): prevent Python/Go benchmark sections from being silently skipped

**Problem:**
Python and Go benchmark sections were silently skipped in CI because
the RTK repository doesn't contain pyproject.toml or go.mod files.
The sections only ran when these project files existed.

**Solution:**
1. Create temporary fixtures with minimal project structure:
   - Python: pyproject.toml + sample.py + test_sample.py
   - Go: go.mod + main.go + main_test.go
2. Resolve RTK to absolute path to work after cd into temp dirs
3. Install required tools in CI workflow:
   - Python: ruff, pytest
   - Go: stable version + golangci-lint

**Impact:**
- Python/Go sections now appear in CI benchmark output
- Self-contained fixtures ensure consistent benchmarking
- No dependency on RTK project structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hooks): add missing RTK command rewrites

Add 8 missing command rewrites to rtk-rewrite.sh and rtk-suggest.sh:
- cargo check/install/fmt
- tree, find, diff
- head → rtk read (with --max-lines transformation)
- wget

Fixes BSD sed compatibility for head transformation by using literal
spaces instead of \s+ (which doesn't work on macOS).

Impact: ~18.2K tokens saved on previously missed commands discovered
by `rtk discover`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

* chore(master): release 0.15.1 (rtk-ai#96)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* feat(python): add lint dispatcher + universal format command

Phase 1: Enhanced rtk lint
- Add pylint JSON2 parser (80-85% token savings)
- Add mypy text parser (75-80% token savings)
- Smart dispatcher: Python tools (pip) vs JS tools (npm)
- Reuse ruff_cmd JSON parser for rtk lint ruff

Phase 2: New rtk format command
- Universal formatter: black/ruff/prettier
- Auto-detect from pyproject.toml/package.json
- Implement black output parser (70-85% savings)
- Reuse existing prettier/ruff formatters

Phase 3: Hook integration
- Auto-rewrite: pylint → rtk lint pylint
- Auto-rewrite: mypy → rtk lint mypy
- Auto-rewrite: black --check → rtk format black

Files changed:
- src/lint_cmd.rs: +454 lines (pylint/mypy parsers, dispatcher)
- src/format_cmd.rs: +386 lines (NEW - universal formatter)
- src/ruff_cmd.rs: Export filter functions as pub
- src/prettier_cmd.rs: Export filter_prettier_output as pub
- src/main.rs: Add Commands::Format + routing
- hooks/rtk-rewrite.sh: Add Python tool rewrite rules

Testing: 10 new unit tests, all 313 tests passing
Impact: 80-90% token savings on Python workflows

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Michael Coen <mhcoen@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants