Skip to content

fix(tools): throttle ripgrep CPU usage with thread limits and concurrency control#2009

Merged
code-yeongyu merged 2 commits intocode-yeongyu:devfrom
JiHongKim98:fix/ripgrep-cpu-throttle
Feb 22, 2026
Merged

fix(tools): throttle ripgrep CPU usage with thread limits and concurrency control#2009
code-yeongyu merged 2 commits intocode-yeongyu:devfrom
JiHongKim98:fix/ripgrep-cpu-throttle

Conversation

@JiHongKim98
Copy link
Copy Markdown
Contributor

@JiHongKim98 JiHongKim98 commented Feb 20, 2026

Summary

Fixes #2008 — ripgrep (rg) causes sustained CPU spikes (700%+) on macOS during agent workflows, making the system unresponsive for extended periods (1h+).

Related: #674 (comment-checker CPU spikes), #1722 (rg CPU spikes on Windows — closed as "working as designed")

Image

Changes

  • Thread limiting: Add --threads=4 to all rg invocations in both grep and glob tools, preventing a single process from saturating all CPU cores
  • Concurrency control: Add a global Semaphore(2) that limits concurrent rg processes to 2, queuing additional requests
  • Timeout fix: Reduce DEFAULT_TIMEOUT_MS from 300s to 60s in grep/constants.ts — the tool description already claims "60s timeout" but the actual value was 5 minutes
  • Output buffer reduction: Reduce DEFAULT_MAX_OUTPUT_BYTES from 10MB to 256KB — 10MB is excessive for LLM context consumption
  • New output_mode parameter: Supports content, files_with_matches (default), and count modes — inspired by Claude Code's approach of fetching file lists first, then reading specific files
  • New head_limit parameter: Limits result count for incremental fetching instead of retrieving everything at once

Files Changed (8)

File Change
src/tools/shared/semaphore.ts New — Counting semaphore for concurrent process limiting
src/tools/grep/constants.ts DEFAULT_RG_THREADS=4, timeout 300s→60s, output 10MB→256KB
src/tools/grep/types.ts Add threads, outputMode, headLimit to GrepOptions
src/tools/grep/cli.ts --threads flag, semaphore wrapping, headLimit/outputMode support
src/tools/grep/tools.ts Expose output_mode, head_limit params, count mode support
src/tools/glob/types.ts Add threads to GlobOptions
src/tools/glob/constants.ts Re-export DEFAULT_RG_THREADS
src/tools/glob/cli.ts --threads flag, semaphore wrapping

Impact

Before: Two rg processes × 12 threads each = 24 threads at 100% = system freeze
After: Two rg processes (max) × 4 threads each = 8 threads at 100% = smooth operation

Non-breaking

  • All changes are additive with sensible defaults
  • Existing tool calls work identically (no required new parameters)
  • output_mode defaults to files_with_matches which is more efficient than the previous behavior of always returning full content

Test plan

  • bun test src/tools/grep src/tools/glob24 pass, 0 fail
  • bun run typecheck — No new errors (3 pre-existing errors in unrelated files)
  • Manual verification: run agent workflow and confirm CPU stays under control

Summary by cubic

Throttle ripgrep to prevent CPU spikes by capping to 4 threads per process and allowing at most 2 concurrent runs. Tightens timeouts/output and adds lighter output modes to make searches faster and safer.

  • Bug Fixes

  • New Features

    • Add output_mode: content, files_with_matches (default), and count (validated via enum).
    • Add head_limit to cap results for incremental fetching.

Written for commit 02017a1. Summary will update on new commits.

I have read the CLA Document and I hereby sign the CLA

…ency control

- Add --threads=4 flag to all rg invocations (grep and glob)
- Add global semaphore limiting concurrent rg processes to 2
- Reduce grep timeout from 300s to 60s (matches tool description)
- Reduce max output from 10MB to 256KB (prevents excessive memory usage)
- Add output_mode parameter (content/files_with_matches/count)
- Add head_limit parameter for incremental result fetching

Closes code-yeongyu#2008

Ref: code-yeongyu#674, code-yeongyu#1722
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 20, 2026

All contributors have signed the CLA. Thank you! ✅
Posted by the CLA Assistant Lite bot.

@JiHongKim98
Copy link
Copy Markdown
Contributor Author

I have read the CLA Document and I hereby sign the CLA

github-actions bot added a commit that referenced this pull request Feb 20, 2026
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 8 files

Confidence score: 3/5

  • src/tools/grep/cli.ts has a concrete behavior bug: outputMode: "files_with_matches" yields empty results due to parseOutput expecting line-numbered output, which can directly affect users relying on that mode.
  • Two compatibility issues in src/tools/grep/tools.ts around schema enum usage and an unsafe type assertion could cause agent/tool schema mismatches at runtime.
  • The score reflects a real user-facing regression risk in the CLI path plus schema compatibility concerns, though the fixes are localized.
  • Pay close attention to src/tools/grep/cli.ts and src/tools/grep/tools.ts - output parsing for files-only mode and enum schema/type safety.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/tools/grep/tools.ts">

<violation number="1" location="src/tools/grep/tools.ts:24">
P1: Custom agent: **Opencode Compatibility**

Use `tool.schema.enum` instead of a generic `string()` for `output_mode` to ensure the agent receives the exact allowed values in its JSON schema and to enforce strict runtime validation.</violation>

<violation number="2" location="src/tools/grep/tools.ts:40">
P1: Custom agent: **Opencode Compatibility**

Remove the unsafe type assertion, as the type should be automatically inferred from a correctly defined `tool.schema.enum`.</violation>
</file>

<file name="src/tools/grep/cli.ts">

<violation number="1" location="src/tools/grep/cli.ts:57">
P1: `outputMode: "files_with_matches"` produces empty results due to incompatible regex parsing in `parseOutput`. When using `--files-with-matches`, ripgrep outputs only file paths (e.g., "src/file.ts") without line numbers or content, but `parseOutput` uses regex `/^(.+?):(\\d+):(.*)$/` expecting `file:line:content` format. This causes all lines to fail matching, returning an empty `matches` array.</violation>
</file>

Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Add one-off context when rerunning by tagging @cubic-dev-ai with guidance or docs links (including llms.txt)
  • Ask questions if you need clarification on any suggestion

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

- Use tool.schema.enum() for output_mode instead of generic string()
- Remove unsafe type assertion for output_mode
- Fix files_with_matches mode returning empty results by adding
  filesOnly flag to parseOutput for --files-with-matches rg output
@JiHongKim98
Copy link
Copy Markdown
Contributor Author

Thanks for the review, @cubic-dev-ai! All 3 issues have been addressed in 02017a1:

  1. tool.schema.enum() for output_mode — Fixed. Now uses tool.schema.enum(["content", "files_with_matches", "count"]) instead of a generic string(), ensuring proper JSON schema validation and LLM visibility of allowed values.

  2. Unsafe type assertion removed — Fixed. The as "content" | "files_with_matches" | "count" cast is no longer needed since the enum schema provides correct type inference.

  3. files_with_matches empty results bug — Fixed. Added a filesOnly parameter to parseOutput() that handles --files-with-matches output format (file paths only, no file:line:content). When outputMode === "files_with_matches", the parser now correctly extracts file paths without expecting line numbers.

All existing tests pass (24/24).

@code-yeongyu code-yeongyu merged commit b175c11 into code-yeongyu:dev Feb 22, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: ripgrep (rg) causes sustained CPU spikes on macOS during agent workflows

2 participants