Skip to content

fix: reduce API rate limit pressure in train-drain3-weights workflow#24392

Merged
pelikhan merged 1 commit intomainfrom
copilot/add-cache-store-logs
Apr 3, 2026
Merged

fix: reduce API rate limit pressure in train-drain3-weights workflow#24392
pelikhan merged 1 commit intomainfrom
copilot/add-cache-store-logs

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 3, 2026

The daily train-drain3-weights workflow hit GitHub API rate limits when processing 1000 runs — making ~400 API calls (downloads + job status fetches) — and flooded CI logs with hundreds of \r-based progress bar lines that don't overwrite in non-TTY environments.

Changes

train-drain3-weights.yml

  • Reduce count 1000 → 100: cuts API call volume by 10×
  • Add actions/cache for /tmp/drain3-logs: restores previously downloaded run artifacts on subsequent daily runs; the in-code cache (run_summary.json) then skips re-downloading already-seen runs

pkg/cli/logs_orchestrator.go

  • Suppress progress bar in CI: \r doesn't overwrite in non-TTY — added IsRunningInCI() guard so the progress bar only renders in interactive terminals
  • 500ms cooldown between batch iterations: inserted time.Sleep(APICallCooldown) before each iteration after the first to reduce burst API pressure during paginated fetches

pkg/cli/logs_models.go

  • Added APICallCooldown = 500 * time.Millisecond constant

…n, limit to 100 logs

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d3933290-6e8e-4ccb-a118-7bdc43be5ad0

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan pelikhan marked this pull request as ready for review April 3, 2026 22:16
Copilot AI review requested due to automatic review settings April 3, 2026 22:16
@pelikhan pelikhan merged commit 755e2d6 into main Apr 3, 2026
63 of 64 checks passed
@pelikhan pelikhan deleted the copilot/add-cache-store-logs branch April 3, 2026 22:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Reduces GitHub API rate-limit pressure and CI log noise in the daily train-drain3-weights workflow by lowering run volume, adding caching, and adjusting CLI behavior for CI/non-interactive environments.

Changes:

  • Throttle gh-aw logs pagination by adding a fixed cooldown between batch iterations.
  • Suppress the \r-based progress bar in CI to avoid flooding workflow logs.
  • Update the training workflow to process fewer runs and cache downloaded logs between daily executions.
Show a summary per file
File Description
.github/workflows/train-drain3-weights.yml Lowers run count and adds an Actions cache for /tmp/drain3-logs to reduce repeated downloads.
pkg/cli/logs_orchestrator.go Adds API-call cooldown between pagination iterations; disables progress bar output in CI.
pkg/cli/logs_models.go Introduces APICallCooldown constant used to pace API pagination.
pkg/agentdrain/data/default_weights.json Updates embedded default drain3 weights content/format from training output.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 3

// The first iteration (iteration == 0) runs immediately; subsequent iterations
// pause briefly to give the API rate limit window time to recover.
if iteration > 0 {
time.Sleep(APICallCooldown)
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.Sleep(APICallCooldown) does not respect ctx cancellation. If the user cancels (Ctrl+C) during the cooldown, the loop will block until the sleep completes before returning. Consider making the cooldown context-aware (e.g., select on ctx.Done() vs a time.NewTimer(APICallCooldown)), so cancellation is immediate while still throttling API calls.

Suggested change
time.Sleep(APICallCooldown)
timer := time.NewTimer(APICallCooldown)
select {
case <-ctx.Done():
if !timer.Stop() {
<-timer.C
}
fmt.Fprintln(os.Stderr, console.FormatWarningMessage("Operation cancelled"))
return ctx.Err()
case <-timer.C:
}

Copilot uses AI. Check for mistakes.
Comment on lines +581 to 586
// Create progress bar for tracking run processing (only in non-verbose, non-CI mode)
// In CI environments \r is treated as a newline, producing excessive output for each update.
var progressBar *console.ProgressBar
if !verbose {
if !verbose && !IsRunningInCI() {
progressBar = console.NewProgressBar(int64(totalRuns))
fmt.Fprintf(os.Stderr, "Processing runs: %s\r", progressBar.Update(0))
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions “non-TTY” behavior, but the guard only checks !IsRunningInCI(). In other non-TTY contexts (e.g., piping stderr to a file locally), the \r updates will still emit noisy output. Consider gating progress rendering on tty.IsStderrTerminal() (or equivalent) rather than CI detection, since the underlying issue is terminal capability, not CI specifically.

Copilot uses AI. Check for mistakes.
Comment on lines +31 to +38
- name: Restore logs cache
id: cache-logs
uses: actions/cache@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: /tmp/drain3-logs
key: drain3-logs-${{ runner.os }}-${{ github.run_id }}
restore-keys: |
drain3-logs-${{ runner.os }}-
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With key: drain3-logs-${{ runner.os }}-${{ github.run_id }}, every workflow run will save a new cache entry (even when a previous cache is restored via restore-keys), which can churn the Actions cache and increase eviction pressure if /tmp/drain3-logs is large. If the goal is to reuse the same cache across daily runs, consider using a more stable key (e.g., per-OS + date/week) or switching to explicit actions/cache/restore + actions/cache/save so you only write a new cache when you intentionally rotate it.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants