Skip to content

fix(gateway): add ThrottleInterval to prevent launchd restart loop#27650

Merged
steipete merged 2 commits intoopenclaw:mainfrom
kevinWangSheng:fix/gateway-restart-loop-backoff
Feb 26, 2026
Merged

fix(gateway): add ThrottleInterval to prevent launchd restart loop#27650
steipete merged 2 commits intoopenclaw:mainfrom
kevinWangSheng:fix/gateway-restart-loop-backoff

Conversation

@kevinWangSheng
Copy link

Summary

  • Adds ThrottleInterval: 60 to the launchd plist to prevent rapid-fire gateway restarts

The launchd plist had KeepAlive=true with no ThrottleInterval, defaulting to macOS's 10-second interval. When the gateway crashes and leaves a stale lock file (30-second stale threshold), the restarted process can't acquire the lock within its 5-second timeout, exits with code 1, and launchd restarts it 10 seconds later. At that point the lock is only ~15 seconds old — still under the 30-second stale threshold — so it loops indefinitely (250 restarts in 42 minutes as reported).

With ThrottleInterval: 60, the restarted process waits 60+ seconds before retrying. By then the lock file is well past the 30-second stale threshold, gets cleaned up automatically, and the gateway starts successfully on the first retry.

Test plan

  • Existing launchd test updated to expect ThrottleInterval
  • pnpm test -- src/daemon/launchd.test.ts passes

Closes #27590

🤖 Generated with Claude Code

@openclaw-barnacle openclaw-barnacle bot added gateway Gateway runtime agents Agent runtime and tooling size: S labels Feb 26, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 26, 2026

Greptile Summary

This PR bundles three unrelated fixes from separate issues: launchd restart throttling (#27590), usage token normalization (#27509), and chokidar v5 compatibility (#27404). According to CLAUDE.md: "Group related changes; avoid bundling unrelated refactors." Each fix addresses a different subsystem and should ideally be in separate PRs for easier review and rollback.

Changes included:

  • Adds ThrottleInterval: 60 to launchd plist to prevent rapid gateway restarts when lock files are stale
  • Fixes token counting for providers that return input_tokens: 0 with actual counts in prompt_tokens (yunwu-openai)
  • Adds glob dependency (v13.0.6) for chokidar v5 compatibility

All three fixes are technically sound with appropriate test coverage. The launchd throttle interval correctly addresses the 10-second restart loop by waiting 60 seconds (exceeding the 30-second stale lock threshold). The usage normalization logic properly prefers non-zero token counts. The glob dependency addition has no visible code changes in this PR.

Confidence Score: 4/5

  • Safe to merge - all code changes are correct with proper test coverage
  • Score reflects that the code implementations are sound and well-tested, but the PR bundles three unrelated fixes that would be better as separate PRs. The launchd throttle logic correctly solves the restart loop, usage normalization handles provider quirks properly, and the glob dependency is a straightforward addition. No logical errors or security concerns identified.
  • No files require special attention - all changes are straightforward

Last reviewed commit: 5090556

@steipete steipete force-pushed the fix/gateway-restart-loop-backoff branch from 5090556 to db3c5a5 Compare February 26, 2026 15:28
@openclaw-barnacle openclaw-barnacle bot added size: XS and removed agents Agent runtime and tooling size: S labels Feb 26, 2026
@steipete steipete merged commit 7f863e2 into openclaw:main Feb 26, 2026
25 of 27 checks passed
@steipete
Copy link
Contributor

Landed via /landpr flow.

  • Gate: pnpm check && pnpm build && pnpm test -- src/daemon/launchd.test.ts
  • Land commits: 431726dff143b5e7998f370204d4ce7cccf6f4a3, db3c5a52b29d6f7f3bd3a5cc5f0e190d658ac4bc
  • Merge commit: 7f863e22b06f7fdcaece417299db7a8fff4b5214

Unified changelog entry included for the restart/orphan loop cluster (#27605, #27590, #26904).

Thanks @kevinWangSheng!

daymade added a commit to daymade/openclaw that referenced this pull request Mar 8, 2026
- Add test ensuring launchd path never returns "failed" status
- Add CHANGELOG.md entry documenting the fix with issue/PR references
- Reference ThrottleInterval evolution (openclaw#27650openclaw#29078 → current 1s)
steipete pushed a commit that referenced this pull request Mar 8, 2026
- Add test ensuring launchd path never returns "failed" status
- Add CHANGELOG.md entry documenting the fix with issue/PR references
- Reference ThrottleInterval evolution (#27650#29078 → current 1s)
steipete pushed a commit that referenced this pull request Mar 8, 2026
- Add test ensuring launchd path never returns "failed" status
- Add CHANGELOG.md entry documenting the fix with issue/PR references
- Reference ThrottleInterval evolution (#27650#29078 → current 1s)
Saitop pushed a commit to NomiciAI/openclaw that referenced this pull request Mar 8, 2026
- Add test ensuring launchd path never returns "failed" status
- Add CHANGELOG.md entry documenting the fix with issue/PR references
- Reference ThrottleInterval evolution (openclaw#27650openclaw#29078 → current 1s)
GordonSH-oss pushed a commit to GordonSH-oss/openclaw that referenced this pull request Mar 9, 2026
- Add test ensuring launchd path never returns "failed" status
- Add CHANGELOG.md entry documenting the fix with issue/PR references
- Reference ThrottleInterval evolution (openclaw#27650openclaw#29078 → current 1s)
hugs42 pushed a commit to hugs42/openclaw that referenced this pull request Mar 10, 2026
- Add test ensuring launchd path never returns "failed" status
- Add CHANGELOG.md entry documenting the fix with issue/PR references
- Reference ThrottleInterval evolution (openclaw#27650openclaw#29078 → current 1s)
jenawant pushed a commit to jenawant/openclaw that referenced this pull request Mar 10, 2026
- Add test ensuring launchd path never returns "failed" status
- Add CHANGELOG.md entry documenting the fix with issue/PR references
- Reference ThrottleInterval evolution (openclaw#27650openclaw#29078 → current 1s)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Gateway enters infinite restart loop after agent timeout (250 retries in 42 minutes)

2 participants