docs: update architecture-and-performance.md to reflect AIMD changes#467
Conversation
…ncy control (#466) Account for the AIMD throttle manager throughout the architecture doc: concurrency formula, max_parallel_requests guidance, new ThrottleConfig section, common problems table, and tuning workflow. Add sync-engine caveats noting AIMD is fully active on the async path. Made-with: Cursor
Greptile SummaryThis PR updates Key changes:
Issue found:
|
| Filename | Overview |
|---|---|
| docs/concepts/architecture-and-performance.md | Documentation updated to cover AIMD adaptive concurrency, ThrottleConfig parameters, and the DatasetBuilder rename. One factual inaccuracy found: the tip section incorrectly states the rate-limit ceiling ratchets up on probe success; the implementation only ever lowers the ceiling via min(). |
Prompt To Fix All With AI
This is a comment left during a code review.
Path: docs/concepts/architecture-and-performance.md
Line: 233-236
Comment:
**"Ceiling ratchets up" claim is not supported by the implementation**
The tip states: *"If the probe succeeds, the ceiling ratchets up; if it triggers another 429, the ceiling lowers."*
Looking at `throttle_manager.py`, `rate_limit_ceiling` is only ever written in `release_rate_limited()`:
```python
if state.rate_limit_ceiling == 0:
state.rate_limit_ceiling = prev_limit
else:
state.rate_limit_ceiling = min(state.rate_limit_ceiling, prev_limit)
```
`release_success()` never touches `rate_limit_ceiling`. This means the stored ceiling can only decrease (or stay the same) — it never ratchets up as a result of a successful probe. When the system successfully operates at `ceiling × (1 + ceiling_overshoot)` requests, `rate_limit_ceiling` is unchanged; the soft ceiling returned by `_compute_soft_ceiling()` stays fixed too. If a new 429 fires at `prev_limit = ceiling × (1 + overshoot)` which is greater than the old ceiling, `min(old_ceiling, prev_limit)` still returns `old_ceiling` — no upward movement.
In practice this means a ceiling that was set too low (e.g. during a transient overload) can never self-heal upward; the system is permanently capped at `old_ceiling × (1 + overshoot)` regardless of how well the server performs afterward.
Additionally, "If the server rate-limits **again**, the controller records that level as a ceiling" is also inaccurate — the ceiling is set on the **first** 429 (via the `if state.rate_limit_ceiling == 0` branch), not only on subsequent ones.
How can I resolve this? If you propose a fix, please make it concise.Reviews (3): Last reviewed commit: "docs: address PR #467 review feedback on..." | Re-trigger Greptile
andreatgretel
left a comment
There was a problem hiding this comment.
Looks good, left two minor nits but nothing blocking. 👍
- Clarify that only the first 429 in a burst reduces the concurrency limit; subsequent in-flight cascade 429s hold it steady - Soften sync engine caveats: AIMD engages as a fallback when transport retries exhaust, not "no effect" - Note salvage queue difference between async and sync paths when recommending aggressive max_parallel_requests values Made-with: Cursor
📋 Summary
Update the Architecture & Performance doc to account for AIMD adaptive concurrency control and the
ColumnWiseDatasetBuilder→DatasetBuilderrename.Closes #466
🔄 Changes
🔧 Changed
max_parallel_requestswith AIMD-managedcurrent_throttle_limit; explained decrease/increase behaviormax_parallel_requestssection — reframed as a ceiling; updated guidance; added AIMD-aware tuning tipDatasetBuilder(renamed in chore: rename ColumnWiseDatasetBuilder to DatasetBuilder #437)✨ Added
ThrottleConfigparameters🤖 Generated with AI
Made with Cursor