Replace custom rate limit header parsing with rate-limits crate#2135
Conversation
There was a problem hiding this comment.
Code here looks reasonable. I also took a squiz at mre/rate-limits#11.
1942c4a to
419509a
Compare
thomas-zahner
left a comment
There was a problem hiding this comment.
Thank you for all the work!
- Remove internal header parsing logic in favor of the rate-limits crate - Update dependencies and remove unused httpdate crate - Add integration tests for rate limit handling
Remove redundant status code and paths arguments from run_rate_limit_test. Paths are now fixed internally, and status code is always 200. Update all test cases accordingly for clarity and consistency.
- Centralize rate limit delay and offset constants - Add helper to calculate reset time for absolute timestamps - Simplify test setup and assertions - Increase EPSILON to reduce test flakiness
- Use explicit expected delays for each test case - Improve timing assertions with tolerance and max overhead - Record and assert server-side request times for accuracy - Clarify distinction between absolute and relative reset headers - Add test for Retry-After backoff between different URLs
- Fix clippy lints triggered on newer toolchain:
- Add `# Panics` doc section on `record_cache_hit`
- Collapse nested `if` in `parse_rate_limit_headers`
- Scope MutexGuards in rate-limit tests to avoid holding them
across await points
- Remove unused `Instant` and `Request` imports in cli tests
- Relax TIMING_TOLERANCE in rate-limit tests from 500ms to 1s; the
absolute-timestamp tests have a variable startup delay between
`calculate_reset_time()` and lychee parsing headers that can
exceed 500ms on slower machines.
419509a to
f4d39ae
Compare
thomas-zahner
left a comment
There was a problem hiding this comment.
Thank you! Thrilled to see this mess of rate-limit mechanisms distilled into a dedicated Rust crate and lychee making use of it :)
Oh, wow! Thanks so much for that comment. |
This migrates our rate-limit handling to the rate-limits crate.
This work is based on my rewrite of the rate-limits crate in mre/rate-limits#11. I rewrote it from scratch to be more efficient as well as use a state-machine under the hood for more reliable vendor detection.
I tested it locally as best as I could and so far I haven't seen any regressions.
Once all looks good, I will release a new version of the rate-limits crate based on this new approach and will remove the "draft" setting on this PR. We could also move the crate into the lycheeverse org, which I believe will make shared maintenance / bugfix releases easier in the future. wdyt?
Edit: while refactoring the code, I also decided to remove the 80% threshold for throttling request, which we had before. I don't think it's helpful. Instead, I now believe, we should go to the maximum number of allowed requests and get throttled. AFAICT, there is no penalty for that, and we could squeeze out some more performance this way. Once we get throttled, we just wait until we get unblocked. From what I understand, this is how to work with those headers correctly.
Closes #2027