Conversation
Remove case-sensitive header map and refactor header parsing
- Move vendor definitions and matching logic to new vendors.rs - Replace headers/types.rs with vendors.rs and update imports - Use VendorMask bitmask for candidate tracking instead of Vec - Refactor parser to use VendorMask and new vendor matching - Rename retryafter module to retry_after for consistency - Remove unused integer conversion functions - Update tests and README for new VendorMask usage
- Add `extra_headers` to `VendorSpec` for more accurate vendor detection - Use case-insensitive header matching for vendor identification - Add `from_iter` methods for `RateLimit`, `Headers`, and `retry_after::RateLimit` - Make `RateLimit::new` and `Headers::new` accept `http::HeaderMap` directly - Update docs, tests, and README to reflect new API and detection logic
- Rename Vendor::Unknown to Vendor::Generic for clearer semantics - Replace from_iter with extract in public APIs for header parsing - Improve vendor specificity scoring and candidate selection logic - Add VendorMask::from_iter for ergonomic mask construction - Update documentation and examples to reflect API changes
thomas-zahner
left a comment
There was a problem hiding this comment.
Looks great overall! I've just added a few remarks which might help to improve the code readability.
| let mut current_number = String::new(); | ||
|
|
||
| let mut chars = value.chars().peekable(); | ||
| fn parse_openai_duration(s: &str) -> Option<usize> { |
There was a problem hiding this comment.
This function confuses me a bit. Maybe readability could be improved. But more importantly it seems to be untested.
There was a problem hiding this comment.
Good catch. Added a doc comment explaining the format and 7 unit tests covering seconds, ms rounding, m vs ms disambiguation, compound, fractional, hours/days, and invalid inputs. Also rewrote it to be allocation-free.
katrinafyi
left a comment
There was a problem hiding this comment.
I just had a quick look. I had no idea the rate limiting headers were so non-standard and it's a big project to parse them. You'd think if they wanted people to respect the headers, they would be more consistent...
cf467fe to
0b6e782
Compare
|
@thomas-zahner, @katrinafyi if one of you finds the time, please take another look. I tried to address all of your very good comments. Split up the logic as much as I could and updated the docs. |
- Rename ResetTimeKind::OpenAIDuration to OpenAiDuration - Promote vendor URL comments to public docs on Vendor enum - Document parse_openai_duration and add unit tests - Refactor Parser::parse into smaller, documented helpers - Introduce Specificity newtype with From<&VendorState> impl - Migrate VendorMask to bitflags with public per-vendor constants - Extract parse_header_lines and header_map_str_pairs helpers - Clarify case-sensitive matching in README
|
@mre Thank you for the update! I think it's much more readable. Haven't tested it but it does look like a great improvement. |
The parser's per-vendor state array is sized from `VENDORS.len()`, so no manual length bookkeeping is required anymore.
|
Thanks to both of your feedback! I've made some minor refactorings and finally merged it. :) Will release a new version now and update the lychee PR. |
This PR completely overhauls the internal parsing engine. It transitions the crate to a zero-copy, single-pass state machine.
http::HeaderMapinherently normalizes keys to lowercase, it was destroying the exact casing information needed to distinguish between certain APIs. The parser now uses exact case-sensitive matching for much stronger vendor detection.Vec<Vendor>used for tracking candidates with a lightweightu64-backedVendorMask. This allows O(1) vendor tracking and elimination with zero heap allocations.Vendor::Unknown(while still returning the successfully parsed limit/remaining/reset numbers) and populates thecandidatesmask for the user.These changes make the crate more robust against conflicting vendor APIs, and 100% allocation-free during the parsing phase.