Skip to content

Move to parser-based approach#11

Merged
mre merged 15 commits into
masterfrom
parser
May 30, 2026
Merged

Move to parser-based approach#11
mre merged 15 commits into
masterfrom
parser

Conversation

@mre

@mre mre commented Apr 7, 2026

Copy link
Copy Markdown
Owner

This PR completely overhauls the internal parsing engine. It transitions the crate to a zero-copy, single-pass state machine.

  • Since http::HeaderMap inherently normalizes keys to lowercase, it was destroying the exact casing information needed to distinguish between certain APIs. The parser now uses exact case-sensitive matching for much stronger vendor detection.
  • Replaced the Vec<Vendor> used for tracking candidates with a lightweight u64-backed VendorMask. This allows O(1) vendor tracking and elimination with zero heap allocations.
  • If a set of headers matches multiple vendors equally well, the parser no longer guesses. It safely defaults to Vendor::Unknown (while still returning the successfully parsed limit/remaining/reset numbers) and populates the candidates mask for the user.

These changes make the crate more robust against conflicting vendor APIs, and 100% allocation-free during the parsing phase.

mre added 12 commits February 4, 2026 14:18
Remove case-sensitive header map and refactor header parsing
- Move vendor definitions and matching logic to new vendors.rs
- Replace headers/types.rs with vendors.rs and update imports
- Use VendorMask bitmask for candidate tracking instead of Vec
- Refactor parser to use VendorMask and new vendor matching
- Rename retryafter module to retry_after for consistency
- Remove unused integer conversion functions
- Update tests and README for new VendorMask usage
- Add `extra_headers` to `VendorSpec` for more accurate vendor detection
- Use case-insensitive header matching for vendor identification
- Add `from_iter` methods for `RateLimit`, `Headers`, and
  `retry_after::RateLimit`
- Make `RateLimit::new` and `Headers::new` accept `http::HeaderMap`
  directly
- Update docs, tests, and README to reflect new API and detection logic
- Rename Vendor::Unknown to Vendor::Generic for clearer semantics
- Replace from_iter with extract in public APIs for header parsing
- Improve vendor specificity scoring and candidate selection logic
- Add VendorMask::from_iter for ergonomic mask construction
- Update documentation and examples to reflect API changes

@thomas-zahner thomas-zahner left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! I've just added a few remarks which might help to improve the code readability.

Comment thread src/reset_time.rs Outdated
Comment thread src/reset_time.rs
let mut current_number = String::new();

let mut chars = value.chars().peekable();
fn parse_openai_duration(s: &str) -> Option<usize> {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function confuses me a bit. Maybe readability could be improved. But more importantly it seems to be untested.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Added a doc comment explaining the format and 7 unit tests covering seconds, ms rounding, m vs ms disambiguation, compound, fractional, hours/days, and invalid inputs. Also rewrote it to be allocation-free.

Comment thread src/vendors.rs Outdated
Comment thread src/vendors.rs Outdated
Comment thread src/vendors.rs
Comment thread src/parser.rs
Comment thread src/parser.rs
Comment thread src/parser.rs Outdated
Comment thread src/parser.rs Outdated

@katrinafyi katrinafyi left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a quick look. I had no idea the rate limiting headers were so non-standard and it's a big project to parse them. You'd think if they wanted people to respect the headers, they would be more consistent...

Comment thread README.md
Comment thread src/vendors.rs
@mre mre force-pushed the parser branch 3 times, most recently from cf467fe to 0b6e782 Compare April 24, 2026 23:21
@mre

mre commented Apr 24, 2026

Copy link
Copy Markdown
Owner Author

@thomas-zahner, @katrinafyi if one of you finds the time, please take another look. I tried to address all of your very good comments. Split up the logic as much as I could and updated the docs.

- Rename ResetTimeKind::OpenAIDuration to OpenAiDuration
- Promote vendor URL comments to public docs on Vendor enum
- Document parse_openai_duration and add unit tests
- Refactor Parser::parse into smaller, documented helpers
- Introduce Specificity newtype with From<&VendorState> impl
- Migrate VendorMask to bitflags with public per-vendor constants
- Extract parse_header_lines and header_map_str_pairs helpers
- Clarify case-sensitive matching in README
@thomas-zahner

Copy link
Copy Markdown

@mre Thank you for the update! I think it's much more readable. Haven't tested it but it does look like a great improvement.

mre added 2 commits May 30, 2026 17:33
The parser's per-vendor state array is sized from `VENDORS.len()`, so no manual
length bookkeeping is required anymore.
@mre mre merged commit 70b44c3 into master May 30, 2026
8 checks passed
@mre mre deleted the parser branch May 30, 2026 15:38
@mre

mre commented May 30, 2026

Copy link
Copy Markdown
Owner Author

Thanks to both of your feedback! I've made some minor refactorings and finally merged it. :) Will release a new version now and update the lychee PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants