Improve boundary classification by RobinMalfait · Pull Request #17005 · tailwindlabs/tailwindcss

RobinMalfait · 2025-03-06T23:44:01Z

This PR cleans up the boundary character checking by using similar classification techniques as we used for other classification problems.

For starters, this moves the boundary related items to its own file, next we setup the classification enum.

Last but not least, we removed } as an after boundary character, and instead handle that situation in the Ruby pre processor where we need it. This means the %w{flex} will still work in Ruby files.

This PR is a followup for #17001, the main goal is to clean up some of the boundary character checking code. The other big improvement is performance. Changing the boundary character checking to use a classification instead results in:

Took the best score of 10 runs each:

- CandidateMachine: Throughput: 311.96 MB/s
+ CandidateMachine: Throughput: 333.52 MB/s

So a ~20MB/s improvement.

Test plan

Existing tests should pass. Due to the removal of } as an after boundary character, some tests are updated.
Added new tests to ensure the Ruby pre processor still works as expected.

This is a relatively small change, but with big impact: ```diff - CandidateMachine: Throughput: 308.62 MB/s + CandidateMachine: Throughput: 324.34 MB/s ``` Almost 20 MB/s more throughput from this change alone.

We still need it for `%w{…}` in Ruby land, but let's solve that using the pre processor instead.

Co-authored-by: Jordan Pittman <jordan@cryptica.me>

move boundary related functions to dedicated file

443652d

RobinMalfait requested a review from a team as a code owner March 6, 2025 23:44

RobinMalfait added 4 commits March 7, 2025 00:47

classify boundary characters

e9f0eda

This is a relatively small change, but with big impact: ```diff - CandidateMachine: Throughput: 308.62 MB/s + CandidateMachine: Throughput: 324.34 MB/s ``` Almost 20 MB/s more throughput from this change alone.

move } to before boundary characters

9d42c7d

We still need it for `%w{…}` in Ruby land, but let's solve that using the pre processor instead.

update tests now that } is not a valid end boundary

8fcbe96

move {…} to Ruby pre-processor

b3c597b

RobinMalfait force-pushed the feat/classify-boundaries branch from 797ef23 to b3c597b Compare March 6, 2025 23:47

thecrypticace reviewed Mar 6, 2025

View reviewed changes

Comment thread crates/oxide/src/extractor/boundary.rs Outdated

thecrypticace approved these changes Mar 6, 2025

View reviewed changes

Update crates/oxide/src/extractor/boundary.rs

8408efe

Co-authored-by: Jordan Pittman <jordan@cryptica.me>

RobinMalfait enabled auto-merge (squash) March 6, 2025 23:58

RobinMalfait merged commit d0a9746 into main Mar 7, 2025

RobinMalfait deleted the feat/classify-boundaries branch March 7, 2025 00:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve boundary classification#17005

Improve boundary classification#17005
RobinMalfait merged 6 commits into
mainfrom
feat/classify-boundaries

RobinMalfait commented Mar 6, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

Conversation

RobinMalfait commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RobinMalfait commented Mar 6, 2025 •

edited

Loading