Skip to content

Avoid unnecessary parser lookahead for operators#25290

Merged
charliermarsh merged 1 commit into
mainfrom
charlie/parser-bench-delimiter-depth
May 21, 2026
Merged

Avoid unnecessary parser lookahead for operators#25290
charliermarsh merged 1 commit into
mainfrom
charlie/parser-bench-delimiter-depth

Conversation

@charliermarsh

@charliermarsh charliermarsh commented May 21, 2026

Copy link
Copy Markdown
Member

Summary

We only need to peek ahead two tokens if the first token is not or in. We can avoid the peek in the majority of cases, which apparently speeds up the parser by 20-30%.

@astral-sh-bot

astral-sh-bot Bot commented May 21, 2026

Copy link
Copy Markdown

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@codspeed-hq

codspeed-hq Bot commented May 21, 2026

Copy link
Copy Markdown

Merging this PR will improve performance by 10.75%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 17 improved benchmarks
✅ 100 untouched benchmarks

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation ty_micro[complex_constrained_attributes_1] 74.6 ms 70.6 ms +5.69%
Simulation ty_micro[many_tuple_assignments] 65.8 ms 62 ms +5.99%
Simulation ty_micro[complex_constrained_attributes_2] 74.4 ms 70 ms +6.17%
Simulation ty_micro[pandas_tdd] 255.3 ms 238.9 ms +6.89%
Simulation ty_micro[complex_constrained_attributes_3] 80.8 ms 76.9 ms +5.03%
Simulation ty_micro[many_tuple_assignments] 74.4 ms 70.1 ms +6.18%
Simulation ty_micro[gradual_vararg_call] 74.9 ms 70.7 ms +6.03%
Simulation ty_micro[many_enum_members] 100.1 ms 95.3 ms +4.99%
Simulation ty_micro[vararg_parameter_type_accumulation] 64.1 ms 60.4 ms +6.17%
Simulation ty_micro[many_enum_members_2] 95.9 ms 91.5 ms +4.88%
Simulation ty_micro[very_large_tuple] 76.7 ms 72.3 ms +6.05%
Simulation ty_micro[many_string_assignments] 85 ms 81 ms +5.02%
Simulation parser[large/dataset.py] 4.7 ms 3.6 ms +30.44%
Simulation parser[numpy/ctypeslib.py] 883.8 µs 713.6 µs +23.85%
Simulation parser[numpy/globals.py] 103.7 µs 90.4 µs +14.77%
Simulation parser[pydantic/types.py] 1.8 ms 1.4 ms +27.42%
Simulation parser[unicode/pypinyin.py] 308.6 µs 251.3 µs +22.78%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing charlie/parser-bench-delimiter-depth (9764ee1) with main (6cbd59b)

Open in CodSpeed

@AlexWaygood AlexWaygood added the performance Potential performance improvement label May 21, 2026
@charliermarsh charliermarsh changed the title View CodSpeed benchmarks Avoid unnecessary parser lookahead for operators May 21, 2026
@charliermarsh charliermarsh marked this pull request as ready for review May 21, 2026 13:07

@MichaReiser MichaReiser left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow

@charliermarsh charliermarsh merged commit aea5ed4 into main May 21, 2026
52 checks passed
@charliermarsh charliermarsh deleted the charlie/parser-bench-delimiter-depth branch May 21, 2026 13:13
@chadbrewbaker

Copy link
Copy Markdown

I would do another /goal round looking at the buffers that peek() is using. My mind's eye has loading whole cache lines when possible so it's only doing memory fetch or OS calls when absolutely needed.

thejchap pushed a commit to thejchap/ruff that referenced this pull request May 23, 2026
## Summary

We only need to peek ahead two tokens if the first token is `not` or
`in`. We can avoid the `peek` in the majority of cases, which apparently
speeds up the parser by 20-30%.
anishgirianish pushed a commit to anishgirianish/ruff that referenced this pull request May 28, 2026
## Summary

We only need to peek ahead two tokens if the first token is `not` or
`in`. We can avoid the `peek` in the majority of cases, which apparently
speeds up the parser by 20-30%.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Potential performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants