Skip to content

Pre-compute the module range for LALRPOP parser#10879

Merged
dhruvmanila merged 1 commit intodhruv/parserfrom
dhruv/mod-range-lalrpop
Apr 11, 2024
Merged

Pre-compute the module range for LALRPOP parser#10879
dhruvmanila merged 1 commit intodhruv/parserfrom
dhruv/mod-range-lalrpop

Conversation

@dhruvmanila
Copy link
Member

Summary

This PR updates the LALRPOP to use the complete source range for module by looking at the tokens before the parsing begins. This means the module range now corresponds to the actual source even if it only contains trivia tokens. This is mainly so that the fuzzer doesn't crash because the AST doesn't match.

@dhruvmanila dhruvmanila added the internal An internal refactor or improvement label Apr 11, 2024
@MichaReiser
Copy link
Member

MichaReiser commented Apr 11, 2024

Oh no, the formatter fails now. I think we need to fix

if !f.context().comments().has_leading(item)
&& lines_after(range.end(), f.context().source()) != 0
{
empty_line().fmt(f)
} else {

to count the lines after range.start() instead of after range.end()?

@MichaReiser
Copy link
Member

I went ahead and pushed the fix. I hope that's okay with you

@dhruvmanila
Copy link
Member Author

Thank you!

Base automatically changed from dhruv/async-for-panic to dhruv/parser April 11, 2024 16:44
@dhruvmanila dhruvmanila force-pushed the dhruv/mod-range-lalrpop branch from 5c117e7 to d2d4b3e Compare April 11, 2024 16:44
@dhruvmanila dhruvmanila merged commit df9e766 into dhruv/parser Apr 11, 2024
@dhruvmanila dhruvmanila deleted the dhruv/mod-range-lalrpop branch April 11, 2024 16:45
dhruvmanila added a commit that referenced this pull request Apr 11, 2024
## Summary

This PR updates the LALRPOP to use the complete source range for module
by looking at the tokens before the parsing begins. This means the
module range now corresponds to the actual source even if it only
contains trivia tokens. This is mainly so that the fuzzer doesn't crash
because the AST doesn't match.
@codspeed-hq
Copy link

codspeed-hq bot commented Apr 11, 2024

CodSpeed Performance Report

Merging #10879 will degrade performances by 5.27%

Comparing dhruv/mod-range-lalrpop (d2d4b3e) with dhruv/parser (b069358)

Summary

⚡ 2 improvements
❌ 1 regressions
✅ 27 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark dhruv/parser dhruv/mod-range-lalrpop Change
parser[large/dataset.py] 28.8 ms 26.7 ms +8.16%
parser[pydantic/types.py] 11.8 ms 10.9 ms +8.77%
parser[unicode/pypinyin.py] 1.6 ms 1.7 ms -5.27%

@dhruvmanila
Copy link
Member Author

Ugh, I force pushed the branch and removed your change, I'll fix this.

dhruvmanila added a commit that referenced this pull request Apr 15, 2024
## Summary

This PR updates the LALRPOP to use the complete source range for module
by looking at the tokens before the parsing begins. This means the
module range now corresponds to the actual source even if it only
contains trivia tokens. This is mainly so that the fuzzer doesn't crash
because the AST doesn't match.
dhruvmanila added a commit that referenced this pull request Apr 16, 2024
## Summary

This PR updates the LALRPOP to use the complete source range for module
by looking at the tokens before the parsing begins. This means the
module range now corresponds to the actual source even if it only
contains trivia tokens. This is mainly so that the fuzzer doesn't crash
because the AST doesn't match.
dhruvmanila added a commit that referenced this pull request Apr 16, 2024
## Summary

This PR updates the LALRPOP to use the complete source range for module
by looking at the tokens before the parsing begins. This means the
module range now corresponds to the actual source even if it only
contains trivia tokens. This is mainly so that the fuzzer doesn't crash
because the AST doesn't match.
dhruvmanila added a commit that referenced this pull request Apr 17, 2024
## Summary

This PR updates the LALRPOP to use the complete source range for module
by looking at the tokens before the parsing begins. This means the
module range now corresponds to the actual source even if it only
contains trivia tokens. This is mainly so that the fuzzer doesn't crash
because the AST doesn't match.
dhruvmanila added a commit that referenced this pull request Apr 18, 2024
## Summary

This PR updates the LALRPOP to use the complete source range for module
by looking at the tokens before the parsing begins. This means the
module range now corresponds to the actual source even if it only
contains trivia tokens. This is mainly so that the fuzzer doesn't crash
because the AST doesn't match.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal An internal refactor or improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants