Add "containing range" APIs to QueryCursor#4919
Conversation
tested Co-authored-by: Kirill Bulatov <mail4score@gmail.com> Co-authored-by: dino <dinojoaocosta@gmail.com> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com> Co-authored-by: John Tur <john-tur@outlook.com>
Co-authored-by: Kirill Bulatov <mail4score@gmail.com> Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com> Co-authored-by: dino <dinojoaocosta@gmail.com> Co-authored-by: John Tur <john-tur@outlook.com> Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com> Co-authored-by: dino <dinojoaocosta@gmail.com>
|
Only bugfixes are backported, no API changes. We also want to clean up the |
|
It's fine for this to make it into 0.26 if it's ready. |
|
When this lands, we either need to get 0.26 out, or backport this. Or both. I don’t care which, but it’s not gonna work to just completely block releases at any point. |
|
That is not what's happening here (believe me, I want this feature as much as you do). But it's your project, so you do what you want; I am tired of fighting about this. |
|
For what it's worth I'd also like to release 0.26 soon with any PRs that are ready to ship and/or necessary:
I think we've delayed it long enough. |
|
Ok, sounds good |
|
I'd double check with @amaanq but it should be fine to release 0.26.1 after this lands. @maxbrunsfeld this will have some conflicts with #4908, do you have a preference for which should go in first? I believe the test refactor is good to merge now, (unless you have any additional comments), but I'm also fine waiting on this to land first. |
|
Hey @maxbrunsfeld. Do you have a timeline in mind for when this PR will be ready to go? We're hoping to publish 0.26.1 within the next week or two, and would really like to get this in if possible. If that's not enough time, we could always punt to Edit: I've also copied, rebased, and tried to complete this branch in a separate PR here to keep your history intact. If you'd like to work off of that feel free! |
0.27.0, since this is an API addition. CLI consumers of the API don't have Zed's freedom of ignoring distro maintainers insisting on having tree-sitter shipped as a shared system-wide library, which makes proper (semantic) versioning very important. (This is just about the number, not the timing! Nobody is stopping us from releasing 0.27.0 one week after 0.26.1.) |
|
Thank you @WillLillis. Sorry, I have gotten pulled into a different project at Zed that is taking a lot of focus, so I had to step away from this. Feel free to proceed with the release with or without landing your rebased PR. If anyone wants to pick this up, I think the task list in the PR description is still up to date. |
|
Superseded by #5100 |
…es search (zed-industries#39416) Part of zed-industries#39594 Closes zed-industries#4701 Closes zed-industries#42861 Closes zed-industries#44503 ~Depends on tree-sitter/tree-sitter#4919 Release Notes: - Fixed some performance bottlenecks related to syntax analysis when editing very large files --------- Co-authored-by: Kirill Bulatov <kirill@zed.dev>
…es search (zed-industries#39416) Part of zed-industries#39594 Closes zed-industries#4701 Closes zed-industries#42861 Closes zed-industries#44503 ~Depends on tree-sitter/tree-sitter#4919 Release Notes: - Fixed some performance bottlenecks related to syntax analysis when editing very large files --------- Co-authored-by: Kirill Bulatov <kirill@zed.dev>
…es search (zed-industries#39416) Part of zed-industries#39594 Closes zed-industries#4701 Closes zed-industries#42861 Closes zed-industries#44503 ~Depends on tree-sitter/tree-sitter#4919 Release Notes: - Fixed some performance bottlenecks related to syntax analysis when editing very large files --------- Co-authored-by: Kirill Bulatov <kirill@zed.dev>
Follow-up to #2085
Background
Running a query on a very large syntax tree is generally much faster if you provide a smaller range within which the query cursor should search. But the current
QueryCursor::set_byte_rangeand::set_point_rangeAPIs are now always guaranteed to speed things up by the same amount for all queries. Those existing methods have the semantics that a match should only be returned if some node in the match intersects the given range. In some cases, in order to find all of the relevant matches, it's still necessary to walk a large portion of the tree, including nodes arbitrarily distant from the given range.This is a problem when editing very large files in code editors. For example, in zed-industries/zed#4701, users are observing slowness in Zed's
enclosing_bracket_rangesandsuggested_indentsqueries when editing a large JSON document. These code paths are slow despite looking only for matches that contain the cursor's position.Solution
This PR introduces new range-based
QueryCursorAPIs:set_containing_byte_rangeandset_containing_point_range, which have a different range-filtering semantics. These allow you to search only for matches where all nodes are fully contained within the given range. These APIs are independent of the existing range-filtering methods, and can be used in conjunction with them, for more advanced filtering.For example, if you wanted to search for any matches that intersect line 5000, as long as they are fully contained within lines 4500-5500, you could do this:
The benefit of these new APIs is that they make possible a more aggressive optimization inside the query cursor, where it will definitively not descend into any syntax nodes outside of the given range.
Todo