Conversation
7726f46 to
8491a39
Compare
|
|
I'll review tomorrow. Can you add a log how many times we end up reparsing an ast and do a cold run on some project to see if we see any reparses to validate our assumption that this shouldn't happen? |
CodSpeed Performance ReportMerging #18482 will degrade performances by 4.43%Comparing Summary
Benchmarks breakdown
|
|
The incremental performance regression is expected. I'm not sure if we want less eager garbage collection for the LSP or if the initial hit is fine, and we instead try to keep recently reparsed ASTs around on subsequent runs? The cold check performance regression is pretty minor (~3% both on codspeed and locally on mypy_primer runs) even with the extra AST traversal. |
|
Thanks for working on this! I'm going to let @MichaReiser handle review :) I'm curious about the new NodeIndex bit. Is the entire purpose of that so that we can use it in |
There was a problem hiding this comment.
This is great!
I'd prefer to implement the ID assigning in the parser. Not only enables this other crates to make use of the index, but it also allows us to change NodeIndex to a u32 to which is Copy.
Do you think it would be possible to collect the nodes inside the parser (as part of the parsing step?). That would remove the need for an extra traversal. We could add an option to only collect them when requested.
Besides these code changes: It would be great to have some end to end benchmarks.
- Can you run
ty_benchmarkon some project and share the before/after runtime? - Can you add some logging to when an AST gets reparsed because it was cleared before and run ty across some projects. Do we see any reparses within the same revisions? If so, why?
For the LSP case. It probably makes sense for us not to purge any AST for a file that is in project.open_files. This should address any LSP concerns and I think even the perf regression in the benchmark
Creating a flat index of the AST upfront allow us to avoid both of those, and |
|
The ty-benchmark results match the 3% regression on Codspeed. |
8cf6695 to
f6b7b7c
Compare
|
Given that the regression is quite minor I'm inclined to get this merged. I spent some time trying to assign indices in the parser but it turned out to be non-trivial. The parser is recursive-descendant and we want to assign indices in source-order, so we run into problems where the parent node depends on the result of the recursion, so we don't know whether or not to claim an outer index before parsing the inner nodes. I think we would end up having to do mini-traversals to adjust the children indices — it's much easier to do this in a separate pass. If the parser also collected the nodes then it could do a simple pass over those and assign the indices then, but that could also be addressed in a future PR. I also ran on a number of projects and was unable to trigger a reparse within a revision, I added some logs in case that ever happens. |
|
Just trying to understand this better. Can you share a specific instance of this? The main thing that I find slightly annoying is that |
In |
Curious - would there be negative consequences to skipping an index in some scenarios? |
Right now all indices are set strictly monotonic, so we can use a simple |
|
Could we update the index when wrapping the node in a tuple (it may require a traversal of the inner node) |
|
We could, but I guess I don't really see the benefit of doing that. Collecting the nodes as part of parsing would have a similar problem I think (we'd have to insert parent nodes into the middle of the list before it's children). The separate pass is much simpler and doesn't seem to impact performance noticeably. |
|
Could we increase the index non-monotonically during parsing and then remove the gaps after the parsing is done? Or, would that be equivalent to visiting the parsed tree at the end to assign the index? |
Just double checking, did you do this check with a release build? |
There was a problem hiding this comment.
I'm fine with the extra traversal for now. Eventough a 3% regression is fairly substantial for something that doesn't add new typing features. But the memory reduction is probably a good justification for it.
But no longer doing it in the parser invalidates the main reason why we added the NodeIndex to every AST node. I did a quick search and it seems that the only place where we create AstNodeRefs is in the SemanticIndexBuilder. This makes me wonder if an approach closer to AstNodeIds would be a better fit where:
SemanticIndexstores an:nodes: IndexVec<NodeIndex, AnyRootNodeRef>and an inverse map fromAnyNodeRef(using ptr equality) toNodeIndex.SemanticIndexBuilderhas aregister_nodemethod that, given anode, returns anAstNodeRefto it
This has a couple of advantages:
- No changes to the AST, no size increase of the AST
- We only register exactly the nodes we need and not all nodes.
- It doesn't require an extra traversal
The obvious downside is that it requires an inverse hash map which might be coslty. What are your thoughts on building the index outside the AST?
I wonder if it would make sense to split this PR into a few smaller PRs to, at least, get the most annoying changes landed sooner and also have a chance to better focus on some discussions:
- Add
NodeIndex: Decide on which nodes have aNodeIndex, whether it should be an atomic or not - Add a
SourceOrderMutVisitor(if we decide to not use atomics) - The actual AST GC implementation
But I leave this decision up to you.
On the atomic discussion. I'm a bit undecided:
- I think we should remove the atomic and instead introduce a
SourceOrderMutVisitor(orSourceOrderTransformer) if we don't plan on integrating the indexing/numbering into the semantic index step - Otherwise, keeping atomic seems fine, but we should follow up with trying to integrate the pass into semantic indexing
|
|
||
|
|
||
| # ------------------------------------------------------------------------------ | ||
| # AnyRootNodeRef |
There was a problem hiding this comment.
What this type is, isn't clear to me wrom it's name. First I thought it's about types that can be a "root" from what the parser returns (ModModule or ModExpression) but that's not what it is. Instead, it's a very specific selection of nodes?
Reading through the code, I get the impression that it is related to nodes to which we assign IDs. Which brings up a related question: Why are we only assigning IDs to some nodes but not all? IMO, it's confusing that only some nodes have a NodeId. I'd prefer to just assign an ID to every node instead.
There was a problem hiding this comment.
I added some documentation to why we need a new type.. the name isn't great but I'm not sure of a better one. I believe all nodes have a NodeId, unless I missed one.
| /// Create a new [`IndexedModule`] from the given AST. | ||
| #[allow(clippy::unnecessary_cast)] | ||
| pub fn new(parsed: Parsed<ModModule>) -> Arc<Self> { | ||
| let mut visitor = Visitor { |
There was a problem hiding this comment.
Future: It could be interesting to collect some more numbers about the parsed module here. E.g. how many expressions are there? This could help to correctly initialize the HashMaps in the SemanticIndexBuilder that currently suffer from many resizes.
|
I explored a number of options here:
I think the current solution, where the flat index is created and managed directly by the |
01ec2b7 to
98049c9
Compare
98049c9 to
10e5edf
Compare
dhruvmanila
left a comment
There was a problem hiding this comment.
I'll leave this to Micha but a few things I noticed.
| --- | ||
| source: crates/ruff_python_parser/src/string.rs | ||
| expression: suite | ||
| snapshot_kind: text |
There was a problem hiding this comment.
I always forget, is this because the snapshots haven't been touched in a while and so cargo-insta has removed this field on a recent version or is it that you've an older version of cargo-insta?
There was a problem hiding this comment.
I'm on the latest version of cargo-insta so I assume the former.
| node_index: NodeIndex( | ||
| "_", | ||
| ), |
There was a problem hiding this comment.
Should we generate a custom Debug implementation for the nodes to avoid including this field in the snapshot? Or, is it required that we include this field in the Debug implementation?
There was a problem hiding this comment.
I don't really see the benefit of removing this field, we may end up setting it in the parser later. I updated the debug implementation of NodeIndex to avoid the extra whitespace.
MichaReiser
left a comment
There was a problem hiding this comment.
Nice! This is great and thank you for exploring all my not very fruitful alternatives.
I think it could be nice to have a non atomic NodeIndex that we could use in AstNodeRef, get_by_index and in NodeKey.
08ad6c5 to
270768f
Compare
270768f to
347934a
Compare
|
Just noticed this had been merged after I made the alpha-10 release. I think a 20-30% peak memory reduction is worth noting for users! So in future I think we could leave off the This is awesome! |
Summary
Garbage collect ASTs once we are done checking a given file. Queries with a cross-file dependency on the AST will reparse the file on demand. This reduces ty's peak memory usage by ~20-30%.
The primary change of this PR is adding a
node_indexfield to every AST node, that is assigned by the parser.ParsedModulecan use this to create a flat index of AST nodes any time the file is parsed (or reparsed). This allowsAstNodeRefto simply index into the current instance of theParsedModule, instead of storing a pointer directly.The indices are somewhat hackily (using an atomic integer) assigned by the
parsed_modulequery instead of by the parser directly. Assigning the indices in source-order in the (recursive) parser turns out to be difficult, and collecting the nodes during semantic indexing is impossible asSemanticIndexdoes not hold onto a specificParsedModuleRef, which the pointers in the flat AST are tied to. This means that we have to do an extra AST traversal to assign and collect the nodes into a flat index, but the small performance impact (~3% on cold runs) seems worth it for the memory savings.Part of astral-sh/ty#214.