-
Notifications
You must be signed in to change notification settings - Fork 227
Description
Red Knot currently loads and holds on to every analyzed file's AST and source code. For large projects, this can easily take up multiple GB of data.
Persistent caching, which removes the need to analyze all files, should help with overall consumption but it won't help for the initial run or when many files changed.
Salsa plans to add support for LRU garbage collection (or already has). It might be able to do exactly what we need but I'm not sure if it is limited to collecting stale values between revisions. I also suspect that it won't help in our case because collecting the cached results for parsed_modules isn't sufficient because DefinitionKind and ExpressionKind hold on to an Arced AST and the tracked structs don't get collected.
One option would be to have a lazy representation for AstNodeRef (WeakAstNodeRef) that stores an identifier that uniquely identifies the node, a weak Arc, together with the node's reference. Resolving the node would re-parse the file if the Arc has been collected and finds the node in the tree (which could be expensive). On the other hand, it uses the Ast as is if the Arc is still materialized.
We should look into ways on how we can drop no longer needed ASTs.