[ty] Use Salsa to cache workspace symbols#20030
Conversation
Diagnostic diff on typing conformance testsNo changes detected when running ty on typing conformance tests ✅ |
|
|
dhruvmanila
left a comment
There was a problem hiding this comment.
Looks great! Do we have any numbers on the difference between a few requests like (a) no query (b) with small and large query string?
MichaReiser
left a comment
There was a problem hiding this comment.
I think there are ways on how we could improve the caching further (or, at least, reduce the caching overhead) and concurrency should make this much faster.
It does hurt us a little that we garbage collect parsed ASTs when workspace diagnostics are enabled, but that's out of scope for this change
This is a pretty naive approach, but it makes cold times for listing workspace symbols in home-assistant under 1s on my machine. Courtesy of Micha: #20030 (comment)
e195b9d to
78d87ea
Compare
Useful for ad hoc debugging, but it's also useful to have permanently to enable serendipitous discovery of performance problems.
This is prep work for turning this into a Salsa query. Specifically, we don't want the Salsa query to be dependent upon the query string.
There is a small amount of subtlety to this matching routine, and it could be implemented in a faster way. So let's right some tests for what we have to ensure we don't break anything when we optimize it.
While this doesn't typically matter, when ty returns a very large list of symbols, this can have an impact. Specifically, when searching `async` in home-assistant, this gets times closer to 500ms versus closer to 600ms before this change. It looks like an overall ~50ms improvement (so around 10%), but variance is all over the place and I didn't do any statistical tests. But this does make intuitive sense. Previously, we were allocating intermediate strings, doing UTF-8 decoding and consulting Unicode casing tables. Now we're just doing what is likely a single DFA scan. In effect, we front load all of the Unicode junk into regex compilation.
This is a pretty naive approach, but it makes cold times for listing workspace symbols in home-assistant under 1s on my machine. Courtesy of Micha: #20030 (comment)
17e7b0f to
fd8109c
Compare
After the initial cold run, requests generally take a few dozen milliseconds. There isn't a ton of variation with respect to query string length (which is what I'd expect I think). The slowest part of the process here is when we return a lot of symbols and VS Code needs to update its drop-down UI. There's a fairly sizeable delay there (on the order of a couple seconds). But to VS Code's credit, if you keep typing, it's pretty good about stopping what it was doing and moving on to the next request. It overall feels pretty good. It'd be nice if VS Code was a little snappier, but I think the only thing we could do there is return fewer symbols. And I'm not sure if that's a good idea. |
MichaReiser
left a comment
There was a problem hiding this comment.
Nice! Thank you for figuring out this tricky representation. This required more work than I expected
| let mut last_end: usize = 0; | ||
| for (tree, child_symbol_ids) in self.symbols.iter().zip(children_ids) { | ||
| let start = last_end; | ||
| let end = start + child_symbol_ids.len(); |
There was a problem hiding this comment.
I'm a bit surprised this works. Is it guaranteed that we add all symbols (children) in breadth-first and not depth-first order? Or am I overlooking something here that makes this possible?
We use a similar representation for Scope in the SemanticIndex where each Scope stores a Range<FileScopeId>. However, that range doesn't capture the Scope's children; it's the Scope's descendants. Finding the Scope's children requires iterating over the descendants and skipping all those with a different parent.
It's not fully clear to me why we don't have the descendants problem here. Maybe it's worth adding a comment why start + child_symbol_ids.len() works.
ruff/crates/ty_python_semantic/src/semantic_index.rs
Lines 612 to 615 in 4ac2b2c
There was a problem hiding this comment.
Yeah I can add a comment. It works because we guarantee all parents are added before their children in the visitor. I think I wrote a comment about that somewhere, but I should make it more prominent.
All this code is doing is taking the Map<SymbolId, Vec<SymbolId>> representation created above and flattening it.
Basically, this splits the implementation into two pieces: the first piece does the traversal and finds *all* symbols across the workspace. The second piece does filtering based on a user provided query string. Only the first piece is cached by Salsa. This brings warm "workspace symbols" requests down from 500-600ms to 100-200ms.
This is a pretty naive approach, but it makes cold times for listing workspace symbols in home-assistant under 1s on my machine. Courtesy of Micha: #20030 (comment)
In effect, we make the Salsa query aspect keyed only on whether we want global symbols. We move everything else (hierarchical and querying) to an aggregate step *after* the query. This was a somewhat involved change since we want to return a flattened list from visiting the source while also preserving enough information to reform the symbols into a hierarchical structure that the LSP expects. But I think overall the API has gotten simpler and we encode more invariants into the type system. (For example, previously you got a runtime assertion if you tried to provide a query string while enabling hierarchical mode. But now that's prevented by construction.)
This makes use of early continue/return to keep rightward drift under control. (I also find it easier to read.)
fd8109c to
428a532
Compare
This is a pretty naive approach, but it makes cold times for listing workspace symbols in home-assistant under 1s on my machine. Courtesy of Micha: #20030 (comment)
This is a pretty naive approach, but it makes cold times for listing workspace symbols in home-assistant under 1s on my machine. Courtesy of Micha: astral-sh#20030 (comment)
This PR pretty much does what astral-sh/ty#998 suggests: it turns the
symbol gathering at the file level into a Salsa query while being
careful not to make the query string a dependency of that query. This
lets Salsa cache the symbols on a per-file basis, while we do the
filtering in a separate aggregation step.
This PR also includes a (small) bonus optimization using a regex for
query filtering. :-)
Fixes astral-sh/ty#998