feat: Add support for precise usagesForSymbol#64126
Conversation
|
Caution License checking failed, please read: how to deal with third parties licensing. |
ac6d4aa to
b3d059c
Compare
bf37c0d to
b5e186d
Compare
b5e186d to
7366fe9
Compare
kritzcreek
left a comment
There was a problem hiding this comment.
Thanks, this is looking really good. Do you have plans for how you're going to test the new PreciseUsages logic?
| Limit: remainingCount, | ||
| RawCursor: currentCursor.PreciseCursor.Encode(), | ||
| Path: args.Path, | ||
| Matcher: shared.NewSCIPBasedMatcher(args.Range, lookupSymbol), |
There was a problem hiding this comment.
I'd prefer Option[string] for lookupSymbol here. AFAICT at the only place we're pulling it back out we immediately check against the empty string.
There was a problem hiding this comment.
This is a weird case where we don't want to allow an empty string as a valid input. So we want something like Option<NonEmptyString>. I've kept this function as-is, but changed the internal representation to use Option[string]. Changing this signature makes the code more cumbersome for callers, and still requires the code internally to fold Some("") to None.
There was a problem hiding this comment.
I've looked around a bit, but I still don't understand why we need special handling for "". Is it because the cursor json encoding ends up using the empty string to signal "None"?
There was a problem hiding this comment.
We can fix this in a follow-up (don't want to hold up a stack of PRs over this).
There was a problem hiding this comment.
Looking up for the empty string doesn't make sense (because it's not a valid symbol string), but the documents we store in the DB may have some occurrences with empty string as the symbol name (the current normalization scheme doesn't strip those out). So if we allow creating Some(""), we need to do a check later during symbol comparison to handle Some("") differently from Some(non-empty string).
There was a problem hiding this comment.
Okay, that sounds to me like None and Some("") are not the same after all. None means I'm not supposed to filter based on symbol, while Some("") means the user asked to match against the empty symbol... which is just not going to produce any matches, but that would be just as true for Some("1234").
but the documents we store in the DB may have some occurrences with empty string as the symbol name (the current normalization scheme doesn't strip those out).
Have you seen these occurrences or did you just infer they exist because of the existing "" checks in the code? I still don't understand why "" is the only invalid symbol string we care about, other than Go null value fears.
There was a problem hiding this comment.
Okay, that sounds to me like None and Some("") are not the same after all. None means I'm not supposed to filter based on symbol, while Some("") means the user asked to match against the empty symbol... which is just not going to produce any matches, but that would be just as true for Some("1234").
It's not terribly obvious here, but the outer loop means that the symbol values being matched against will dynamically change (e.g. during Find impls, we will find other symbol names, and we don't want to accidentally get ""). So even if you start a request, if there is some bad index data, it's possible that we start matching using "".
Have you seen these occurrences or did you just infer they exist because of the existing "" checks in the code? I still don't understand why "" is the only invalid symbol string we care about, other than Go null value fears.
- Ideally, we'd assert that all symbol names that are present in the DB and that we get as inputs are well-formed, but that is a much more complex undertaking.
- Having
""is a very easy mistake to make, and we don't want to accidentally return""in the symbols in the API (so if we don't put the check here, we need to push the check to code later in the pipeline).
It's not a perfect solution for sure, but I think it's better to have the checks that we can have for now, instead of not checking anything at all, or risking checking too much and potentially breaking code nav for a bunch of people.
| path: usage.Path, | ||
| range_: usage.Range, | ||
| }, | ||
| // TODO: Record if we got the results from Searcher or Zoekt |
There was a problem hiding this comment.
I checked with Julie and Stefan and we don't have a way of getting these results. There are some metrics recorded in Honeycomb, but nothing for an individual match. (Especially since hybrid search means a given search command can end up hitting both)
There was a problem hiding this comment.
Then let's file a feature request against Search Platform team? It might be useful to have this information for debugging.
b2aa8f7 to
2c2e4c4
Compare
No, I need to think about it some more and look at the existing tests. But I will be working on that after fixing the perf hole. |
This patch wires up the newly changed APIs in #64118
to the GraphQL API, enabling precise support in the
usagesForSymbol API. It also handles pagination.
Fixes https://linear.app/sourcegraph/issue/GRAPH-573
Test plan
Manually tested. Will add automated tests in follow-up PR.
Changelog
usagesForSymbolAPI,which can be used to implement a reference panel or similar functionality.