Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

chore(search): update logging of search durations#64269

Merged
stefanhengl merged 2 commits into
mainfrom
stefan-splf-182-fix-logging-of-keyword-searches
Aug 6, 2024
Merged

chore(search): update logging of search durations#64269
stefanhengl merged 2 commits into
mainfrom
stefan-splf-182-fix-logging-of-keyword-searches

Conversation

@stefanhengl

@stefanhengl stefanhengl commented Aug 5, 2024

Copy link
Copy Markdown
Member

Relates to slack convo

The current logging is outdated. This PR is not a complete rewrite. It tries to keep semantics mostly as-is but it is more generous when it comes to what we log. (For example we didn't log (AND foo bar) before)

IMO this is a good compromise we can ship quickly and then take some time to figure out how to distinguish suggestions, searches, code-intel and whether it is a good idea to mix search types and result types.

Updates:

  • Prefer to log result types, fall back to pattern type. (This is largely the same logic as before but we don't bail for complex queries)
  • Don't guess repo and file results types if not explicitly specified. We loose a bit of information here but it keeps the code much cleaner.
  • Allow "AND" and "OR" operators.

The updated test case gives a good overview of how we would log things in the future.

Kept as-is:

  • I left "new" and "legacy" logging in place. I am not sure how we use the target stores right now so I wanted to keep this as-is for now. However we should see if we can retire the legacy logging.
  • The previous and current logic translates type:file queries to pattern types. I guess this makes sense from the perspective of how things are implemented in the backend.
  • "type:path" is logged as "file".

Test plan:

Updated unit test

This updates and simplifies our logging as follows:
- Prefer to log result types, fall back to pattern type.
- Don't guess repo and file results types if not explicitly specified.
- Allow "AND" and "OR" operators.

Notes:
- I left "New" and "legacy" logging in place. I am not sure how we use
the target stores right now so I wanted to keep this as-is for now.
However we should update this.
- The previous and current logic translates `type:file` queries to
pattern types. This makes sense from the perspective of how things are
implemented in the backend, but it might be confusing for the consumer of
the reporting.

Test plan:
updated unit test
@cla-bot cla-bot Bot added the cla-signed label Aug 5, 2024
@github-actions github-actions Bot added team/product-platform team/search-platform Issues owned by the search platform team labels Aug 5, 2024
// categories.
if len(types) > 1 {
return
// Prefer the result type if it is a single type, otherwise use the pattern

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly copy&pasted from below. New code from lines 140 on.

return
clients.Logger.Warn("Could not log search latency", log.Error(err))
}
if !query.IsPatternAtom(q) {

@stefanhengl stefanhengl Aug 5, 2024

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we never logged keyword searches with more than 1 term.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below it says // Not an atomic pattern, can't guarantee accurate reporting., I guess we don't think that's a concern anymore? I can't see a big issue with it, and it's critical to capture these cases for keyword search.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I think we prefer completeness over accuracy here, especially now that keyword search is the default.

@stefanhengl stefanhengl requested a review from a team August 5, 2024 12:41
@stefanhengl stefanhengl marked this pull request as ready for review August 5, 2024 12:41
@jtibshirani jtibshirani requested a review from a team August 5, 2024 14:37

@jtibshirani jtibshirani left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. I like the simplification to not infer response types.

I am not familiar with our approach to new vs. legacy telemetry, and added the data team in case they have an opinion.

Also noting: we discussed offline about rethinking this reporting at some point (like why do we mix result and pattern types? and we shouldn't mix suggestions and API usage together with search bar usage)

return
clients.Logger.Warn("Could not log search latency", log.Error(err))
}
if !query.IsPatternAtom(q) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below it says // Not an atomic pattern, can't guarantee accurate reporting., I guess we don't think that's a concern anymore? I can't see a big issue with it, and it's critical to capture these cases for keyword search.


// Search.latencies
events.Record(ctx, "search.latencies", telemetry.SafeAction(action), &telemetry.EventParameters{
Metadata: telemetry.EventMetadata{

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to capture more metadata here (like number of clauses). We have been wondering about how often people use multi-term searches, and therefore benefit from keyword search. Not necessary for this PR, just brainstorming!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I will see if it is quick to add. If not I will file a follow up issue.

@stefanhengl stefanhengl merged commit 737e460 into main Aug 6, 2024
@stefanhengl stefanhengl deleted the stefan-splf-182-fix-logging-of-keyword-searches branch August 6, 2024 05:26
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/product-platform team/search-platform Issues owned by the search platform team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants