Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

search: Distinguish file content matches from path matches in the UI when select:file is present#41297

Merged
tbliu98 merged 13 commits into
mainfrom
tl/distinguish-file-content-match-from-path-match
Sep 6, 2022
Merged

search: Distinguish file content matches from path matches in the UI when select:file is present#41297
tbliu98 merged 13 commits into
mainfrom
tl/distinguish-file-content-match-from-path-match

Conversation

@tbliu98

@tbliu98 tbliu98 commented Sep 4, 2022

Copy link
Copy Markdown
Contributor

Stacked on #41296

select:file restricts result rendering to only show file paths. These results can be either:

  1. A match on the literal file path string, or
  2. A file containing a content match, but not necessarily a match on the file path

Regardless, select:file will cause all results to be converted to path matches.

Currently, we always label a result with "Path match" when it is rendered as a path match. This is confusing when a result is a content match that has been converted to a path match, because the label "Path match" implies that the file path itself contains a match, when this is not always the case.

Now that path match ranges are available in the client, we can make the distinction between a "natural" path match and a "converted" path match in the UI in a way that's informative to users.

Before:
Screen Shot 2022-09-04 at 6 20 55 PM

After:
Screen Shot 2022-09-06 at 11 43 08 AM

Test plan

Visually confirm the change in the UI

During construction of zoekt.GlobalTextSearchJob and zoekt.RepoSubsetTextSearchJob, compile the query patterns being sent to zoekt to regexps. These regexps are used during match construction to compute path match ranges. These path match ranges are propagated back up to the client (web app) where they are then highlighted in the UI.
During construction of searcher.TextSearchJob, compile the text pattern into a regexp. When converting searcher matches to result.FileMatches, use that regexp to compute path match ranges, which are then propagated back up to the client (web app) where they are highlighted in the UI.
@cla-bot cla-bot Bot added the cla-signed label Sep 4, 2022
@tbliu98 tbliu98 marked this pull request as ready for review September 4, 2022 23:35
@tbliu98 tbliu98 requested a review from a team September 4, 2022 23:35
@sourcegraph-bot

sourcegraph-bot commented Sep 4, 2022

Copy link
Copy Markdown
Contributor

Codenotify: Notifying subscribers in CODENOTIFY files for diff b026def...2b6a03d.

Notify File(s)
@fkling client/search-ui/src/components/FileMatchChildren.tsx
@limitedmage client/search-ui/src/components/FileMatchChildren.tsx

@camdencheek camdencheek left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it's confusing that we call the results of select:file a "path match", but this change doesn't quite fit my mental model of what's happening here.

A couple of statements:

  1. type:file abc will return both content matches and path matches
  2. type:file abc select:file will convert any content matches into path matches, and pass through any path matches unmodified
  3. Path matches, content matches, and symbol matches are all represented by the FileMatch backend type, with len(fm.ChunkMatches) == 0 && len(fm.Symbols) indicating it's a path match.

(1) is confusing because type:file will return two different result types
(2) is confusing because, even though we use type:file to mean "path matches and content matches", select:file means "select the path from any type of file match"
(3) is confusing because we don't have a dedicated type for path matches, which means we have no way distinguish between path matches and content matches with zero matched ranges. I tried to fix this at one point, but there were many edge cases that I didn't have time to track down. Some more history

So...why does this change feel wrong? Because access_token select:file does actually return a "path match" by this definition. It converts content matches into path matches -- those content matches are no longer content matches. Similarly, if you did select:repo, even if we started with content matches, they become repo matches via selection. All that to say, I think it's inaccurate to say that access_token select:file returns content matches.

@tbliu98

tbliu98 commented Sep 6, 2022

Copy link
Copy Markdown
Contributor Author

@camdencheek So, it wasn't my intent to label these matches as being content matches, because what we're presenting is a path match. That hasn't changed. My intent was to indicate more specifically to the user why certain results are included when select:file is used. I realize now I used "path match" and "content match" incorrectly in my PR description, so I'll update that to be more specific 😅

I think that adding path match highlighting is going to affect how users interpret the label "Path match." If I'm a user, and I see a path match result, I see that 1) the matching part of the file path is highlighted, and 2) it's labeled as a "Path match." So I think, "A path match is where the file path itself matches my query and the matching part will be highlighted." I think the label "Path match" implies that the file path itself contains a match on the query, but this is not always true. (I actually have always felt this way, way before working on the highlighting.)

Now if I see a result like CHANGELOG.md in the results when my query was access_token select:file -- clearly CHANGELOG.md doesn't match any part of my query, yet it's still present as a result and labeled as a "Path match." Now this conflicts with my notion of what a path match is. So I want to make it clear to the user why the file path CHANGELOG.md is showing up for the query access_token select:file.

Would it be better to phrase it as something like "File contains matching content"? Instead of "File content match," which implies that the match itself is a content match.

@tbliu98

tbliu98 commented Sep 6, 2022

Copy link
Copy Markdown
Contributor Author

Updated the displayed message to be more specific

@limitedmage

Copy link
Copy Markdown
Contributor

Updated wording seems clear to me and I think would make using select a lot less confusing. Really like this!

…lect:file` is present

`select:file` restricts result rendering to only show file paths. These results can be either 1) a match on the literal file path string, or 2) a file containing a content match, but not necessarily a path match. Currently, we always display the text "Path match" when a result is rendered as file path-only. This is confusing when the result is a content match but not a path match. This commit makes a distinction between a path match and a content match in the UI for `select:file` results to address this issue.
Display the text "File contains matching content" when a path match is in the result set because it contains a content match (i.e. it was a content match that was converted to a path match in the backend due to `select:file`). This is more specific than the initial label "File content match" which implies that the match itself is a content match, when it is in fact still a path match.
@tbliu98 tbliu98 force-pushed the tl/distinguish-file-content-match-from-path-match branch from 1c3477a to 2b6a03d Compare September 6, 2022 20:38
Base automatically changed from tl/implement-path-match-ranges to main September 6, 2022 21:30
@tbliu98 tbliu98 enabled auto-merge (squash) September 6, 2022 21:30
@tbliu98 tbliu98 merged commit 363ab4a into main Sep 6, 2022
@tbliu98 tbliu98 deleted the tl/distinguish-file-content-match-from-path-match branch September 6, 2022 21:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants