Search: add regex support to repo:has.meta()#63891
Conversation
c7c9572 to
e698148
Compare
There was a problem hiding this comment.
Created a new GIN index on repo KVPs. This makes regex lookups an efficient operation. Tested against sourcegraph.com (3 million kvps for testing). Reduced a point query from 1.5 seconds (surprisingly fast full table scan) to 100ms (using the GIN index).
There was a problem hiding this comment.
how long did adding the index take? Any concerns for this running in migrator for larger customers?
There was a problem hiding this comment.
I didn't measure closely, but it was ~30s. That's for 3 million with one kvp each ("camdentest":<repo_name>), so long posting lists for the key and many posting lists for the value. I think that's fast enough that we don't need to be too worried
There was a problem hiding this comment.
Unrelated, but I wanted some way to type a regex string. Too many times I've gotten confused about whether a pattern is a string literal or a regex string.
There was a problem hiding this comment.
Unrelated cleanup because every time I touch this code I get confused. This just flattens the access tree so a predicate is just a named method on a filter type.
There was a problem hiding this comment.
Unnecessary after flattening the access tree.
repo:has.meta()repo:has.meta()
756ebf4 to
ade025d
Compare
keegancsmith
left a comment
There was a problem hiding this comment.
backend code LGTM. nice stuff.
There was a problem hiding this comment.
how long did adding the index take? Any concerns for this running in migrator for larger customers?
This adds support to searching for repo metadata with a regex pattern.
Background: repo metadata is a useful feature for shoehorning business-specific information into the search query language. It allows tagging repos with arbitrary metadata (think ownership info, quality info, 3rd-party system IDs, etc.). This ends up being a useful escape hatch to shim in functionality that is not natively supported in Sourcegraph.
However it's currently limited to searching with an exact key/value pair. We've had a few requests to extend this to allow searching by pattern because it enables ingesting semi-structured data and making it searchable.
This adds the ability to use a
/.../-delimited regex pattern to match against both keys and values. For example,repo:has.meta(team:/^my\/org/)Fixes SRCH-731
Test plan
Changelog
repo:has.meta()predicate now supports regex patterns for keys and values