index: use a random sample of ngrams when limiting by keegancsmith · Pull Request #797 · sourcegraph/zoekt

keegancsmith · 2024-07-29T09:16:35Z

The first bit of data I am getting back indicates this strategy of limiting the number of ngrams we lookup isn't working. I am still experimenting with different limits, but in the meantime it is easy to implement a strategy which picks a random subset. This is so that the first N ngrams of a query aren't the only ones being consulted.

Test Plan: ran all tests with the envvar set to 2. I expected tests that assert on stats to fail, but everything else to pass. This was the case.

SRC_EXPERIMENT_ITERATE_NGRAM_LOOKUP_LIMIT=2 go test ./...

Part of https://linear.app/sourcegraph/issue/CODY-3029/investigate-performance-of-guardrails-attribution-endpoint

The first bit of data I am getting back indicates this strategy of limiting the number of ngrams we lookup isn't working. I am still experimenting with different limits, but in the meantime it is easy to implement a strategy which picks a random subset. This is so that the first N ngrams of a query aren't the only ones being consulted. Test Plan: ran all tests with the envvar set to 2. I expected tests that assert on stats to fail, but everything else to pass. This was the case. SRC_EXPERIMENT_ITERATE_NGRAM_LOOKUP_LIMIT=2 go test ./...

keegancsmith requested review from a team and eseliger July 29, 2024 09:16

cla-bot Bot added the cla-signed label Jul 29, 2024

eseliger approved these changes Jul 29, 2024

View reviewed changes

keegancsmith merged commit ebb3ca2 into main Jul 29, 2024

keegancsmith deleted the k/random-sample branch July 29, 2024 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index: use a random sample of ngrams when limiting#797

index: use a random sample of ngrams when limiting#797
keegancsmith merged 1 commit into
mainfrom
k/random-sample

keegancsmith commented Jul 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

keegancsmith commented Jul 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants