Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

searcher: Modernize entrypoint and gRPC server#63700

Merged
eseliger merged 1 commit into
mainfrom
es/07-09-searchermodernizeentrypointandgrpcserver
Jul 9, 2024
Merged

searcher: Modernize entrypoint and gRPC server#63700
eseliger merged 1 commit into
mainfrom
es/07-09-searchermodernizeentrypointandgrpcserver

Conversation

@eseliger

@eseliger eseliger commented Jul 8, 2024

Copy link
Copy Markdown
Member

This PR does a bit of cleanup / alignment with other services, most notably:

  • Loads config via the config method of the shared routine package plus tests for default config values
  • Listens on HTTP using the existing httpserver package
  • Makes it clear what gitserver methods are used by calling them at the entry function instead of passing the whole client down
  • Adds exhaustive request logging for better auditability (and found an unused field which I removed)
  • Rename log to logger
  • Add the common stack of HTTP middlewares for actor, client etc propagation

Test plan:

Unit tests still pass. CI is green and I was able to run unindexed searches locally.

@cla-bot cla-bot Bot added the cla-signed label Jul 8, 2024

eseliger commented Jul 8, 2024

Copy link
Copy Markdown
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @eseliger and the rest of your teammates on Graphite Graphite

@github-actions github-actions Bot added team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all labels Jul 8, 2024
@eseliger eseliger force-pushed the es/07-09-searchermodernizeentrypointandgrpcserver branch 2 times, most recently from d89b72a to 116ccf3 Compare July 8, 2024 22:42
@eseliger eseliger marked this pull request as ready for review July 8, 2024 22:44
@eseliger eseliger requested review from a team and camdencheek July 8, 2024 22:45
This PR does a bit of cleanup / alignment with other services, most notably:

- Loads config via the config method of the shared routine package plus tests for default config values
- Listens on HTTP using the existing httpserver package
- Makes it clear what gitserver methods are used by calling them at the entry function instead of passing the whole client down
- Adds exhaustive request logging for better auditability (and found an unused field which I removed)
- Rename log to logger
- Add the common stack of HTTP middlewares for actor, client etc propagation

Test plan:

Unit tests still pass. CI is green and I was able to run unindexed searches locally.
@eseliger eseliger force-pushed the es/07-09-searchermodernizeentrypointandgrpcserver branch from 116ccf3 to 84b2249 Compare July 8, 2024 23:32

@camdencheek camdencheek left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

base proto.SearcherServiceServer
logger log.Logger

proto.UnsafeSearcherServiceServer // Consciously opt out of forward compatibility checks to ensure that the go-compiler will catch any breaking changes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so rather than just responding with an "unimplemented", since this doesn't implement the required interface, it will just fail to compile?

Q: why is it not recommended to do this even for non-wrapper servers? Seems like if we add an RPC to the proto, it would be more useful to fail to compile since these are "canonical" services and we aren't implementing them with multiple backends.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have honestly no idea - I copied this 1:1 from the implementation @ggilmore wrote for gitserver exhaustive logging 😬

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, got it, thanks 😄

Comment on lines +17 to +27
type Config struct {
env.BaseConfig

ListenAddress string
CacheDir string
CacheSizeMB int
BackgroundTimeout time.Duration
MaxTotalGitArchivePathsLength int
DisableHybridSearch bool
ExhaustiveRequestLoggingEnabled bool
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 😎

Comment on lines -23 to -25
// URL specifies the repository's Git remote URL (for gitserver). It is optional. See
// (gitserver.ExecRequest).URL for documentation on what it is used for.
URL string

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice. I didn't realize this was no longer used. That feels good!

Comment on lines +20 to +21
reserved 6;
reserved "url";

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my understanding: we reserve both the number and the name because the number is used in the binary encoding and the name is used in the canonical JSON encoding?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is my understanding - yes!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know! Should we have done this for other old fields too (I see reserved: <number> a lot of places already)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do decide to at some point expose the APIs via some mechanism like connectrpc that MSP services use to expose gRPC to the internet then this would definitely help - but it's only a real problem when some old service might still reference this field, which inside sourcegraph is usually not the case since all services are kept in lockstep

@eseliger eseliger merged commit fab1281 into main Jul 9, 2024
@eseliger eseliger deleted the es/07-09-searchermodernizeentrypointandgrpcserver branch July 9, 2024 19:10

@jtibshirani jtibshirani left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean-up with URL :)

Comment on lines +20 to +21
reserved 6;
reserved "url";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know! Should we have done this for other old fields too (I see reserved: <number> a lot of places already)?


filter.CommitIgnore = func(hdr *tar.Header) bool { return false } // default: don't filter
if s.FilterTar != nil {
filter.CommitIgnore, err = s.FilterTar(ctx, s.GitserverClient, repo, commit)
if err != nil {
return nil, errors.Errorf("error while calling FilterTar: %w", err)
}
filter.CommitIgnore, err = s.FilterTar(ctx, repo, commit)
if err != nil {
r.Close()
return nil, errors.Errorf("error while calling FilterTar: %w", err)
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eseliger do you remember why you made this a required field?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see a test that doesn't set it 🤔 So I thought it'd be safe to always call and not have a case here where at runtime we forget to set it and then don't filter by accident. If that breaks anything totally fine to revert

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It broke searcher benchmarks, but no worries fixed forward.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed team/product-platform team/source Tickets under the purview of Source - the one Source to graph it all

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants