Search-as-a-service
**TL;DR** GitLab should provide a single search API that allows developers and the wider community to create delightful search experiences for our users. The Global Search team should be the owner of this API.
## Introduction
Search is a critical user workflow and [finding anything should be simple, fast and intuitive](https://about.gitlab.com/direction/global-search/#overview). GitLab supports [basic search](https://docs.gitlab.com/ee/user/search/), which enables searching various types of data (issues, merge requests) and also allows users to filter search results. Basic search uses PostgreSQL. [Advanced search](https://docs.gitlab.com/ee/user/search/advanced_search.html) expands on these functionalities and is based on ElasticSearch. Basic search is available in the Core tier where Advanced Search is a Premium feature.
Search functionalities across the product are not implemented using a single API, instead there are at [least three different APIs available](https://docs.gitlab.com/ee/api/search.html) and there is no clearly defined service boundary for search. This means that groups across GitLab need to implement new basic search functionalities by themselves while only the GlobalSearch team has the experience and domain expertise to improve Advanced Search. This can result in duplication of effort, different implementations for similar search functions and suboptimal results for our users.
To solve this problem, this epic proposes creating a single search API that supports multiple search backends and provides a clear service boundary between Search and the rest of GitLab. The search API should be fully owned by the GlobalSearch team with all GitLab developers and the wider community being the customers of this API.
## Problem to solve
Search is a critical user workflow and it is hard. This means that creating a delightful search experience requires domain expertise both for the design of frontend components and the implementation of search strategies in the backend. Today developers need to be familiar with ElasticSearch or PostgreSQL-based search to create a great search experience. This is a problem because many developers lack those skills and up-skilling all developers in those areas is expensive and inefficient. For example, the project management group is investigating using [PostgreSQL full-text search for complex sorts](https://gitlab.com/groups/gitlab-org/-/epics/4968) to improve performance. This a complex task that will require engineers in that area to learn about this technology rather than focusing on driving more direct customer value.
The Global Search team on the other hand has the domain expertise and experience utilizing ElasticSearch, which underpins Advanced Search. The absence of a central API makes it hard for others to adopt Advanced Search capabilities and it being a Premium feature means that not all users will benefit from improvements for central workflows, such as issue search.
In summary, there are two distinct problems to solve
1. Unclear ownership for a number of search features
1. Exposure of implementation details and a complex sub-system to developers and other contributors.
These make it harder to build great search experiences across the product.
## Proposal
This epic proposes two solutions:
### Create a clear service boundary (e.g. an API)
We should create a clear boundary between search internals and the rest of GitLab. The service boundary should make it easy for GitLab developers to create great search experiences** without having to understand all of the details of the backend technologies being used. Groups across GitLab should be able to easily implement new search and filtering functionalities in a consistent and predictable way.
A possible way to accomplish this is by providing a unified search API, similar to https://gitlab.com/gitlab-org/gitlab/-/issues/347085. The Geo self-service framework is a different approach that doesn't rely on a separate API but provides clear instructions for developers on how to integrate Geo https://docs.gitlab.com/ee/development/geo/framework.html
I am not proposing any specific implementation or architecture for this. This is something we should align on and as seen in https://gitlab.com/groups/gitlab-org/-/epics/7660#note_863227543 is controversial.
### Make Global Search the owner of all Search
The Global Search team should own all search experiences and its mission should be to empower user developers to create great search experiences by using the service they provide. This is a similar structure to how [Gitaly](https://docs.gitlab.com/ee/administration/gitaly/) works today. This could be done via an API
The advantages of this approach are:
* It creates a clearly defined organizational structure and establishes clear collaboration and interaction rules
* It provides ownership. This reduces ambiguity and establishes DRIs
* With ownership comes power and clarity for the business on where to invest to create better search.
* It abstracts away the details of search for most developers
* With a single API it becomes simpler to stratify search across tiers. Rather than using performance, we can position features in tiers using our buyer based tiering model
* A single API means we can create new features (e.g. code search or federated search) and expose them without developers needing to understand all implementation details
* SLI/SLO targets are much easier to manage when managed by a team that is solely responsible.
#### What should be owned?
1. Unscoped and scoped searching of entities
1. Filtering entities
1. Listing
1. Code search
1. Code navigation
### Allow other teams to contribute
To avoid creating a single bottleneck, the Search service should be extendable and allow other teams to contribute new entities by themselves. For example, if a new entity e.g. vulnerabilities is added, it should be simple for developers to make it searchable. The Geo self-service framework is one example of how to accomplish this https://docs.gitlab.com/ee/development/geo/framework.html
## Intended users
* Software developers (GitLab and wider community)
* API users
## Further details
* We have three search APIs https://docs.gitlab.com/ee/api/search.html at least
* We don't have a matrix that outlines feature parity between Advanced Search and Basic Search
## Documentation
The API needs to be clearly documented and we also should make it simple for others to contribute and extend the API. We likely need some kind of framework.
## What does success look like, and how can we measure that?
* All existing search functionalities in GitLab use the single search API
* New features use the single API by default
* Clear stratification of search by tier based on available API features
## Links references
* https://teamtopologies.com/key-concepts
epic