This is a meta issue to track the tasks to move the state of a search in the coordinator node.
Today the initial phase of any search creates a SearchContext on each node that contains a shard selected for the request. This SearchContext is then used as a state on each shard for the subsequent phases. This issue proposes to move from a SearchContext on each shard to a ReaderContext that would keep track of the index reader that should be used for the entire lifecycle of a search request and to move all the state of the search to the coordinating node.
To achieve this we need to re-create the search context for each phase based on the results of the previous phase. The state of the previous phase can be passed through the result of the phase and added to the request of the next one in order to be able to rebuild the search state.
Here is a list (hopefully exhaustive) of the tasks that need to be done to achieve this:
There are plenty of follow ups that we could do once we move the state of the request to the coordinator node. For instance we could create a single reader context per directory reader based on sequence numbers that we could check when a replica fails in order to move the search to a different node if another replica is at the same checkpoint (always the case for read-only indices/frozen indices).
Closes #26472
This is a meta issue to track the tasks to move the state of a search in the coordinator node.
Today the initial phase of any search creates a
SearchContexton each node that contains a shard selected for the request. ThisSearchContextis then used as a state on each shard for the subsequent phases. This issue proposes to move from aSearchContexton each shard to aReaderContextthat would keep track of the index reader that should be used for the entire lifecycle of a search request and to move all the state of the search to the coordinating node.To achieve this we need to re-create the search context for each phase based on the results of the previous phase. The state of the previous phase can be passed through the result of the phase and added to the request of the next one in order to be able to rebuild the search state.
Here is a list (hopefully exhaustive) of the tasks that need to be done to achieve this:
SearchContextwithQueryShardContextwhen building aggregator factories (Replace the SearchContext with QueryShardContext when building aggregator factories #46527)SearchContextwithQueryShardContextwhen building collapse context (Replace the SearchContext with QueryShardContext when building collapsing context #46543)SubSearchContextand add a way to clone aSearchContextSearchContextthat can be fully built from aShardSearchRequestShardSearchRequesttoQuerySearchResultand ShardFetchRequest(+bwc).SearchContexton each phase in theSearchServiceand register a simpleReaderContextin the initial phase that can be used to build theSearchContextin the subsequent phase. The bwc layer must handle nodes in previous versions and scrolls so a special context could be used to reference old style requests.ReaderContextthat can be used as a point in time reader for multiple search requests.There are plenty of follow ups that we could do once we move the state of the request to the coordinator node. For instance we could create a single reader context per directory reader based on sequence numbers that we could check when a replica fails in order to move the search to a different node if another replica is at the same checkpoint (always the case for read-only indices/frozen indices).
Closes #26472