Add XContent chunking to SearchResponse by romseygeek · Pull Request #94736 · elastic/elasticsearch

romseygeek · 2023-03-24T14:44:12Z

This commit adds xcontent chunking to SearchResponse and MultiSearchResponse
by making SearchHits implement ChunkedToXContent.

Relates to #89838

elasticsearchmachine · 2023-03-24T14:44:38Z

Pinging @elastic/es-search (Team:Search)

romseygeek · 2023-03-24T14:45:30Z

server/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

+        return Iterators.concat(
+            ChunkedToXContentHelper.startObject(),
+            Iterators.single((b, p) -> b.field("took", tookInMillis).startArray(Fields.RESPONSES)),
+            Iterators.flatMap(Arrays.stream(items).iterator(), item -> item.toXContentChunked(params)),


There might be a nicer way to include an array of objects which themselves implement ChunkedToXContent but I couldn't find one

This is fine IMO but there's no need for the intermediate stream:

Suggested change

Iterators.flatMap(Arrays.stream(items).iterator(), item -> item.toXContentChunked(params)),

Iterators.flatMap(Iterators.forArray(items), item -> item.toXContentChunked(params)),

romseygeek · 2023-03-24T14:46:03Z

server/src/main/java/org/elasticsearch/action/search/SearchResponse.java

+        return Iterators.concat(
+            Iterators.single((ToXContent) SearchResponse.this::headerToXContent),
+            Iterators.single(clusters),
+            Iterators.flatMap(Iterators.single(internalResponse), r -> r.toXContentChunked(params))


Similarly I feel there ought to be a nicer way of doing this but I couldn't find one...

I think you're looking for this:

Suggested change

Iterators.flatMap(Iterators.single(internalResponse), r -> r.toXContentChunked(params))

internalResponse.toXContentChunked(params)

romseygeek · 2023-03-24T16:33:52Z

This is an interesting test failure. There's a composite aggs test that checks that we get a specific exception if keys are being formatted in a lossy fashion, and this exception is thrown during xcontent parsing. But with chunking we don't seem to catch exceptions thrown by parsing code, and so we get a 'failure encoding chunk' message instead. Is this something we've already encountered elsewhere or do I need to add some extra error handling here?

romseygeek · 2023-03-24T16:50:35Z

For a standard RestToXContentListener errors are handled by the onFailure() method, but we never get there when using the chunked handler.

DaveCTurner · 2023-03-24T17:07:20Z

This is a drawback of the chunked encoding: we have to be sure that the serialization will succeed before we start. Once we start sending chunks to the client we can't change our minds about the response code and return an error.

romseygeek · 2023-03-24T17:10:32Z

It looks as though we have quite a few places (mainly in aggs) that have this late checking. I can work through the tests and try and fix them, but it blocks making any changes here for the moment, unfortunately.

romseygeek · 2023-03-27T08:31:23Z

I've opened #94760 to discuss moving key format checks earlier in the aggregation process.

...es/lang-mustache/src/main/java/org/elasticsearch/script/mustache/SearchTemplateResponse.java

server/src/main/java/org/elasticsearch/action/search/MultiSearchResponse.java

server/src/main/java/org/elasticsearch/action/search/SearchResponse.java

cbuescher · 2023-03-27T09:32:47Z

server/src/main/java/org/elasticsearch/action/search/SearchResponseSections.java

+        return Iterators.concat(
+            Iterators.flatMap(Iterators.single(hits), r -> r.toXContentChunked(params)),
+            Iterators.single((ToXContent) (b, p) -> {
+                if (aggregations != null) {


I see why moving these checks inside the Iterator helps enumerating everything in one concat call, but this way we create iterators the we already know will be a noop. Don't know if its worth pulling these out. maybe readability suffers then, leave it up to you to decide.

Yeah I went back and forth on this a bit, but I think this is the most readable way of doing it, and creating no-op iterators is pretty low cost.

cbuescher · 2023-03-27T09:43:34Z

server/src/main/java/org/elasticsearch/search/aggregations/metrics/InternalTopHits.java

    @Override
    public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
-        searchHits.toXContent(builder, params);
+        ChunkedToXContent.wrapAsToXContent(searchHits).toXContent(builder, params);


Question out of curiosity: I thought the final goal would be to convert everything in search to a "chunked" xcontent variant. This looks a bit like for aggregations we aren't? Is this reading correct, if so is this future work or does that mean aggregations remain larger "chunks"

Yes, the plan is to do this bit-by-bit. So we start off by chunking search into search hits and aggregations, and we can then look at breaking these chunks into smaller ones in the future.

cbuescher · 2023-03-27T09:50:38Z

...lugin/core/src/main/java/org/elasticsearch/xpack/core/search/action/AsyncSearchResponse.java

        if (searchResponse != null) {
            builder.field("response");
-            searchResponse.toXContent(builder, params);
+            ChunkedToXContent.wrapAsToXContent(searchResponse).toXContent(builder, params);


Just want to verify if AsyncSearchResponse will be changed over to ChunkedToXContentObject as well at some point and if we already have plans for that?

I've created #95661 to keep track of these.

cbuescher · 2023-03-27T10:00:25Z

@romseygeek thanks, change looks great in general to me, I left a couple of questions

cbuescher · 2023-04-01T20:11:50Z

@romseygeek fyi I'm out for a couple of days, the questions I left aren't strong asks for changes so shouldn't block you from continuing this and for other to approve and get this PR merged.

…response

romseygeek · 2023-04-03T13:58:46Z

@elasticmachine run elasticsearch-ci/part-1

romseygeek · 2023-04-03T14:56:26Z

@elasticmachine run elasticsearch-ci/part-1

romseygeek · 2023-04-03T14:56:43Z

@elasticmachine run elasticsearch-ci/part-2

…response

romseygeek · 2023-05-02T14:02:01Z

@elasticmachine update branch

romseygeek · 2023-05-02T14:55:32Z

This is ready for another round of reviews

romseygeek · 2023-05-12T09:17:00Z

@elasticmachine update branch

cbuescher

Thanks for the update, I left one question regarding how much extra work we expect for the xContent validation happening for the search hits in the FetchPhase.

cbuescher · 2023-05-12T10:05:56Z

server/src/main/java/org/elasticsearch/search/fetch/FetchPhase.java

        SearchHits hits = null;
-        try {
-            hits = buildSearchHits(context, profiler);
+        try (XContentBuilder xContentValidator = new XContentBuilder(XContentType.JSON.xContent(), OutputStream.nullOutputStream())) {


Question for better understanding: this validator builder is used to completely serialize the search hits once on the node before sending them across the wire and then thrown away? Is there an understanding of the extra work this involves? I see why validation is necessary for chunking but I want to make sure I understand the tradeoffs here.

Actually I think #95673 means that I can remove this bit, good catch!

cbuescher

Thanks, LGTM

Add XContent chunking to SearchResponse

8770305

romseygeek added :Search/Search Search-related issues that do not fall into other categories >refactoring v8.8.0 labels Mar 24, 2023

romseygeek requested review from DaveCTurner and cbuescher March 24, 2023 14:44

romseygeek self-assigned this Mar 24, 2023

elasticsearchmachine added the Team:Search Meta label for search team label Mar 24, 2023

romseygeek commented Mar 24, 2023

View reviewed changes

romseygeek added 2 commits March 24, 2023 15:44

fix search template

6bb572d

sake

20fac00

cbuescher requested changes Mar 27, 2023

View reviewed changes

cbuescher self-requested a review March 27, 2023 09:59

Merge remote-tracking branch 'origin/main' into rest-chunking/search-…

54bac4a

…response

Merge remote-tracking branch 'origin/main' into rest-chunking/search-…

dbe558e

…response

gmarouli added v8.9.0 and removed v8.8.0 labels Apr 26, 2023

romseygeek added 2 commits April 28, 2023 10:13

Merge remote-tracking branch 'origin/main' into rest-chunking/search-…

2dc8dff

…response

tidy

c28e81e

compile

9b3e5b4

romseygeek mentioned this pull request Apr 28, 2023

Add Chunked rest encoding for Search APIs #95661

Open

4 tasks

romseygeek added 2 commits April 28, 2023 12:22

more cleanup

40b5258

Make sure SearchHit is serializable before we serialize it

5bb4694

Merge branch 'main' into rest-chunking/search-response

c7bd1d8

Merge branch 'main' into rest-chunking/search-response

cdae378

cbuescher reviewed May 12, 2023

View reviewed changes

remove unnecessary validation

c590c8c

romseygeek requested a review from cbuescher May 12, 2023 10:46

cbuescher approved these changes May 12, 2023

View reviewed changes

romseygeek merged commit a3edf6b into elastic:main May 12, 2023

romseygeek deleted the rest-chunking/search-response branch May 12, 2023 11:45

romseygeek mentioned this pull request Aug 11, 2023

RestChunkedXContentListener always responds with a 200 Status.OK #98389

Closed

	Iterators.flatMap(Arrays.stream(items).iterator(), item -> item.toXContentChunked(params)),
	Iterators.flatMap(Iterators.forArray(items), item -> item.toXContentChunked(params)),

	Iterators.flatMap(Iterators.single(internalResponse), r -> r.toXContentChunked(params))
	internalResponse.toXContentChunked(params)

Conversation

romseygeek commented Mar 24, 2023

Uh oh!

elasticsearchmachine commented Mar 24, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

romseygeek commented Mar 24, 2023

Uh oh!

romseygeek commented Mar 24, 2023

Uh oh!

DaveCTurner commented Mar 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

romseygeek commented Mar 24, 2023

Uh oh!

romseygeek commented Mar 27, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cbuescher commented Mar 27, 2023

Uh oh!

cbuescher commented Apr 1, 2023

Uh oh!

romseygeek commented Apr 3, 2023

Uh oh!

romseygeek commented Apr 3, 2023

Uh oh!

romseygeek commented Apr 3, 2023

Uh oh!

romseygeek commented May 2, 2023

Uh oh!

romseygeek commented May 2, 2023

Uh oh!

romseygeek commented May 12, 2023

Uh oh!

cbuescher left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cbuescher left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

DaveCTurner commented Mar 24, 2023 •

edited

Loading