Skip to content

Allow incremental bulk request execution#111865

Merged
Tim-Brooks merged 9 commits intoelastic:partial-rest-requestsfrom
Tim-Brooks:add_incremental_bulk_service
Aug 20, 2024
Merged

Allow incremental bulk request execution#111865
Tim-Brooks merged 9 commits intoelastic:partial-rest-requestsfrom
Tim-Brooks:add_incremental_bulk_service

Conversation

@Tim-Brooks
Copy link
Copy Markdown
Contributor

Allow a single bulk request to be passed to Elasticsearch in multiple
parts. Once a certain memory threshold or number of operations have
been received, the request can be split and submitted for processing.

@Tim-Brooks Tim-Brooks added >enhancement :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. v8.16.0 labels Aug 13, 2024
@Tim-Brooks Tim-Brooks marked this pull request as ready for review August 13, 2024 23:43
@Tim-Brooks Tim-Brooks requested a review from a team as a code owner August 13, 2024 23:43
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Aug 13, 2024
GovindarajanL

This comment was marked as off-topic.

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

handleBulkFailure(isFirstRequest, e);
}
}, nextItems));
incrementalRequestSubmitted = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set this before invoke client.bulk (need to move out the isFirstRequest calculation? While this might work in a netty eventloop with single thread per request I wonder if we want to rely on that? I.e., once we invoke nextItems.run(), I could see this method being invoked in another thread (perhaps even same, in case the bulk request fails) as an option (though perhaps not in netty, but that seems subtle to rely on).

@Override
public void onResponse(BulkResponse bulkResponse) {
responses.add(bulkResponse);
releaseCurrentReferences();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to null out bulkRequest here. That would also allow us to assert that bulkRequest != null in add/lastItems to verify the interaction. I think doing it in releaseCurrentReferences could be beneficial.


@Override
public void onFailure(Exception e) {
handleBulkFailure(isFirstRequest, e);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a failure here results in us sending back a response, also if there was never an addItems call? I think the after-block below need to check for a globalFailure and call onFailure in that case.

@Tim-Brooks Tim-Brooks merged commit a5565f2 into elastic:partial-rest-requests Aug 20, 2024
Tim-Brooks added a commit that referenced this pull request Sep 17, 2024
Allow a single bulk request to be passed to Elasticsearch in multiple
parts. Once a certain memory threshold or number of operations have
been received, the request can be split and submitted for processing.
Tim-Brooks added a commit that referenced this pull request Sep 18, 2024
Allow a single bulk request to be passed to Elasticsearch in multiple
parts. Once a certain memory threshold or number of operations have
been received, the request can be split and submitted for processing.
Tim-Brooks added a commit that referenced this pull request Sep 18, 2024
Allow a single bulk request to be passed to Elasticsearch in multiple
parts. Once a certain memory threshold or number of operations have
been received, the request can be split and submitted for processing.
Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this pull request Sep 19, 2024
Allow a single bulk request to be passed to Elasticsearch in multiple
parts. Once a certain memory threshold or number of operations have
been received, the request can be split and submitted for processing.
Tim-Brooks added a commit that referenced this pull request Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >enhancement Team:Distributed Meta label for distributed team. v8.16.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants