Skip to content

Our internal usage of BulkProcessor can fail indexing during retry #83927

@martijnvg

Description

@martijnvg

When BulkProcessor retries failed item requests then the same IndexRequest instance are used in a new BulkRequest for a retry. This is a problem when BulkProcessor is used from within Elasticsearch (this is the case for deprecation index logging, ilm/slm history and more), because the bulk request transport action may have set the autoGeneratedTimestamp field.

If that is the case then error such a these may appear in the logs and the BulkProcessor retry fails:

Bulk write of 1 deprecation logs failed: autoGeneratedTimestamp should not be set externally
java.lang.IllegalArgumentException: autoGeneratedTimestamp should not be set externally
	at org.elasticsearch.action.bulk.TransportBulkAction.doInternalExecute(TransportBulkAction.java:223) ~[elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.action.bulk.TransportBulkAction$1.doRun(TransportBulkAction.java:192) ~[elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:200) [elasticsearch-7.16.1.jar:7.16.1]
	at org.elasticsearch.action.bulk.TransportBulkAction.doExecute(TransportBulkAction.java:91) [elasticsearch-7.16.1.jar:7.16.1]

In the bulk transport action, for each bulk item request of the BulkRequest its process() method is invoked. Incase of IndexRequest when no _id has been specified an id is generated and in that case the IndexRequest#autoGeneratedTimestamp field is set. If the bulk request fails at a later stage, for example because there was no capacity on the node that hosts a primary shard to index documents then the bulk response will contain a EsRejectedExecutionException exception in the BulkResponse for the bulk items this failure occurred for. This is normal behaviour. However the BulkProcessor has retry logic and EsRejectedExecutionException is deemed retry-able. The BulkProcessor reuses the same bulk request item instances (IndexRequest in this case) upon retry. This causes the failure mentioned in the stack trace.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions