Is your feature request related to a problem? Please describe
Currently there is inconsistency around how ingest pipelines are applied to single/bulk document update operations described in #17742. This leads to inconsistent document processing, particularly when update requests generate multiple index operations (e.g., upsert scenarios or doc_as_upsert cases): certain flag combinations trigger ingest pipelines, while others don't.
System ingest pipelines introduced in #17817 are intended to apply processor transformations like embedding generation for semantic field while abstracting away pipeline setup for users. In addition to the update inconsistency problems described previously, this introduces more surface area for confusion: for example, semantic field users may bulk update their semantic text field without knowing it uses system ingest pipelines to generate embeddings under the hood. This would cause the text field and the underlying embedding to be out of sync due to pipelines not being triggered, leading to search degradation.
We propose a sub-solution for the general case described in #17742 where we resolve and execute system pipelines for all update requests to make this behavior consistent. Much of this work is also shared with resolving the general case of the original issue.
Note that with this change, with bulk update operations, system ingest processors will be triggered on partial docs which may not contain all fields expected for documents (fields defined in in the index mapping). System ingest processors MUST handle this case gracefully (validate fields exist before accessing, have clearly defined behavior when fields are missing) or else bulk update operations will fail.
Describe the solution you'd like
Support system ingest pipelines for bulk update operations
Update Request Type Classification
- Introduce a method to expose all child index requests associated with an update operation
Pipeline Resolution Enhancement
- Use
resolveSystemIngestPipeline to enable resolving only the system ingest pipeline while setting the others to NOOP
- Based on update request fields, we extract the update request children and conditionally resolve ALL pipelines, resolve ONLY system ingest pipelines, or no pipelines at all.
Slot Management
- Introduce
innerSlot to track individual child index requests within anupdate operation
- Use innerslot to map pipeline execution results back to the correct child request using (slot, innerSlot) pairs
- Maintain proper error handling and response mapping for both parent and child operations to their original bulk request slot
Related component
Indexing
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem? Please describe
Currently there is inconsistency around how ingest pipelines are applied to single/bulk document update operations described in #17742. This leads to inconsistent document processing, particularly when update requests generate multiple index operations (e.g., upsert scenarios or doc_as_upsert cases): certain flag combinations trigger ingest pipelines, while others don't.
System ingest pipelines introduced in #17817 are intended to apply processor transformations like embedding generation for semantic field while abstracting away pipeline setup for users. In addition to the update inconsistency problems described previously, this introduces more surface area for confusion: for example, semantic field users may bulk update their semantic text field without knowing it uses system ingest pipelines to generate embeddings under the hood. This would cause the text field and the underlying embedding to be out of sync due to pipelines not being triggered, leading to search degradation.
We propose a sub-solution for the general case described in #17742 where we resolve and execute system pipelines for all update requests to make this behavior consistent. Much of this work is also shared with resolving the general case of the original issue.
Note that with this change, with bulk update operations, system ingest processors will be triggered on partial docs which may not contain all fields expected for documents (fields defined in in the index mapping). System ingest processors MUST handle this case gracefully (validate fields exist before accessing, have clearly defined behavior when fields are missing) or else bulk update operations will fail.
Describe the solution you'd like
Support system ingest pipelines for bulk update operations
Update Request Type Classification
Pipeline Resolution Enhancement
resolveSystemIngestPipelineto enable resolving only the system ingest pipeline while setting the others to NOOPSlot Management
innerSlotto track individual child index requests within anupdate operationRelated component
Indexing
Describe alternatives you've considered
No response
Additional context
No response