ESQL: Add split discovery and distribution for external sources#143114
Merged
costin merged 3 commits intoelastic:mainfrom Feb 26, 2026
Merged
ESQL: Add split discovery and distribution for external sources#143114costin merged 3 commits intoelastic:mainfrom
costin merged 3 commits intoelastic:mainfrom
Conversation
Introduce split-based parallel processing for external data sources. ExternalSourceExec now carries a list of ExternalSplit instances with version-gated serialization. SplitDiscoveryPhase walks the physical plan tree, collecting per-ancestor filter expressions and discovering splits via SplitProvider for each ExternalSourceExec node. - ExternalSourceExec: splits field, serialization, equals/hashCode, NodeInfo - SplitDiscoveryPhase: recursive plan walk with per-source filter scoping - ComputeService: wire discoverSplits into executePlan - SourceOperatorContext: add split field for per-split operator execution - FileSplitProvider: L1 partition pruning via Expression evaluation - OperatorFactoryRegistry: expose sourceFactories for split discovery - TransportVersion: ESQL_EXTERNAL_SOURCE_SPLITS for backward compat Developed using AI-assisted tooling
Collaborator
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Collaborator
|
Hi @costin, I've created a changelog YAML for you. |
6 tasks
bpintea
approved these changes
Feb 26, 2026
| if (listItem instanceof Literal lit == false) { | ||
| return null; | ||
| } | ||
| if (compareEquals(partitionValue, ((Literal) listItem).value())) { |
Contributor
There was a problem hiding this comment.
Nit:
Suggested change
| if (compareEquals(partitionValue, ((Literal) listItem).value())) { | |
| if (compareEquals(partitionValue, lit.value())) { |
Comment on lines
51
to
56
| if (path == null) { | ||
| throw new IllegalArgumentException("path cannot be null"); | ||
| } | ||
| if (executor == null) { | ||
| throw new IllegalArgumentException("executor cannot be null"); | ||
| } |
Contributor
There was a problem hiding this comment.
These lines not changed in this PR, but generally for this patterns, we need to use Check.notNull, which throws a 5xx class exception.
costin
added a commit
to costin/elasticsearch
that referenced
this pull request
Feb 26, 2026
Use pattern variable lit instead of re-casting in FileSplitProvider IN-expression evaluation. Replace manual null checks with Check.notNull in SourceOperatorContext compact constructor. Developed using AI-assisted tooling
costin
added a commit
to costin/elasticsearch
that referenced
this pull request
Feb 26, 2026
Use pattern variable lit instead of re-casting in FileSplitProvider IN-expression evaluation. Replace manual null checks with Check.notNull in SourceOperatorContext compact constructor. Developed using AI-assisted tooling
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduce split-based parallel processing for external data sources.
ExternalSourceExec now carries a list of ExternalSplit instances with
version-gated serialization. SplitDiscoveryPhase walks the physical
plan tree, collecting per-ancestor filter expressions and discovering
splits via SplitProvider for each ExternalSourceExec node.
Relates #142996
Developed using AI-assisted tooling