Skip to content

ESQL: Add split discovery and distribution for external sources#143114

Merged
costin merged 3 commits intoelastic:mainfrom
costin:esql/ds-distributed/stage-2
Feb 26, 2026
Merged

ESQL: Add split discovery and distribution for external sources#143114
costin merged 3 commits intoelastic:mainfrom
costin:esql/ds-distributed/stage-2

Conversation

@costin
Copy link
Copy Markdown
Member

@costin costin commented Feb 25, 2026

Introduce split-based parallel processing for external data sources.
ExternalSourceExec now carries a list of ExternalSplit instances with
version-gated serialization. SplitDiscoveryPhase walks the physical
plan tree, collecting per-ancestor filter expressions and discovering
splits via SplitProvider for each ExternalSourceExec node.

  • ExternalSourceExec: splits field, serialization, equals/hashCode, NodeInfo
  • SplitDiscoveryPhase: recursive plan walk with per-source filter scoping
  • ComputeService: wire discoverSplits into executePlan
  • SourceOperatorContext: add split field for per-split operator execution
  • FileSplitProvider: L1 partition pruning via Expression evaluation
  • OperatorFactoryRegistry: expose sourceFactories for split discovery
  • TransportVersion: ESQL_EXTERNAL_SOURCE_SPLITS for backward compat

Relates #142996

Developed using AI-assisted tooling

Introduce split-based parallel processing for external data sources.
ExternalSourceExec now carries a list of ExternalSplit instances with
version-gated serialization. SplitDiscoveryPhase walks the physical
plan tree, collecting per-ancestor filter expressions and discovering
splits via SplitProvider for each ExternalSourceExec node.

- ExternalSourceExec: splits field, serialization, equals/hashCode, NodeInfo
- SplitDiscoveryPhase: recursive plan walk with per-source filter scoping
- ComputeService: wire discoverSplits into executePlan
- SourceOperatorContext: add split field for per-split operator execution
- FileSplitProvider: L1 partition pruning via Expression evaluation
- OperatorFactoryRegistry: expose sourceFactories for split discovery
- TransportVersion: ESQL_EXTERNAL_SOURCE_SPLITS for backward compat

Developed using AI-assisted tooling
@costin costin added >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.4.0 labels Feb 25, 2026
@costin costin requested a review from bpintea February 25, 2026 22:06
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @costin, I've created a changelog YAML for you.

Copy link
Copy Markdown
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

if (listItem instanceof Literal lit == false) {
return null;
}
if (compareEquals(partitionValue, ((Literal) listItem).value())) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
if (compareEquals(partitionValue, ((Literal) listItem).value())) {
if (compareEquals(partitionValue, lit.value())) {

Comment on lines 51 to 56
if (path == null) {
throw new IllegalArgumentException("path cannot be null");
}
if (executor == null) {
throw new IllegalArgumentException("executor cannot be null");
}
Copy link
Copy Markdown
Contributor

@bpintea bpintea Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines not changed in this PR, but generally for this patterns, we need to use Check.notNull, which throws a 5xx class exception.

@costin costin merged commit 80e4f22 into elastic:main Feb 26, 2026
35 checks passed
@costin costin deleted the esql/ds-distributed/stage-2 branch February 26, 2026 12:25
costin added a commit to costin/elasticsearch that referenced this pull request Feb 26, 2026
Use pattern variable lit instead of re-casting in FileSplitProvider
IN-expression evaluation. Replace manual null checks with Check.notNull
in SourceOperatorContext compact constructor.

Developed using AI-assisted tooling
costin added a commit to costin/elasticsearch that referenced this pull request Feb 26, 2026
Use pattern variable lit instead of re-casting in FileSplitProvider
IN-expression evaluation. Replace manual null checks with Check.notNull
in SourceOperatorContext compact constructor.

Developed using AI-assisted tooling
@tylerperk tylerperk added the ES|QL|DS ES|QL datasources label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement ES|QL|DS ES|QL datasources Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants