Skip to content

Cleanup swapSourceProvider(...) workaround#118480

Merged
martijnvg merged 6 commits intoelastic:mainfrom
martijnvg:revert_esql_eof_workaround
Jan 14, 2025
Merged

Cleanup swapSourceProvider(...) workaround#118480
martijnvg merged 6 commits intoelastic:mainfrom
martijnvg:revert_esql_eof_workaround

Conversation

@martijnvg
Copy link
Copy Markdown
Member

This reverts the workaround that was introduced in #117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

This reverts the workaround that was introduced in elastic#117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic.
This is now covered by the `ReinitializingSourceProvider` workaround that covers that and the concurrency problem.
With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.
@martijnvg martijnvg added >non-issue auto-backport Automatically create backport pull requests when merged :StorageEngine/Mapping The storage related side of mappings v9.0.0 v8.18.0 labels Dec 11, 2024
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@martijnvg
Copy link
Copy Markdown
Member Author

The last commit addresses a test failure that surfaces after running many iterations. For example: repeat 10 ./gradlew ":x-pack:plugin:esql:internalClusterTest" --tests "org.elasticsearch.xpack.esql.action.Esql*IT.testScriptField" -Dtests.iters=256

The issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.

This problem became visible after reverting #117792 in favor of just relying on ReinitializingSourceProvider. But with the latest fix doesn't can't happen any more. Now the entire compute engine source provider workaround only resides in ReinitializingSourceProvider.

@martijnvg martijnvg requested a review from dnhatn January 13, 2025 19:14
Copy link
Copy Markdown
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks Martijn!

@martijnvg martijnvg enabled auto-merge (squash) January 14, 2025 08:23
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

💚 Backport successful

Status Branch Result
8.x

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jan 14, 2025
This reverts the workaround that was introduced in elastic#117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

Additional another in `ReinitializingSourceProvider` was fixed, the issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jan 14, 2025
This reverts the workaround that was introduced in elastic#117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

Additional another in `ReinitializingSourceProvider` was fixed, the issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.
martijnvg added a commit that referenced this pull request Jan 15, 2025
Backporting #118480 to 8.x branch.

This reverts the workaround that was introduced in #117792 to avoid EOF error when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. This is now covered by the ReinitializingSourceProvider workaround that covers that and the concurrency problem. With this change, the main code for the required workarounds are now in isolated in ReinitializingSourceProvider.

Additional another in `ReinitializingSourceProvider` was fixed, the issue was the lastSeenDoc field was reused overwritten by different threads, the latest commit moves the lastSeenDoc field to PerThreadSourceProvider so that each thread gets its own place to store the last seen docid.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >non-issue :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v8.18.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants