-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[opt](rowset) Remote fetch rowsets to avoid -230 error when capturing rowsets #52995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
cc13028 to
da91298
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage `` 🎉 |
4dc6d94 to
840053e
Compare
|
run buildall |
840053e to
715070a
Compare
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
715070a to
f0054cd
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
f0054cd to
38759bb
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run beut |
38759bb to
c30231f
Compare
|
run buildall |
1 similar comment
|
run buildall |
|
run vault_p0 |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
df02c39 to
f22b220
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 36123 ms |
ClickBench: Total hot run time: 29.84 s |
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run cloud_p0 |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
… rowsets (apache#52995) Related PR: apache#52440 In read-write splitting scenarios, some BE (Backend) nodes may have already merged certain rowset versions, while another BE still attempts to capture or access those rowsets. When this happens, the BE reports error E-230 (versions already merged), causing data access or synchronization to fail. This PR introduces a remote rowset fetching mechanism, allowing a BE that lacks the required rowset to fetch it from other BE nodes, instead of failing with E-230. - Added a remote fetch mechanism in the rowset management layer: When a BE detects that a rowset is missing locally but has already been merged, it will try to fetch the rowset from other BE nodes. - Updated version and state checking logic to correctly identify the “merged but missing” condition. - Adjusted the rowset access path to trigger remote fetch rather than throwing an immediate error. - Added tests (unit/integration) to cover the new logic where applicable. - Ensured backward compatibility: If the BE already has the rowset locally or read-write splitting is not enabled, the behavior remains unchanged. Introduce a remote rowset fetching mechanism to prevent E-230 (“versions already merged”) errors in read-write splitting scenarios. This improves BE fault tolerance when some nodes have merged versions that others have not yet synchronized.
… rowsets (#52995) (#57271) Related PR: #52995 In read-write splitting scenarios, some BE (Backend) nodes may have already merged certain rowset versions, while another BE still attempts to capture or access those rowsets. When this happens, the BE reports error E-230 (versions already merged), causing data access or synchronization to fail. This PR introduces a remote rowset fetching mechanism, allowing a BE that lacks the required rowset to fetch it from other BE nodes, instead of failing with E-230. - Added a remote fetch mechanism in the rowset management layer: When a BE detects that a rowset is missing locally but has already been merged, it will try to fetch the rowset from other BE nodes. - Updated version and state checking logic to correctly identify the “merged but missing” condition. - Adjusted the rowset access path to trigger remote fetch rather than throwing an immediate error. - Added tests (unit/integration) to cover the new logic where applicable. - Ensured backward compatibility: If the BE already has the rowset locally or read-write splitting is not enabled, the behavior remains unchanged. Introduce a remote rowset fetching mechanism to prevent E-230 (“versions already merged”) errors in read-write splitting scenarios. This improves BE fault tolerance when some nodes have merged versions that others have not yet synchronized. ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [x] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [x] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [x] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
… rowsets (apache#52995) Related PR: apache#52440 In read-write splitting scenarios, some BE (Backend) nodes may have already merged certain rowset versions, while another BE still attempts to capture or access those rowsets. When this happens, the BE reports error E-230 (versions already merged), causing data access or synchronization to fail. This PR introduces a remote rowset fetching mechanism, allowing a BE that lacks the required rowset to fetch it from other BE nodes, instead of failing with E-230. - Added a remote fetch mechanism in the rowset management layer: When a BE detects that a rowset is missing locally but has already been merged, it will try to fetch the rowset from other BE nodes. - Updated version and state checking logic to correctly identify the “merged but missing” condition. - Adjusted the rowset access path to trigger remote fetch rather than throwing an immediate error. - Added tests (unit/integration) to cover the new logic where applicable. - Ensured backward compatibility: If the BE already has the rowset locally or read-write splitting is not enabled, the behavior remains unchanged. ### Release note Introduce a remote rowset fetching mechanism to prevent E-230 (“versions already merged”) errors in read-write splitting scenarios. This improves BE fault tolerance when some nodes have merged versions that others have not yet synchronized.
What problem does this PR solve?
Related PR: #52440
Problem Summary:
In read-write splitting scenarios, some BE (Backend) nodes may have already merged certain rowset versions, while another BE still attempts to capture or access those rowsets.
When this happens, the BE reports error E-230 (versions already merged), causing data access or synchronization to fail.
This PR introduces a remote rowset fetching mechanism, allowing a BE that lacks the required rowset to fetch it from other BE nodes, instead of failing with E-230.
When a BE detects that a rowset is missing locally but has already been merged, it will try to fetch the rowset from other BE nodes.
Release note
Introduce a remote rowset fetching mechanism to prevent E-230 (“versions already merged”) errors in read-write splitting scenarios.
This improves BE fault tolerance when some nodes have merged versions that others have not yet synchronized.
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)