Skip to content

Conversation

@Hastyshell
Copy link
Collaborator

@Hastyshell Hastyshell commented Oct 23, 2025

Related PR: #52995

In read-write splitting scenarios, some BE (Backend) nodes may have already merged certain rowset versions, while another BE still attempts to capture or access those rowsets.
When this happens, the BE reports error E-230 (versions already merged), causing data access or synchronization to fail.

This PR introduces a remote rowset fetching mechanism, allowing a BE that lacks the required rowset to fetch it from other BE nodes, instead of failing with E-230.

  • Added a remote fetch mechanism in the rowset management layer: When a BE detects that a rowset is missing locally but has already been merged, it will try to fetch the rowset from other BE nodes.
  • Updated version and state checking logic to correctly identify the “merged but missing” condition.
  • Adjusted the rowset access path to trigger remote fetch rather than throwing an immediate error.
  • Added tests (unit/integration) to cover the new logic where applicable.
  • Ensured backward compatibility: If the BE already has the rowset locally or read-write splitting is not enabled, the behavior remains unchanged.

Introduce a remote rowset fetching mechanism to prevent E-230 (“versions already merged”) errors in read-write splitting scenarios. This improves BE fault tolerance when some nodes have merged versions that others have not yet synchronized.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

… rowsets (apache#52995)

Related PR: apache#52440

In read-write splitting scenarios, some BE (Backend) nodes may have
already merged certain rowset versions, while another BE still attempts
to capture or access those rowsets.
When this happens, the BE reports error E-230 (versions already merged),
causing data access or synchronization to fail.

This PR introduces a remote rowset fetching mechanism, allowing a BE
that lacks the required rowset to fetch it from other BE nodes, instead
of failing with E-230.

- Added a remote fetch mechanism in the rowset management layer:
When a BE detects that a rowset is missing locally but has already been
merged, it will try to fetch the rowset from other BE nodes.
- Updated version and state checking logic to correctly identify the
“merged but missing” condition.
- Adjusted the rowset access path to trigger remote fetch rather than
throwing an immediate error.
- Added tests (unit/integration) to cover the new logic where
applicable.
- Ensured backward compatibility: If the BE already has the rowset
locally or read-write splitting is not enabled, the behavior remains
unchanged.

Introduce a remote rowset fetching mechanism to prevent E-230 (“versions
already merged”) errors in read-write splitting scenarios.
This improves BE fault tolerance when some nodes have merged versions
that others have not yet synchronized.
@Hastyshell Hastyshell requested a review from yiguolei as a code owner October 23, 2025 08:06
@Hastyshell
Copy link
Collaborator Author

run buildall

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yiguolei yiguolei merged commit bd49d2d into apache:branch-4.0 Oct 24, 2025
27 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants