Skip to content

Conversation

@sollhui
Copy link
Contributor

@sollhui sollhui commented Jul 21, 2025

pick (#53374)

Multiple concurrent split file locations will be determined in plan phase, if the split point happens to be in the middle of the multi char line delimiter:

  • The previous concurrent will read the complete row1 and read a little more to read the line delimiter.
  • The latter concurrency will start reading from half of the multi char line delimiter, and row2 is the first line of this concurrency, but the first line in the middle range is always discarded, so row2 will be lost.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@sollhui sollhui requested a review from yiguolei as a code owner July 21, 2025 06:47
@sollhui
Copy link
Contributor Author

sollhui commented Jul 21, 2025

run buildall

@Thearas
Copy link
Contributor

Thearas commented Jul 21, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 39.09% (10349/26473)
Line Coverage 30.09% (86006/285839)
Region Coverage 28.80% (44438/154308)
Branch Coverage 25.52% (22749/89154)

…r line delimiter (apache#53374)

Multiple concurrent split file locations will be determined in plan
phase, if the split point happens to be in the middle of the multi char
line delimiter:

- The previous concurrent will read the complete row1 and read a little
more to read the line delimiter.
- The latter concurrency will start reading from half of the multi char
line delimiter, and row2 is the first line of this concurrency, but the
first line in the middle range is always discarded, so row2 will be
lost.
@yiguolei yiguolei force-pushed the 2.1_csv_reader_loss_data branch from 060f09b to 2f9223b Compare July 26, 2025 15:34
@yiguolei
Copy link
Contributor

run buildall

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 0.00% (0/5) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 39.09% (10349/26472)
Line Coverage 30.09% (86015/285838)
Region Coverage 28.81% (44460/154309)
Branch Coverage 25.53% (22758/89156)

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 26, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 3726272 into apache:branch-2.1 Jul 26, 2025
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants