-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Enhancement](json load) Set jsonload's default behavior to be read_json_by_line #55861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ata loss in read_json_by_line
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
TPC-H: Total hot run time: 34803 ms |
TPC-DS: Total hot run time: 189468 ms |
ClickBench: Total hot run time: 30.39 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
TPC-H: Total hot run time: 34617 ms |
TPC-DS: Total hot run time: 190059 ms |
ClickBench: Total hot run time: 29.9 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
ClickBench: Total hot run time: 30.98 s |
FE UT Coverage ReportIncrement line coverage |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
liaoxin01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…son_by_line (#55861) ### What problem does this PR solve? Issue Number: close #xxx Related PR: [doc-2874](apache/doris-website#2874) Problem Summary: For the read_json_by_line and strip_outer_array parameters, considering that the first parameter will gradually be deprecated in the future, and some users may forget to specify these two parameters when importing JSON files, I will modify the default behavior of these two parameters: if the user does not specify values for these two parameters, the default setting for read_json_by_line will be true. Behavior patterns after this PR: 1、In scenarios such as S3 load, since read_json_by_line is not only related to importing JSON formats but also serves as the switch for streaming JSON file reading, it will be hardcoded to true (thus, JSON formats requiring this parameter to be false are not supported in such environments). 2、In scenarios such as Stream Load, users have absolute freedom to specify any combination of parameter values (though typically we do not expect users to actively set either to false). ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [x] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [x] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [x] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
…son_by_line (#55861) ### What problem does this PR solve? Issue Number: close #xxx Related PR: [doc-2874](apache/doris-website#2874) Problem Summary: For the read_json_by_line and strip_outer_array parameters, considering that the first parameter will gradually be deprecated in the future, and some users may forget to specify these two parameters when importing JSON files, I will modify the default behavior of these two parameters: if the user does not specify values for these two parameters, the default setting for read_json_by_line will be true. Behavior patterns after this PR: 1、In scenarios such as S3 load, since read_json_by_line is not only related to importing JSON formats but also serves as the switch for streaming JSON file reading, it will be hardcoded to true (thus, JSON formats requiring this parameter to be false are not supported in such environments). 2、In scenarios such as Stream Load, users have absolute freedom to specify any combination of parameter values (though typically we do not expect users to actively set either to false). ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [x] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [x] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [x] No. - [ ] Yes. <!-- Add document PR link here. eg: apache/doris-website#1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into -->
What problem does this PR solve?
Issue Number: close #xxx
Related PR: doc-2874
Problem Summary:
For the read_json_by_line and strip_outer_array parameters, considering that the first parameter will gradually be deprecated in the future, and some users may forget to specify these two parameters when importing JSON files, I will modify the default behavior of these two parameters: if the user does not specify values for these two parameters, the default setting for read_json_by_line will be true.
Behavior patterns after this PR:
1、In scenarios such as S3 load, since read_json_by_line is not only related to importing JSON formats but also serves as the switch for streaming JSON file reading, it will be hardcoded to true (thus, JSON formats requiring this parameter to be false are not supported in such environments).
2、In scenarios such as Stream Load, users have absolute freedom to specify any combination of parameter values (though typically we do not expect users to actively set either to false).
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)