Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #55861

…son_by_line (#55861)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR:
[doc-2874](apache/doris-website#2874)

Problem Summary:

For the read_json_by_line and strip_outer_array parameters, considering
that the first parameter will gradually be deprecated in the future, and
some users may forget to specify these two parameters when importing
JSON files, I will modify the default behavior of these two parameters:
if the user does not specify values for these two parameters, the
default setting for read_json_by_line will be true.

Behavior patterns after this PR:

1、In scenarios such as S3 load, since read_json_by_line is not only
related to importing JSON formats but also serves as the switch for
streaming JSON file reading, it will be hardcoded to true (thus, JSON
formats requiring this parameter to be false are not supported in such
environments).
2、In scenarios such as Stream Load, users have absolute freedom to
specify any combination of parameter values (though typically we do not
expect users to actively set either to false).

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
@Thearas
Copy link
Contributor

Thearas commented Sep 28, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Sep 28, 2025
@Thearas
Copy link
Contributor

Thearas commented Sep 28, 2025

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage `` 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

ClickBench: Total hot run time: 29.88 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 19c8866b3afcf1630bfed0a07c45e25b2e0bbea9, data reload: true

query1	0.03	0.03	0.03
query2	0.11	0.04	0.03
query3	0.27	0.07	0.07
query4	1.75	0.11	0.11
query5	0.26	0.26	0.25
query6	1.24	0.66	0.63
query7	0.02	0.03	0.02
query8	0.06	0.04	0.04
query9	0.64	0.51	0.52
query10	0.58	0.58	0.56
query11	0.18	0.11	0.10
query12	0.16	0.11	0.11
query13	0.62	0.61	0.61
query14	0.80	0.89	0.79
query15	0.88	0.85	0.86
query16	0.39	0.41	0.40
query17	1.05	1.02	1.07
query18	0.19	0.18	0.19
query19	1.89	1.86	1.86
query20	0.02	0.01	0.02
query21	15.51	0.94	0.59
query22	0.80	1.00	0.71
query23	15.24	1.36	0.62
query24	16.30	0.43	0.19
query25	0.17	0.08	0.09
query26	0.37	0.15	0.13
query27	0.06	0.05	0.06
query28	10.67	0.97	0.93
query29	12.69	3.86	3.21
query30	0.31	0.13	0.11
query31	3.02	0.58	0.38
query32	3.30	0.56	0.48
query33	3.09	3.12	3.14
query34	17.47	5.79	5.09
query35	5.13	5.14	5.14
query36	0.73	0.55	0.52
query37	0.11	0.07	0.07
query38	0.06	0.04	0.04
query39	0.04	0.03	0.02
query40	0.18	0.16	0.14
query41	0.09	0.03	0.03
query42	0.04	0.04	0.02
query43	0.05	0.04	0.03
Total cold run time: 116.57 s
Total hot run time: 29.88 s

@yiguolei yiguolei merged commit 01e631f into branch-4.0 Sep 29, 2025
20 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants