Skip to content

Conversation

@eldenmoon
Copy link
Member

Previous we allow invalid text as variant in PR #37794 and store as string type.But in encoding rowstore we CHECK the json is valid and store as jsonb binary field.In this PR we support the invalid json encoding as row store

Proposed changes

Issue Number: close #xxx

Previous we allow invalid text as variant in PR apache#37794 and store as string type.But in encoding rowstore we CHECK the json is valid and store as jsonb binary field.In this PR we support the invalid json encoding as row store
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@eldenmoon
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 37983 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 036a1c3b6dabb8a5997468135639d90c153eb7a5, data reload: false

------ Round 1 ----------------------------------
q1	17943	4380	4299	4299
q2	2048	179	173	173
q3	10742	1259	1047	1047
q4	10535	751	733	733
q5	7780	2795	2843	2795
q6	229	140	146	140
q7	983	603	602	602
q8	9602	2055	2075	2055
q9	7404	6688	6669	6669
q10	7761	2277	2165	2165
q11	462	249	245	245
q12	432	223	227	223
q13	17761	3005	2991	2991
q14	287	256	246	246
q15	522	475	469	469
q16	499	392	384	384
q17	975	678	701	678
q18	7427	6849	6871	6849
q19	5141	1067	1129	1067
q20	674	325	339	325
q21	3960	2872	2816	2816
q22	1125	1032	1012	1012
Total cold run time: 114292 ms
Total hot run time: 37983 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4625	4276	4301	4276
q2	375	272	265	265
q3	2843	2660	2627	2627
q4	1918	1689	1676	1676
q5	5667	5634	5594	5594
q6	222	138	134	134
q7	2184	1817	1804	1804
q8	3301	3496	3460	3460
q9	8820	8609	8746	8609
q10	3535	3290	3314	3290
q11	597	523	516	516
q12	808	617	635	617
q13	17008	3173	3256	3173
q14	320	275	288	275
q15	531	486	485	485
q16	490	451	442	442
q17	1825	1560	1543	1543
q18	8176	8094	7729	7729
q19	1771	1519	1750	1519
q20	2151	1820	1803	1803
q21	5529	5365	5449	5365
q22	1157	1057	1091	1057
Total cold run time: 73853 ms
Total hot run time: 56259 ms

@doris-robot
Copy link

TPC-H: Total hot run time: 37434 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 036a1c3b6dabb8a5997468135639d90c153eb7a5, data reload: false

------ Round 1 ----------------------------------
q1	17616	4348	4231	4231
q2	2035	180	174	174
q3	11877	1033	1092	1033
q4	10500	722	768	722
q5	7748	2823	2773	2773
q6	223	137	137	137
q7	945	599	592	592
q8	9522	2050	2045	2045
q9	8629	6501	6533	6501
q10	7009	2125	2240	2125
q11	503	237	244	237
q12	397	220	219	219
q13	18039	3004	2988	2988
q14	281	234	242	234
q15	511	493	493	493
q16	505	385	385	385
q17	959	681	713	681
q18	7325	6786	6670	6670
q19	6772	1023	992	992
q20	712	327	345	327
q21	3824	2875	2896	2875
q22	1080	1026	1000	1000
Total cold run time: 117012 ms
Total hot run time: 37434 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4491	4262	4272	4262
q2	383	276	261	261
q3	2859	2625	2576	2576
q4	1917	1724	1646	1646
q5	5716	5727	5566	5566
q6	232	131	130	130
q7	2161	1718	1777	1718
q8	3370	3438	3451	3438
q9	8739	8729	8873	8729
q10	3589	3264	3259	3259
q11	603	522	505	505
q12	806	627	621	621
q13	17018	3217	3220	3217
q14	326	311	286	286
q15	530	515	494	494
q16	502	439	452	439
q17	1843	1519	1524	1519
q18	8113	8121	7995	7995
q19	1735	1667	1341	1341
q20	2153	1879	1904	1879
q21	5659	5598	5150	5150
q22	1099	1028	1059	1028
Total cold run time: 73844 ms
Total hot run time: 56059 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189051 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 036a1c3b6dabb8a5997468135639d90c153eb7a5, data reload: false

query1	1237	881	870	870
query2	6436	1960	1912	1912
query3	10621	3947	3726	3726
query4	58777	24291	23215	23215
query5	5700	491	489	489
query6	488	169	162	162
query7	6231	292	286	286
query8	288	205	189	189
query9	8883	2436	2406	2406
query10	500	270	256	256
query11	18014	14922	15142	14922
query12	157	101	105	101
query13	1577	371	372	371
query14	11140	7199	6723	6723
query15	230	177	170	170
query16	7673	511	467	467
query17	1130	583	596	583
query18	2073	305	293	293
query19	294	153	147	147
query20	124	112	113	112
query21	211	115	105	105
query22	4577	4370	4389	4370
query23	34612	33399	33512	33399
query24	5722	2863	2853	2853
query25	514	389	381	381
query26	690	156	160	156
query27	1788	282	278	278
query28	3819	2050	2056	2050
query29	693	394	395	394
query30	240	158	151	151
query31	928	741	734	734
query32	81	52	56	52
query33	438	286	275	275
query34	857	452	464	452
query35	811	731	691	691
query36	1096	902	930	902
query37	135	79	76	76
query38	4007	3856	3797	3797
query39	1449	1372	1433	1372
query40	192	112	117	112
query41	45	44	42	42
query42	123	101	97	97
query43	505	456	465	456
query44	1085	729	731	729
query45	193	165	165	165
query46	1077	771	702	702
query47	1838	1750	1804	1750
query48	360	279	292	279
query49	743	405	411	405
query50	800	401	411	401
query51	6877	6759	6738	6738
query52	107	89	89	89
query53	253	190	178	178
query54	560	451	450	450
query55	76	77	78	77
query56	263	246	243	243
query57	1141	1053	1045	1045
query58	223	227	220	220
query59	2888	2858	2889	2858
query60	294	259	269	259
query61	100	98	98	98
query62	749	635	649	635
query63	202	183	179	179
query64	3148	1702	1686	1686
query65	3191	3125	3149	3125
query66	666	386	337	337
query67	15066	14718	14903	14718
query68	4507	529	549	529
query69	412	273	260	260
query70	1128	1152	1099	1099
query71	355	274	274	274
query72	6395	2289	1993	1993
query73	759	318	319	318
query74	9432	8812	8800	8800
query75	3348	2732	2674	2674
query76	1827	921	995	921
query77	536	314	300	300
query78	9617	9197	8877	8877
query79	2621	518	528	518
query80	2084	491	485	485
query81	546	225	222	222
query82	608	137	134	134
query83	265	151	148	148
query84	271	73	73	73
query85	1174	365	272	272
query86	450	306	259	259
query87	4464	4173	4226	4173
query88	4124	2296	2313	2296
query89	399	288	297	288
query90	1926	197	196	196
query91	123	96	97	96
query92	65	52	53	52
query93	2984	531	529	529
query94	910	295	295	295
query95	349	254	260	254
query96	614	269	274	269
query97	3240	3015	3042	3015
query98	220	203	201	201
query99	1607	1270	1289	1270
Total cold run time: 314366 ms
Total hot run time: 189051 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.39 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 036a1c3b6dabb8a5997468135639d90c153eb7a5, data reload: false

query1	0.05	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.68	0.07	0.08
query5	0.51	0.47	0.50
query6	1.12	0.72	0.73
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.55	0.47	0.49
query10	0.54	0.54	0.55
query11	0.15	0.12	0.12
query12	0.14	0.11	0.12
query13	0.59	0.59	0.58
query14	0.76	0.80	0.77
query15	0.86	0.80	0.82
query16	0.36	0.36	0.38
query17	0.97	1.01	0.97
query18	0.22	0.21	0.21
query19	1.77	1.67	1.76
query20	0.01	0.01	0.00
query21	15.40	0.73	0.66
query22	4.33	7.73	1.68
query23	18.26	1.33	1.30
query24	2.06	0.24	0.22
query25	0.13	0.08	0.08
query26	0.30	0.20	0.21
query27	0.45	0.22	0.22
query28	13.32	1.02	1.00
query29	12.63	3.32	3.28
query30	0.23	0.06	0.06
query31	2.89	0.38	0.39
query32	3.28	0.48	0.47
query33	3.00	2.97	2.96
query34	17.14	4.40	4.40
query35	4.45	4.38	4.44
query36	0.64	0.47	0.48
query37	0.19	0.15	0.16
query38	0.15	0.16	0.16
query39	0.05	0.03	0.04
query40	0.15	0.13	0.12
query41	0.08	0.04	0.04
query42	0.06	0.05	0.06
query43	0.05	0.04	0.04
Total cold run time: 109.9 s
Total hot run time: 30.39 s

Copy link
Contributor

@amorynan amorynan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

xiaokang
xiaokang previously approved these changes Aug 15, 2024
Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 15, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

bool succ = json_parser.parse(value_str.data(), value_str.size());
// maybe more graceful, it is ok to check here since data could be parsed
CHECK(succ);
if (!succ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (succ) {
// write binary
} else {
// write string
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 15, 2024
@eldenmoon
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38409 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 323277a7aa016f4a11c9317e1146fcf04808c9af, data reload: false

------ Round 1 ----------------------------------
q1	17891	4541	4340	4340
q2	2079	212	213	212
q3	11101	1121	1121	1121
q4	10545	745	883	745
q5	7795	2825	2878	2825
q6	264	148	149	148
q7	987	636	640	636
q8	9367	2095	2103	2095
q9	7312	6572	6611	6572
q10	7089	2253	2186	2186
q11	478	272	263	263
q12	417	246	243	243
q13	17769	2997	2980	2980
q14	301	272	260	260
q15	565	510	516	510
q16	518	416	405	405
q17	1000	695	683	683
q18	7481	6849	6804	6804
q19	6216	1088	1162	1088
q20	711	361	373	361
q21	3840	2918	3149	2918
q22	1103	1014	1031	1014
Total cold run time: 114829 ms
Total hot run time: 38409 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4471	4372	4316	4316
q2	411	297	301	297
q3	2852	2699	2690	2690
q4	1974	1687	1699	1687
q5	5728	5633	5793	5633
q6	250	150	146	146
q7	2186	1805	1826	1805
q8	3334	3508	3542	3508
q9	8819	8793	8747	8747
q10	3589	3375	3346	3346
q11	666	538	535	535
q12	874	664	674	664
q13	16895	3231	3170	3170
q14	323	296	285	285
q15	559	515	519	515
q16	511	465	458	458
q17	1894	1573	1534	1534
q18	8351	7888	7738	7738
q19	4121	1706	1636	1636
q20	2223	1913	1878	1878
q21	14928	5538	5324	5324
q22	1182	1067	1054	1054
Total cold run time: 86141 ms
Total hot run time: 56966 ms

Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 15, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@doris-robot
Copy link

TPC-DS: Total hot run time: 196570 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 323277a7aa016f4a11c9317e1146fcf04808c9af, data reload: false

query1	1329	911	897	897
query2	6553	2033	1936	1936
query3	10587	3956	3713	3713
query4	58229	26898	23472	23472
query5	5703	644	635	635
query6	475	185	189	185
query7	5790	322	315	315
query8	447	371	363	363
query9	8992	2526	2491	2491
query10	528	313	328	313
query11	17921	15078	15469	15078
query12	194	129	126	126
query13	1613	443	462	443
query14	10977	7471	7438	7438
query15	262	190	201	190
query16	7553	538	547	538
query17	1167	651	653	651
query18	2080	350	349	349
query19	312	172	175	172
query20	152	137	140	137
query21	256	146	145	145
query22	4614	4278	4428	4278
query23	34573	33835	33780	33780
query24	6030	3012	2983	2983
query25	568	435	429	429
query26	714	187	185	185
query27	1768	307	308	307
query28	4210	2227	2200	2200
query29	714	473	460	460
query30	245	198	210	198
query31	1009	852	830	830
query32	100	82	79	79
query33	531	341	343	341
query34	906	495	503	495
query35	875	784	794	784
query36	1119	957	955	955
query37	153	108	125	108
query38	4024	3879	3847	3847
query39	1540	1483	1494	1483
query40	238	157	152	152
query41	139	136	137	136
query42	142	118	119	118
query43	565	534	546	534
query44	1170	786	818	786
query45	221	194	196	194
query46	1130	777	771	771
query47	1900	1858	1867	1858
query48	410	343	340	340
query49	911	588	578	578
query50	869	493	466	466
query51	6874	6728	6778	6728
query52	124	108	113	108
query53	311	241	226	226
query54	627	527	497	497
query55	91	91	91	91
query56	332	318	315	315
query57	1206	1124	1127	1124
query58	306	319	303	303
query59	3103	2856	2761	2761
query60	345	322	334	322
query61	147	151	143	143
query62	790	687	705	687
query63	265	231	230	230
query64	3265	1835	1916	1835
query65	3235	3186	3201	3186
query66	1023	676	667	667
query67	15413	14926	14838	14838
query68	6505	592	590	590
query69	703	429	335	335
query70	1231	1180	1180	1180
query71	549	328	332	328
query72	6570	2371	2149	2149
query73	832	365	363	363
query74	9475	8986	8893	8893
query75	3967	2789	2830	2789
query76	3765	1057	969	969
query77	842	459	447	447
query78	9983	9205	9047	9047
query79	3924	571	561	561
query80	2184	605	625	605
query81	603	265	265	265
query82	842	159	156	156
query83	355	222	217	217
query84	286	101	106	101
query85	1105	343	392	343
query86	484	325	331	325
query87	4436	4236	4232	4232
query88	4451	2495	2501	2495
query89	449	319	325	319
query90	1977	227	227	227
query91	154	126	126	126
query92	88	77	75	75
query93	5414	560	545	545
query94	887	367	349	349
query95	393	301	294	294
query96	620	294	283	283
query97	3228	3155	3081	3081
query98	244	236	227	227
query99	1637	1342	1331	1331
Total cold run time: 328148 ms
Total hot run time: 196570 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.58 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 323277a7aa016f4a11c9317e1146fcf04808c9af, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.05	0.05
query4	1.66	0.08	0.08
query5	0.48	0.49	0.49
query6	1.12	0.75	0.73
query7	0.02	0.01	0.02
query8	0.06	0.05	0.05
query9	0.55	0.50	0.49
query10	0.56	0.56	0.53
query11	0.16	0.12	0.12
query12	0.16	0.13	0.13
query13	0.61	0.61	0.60
query14	0.77	0.78	0.80
query15	0.86	0.83	0.82
query16	0.37	0.37	0.36
query17	1.05	1.06	0.98
query18	0.25	0.23	0.22
query19	1.78	1.70	1.75
query20	0.02	0.01	0.01
query21	15.41	0.87	0.68
query22	4.08	7.32	2.19
query23	18.28	1.36	1.34
query24	2.13	0.22	0.23
query25	0.15	0.09	0.09
query26	0.31	0.22	0.21
query27	0.46	0.24	0.23
query28	13.28	1.04	1.03
query29	12.65	3.35	3.33
query30	0.42	0.24	0.23
query31	2.81	0.42	0.40
query32	3.23	0.51	0.50
query33	3.00	3.00	2.93
query34	17.22	4.33	4.38
query35	4.44	4.41	4.40
query36	0.69	0.49	0.50
query37	0.20	0.17	0.18
query38	0.18	0.17	0.17
query39	0.06	0.06	0.06
query40	0.18	0.15	0.15
query41	0.11	0.07	0.08
query42	0.08	0.08	0.07
query43	0.07	0.06	0.07
Total cold run time: 110.27 s
Total hot run time: 31.58 s

@eldenmoon eldenmoon merged commit 1deab37 into apache:master Aug 15, 2024
@eldenmoon eldenmoon deleted the var-invlid-rs branch August 16, 2024 02:08
eldenmoon added a commit to eldenmoon/incubator-doris that referenced this pull request Aug 16, 2024
…pe (apache#39394)

Previous we allow invalid text as variant in PR apache#37794 and store as
string type.But in encoding rowstore we CHECK the json is valid and
store as jsonb binary field.In this PR we support the invalid json
encoding as row store
eldenmoon added a commit to eldenmoon/incubator-doris that referenced this pull request Aug 16, 2024
…pe (apache#39394)

Previous we allow invalid text as variant in PR apache#37794 and store as
string type.But in encoding rowstore we CHECK the json is valid and
store as jsonb binary field.In this PR we support the invalid json
encoding as row store
dataroaring pushed a commit that referenced this pull request Aug 17, 2024
…pe (#39394)

Previous we allow invalid text as variant in PR #37794 and store as
string type.But in encoding rowstore we CHECK the json is valid and
store as jsonb binary field.In this PR we support the invalid json
encoding as row store
@gavinchou gavinchou mentioned this pull request Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.6-merged dev/3.0.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants