Skip to content

Conversation

@sjyango
Copy link
Contributor

@sjyango sjyango commented Nov 7, 2023

Proposed changes

Issue Number: close #21370

Add unit test codes for IP types. Based on pr 24965

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

}

static void set_to_max(void* buf) {
*reinterpret_cast<uint32_t*>(buf) = 0xFFFFFFFF; // 255.255.255.255
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 0xFFFFFFFF is a magic number; consider replacing it with a named constant [readability-magic-numbers]

        *reinterpret_cast<uint32_t*>(buf) = 0xFFFFFFFF; // 255.255.255.255
                                            ^

Comment on lines 65 to 66
Status DataTypeIPv4SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num, BufferWritable& bw,
FormatOptions& options) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'serialize_one_cell_to_json' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status DataTypeIPv4SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num, BufferWritable& bw,
FormatOptions& options) const {
static Status DataTypeIPv4SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num, BufferWritable& bw,
FormatOptions& options) {

Comment on lines +77 to +79
Status DataTypeIPv4SerDe::deserialize_one_cell_from_json(IColumn& column, Slice& slice,
const FormatOptions& options) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'deserialize_one_cell_from_json' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status DataTypeIPv4SerDe::deserialize_one_cell_from_json(IColumn& column, Slice& slice,
const FormatOptions& options) const {
static Status DataTypeIPv4SerDe::deserialize_one_cell_from_json(IColumn& column, Slice& slice,
const FormatOptions& options) {

Comment on lines 65 to 67
Status DataTypeIPv6SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num, BufferWritable& bw,
FormatOptions& options) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'serialize_one_cell_to_json' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status DataTypeIPv6SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num, BufferWritable& bw,
FormatOptions& options) const {
static Status DataTypeIPv6SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num, BufferWritable& bw,
FormatOptions& options) {

Comment on lines +77 to +79
Status DataTypeIPv6SerDe::deserialize_one_cell_from_json(IColumn& column, Slice& slice,
const FormatOptions& options) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'deserialize_one_cell_from_json' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status DataTypeIPv6SerDe::deserialize_one_cell_from_json(IColumn& column, Slice& slice,
const FormatOptions& options) const {
static Status DataTypeIPv6SerDe::deserialize_one_cell_from_json(IColumn& column, Slice& slice,
const FormatOptions& options) {

{"k6", FieldType::OLAP_FIELD_TYPE_DECIMAL64, 6, TYPE_DECIMAL64, false},
{"k12", FieldType::OLAP_FIELD_TYPE_DATETIMEV2, 12, TYPE_DATETIMEV2, false},
{"k8", FieldType::OLAP_FIELD_TYPE_IPV4, 8, TYPE_IPV4, false},
{"k9", FieldType::OLAP_FIELD_TYPE_IPV6, 9, TYPE_IPV6, false},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 9 is a magic number; consider replacing it with a named constant [readability-magic-numbers]

                {"k9", FieldType::OLAP_FIELD_TYPE_IPV6, 9, TYPE_IPV6, false},
                                                        ^

{
auto vec = vectorized::ColumnVector<IPv4>::create();
auto& data = vec->get_data();
for (int i = 0; i < 1024; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 1024 is a magic number; consider replacing it with a named constant [readability-magic-numbers]

        for (int i = 0; i < 1024; ++i) {
                            ^

{
auto vec = vectorized::ColumnVector<IPv6>::create();
auto& data = vec->get_data();
for (int i = 0; i < 1024; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 1024 is a magic number; consider replacing it with a named constant [readability-magic-numbers]

        for (int i = 0; i < 1024; ++i) {
                            ^

{
auto vec = vectorized::ColumnVector<IPv4>::create();
auto& data = vec->get_data();
for (int i = 0; i < 1024; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 1024 is a magic number; consider replacing it with a named constant [readability-magic-numbers]

        for (int i = 0; i < 1024; ++i) {
                            ^

{
auto vec = vectorized::ColumnVector<IPv6>::create();
auto& data = vec->get_data();
for (int i = 0; i < 1024; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 1024 is a magic number; consider replacing it with a named constant [readability-magic-numbers]

        for (int i = 0; i < 1024; ++i) {
                            ^

@sjyango
Copy link
Contributor Author

sjyango commented Nov 7, 2023

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

return _write_column_to_mysql(column, row_buffer, row_idx, col_const);
}

Status DataTypeIPv4SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'serialize_one_cell_to_json' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status DataTypeIPv4SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num,
static Status DataTypeIPv4SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num,

be/src/vec/data_types/serde/data_type_ipv4_serde.cpp:66:

-                                                      FormatOptions& options) const {
+                                                      FormatOptions& options) {

return _write_column_to_mysql(column, row_buffer, row_idx, col_const);
}

Status DataTypeIPv6SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method 'serialize_one_cell_to_json' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status DataTypeIPv6SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num,
static Status DataTypeIPv6SerDe::serialize_one_cell_to_json(const IColumn& column, int row_num,

be/src/vec/data_types/serde/data_type_ipv6_serde.cpp:66:

-                                                      FormatOptions& options) const {
+                                                      FormatOptions& options) {

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.22% (8440/22674)
Line Coverage: 29.58% (68252/230766)
Region Coverage: 28.26% (35321/124977)
Branch Coverage: 25.02% (18041/72108)
Coverage Report: http://coverage.selectdb-in.cc/coverage/2666091be226a370fab873853c5b6310055c2ab6_2666091be226a370fab873853c5b6310055c2ab6/report/index.html

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.26 seconds
stream load tsv: 553 seconds loaded 74807831229 Bytes, about 129 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17162281103 Bytes

int row_idx, bool col_const) const override;
Status write_column_to_mysql(const IColumn& column, MysqlRowBuffer<false>& row_buffer,
int row_idx, bool col_const) const override;
Status serialize_one_cell_to_json(const IColumn& column, int row_num, BufferWritable& bw,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why here has impl serialize_column_to_json/deserialize_column_from_json

@sjyango
Copy link
Contributor Author

sjyango commented Nov 22, 2023

run PipelineX

@sjyango
Copy link
Contributor Author

sjyango commented Nov 22, 2023

run p0 PipelineX

@sjyango
Copy link
Contributor Author

sjyango commented Nov 22, 2023

run buildall

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.07 seconds
stream load tsv: 573 seconds loaded 74807831229 Bytes, about 124 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.8 seconds inserted 10000000 Rows, about 347K ops/s
storage size: 17097938968 Bytes

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.75% (8495/23116)
Line Coverage: 28.99% (69026/238112)
Region Coverage: 27.96% (35702/127705)
Branch Coverage: 24.68% (18200/73748)
Coverage Report: http://coverage.selectdb-in.cc/coverage/87d8e2e0838ed563a6e1a3551f7010120f838415_87d8e2e0838ed563a6e1a3551f7010120f838415/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 87d8e2e0838ed563a6e1a3551f7010120f838415, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4927	4713	4676	4676
q2	356	159	157	157
q3	2029	1930	1900	1900
q4	1385	1266	1219	1219
q5	3974	3938	4013	3938
q6	246	125	138	125
q7	1417	886	893	886
q8	2757	2775	2772	2772
q9	9716	9668	9578	9578
q10	3450	3523	3522	3522
q11	378	244	239	239
q12	434	291	306	291
q13	4572	3796	3802	3796
q14	317	286	287	286
q15	588	528	523	523
q16	660	577	580	577
q17	1128	959	937	937
q18	7751	7237	7371	7237
q19	1676	1678	1673	1673
q20	535	333	287	287
q21	4372	3985	3994	3985
q22	479	370	372	370
Total cold run time: 53147 ms
Total hot run time: 48974 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4569	4553	4574	4553
q2	350	215	237	215
q3	4005	3987	4015	3987
q4	2694	2674	2723	2674
q5	9740	9633	9821	9633
q6	239	128	126	126
q7	3005	2473	2438	2438
q8	4437	4471	4469	4469
q9	13249	13059	13125	13059
q10	4091	4209	4183	4183
q11	759	659	662	659
q12	966	812	810	810
q13	4276	3598	3576	3576
q14	393	349	338	338
q15	568	515	523	515
q16	731	679	666	666
q17	3916	3876	3847	3847
q18	9577	8958	9006	8958
q19	1789	1765	1767	1765
q20	2403	2077	2053	2053
q21	8795	8566	8578	8566
q22	946	793	833	793
Total cold run time: 81498 ms
Total hot run time: 77883 ms

@sjyango
Copy link
Contributor Author

sjyango commented Nov 23, 2023

run p0

@sjyango
Copy link
Contributor Author

sjyango commented Nov 23, 2023

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.75% (8494/23113)
Line Coverage: 29.00% (69016/237966)
Region Coverage: 27.97% (35693/127626)
Branch Coverage: 24.68% (18201/73758)
Coverage Report: http://coverage.selectdb-in.cc/coverage/5be8b4acc30f247f9d09a83edfb6517c2f66964a_5be8b4acc30f247f9d09a83edfb6517c2f66964a/report/index.html

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 5be8b4acc30f247f9d09a83edfb6517c2f66964a, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4908	4647	4679	4647
q2	357	179	145	145
q3	2049	1928	1901	1901
q4	1393	1273	1277	1273
q5	3979	3953	4045	3953
q6	252	129	133	129
q7	1458	886	884	884
q8	2788	2780	2780	2780
q9	9724	10142	9409	9409
q10	3509	3501	3522	3501
q11	371	240	237	237
q12	435	290	297	290
q13	4574	3858	3789	3789
q14	323	293	303	293
q15	585	538	522	522
q16	657	589	571	571
q17	1141	969	951	951
q18	7834	7407	7852	7407
q19	1660	1681	1657	1657
q20	590	333	320	320
q21	4404	3992	4025	3992
q22	471	376	371	371
Total cold run time: 53462 ms
Total hot run time: 49022 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4596	4564	4597	4564
q2	338	224	262	224
q3	4015	3998	4004	3998
q4	2711	2703	2707	2703
q5	9614	9622	9581	9581
q6	247	123	125	123
q7	3014	2496	2506	2496
q8	4430	4444	4457	4444
q9	12906	12814	12827	12814
q10	4061	4176	4166	4166
q11	768	669	689	669
q12	996	817	815	815
q13	4291	3587	3551	3551
q14	400	339	351	339
q15	574	524	527	524
q16	741	670	683	670
q17	3940	3848	3918	3848
q18	9582	9107	9073	9073
q19	1842	1772	1776	1772
q20	2416	2064	2062	2062
q21	8852	8670	8677	8670
q22	907	759	813	759
Total cold run time: 81241 ms
Total hot run time: 77865 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44 seconds
stream load tsv: 581 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17097707751 Bytes

@sjyango
Copy link
Contributor Author

sjyango commented Nov 23, 2023

run p0

Copy link
Contributor

@amorynan amorynan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2023

PR approved by anyone and no changes requested.

@starocean999 starocean999 merged commit 4d1aa13 into apache:master Dec 4, 2023
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 4, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2023

PR approved by at least one committer and no changes requested.

XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature](datatype) Add IPv4/v6 data type for doris

4 participants