Skip to content

Conversation

@jacktengg
Copy link
Contributor

@jacktengg jacktengg commented Mar 22, 2023

Proposed changes

Issue Number: close #xxx

Problem summary

Result is empty for query select * from person where address like '%\\\\%';, but MySQL can get a line of result.

CREATE TABLE `person` (
  `id` int(11) NULL,
  `name` text NULL,
  `age` int(11) NULL,
  `class` int(11) NULL,
  `address` text NULL
) ENGINE=OLAP
UNIQUE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
); 

insert into person values (10001,'test1',30,2,'test\\\\,xxx');

Adding logs:

select * from person where address like '%\\\\%';

I0323 10:26:15.907760 2387043 like.cpp:558] arg str: %\\%, size: 4, pattern LIKE_ENDS_WITH_RE: (?:%+)(((\\%)|(\\_)|([^%_]))+), size: 30
I0323 10:26:15.907789 2387043 like.cpp:562] match 0: \\%, size: 3
I0323 10:26:15.907801 2387043 like.cpp:562] match 1: \%, size: 2
I0323 10:26:15.907811 2387043 like.cpp:562] match 2: \%, size: 2
I0323 10:26:15.907821 2387043 like.cpp:562] match 3: , size: 0
I0323 10:26:15.907830 2387043 like.cpp:562] match 4: \, size: 1
I0323 10:26:15.907842 2387043 like.cpp:615] search_string : \\%
I0323 10:26:15.907855 2387043 like.cpp:619] search_string escape removed: \%

It matchs against the LIKE_ENDS_WITH_RE which is wrong, the meaning of the sql should be: match strings that have one backslash in any place.

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions github-actions bot added area/sql/function Issues or PRs related to the SQL functions area/vectorization kind/test labels Mar 22, 2023
@jacktengg
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hello-stephen
Copy link
Contributor

hello-stephen commented Mar 23, 2023

TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 35.55 seconds
stream load tsv: 447 seconds loaded 74807831229 Bytes, about 159 MB/s
stream load json: 24 seconds loaded 2358488459 Bytes, about 93 MB/s
stream load orc: 73 seconds loaded 1101869774 Bytes, about 14 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20230329080505_clickbench_pr_122211.html

@yiguolei yiguolei added dev/1.2.4 usercase Important user case type label and removed dev/1.2.3 labels Mar 23, 2023
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

xiaokang
xiaokang previously approved these changes Mar 23, 2023
Copy link
Contributor

@xiaokang xiaokang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@jacktengg
Copy link
Contributor Author

run buildall

yiguolei
yiguolei previously approved these changes Mar 24, 2023
Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 24, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Mar 27, 2023
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@jacktengg
Copy link
Contributor Author

run buildall

@jacktengg
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@jacktengg
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit 9877143 into apache:master Mar 30, 2023
yagagagaga pushed a commit to yagagagaga/doris that referenced this pull request Mar 31, 2023
…8039)

Result is empty for query select * from person where address like '%\\\\%';, but MySQL can get a line of result.

CREATE TABLE `person` (
  `id` int(11) NULL,
  `name` text NULL,
  `age` int(11) NULL,
  `class` int(11) NULL,
  `address` text NULL
) ENGINE=OLAP
UNIQUE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
); 

insert into person values (10001,'test1',30,2,'test\\\\,xxx');
Adding logs:

select * from person where address like '%\\\\%';

I0323 10:26:15.907760 2387043 like.cpp:558] arg str: %\\%, size: 4, pattern LIKE_ENDS_WITH_RE: (?:%+)(((\\%)|(\\_)|([^%_]))+), size: 30
I0323 10:26:15.907789 2387043 like.cpp:562] match 0: \\%, size: 3
I0323 10:26:15.907801 2387043 like.cpp:562] match 1: \%, size: 2
I0323 10:26:15.907811 2387043 like.cpp:562] match 2: \%, size: 2
I0323 10:26:15.907821 2387043 like.cpp:562] match 3: , size: 0
I0323 10:26:15.907830 2387043 like.cpp:562] match 4: \, size: 1
I0323 10:26:15.907842 2387043 like.cpp:615] search_string : \\%
I0323 10:26:15.907855 2387043 like.cpp:619] search_string escape removed: \%
It matchs against the LIKE_ENDS_WITH_RE which is wrong, the meaning of the sql should be: match strings that have one backslash in any place.
yagagagaga pushed a commit to yagagagaga/doris that referenced this pull request Mar 31, 2023
…8039)

Result is empty for query select * from person where address like '%\\\\%';, but MySQL can get a line of result.

CREATE TABLE `person` (
  `id` int(11) NULL,
  `name` text NULL,
  `age` int(11) NULL,
  `class` int(11) NULL,
  `address` text NULL
) ENGINE=OLAP
UNIQUE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
); 

insert into person values (10001,'test1',30,2,'test\\\\,xxx');
Adding logs:

select * from person where address like '%\\\\%';

I0323 10:26:15.907760 2387043 like.cpp:558] arg str: %\\%, size: 4, pattern LIKE_ENDS_WITH_RE: (?:%+)(((\\%)|(\\_)|([^%_]))+), size: 30
I0323 10:26:15.907789 2387043 like.cpp:562] match 0: \\%, size: 3
I0323 10:26:15.907801 2387043 like.cpp:562] match 1: \%, size: 2
I0323 10:26:15.907811 2387043 like.cpp:562] match 2: \%, size: 2
I0323 10:26:15.907821 2387043 like.cpp:562] match 3: , size: 0
I0323 10:26:15.907830 2387043 like.cpp:562] match 4: \, size: 1
I0323 10:26:15.907842 2387043 like.cpp:615] search_string : \\%
I0323 10:26:15.907855 2387043 like.cpp:619] search_string escape removed: \%
It matchs against the LIKE_ENDS_WITH_RE which is wrong, the meaning of the sql should be: match strings that have one backslash in any place.
yagagagaga pushed a commit to yagagagaga/doris that referenced this pull request Mar 31, 2023
…8039)

Result is empty for query select * from person where address like '%\\\\%';, but MySQL can get a line of result.

CREATE TABLE `person` (
  `id` int(11) NULL,
  `name` text NULL,
  `age` int(11) NULL,
  `class` int(11) NULL,
  `address` text NULL
) ENGINE=OLAP
UNIQUE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
); 

insert into person values (10001,'test1',30,2,'test\\\\,xxx');
Adding logs:

select * from person where address like '%\\\\%';

I0323 10:26:15.907760 2387043 like.cpp:558] arg str: %\\%, size: 4, pattern LIKE_ENDS_WITH_RE: (?:%+)(((\\%)|(\\_)|([^%_]))+), size: 30
I0323 10:26:15.907789 2387043 like.cpp:562] match 0: \\%, size: 3
I0323 10:26:15.907801 2387043 like.cpp:562] match 1: \%, size: 2
I0323 10:26:15.907811 2387043 like.cpp:562] match 2: \%, size: 2
I0323 10:26:15.907821 2387043 like.cpp:562] match 3: , size: 0
I0323 10:26:15.907830 2387043 like.cpp:562] match 4: \, size: 1
I0323 10:26:15.907842 2387043 like.cpp:615] search_string : \\%
I0323 10:26:15.907855 2387043 like.cpp:619] search_string escape removed: \%
It matchs against the LIKE_ENDS_WITH_RE which is wrong, the meaning of the sql should be: match strings that have one backslash in any place.
gnehil pushed a commit to gnehil/doris that referenced this pull request Apr 21, 2023
…8039)

Result is empty for query select * from person where address like '%\\\\%';, but MySQL can get a line of result.

CREATE TABLE `person` (
  `id` int(11) NULL,
  `name` text NULL,
  `age` int(11) NULL,
  `class` int(11) NULL,
  `address` text NULL
) ENGINE=OLAP
UNIQUE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
); 

insert into person values (10001,'test1',30,2,'test\\\\,xxx');
Adding logs:

select * from person where address like '%\\\\%';

I0323 10:26:15.907760 2387043 like.cpp:558] arg str: %\\%, size: 4, pattern LIKE_ENDS_WITH_RE: (?:%+)(((\\%)|(\\_)|([^%_]))+), size: 30
I0323 10:26:15.907789 2387043 like.cpp:562] match 0: \\%, size: 3
I0323 10:26:15.907801 2387043 like.cpp:562] match 1: \%, size: 2
I0323 10:26:15.907811 2387043 like.cpp:562] match 2: \%, size: 2
I0323 10:26:15.907821 2387043 like.cpp:562] match 3: , size: 0
I0323 10:26:15.907830 2387043 like.cpp:562] match 4: \, size: 1
I0323 10:26:15.907842 2387043 like.cpp:615] search_string : \\%
I0323 10:26:15.907855 2387043 like.cpp:619] search_string escape removed: \%
It matchs against the LIKE_ENDS_WITH_RE which is wrong, the meaning of the sql should be: match strings that have one backslash in any place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants