Allow empty needle in function replace#69918
Conversation
src/Functions/ReplaceStringImpl.h
Outdated
| prev_haystack_offset = haystack_offsets[i]; | ||
| prev_needle_offset = needle_offsets[i]; | ||
| prev_replacement_offset = replacement_offsets[i]; | ||
| continue; |
There was a problem hiding this comment.
Let's simplify the logic a bit. We can treat (cur_needle_length == 0) as if there is no match.
So instead of this change we could check if (cur_needle_length == 0) and skip the cycle while (start_pos < cur_haystack_end) along with the initialization of searcher if so.
Is it said somewhere? I mean I opened https://www.postgresql.org/docs/8.1/functions-string.html and didn't see there that |
https://github.com/postgres/postgres/blob/2ceeb638b7b27da156c10cb9d5ea4f81cabda0d1/src/backend/utils/adt/varlena.c#L3992 Check here pls. |
Ok, I checked https://onecompiler.com/mysql/42tbkn53z and it seems most DBMS handle |
|
This is an automated comment for commit 350b9c0 with description of existing statuses. It's updated for the latest CI running ✅ Click here to open a full report in a separate page Successful checks
|
src/Functions/ReplaceStringImpl.h
Outdated
| if (needle.empty()) | ||
| throw Exception(ErrorCodes::ARGUMENT_OUT_OF_BOUND, "Length of the pattern argument in function {} must be greater than 0.", name); | ||
| { | ||
| assert(input_rows_count == haystack_data.size() / n); |
There was a problem hiding this comment.
chassert() is better because it helps us to find problem when the code is run in our CI builds.
assert() is no-op in any release builds, whereas chassert() still works in release builds with sanitizers which we run in our CI.
| assert(input_rows_count == haystack_data.size() / n); | ||
| /// Since ColumnFixedString does not have a zero byte at the end, while ColumnString does, | ||
| /// we need to split haystack_data into strings of length n, add 1 zero byte to the end of each string | ||
| /// and then copy to res_data, ref: ColumnString.h and ColumnFixedString.h |
There was a problem hiding this comment.
It would be nice to have a test with FixedString too
There was a problem hiding this comment.
Do you know how to construct a ColumnConst with string type without modifying code useDefaultImplementationForConstants = false ? I have tested with changing this code locally, it works expected.
There was a problem hiding this comment.
Executing of the following query
select replace(materialize(toFixedString('abcdef', 6)), '', '01')
will lead to this part of code.
Though it's strange that a more simple query
select replace(toFixedString('abcdef', 6), '', '01')
just throws exception ILLEGAL_COLUMN.
|
Please fix your code to pass the Style check |
|
|
||
|
|
||
| SELECT 'Check that an exception is thrown if the needle is empty'; | ||
| SELECT 'Check that whether an exception is thrown if the needle is empty'; |
There was a problem hiding this comment.
Bad grammar, please replace by "Check what happens if the needle is empty".
| SELECT s, replaceOne(s, '_', 'o') AS a, replaceRegexpOne(s, '_', 'o') AS b, a = b FROM (SELECT arrayJoin(['__.__', '.__']) AS s); | ||
| SELECT s, replaceOne(s, '_', 'o') AS a, replaceRegexpOne(s, '_', 'o') AS b, a = b FROM (SELECT arrayJoin(['__.__', '__.']) AS s); | ||
| SELECT s, replaceOne(s, '_', 'o') AS a, replaceRegexpOne(s, '_', 'o') AS b, a = b FROM (SELECT arrayJoin(['__.__', '__.__']) AS s); | ||
| SELECT replace('ABCabc', '', 'DEF'); |
There was a problem hiding this comment.
This is way too dirty. We
- should not have 101 lines with tests where
_is used as pattern, and then suddenly test empty needles without explanation or comment what is being tested - and, even worse, without any connection to the test name (we are not looping anything),
- and on top, only test the alias "replace" but not test the actual functions "replaceOne" and "replaceAll".
@zhanglistar Please move l. 102 into a new test 00240_replace_with_empty_needle and address these points.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Allow empty needle in function replace, the same behavior with PostgreSQL.
Documentation entry for user-facing changes
CI Settings (Only check the boxes if you know what you are doing):