Skip to content

More convenient usage of big integers and ORDER BY WITH FILL.#46152

Merged
alexey-milovidov merged 12 commits intomasterfrom
with-fill-bigint
Feb 10, 2023
Merged

More convenient usage of big integers and ORDER BY WITH FILL.#46152
alexey-milovidov merged 12 commits intomasterfrom
with-fill-bigint

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Feb 8, 2023

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

More convenient usage of big integers and ORDER BY WITH FILL. Allow using plain integers for start and end points in WITH FILL when ORDER BY big (128-bit and 256-bit) integers. Fix the wrong result for big integers with negative start or end points. This closes #16733

See also the assertion about unexpected type of exception in debug build https://s3.amazonaws.com/clickhouse-test-reports/46149/a69f9c05a981a6c95af496b47413e540c87e2c09/fuzzer_astfuzzerdebug/report.html

@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-improvement Pull request with some product improvements label Feb 8, 2023
@alexey-milovidov
Copy link
Copy Markdown
Member Author

I also noticed that the code around type conversion is stupid.

@yakov-olkhovskiy yakov-olkhovskiy self-assigned this Feb 8, 2023
@yakov-olkhovskiy
Copy link
Copy Markdown
Member

yakov-olkhovskiy commented Feb 9, 2023

@alexey-milovidov
Copy link
Copy Markdown
Member Author

@yakov-olkhovskiy I want WITH FILL to use getLeastSuperType to determine the resulting type.
Currently, the logic is strange.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

But I don't understand - whether we can change the types after WITH FILL.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

Looks like no.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

And there are some complications with STEP.

@yakov-olkhovskiy
Copy link
Copy Markdown
Member

Currently, the logic is strange.

I think it's done this way to simplify step function generation

@alexey-milovidov
Copy link
Copy Markdown
Member Author

Ready for review.


T operator() (const UInt64 & x) const { return T(x); }
T operator() (const Int64 & x) const { return T(x); }
T operator() (const Int128 & x) const { return T(x); }
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here were leftovers from when we used UInt128 for UUID a long time ago. I've cleaned it up. Some bugs can be automatically fixed by this change...

if constexpr (std::is_same_v<T, U>)
return l == r;

if constexpr (std::is_arithmetic_v<T> && std::is_arithmetic_v<U>)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enables comparison for big integers.

size_t pos = 0;

/// Find position we need to increment for generating next row.
for (; pos < size(); ++pos)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor change, but the compiler cannot typically optimize size calls out of the loop due to not enough strict-aliasing rules in C++ (in comparison to Rust).

for (auto & column : header)
if (column.column && isColumnConst(*column.column) && !sort_keys.contains(column.name))
column.column = column.type->createColumn();
column.column = column.column->convertToFullColumnIfConst();
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more natural way to do the same.

WhichDataType which_from(descr.fill_from_type);
if ((which_from.isDateOrDate32() || which_from.isDateTime() || which_from.isDateTime64()) &&
!descr.fill_from_type->equals(*type))
!descr.fill_from_type->equals(*removeNullable(type)))
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enables WITH FILL when the type is Nullable, see the test.


/// TODO Wrong results for big integers.
if (isInteger(type) || which.isDate() || which.isDate32() || which.isDateTime())
if (which.isInt128() || which.isUInt128())
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented TODO.
The code remains tricky, maybe it's worth simplifying later.

descr.fill_from = convertFieldToType(descr.fill_from, *to_type);
descr.fill_to = convertFieldToType(descr.fill_to, *to_type);
descr.fill_step = convertFieldToType(descr.fill_step, *to_type);
if (!descr.fill_from.isNull())
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much better - we are explicitly asking if the field is convertible, instead of having a risk it is confused with NULL, which has another meaning for the WITH FILL logic.

auto & descr = filling_row.getFillDescription(i);
const auto & type = header_.getByPosition(block_position).type;

const Block & output_header = getOutputPort().getHeader();
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking it from the output header is unneeded, as it is currently identical to the input header by the data types, but I was not sure about it and added this code. It is more natural, so worth keeping this change.

const auto & type = header_.getByPosition(block_position).type;

const Block & output_header = getOutputPort().getHeader();
const DataTypePtr & type = removeNullable(output_header.getByPosition(block_position).type);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for Nullable.

"Incompatible types of WITH FILL expression values with column type {}", type->getName());

if (type->isValueRepresentedByUnsignedInteger() &&
if (isUnsignedInteger(type) &&
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support for big integers.


if (!is_fill_column[idx] && !(interpolate_description && interpolate_description->result_columns_set.contains(column.name)))
other_column_positions.push_back(idx);
other_column_positions.push_back(idx);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style.

{
insertFromFillingRow(res_fill_columns, res_interpolate_columns, res_other_columns, filling_row, interpolate_block);
interpolate();
insertFromFillingRow(res_fill_columns, res_interpolate_columns, res_other_columns, filling_row, interpolate_block);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style.

@alexey-milovidov alexey-milovidov merged commit d5d87bc into master Feb 10, 2023
@alexey-milovidov alexey-milovidov deleted the with-fill-bigint branch February 10, 2023 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feature request: WITH FILL expression values compatible with column type Nullable

3 participants