Skip to content

rowexec: fix a bug with distinct on arrays and composite types#44771

Closed
rohany wants to merge 1 commit intocockroachdb:masterfrom
rohany:distinct-null-fix
Closed

rowexec: fix a bug with distinct on arrays and composite types#44771
rohany wants to merge 1 commit intocockroachdb:masterfrom
rohany:distinct-null-fix

Conversation

@rohany
Copy link
Copy Markdown
Contributor

@rohany rohany commented Feb 5, 2020

Fixes #44079.

This PR fixes a bug caused by the table encoding of
ARRAY types. In particular, NULL had the same encoding
as ARRAY[NULL].

Release note (bug fix): Fix a bug where the distinct operation
on ARRAY[NULL] and NULL could sometimes
return an incorrect result and omit some tuples.

@rohany rohany requested review from a team and yuzefovich February 5, 2020 18:25
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Feb 5, 2020

As @solongordon pointed out, this actually affected all types where the component that we sort by is not equal to the real value, such as composite types. I'll update the message to reflect this.

@rohany rohany force-pushed the distinct-null-fix branch from 9139d5d to 56d8ed3 Compare February 5, 2020 18:34
@rohany rohany changed the title rowexec: fix a bug with distinct on arrays rowexec: fix a bug with distinct on arrays and composite types Feb 5, 2020
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@rohany rohany force-pushed the distinct-null-fix branch from 56d8ed3 to 58be03d Compare February 5, 2020 20:30
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Feb 5, 2020

@yuzefovich can you PTAL? I made a subtle change here to fix some tests that I want another pair of eyes on.

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me, but I'm not that familiar with encodings, I think you should ping someone else (maybe Solon) to take another quick look. I assume that this PR will also need to be backported, so an extra attention wouldn't hurt.

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rohany)


pkg/sql/sqlbase/encoded_datum.go, line 280 at r2 (raw file):

// the existing value encoding may include the column ID within the encoding. This function is intended
// to be used when a column ID free value encoding is desired.
func (ed *EncDatum) EncodeAsValueWithNoColID(

It'd be nice to unit test this function.

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Feb 5, 2020

cc @solongordon

@rohany rohany requested a review from solongordon February 5, 2020 22:48
@rohany rohany force-pushed the distinct-null-fix branch from 58be03d to 5ac4568 Compare February 6, 2020 18:41
Copy link
Copy Markdown
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @solongordon and @yuzefovich)


pkg/sql/sqlbase/encoded_datum.go, line 280 at r2 (raw file):

Previously, yuzefovich wrote…

It'd be nice to unit test this function.

Added a simple unit test

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Feb 10, 2020

hmm, it seems like this is not the correct fix. Looking at postgres:

rohany=# create table t (x decimal);
rohany=# insert into t values (1.0), (1.000);
rohany=# select distinct(x) from t;
  x
-----
 1.0
(1 row)

Switching to the value encoding would not result in this behavior. It seems like this should actually just be related to the key encoding for arrays, and making sure that the key encoding for an array of null is different than the encoding for null.

Fixes cockroachdb#44079.

This PR fixes a bug caused by the table encoding of
ARRAY types. In particular, NULL had the same encoding
as ARRAY[NULL].

Release note (bug fix): Fix a bug where the distinct operation
on ARRAY[NULL] and NULL could sometimes
return an incorrect result and omit some tuples.
@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Feb 17, 2020

PTAL, now that the distinct vec bug has been fixed

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Reviewed 6 of 6 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @solongordon and @yuzefovich)

@rohany
Copy link
Copy Markdown
Contributor Author

rohany commented Feb 19, 2020

I'm actually going to close this in favor of another solution.

@rohany rohany closed this Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: incorrect grouping of array with NULL

3 participants