Skip to content

Incorrect GROUP BY for JSON values #10467

@breezewish

Description

@breezewish

Description

Bug Report

create table tx2 (col json);
insert into tx2 values (json_array(3.0));
insert into tx2 values (json_array(3));
select col, count(1) from tx2 group by col;

MySQL:

mysql> create table tx2 (col json);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into tx2 values (json_array(3.0));
Query OK, 1 row affected (0.00 sec)

mysql> insert into tx2 values (json_array(3));
Query OK, 1 row affected (0.00 sec)

mysql> select col, count(1) from tx2 group by col;
+-------+----------+
| col   | count(1) |
+-------+----------+
| [3.0] |        2 |
+-------+----------+
1 row in set (0.00 sec)

TiDB:

mysql> create table tx2 (col json);
Query OK, 0 rows affected (0.13 sec)

mysql> insert into tx2 values (json_array(3.0));
Query OK, 1 row affected (0.02 sec)

mysql> insert into tx2 values (json_array(3));
Query OK, 1 row affected (0.01 sec)

mysql> select col, count(1) from tx2 group by col;
+------+----------+
| col  | count(1) |
+------+----------+
| [3]  |        1 |
| [3]  |        1 |
+------+----------+
2 rows in set (0.01 sec)

This simply indicates that generating the group key by using serialized value is incorrect. For JSON values, it is very clear that different memory / serialized values are treated as the same group.

SIG slack channel

#sig-exec

Score

  • 300

Mentor

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions