-
Notifications
You must be signed in to change notification settings - Fork 4.1k
DISTINCT doesn't work with jsonb_build_object #46709
Description
Describe the problem
Goal:
We have a relation from an user to postal addresses and phones. The goal is to get whole net for the user (1 or <=100) as json. We distinct between projection and filtering. If we filter by phone (filtering) or want to show phones (projection), left join clause is "automatically" added.
Because of join and projection we have a problem, if user has for example 2 phones and 1 postal address. We get the same postal two times. It's possible to get rid of this problem with postgres, but I don't found a solution with cockroachdb. See reproduction.
To Reproduce
Data snippet for postgres:
CREATE TABLE profile (id UUID NOT NULL, name TEXT NOT NULL, CONSTRAINT "primary1" PRIMARY KEY (id));
CREATE TABLE phone (id UUID NOT NULL, number TEXT NOT NULL, userid UUID NOT NULL, CONSTRAINT "primary2" PRIMARY KEY (id), CONSTRAINT phone_userid_foreign FOREIGN KEY (userid) REFERENCES profile(id));
CREATE TABLE postal (id UUID NOT NULL, streetname TEXT NOT NULL, userid UUID NOT NULL, CONSTRAINT "primary3" PRIMARY KEY (id), CONSTRAINT postal_userid_foreign FOREIGN KEY (userid) REFERENCES profile(id));
INSERT INTO profile VALUES ('00000000-0000-0000-0000-000000000000', 'Tester');
INSERT INTO phone VALUES ('11111111-1111-1111-1111-111111111111', '11111', '00000000-0000-0000-0000-000000000000');
INSERT INTO phone VALUES ('22222222-2222-2222-2222-222222222222', '22222', '00000000-0000-0000-0000-000000000000');
INSERT INTO postal VALUES ('33333333-3333-3333-3333-333333333333', '33333', '00000000-0000-0000-0000-000000000000');Data snippet for crdb:
CREATE TABLE profile (id UUID NOT NULL, name STRING NOT NULL, CONSTRAINT "primary" PRIMARY KEY (id ASC));
CREATE TABLE phone (id UUID NOT NULL, number STRING NOT NULL, userid UUID NOT NULL, CONSTRAINT "primary" PRIMARY KEY (id ASC), CONSTRAINT phone_userid_foreign FOREIGN KEY (userid) REFERENCES profile(id));
CREATE TABLE postal (id UUID NOT NULL, streetname STRING NOT NULL, userid UUID NOT NULL, CONSTRAINT "primary" PRIMARY KEY (id ASC), CONSTRAINT postal_userid_foreign FOREIGN KEY (userid) REFERENCES profile(id));
INSERT INTO profile VALUES ('00000000-0000-0000-0000-000000000000', 'Tester');
INSERT INTO phone VALUES ('11111111-1111-1111-1111-111111111111', '11111', '00000000-0000-0000-0000-000000000000');
INSERT INTO phone VALUES ('22222222-2222-2222-2222-222222222222', '22222', '00000000-0000-0000-0000-000000000000');
INSERT INTO postal VALUES ('33333333-3333-3333-3333-333333333333', '33333', '00000000-0000-0000-0000-000000000000');This query shows the problem on both dbs (filtering omitted, because not relevant for the output):
select "profile"."id", "profile"."name", COUNT(*) OVER() as "count",
CASE count("postal".id) WHEN 0
THEN NULL
ELSE jsonb_agg(jsonb_build_object('id', "postal"."id",'userid', "postal"."userid",'streetname', "postal"."streetname"))
END as "contactpostal",
CASE count("phone".id) WHEN 0
THEN NULL
ELSE jsonb_agg(jsonb_build_object('id', "phone"."id",'userid', "phone"."userid",'number', "phone"."number"))
END as "contactphone"
from "profile"
left join "postal" on "profile"."id" = "postal"."userid" -- and clause for postal's filtering/permissions
left join "phone" on "profile"."id" = "phone"."userid" -- and clause for phone's filtering/permissions
-- clause for profile's filtering/permissions
group by "profile"."id", "profile"."name";Result:
id | 00000000-0000-0000-0000-000000000000
name | Tester
count | 1
contactpostal | [{"id": "33333333-3333-3333-3333-333333333333", "streetname": "33333", "userid": "00000000-0000-0000-0000-000000000000"}, {"id": "33333333-3333-3333-3333-333333333333", "streetname": "33333", "userid": "00000000-0000-0000-0000-000000000000"}]
contactphone | [{"id": "11111111-1111-1111-1111-111111111111", "number": "11111", "userid": "00000000-0000-0000-0000-000000000000"}, {"id": "22222222-2222-2222-2222-222222222222", "number": "22222", "userid": "00000000-0000-0000-0000-000000000000"}]
The closest workaround/solution is to use DISTINCT before jsonb_build_object, and it works on postgres (see sqlfiddle to play with it):
select "profile"."id", "profile"."name", COUNT(*) OVER() as "count",
CASE count("postal".id) WHEN 0
THEN NULL
ELSE jsonb_agg(DISTINCT jsonb_build_object('id', "postal"."id",'userid', "postal"."userid",'streetname', "postal"."streetname"))
END as "contactpostal",
CASE count("phone".id) WHEN 0
THEN NULL
ELSE jsonb_agg(DISTINCT jsonb_build_object('id', "phone"."id",'userid', "phone"."userid",'number', "phone"."number"))
END as "contactphone"
from "profile"
left join "postal" on "profile"."id" = "postal"."userid" -- and clause for postal's filtering/permissions
left join "phone" on "profile"."id" = "phone"."userid" -- and clause for phone's filtering/permissions
-- clause for profile's filtering/permissions
group by "profile"."id", "profile"."name";but not on crdb:
ERROR: unable to encode table key: *tree.DJSONI've seen that something similar was fixed by @rohany , but even after upgrade from 19.2.0 to unstable:
version | CockroachDB CCL v20.1.0-beta.3 (x86_64-unknown-linux-gnu, built 2020/03/23 17:54:39, go1.13.5)
nothing changes.
Expected behavior
I'm not sure if there is an efficient way to workaround it. It can be achieved if instead of usage of postal/phone in the select, a subselect is fired:
select distinct "profile"."id", "profile"."name", COUNT(*) OVER() as "count",
(
select CASE count("postal".id) WHEN 0
THEN NULL
ELSE jsonb_agg(json_build_object('id', "postal"."id",'userid', "postal"."userid",'streetname', "postal"."streetname"))
END as "contactpostal" FROM "postal" WHERE "postal"."userid" = "profile"."id" -- and clause for postal's filtering/permissions
),
(
select CASE count("phone".id) WHEN 0
THEN NULL
ELSE jsonb_agg(json_build_object('id', "phone"."id",'userid', "phone"."userid",'number', "phone"."number"))
END as "contactphone" FROM "phone" WHERE "phone"."userid" = "profile"."id" -- and clause for phone's filtering/permissions
)
from "profile"
left join "postal" on "profile"."id" = "postal"."userid" -- and clause for postals's filtering/permissions
left join "phone" on "profile"."id" = "phone"."userid" -- and clause for phone's filtering/permissions
group by "profile"."id", "profile"."name", "phone"."id";It doesn't feel right and breaks count:
id | 00000000-0000-0000-0000-000000000000
name | Tester
count | 2
contactpostal | [{"id": "33333333-3333-3333-3333-333333333333", "streetname": "33333", "userid": "00000000-0000-0000-0000-000000000000"}]
contactphone | [{"id": "11111111-1111-1111-1111-111111111111", "number": "11111", "userid": "00000000-0000-0000-0000-000000000000"}, {"id": "22222222-2222-2222-2222-222222222222", "number": "22222", "userid": "00000000-0000-0000-0000-000000000000"}]
May be there is another solution. But it would be nice to have a little bit more alignment to postgres.
Additional data / screenshots
Environment:
- CockroachDB version 19.2.0, v20.1.0-beta.3
- Server OS: official and unstable docker image
- Client app:
cockroach sql
Thanks!