-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Invalid JOIN_ON condition #40976
Description
I'm new with CH and I'm trying to JOIN two tables. My first attemp was:
select
u.counter_id,
r.date_of_visit,
sum(r.sessions) as sessions,
sum(r.pageviews) as pageviews,
u.utm_campaign,
u.utm_source,
u.utm_medium
from
connectors_yandex_metrika.utm_for_collect u
inner join connectors_yandex_metrika.utm_sessions_report r on u.counter_id = r.counter_id
and (
u.utm_campaign = r.utm_campaign
or u.utm_campaign IS NULL
)
and (
u.utm_source = r.utm_source
or u.utm_source IS NULL
)
and (
u.utm_medium = r.utm_medium
or u.utm_medium IS NULL
)
group by
u.counter_id,
r.date_of_visit,
u.utm_campaign,
u.utm_source,
u.utm_medium
I received an error:
DB::Exception: Unsupported JOIN ON conditions. Unexpected '(utm_campaign = r.utm_campaign) OR (utm_campaign IS NULL)': While processing (utm_campaign = r.utm_campaign) OR (utm_campaign IS NULL). (INVALID_JOIN_ON_EXPRESSION) (version 22.7.2.15 (official build))
Then I decided to use coalesce:
select
u.counter_id,
u.utm_campaign,
u.utm_source,
u.utm_medium,
r.date_of_visit,
sum(r.sessions) as sessions,
sum(r.pageviews) as pageviews
from connectors_yandex_metrika.utm_for_collect u
inner join connectors_yandex_metrika.utm_sessions_report r
on (u.counter_id,
coalesce(u.utm_campaign, r.utm_campaign),
coalesce(u.utm_source, r.utm_source),
coalesce(u.utm_medium, r.utm_medium)
) = (r.counter_id, r.utm_campaign, r.utm_source, r.utm_medium)
group by u.counter_id,
u.utm_campaign,
u.utm_source,
u.utm_medium
r.date_of_visit
Error:
DB::Exception: Invalid columns in JOIN ON section. Columns counter_id and r.utm_campaign are from different tables.: While processing (counter_id, coalesce(utm_campaign, r.utm_campaign), coalesce(utm_source, r.utm_source), coalesce(utm_medium, r.utm_medium)) = (r.counter_id, r.utm_campaign, r.utm_source, r.utm_medium). (INVALID_JOIN_ON_EXPRESSION) (version 22.7.2.15 (official build))
And my last attempt was:
with r as (
select
counter_id,
date_of_visit,
sum(sessions) as sessions,
sum(pageviews) as pageviews,
coalesce(utm_campaign, '') as utm_campaign,
coalesce(utm_source, '') as utm_source,
coalesce(utm_medium, '') as utm_medium
from
connectors_yandex_metrika.utm_sessions_report
group by
counter_id,
date_of_visit,
cube(5, 6, 7)
)
select
r.*
from
r
join connectors_yandex_metrika.utm_for_collect u on r.counter_id = u.counter_id
and (
r.utm_campaign, r.utm_source, r.utm_medium
) is not distinct
from
(
u.utm_campaign, u.utm_source, u.utm_medium
)
where
'NA' in (
r.utm_campaign, r.utm_source, r.utm_medium
) is not true
Error:
DB::Exception: Syntax error: failed at position 589 ('distinct') (line 24, col 12): distinct from ( u.utm_campaign, u.utm_source, u.utm_medium ) where 'NA' in ( r.utm_campaign, r.utm_source, r.utm_medium ) is not tru. Expected one of: NULL, end of query. (SYNTAX_ERROR) (version 22.7.2.15 (official build))
All these queries work fine on PostgreSQL. Any tips how to adapt this to CH?