Skip to content

Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel#94095

Merged
azat merged 2 commits intoClickHouse:masterfrom
azat:uniqTheta-u8-accuracy-fix
Jan 14, 2026
Merged

Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel#94095
azat merged 2 commits intoClickHouse:masterfrom
azat:uniqTheta-u8-accuracy-fix

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Jan 13, 2026

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel (max_threads > 1 - default)

Test

SELECT
    number % 3 AS key,
    uniqTheta(generateUUIDv4(number)) AS theta,
    uniqCombined64(generateUUIDv4(number)) AS hll
FROM numbers_mt(9160000)
GROUP BY key
SETTINGS max_threads = 8

Before

   ┌─key─┬───theta─┬─────hll─┐
1. │   0300605230514752. │   1307766330575123. │   27306593049931 │
   └─────┴─────────┴─────────┘

After

   ┌─key─┬───theta─┬─────hll─┐
1. │   030682063054168-- 3.05 million
2. │   130291863049882-- 3.05 million
3. │   230551193047556-- 3.05 million
   └─────┴─────────┴─────────┘

Fixes: #45292

@azat azat requested a review from Avogar January 13, 2026 19:53
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jan 13, 2026

Workflow [PR], commit [0f1d705]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Jan 13, 2026
@Avogar Avogar self-assigned this Jan 13, 2026
@azat azat force-pushed the uniqTheta-u8-accuracy-fix branch from 125136f to f3afd15 Compare January 13, 2026 20:25
17 10 10 100 100 610 610 766
52 10 10 100 100 608 608 766
5 10 10 100 100 608 608 765
5 10 10 100 100 609 609 765
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, this is more accurate value, it matches with uniqExact

@azat azat added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label Jan 14, 2026
@azat azat enabled auto-merge January 14, 2026 09:53
This is the correct value, since uniqExact gives the same (last values
are for uniqExact):

    5      10      10      100     100     609     609     10      10      100     100     609     609     765
@azat azat force-pushed the uniqTheta-u8-accuracy-fix branch from 5310fa8 to 0f1d705 Compare January 14, 2026 12:47
@azat azat added this pull request to the merge queue Jan 14, 2026
Merged via the queue into ClickHouse:master with commit a99f0c1 Jan 14, 2026
132 checks passed
@azat azat deleted the uniqTheta-u8-accuracy-fix branch January 14, 2026 15:57
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jan 14, 2026
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 14, 2026
robot-clickhouse-ci-2 added a commit that referenced this pull request Jan 14, 2026
Cherry pick #94095 to 25.12: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
robot-clickhouse added a commit that referenced this pull request Jan 14, 2026
robot-ch-test-poll2 added a commit that referenced this pull request Jan 14, 2026
Cherry pick #94095 to 25.8: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
robot-clickhouse added a commit that referenced this pull request Jan 14, 2026
robot-ch-test-poll2 added a commit that referenced this pull request Jan 14, 2026
Cherry pick #94095 to 25.10: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
robot-clickhouse added a commit that referenced this pull request Jan 14, 2026
robot-ch-test-poll2 added a commit that referenced this pull request Jan 14, 2026
Cherry pick #94095 to 25.11: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
robot-clickhouse added a commit that referenced this pull request Jan 14, 2026
@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Jan 14, 2026
clickhouse-gh bot added a commit that referenced this pull request Jan 14, 2026
Backport #94095 to 25.10: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
azat added a commit that referenced this pull request Jan 14, 2026
Backport #94095 to 25.11: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
azat added a commit that referenced this pull request Jan 14, 2026
Backport #94095 to 25.12: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
clickhouse-gh bot added a commit that referenced this pull request Jan 14, 2026
Backport #94095 to 25.8: Fix accuracy of uniqTheta when using UInt8 aggregation keys in parallel
@alsugiliazova
Copy link
Copy Markdown
Contributor

This pr also fixed this simpler example:

SELECT number % 2 AS even, uniqTheta(number) FROM numbers(10) GROUP BY even;

After this pr:
https://fiddle.clickhouse.com/5aae58e4-a03d-4706-a52a-e2b8c61e5c5a

0	5
1	5

Before:
https://fiddle.clickhouse.com/2ce3da8c-b9ce-4d02-8484-351dc81c2f7d

0	4
1	4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

uniqTheta produces random results with multithreaded streams

6 participants