Skip to content

-Sparkbar as an aggregate functions combinator. #59832

@alexey-milovidov

Description

@alexey-milovidov

Use case

SELECT ClientRequestURI, count(), uniq(ClientIP) AS u, formatReadableSize(sum(EdgeResponseBytes)) AS downloaded, bar(u, 0, max(u)over(), 50) AS bar, sumSparkbar(20, toDate('2022-01-01'), toDate('2024-02-10'))(Date, 1) FROM cloudflare_http WHERE EdgeResponseStatus = 200 AND ClientRequestHost = 'clickhouse.com' AND NOT match(ClientRequestURI, '^/(api|cdn-cgi)/|\.(js|css|svg|png|gif|jpe?g|webp|xml|woff2?|ico|txt)') GROUP BY ALL ORDER BY u DESC LIMIT 50

Here the sumSparkbar(20, toDate('2022-01-01'), toDate('2024-02-10'))(Date, 1)
or countSparkbar(20, toDate('2022-01-01'), toDate('2024-02-10'))(Date)
will apply the corresponding aggregate function to the values in every bucket.

Note: the query can be run with additional aggregation:

WITH 
    '2022-05-01'::Date AS start,
    '2023-10-01'::Date AS end,
    (dateDiff('month', start, end) - 1)::UInt8 AS length,
    replaceRegexpOne(ClientRequestURI, '/$', '') AS URI,
    EdgeResponseStatus = 200 AND ClientRequestHost = 'clickhouse.com'
        AND NOT match(ClientRequestURI, '^/(api|cdn-cgi)/|\.(js|css|svg|png|gif|jpe?g|webp|xml|woff2?|ico|txt)') AS filter,
top_urls AS (
    SELECT URI FROM cloudflare_http
    WHERE filter
    GROUP BY ALL ORDER BY uniq(ClientIP) DESC LIMIT 50
),
aggregated AS (
    SELECT URI AS url, toStartOfMonth(Date) AS d, count(), uniqState(ClientIP) AS u FROM cloudflare_http
    WHERE filter AND url IN top_urls
    GROUP BY ALL
)
SELECT url, uniqMerge(u) AS uniqs,
    bar(uniqs, 0, max(uniqs)over(), 50) AS bar,
    sparkbar(length, start, end)(d, finalizeAggregation(u)) AS spark
FROM aggregated
GROUP BY ALL
ORDER BY uniqs DESC

Metadata

Metadata

Assignees

Labels

featurewarmup taskThe task for new ClickHouse team members. Low risk, moderate complexity, no urgency.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions