Add CGroupMemoryUsedWithoutPageCache async metric and clarify CGroupMemoryUsed description#101513
Add CGroupMemoryUsedWithoutPageCache async metric and clarify CGroupMemoryUsed description#101513primeroz wants to merge 7 commits intoClickHouse:masterfrom
Conversation
…emoryUsed description Add a new `CGroupMemoryUsedWithoutPageCache` async metric that subtracts the ClickHouse userspace page cache from `CGroupMemoryUsed`, mirroring what `MemoryResidentWithoutPageCache` does for RSS (added in ClickHouse#81233). Also clarify the `CGroupMemoryUsed` description: it previously said "excluding page cache" without specifying which page cache. It now explicitly states that the kernel page cache (OS-level file cache) is excluded because `memory.stat` does not account for the `file` field, while the ClickHouse userspace page cache is NOT excluded - that is what the new metric is for. Closes ClickHouse/support-escalation#7289 Co-authored-by: Claude <noreply@anthropic.com>
|
Workflow [PR], commit [01e715a] Summary: ❌
AI ReviewSummaryThis PR introduces ClickHouse Rules
Final Verdict
|
src/Common/AsynchronousMetrics.cpp
Outdated
|
|
||
| UInt64 cgroup_page_cache_bytes = 0; | ||
| if (context && context->getPageCache()) | ||
| cgroup_page_cache_bytes = context->getPageCache()->sizeInBytes(); |
There was a problem hiding this comment.
We already have two metrics *WithoutPageCache can we expose size of user-space page cache in asynchronous metrics instead?
There was a problem hiding this comment.
we do expose the size of user-space page cache already but is not usable in clickhouse-scraper
because clickhouse-scraper can only fetch metrics from table and aggregate over 1 minute. so we end up with
max_over_1m(CGroupUsedMemory) - max_over_1m(page_cache) which is not the same as max_over_1m(CgroupUsedMemory-page_cache)
We already tried this before but we failed and that's why we ended up creating MemoryResidentWithoutPageCache in #81233
we already had this conversation in https://github.com/ClickHouse/data-plane-application/pull/19933 which lead to exposing a raw metric that already took the WithoutPageCache into account
There was a problem hiding this comment.
Sorry, maybe I wasn't clear, let me clarify.
Right now we expose PageCacheBytes only in system.metrics, I'm suggesting to expose it also in system.asynchronous_metric that way you will get it in one place and can do what ever arithmetic your need.
There was a problem hiding this comment.
ah , yeah i misunderstood. 👍
indeed right now we do graph the page cache usage from metrics and the max page cache from async metrics
I'm suggesting to expose it also in system.asynchronous_metric that way you will get it in one place and can do what ever arithmetic your need.
if you think is best sure, 💯
do you want to do it or should i throw my claude at it ?
Side note , i just noticed with use the page cache also in a warning about memory usage
There was a problem hiding this comment.
can you check the current version ? i have a feeling you meant something slighlty different because i don't really need to expose it as async metric as well as standard metric this way ...
Add `MemoryUserSpacePageCache` async metric that exposes the ClickHouse userspace page cache size in `system.asynchronous_metrics`, alongside the existing `MemoryResident`, `MemoryResidentWithoutPageCache`, `CGroupMemoryUsed` and `CGroupMemoryUsedWithoutPageCache` metrics. Previously this value was only available in `system.metrics` as `PageCacheBytes`. Having it in `system.asynchronous_metrics` means operators can do arbitrary arithmetic directly in one place, e.g.: CGroupMemoryUsed - MemoryUserSpacePageCache MemoryResident - MemoryUserSpacePageCache The value is computed from the already-fetched `context->getPageCache()->sizeInBytes()` local variable, so no extra call is made. Co-authored-by: Claude <noreply@anthropic.com>
…ved metrics `MemoryUserSpacePageCache` is now populated first (single call to `context->getPageCache()->sizeInBytes()`), and both `MemoryResidentWithoutPageCache` and `CGroupMemoryUsedWithoutPageCache` read back that value instead of calling `getPageCache()->sizeInBytes()` independently. Co-authored-by: Claude <noreply@anthropic.com>
…s test
- Reword `CGroupMemoryUsed` description to attribute the page cache
exclusion to ClickHouse's own field selection from `memory.stat`
(anonymous memory, socket buffers, non-reclaimable kernel memory),
rather than implying it is a kernel accounting guarantee.
- Add stateless test 04075 that:
- Verifies `MemoryUserSpacePageCache` is always present in
`system.asynchronous_metrics`
- When cgroup metrics are available, asserts the invariant
`CGroupMemoryUsedWithoutPageCache <= CGroupMemoryUsed`
Co-authored-by: Claude <noreply@anthropic.com>
tests/queries/0_stateless/04075_memory_userspace_pagecache_metrics.sql
Outdated
Show resolved
Hide resolved
Replace the WHERE-based filter that silently produced no output when CGroupMemoryUsedWithoutPageCache was missing with explicit if()-based assertions that always produce deterministic output. The existence check and invariant check now properly fail when cgroup metrics are available but the expected metric is absent, while gracefully passing in environments without cgroup support. Co-authored-by: Claude <noreply@anthropic.com>
…description Remove the intermediate MemoryUserSpacePageCache async metric — it is not needed since page_cache_bytes is already in scope. CGroupMemoryUsedWithoutPageCache now reads page_cache_bytes directly. Simplify the CGroupMemoryUsed description to say it excludes the kernel OS page cache, without claiming specifics about cgroup v1/v2 memory.stat field accounting. Update the stateless test accordingly. Co-authored-by: Claude <noreply@anthropic.com>
page_cache_bytes is local to an earlier #if block and not visible in the cgroup metrics section. Read the value from context->getPageCache() directly instead. Co-authored-by: Claude <noreply@anthropic.com>
LLVM Coverage Report
Changed lines: 100.00% (18/18) · Uncovered code |
CGroupMemoryUsedis the preferred metric for memory accounting in cgroup environments (as noted in support escalation #7289), where autoscaling and memory overload warnings are moving away fromMemoryResident.However, when the ClickHouse userspace page cache is enabled,
CGroupMemoryUsedincludes that cache's memory in its value — just likeMemoryResidentdoes. PR #81233 addressed this for the RSS-based path by introducingMemoryResidentWithoutPageCache. This PR adds the equivalent for the cgroup path.Additionally, the existing
CGroupMemoryUseddescription said"(excluding page cache)"without specifying which page cache — it actually excludes only the kernel OS page cache, not the ClickHouse userspace page cache. This was confusing.Changes
CGroupMemoryUseddescription to explicitly state it excludes the kernel OS page cache.CGroupMemoryUsedWithoutPageCacheasync metric:max(0, CGroupMemoryUsed - page_cache_bytes)CGroupMemoryUsedMemoryResidentWithoutPageCachein add MemoryResidentWithoutPageCache #81233CGroupMemoryUsedWithoutPageCachepresence and invariant (<= CGroupMemoryUsed).Related: #81233 (added
MemoryResidentWithoutPageCache)Related: #100901 (improves
CGroupMemoryUsedcalculation by subtractingslab_reclaimable)Related: https://github.com/ClickHouse/support-escalation/issues/7289
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Added
CGroupMemoryUsedWithoutPageCacheasync metric that reports cgroup memory usage excluding both the kernel OS page cache and the ClickHouse userspace page cache, mirroringMemoryResidentWithoutPageCache. Also clarified theCGroupMemoryUsedmetric description.Documentation entry for user-facing changes
Co-authored-by: Claude noreply@anthropic.com