Skip to content

Add CGroupMemoryUsedWithoutPageCache async metric and clarify CGroupMemoryUsed description#101513

Open
primeroz wants to merge 7 commits intoClickHouse:masterfrom
primeroz:cgroupmemorywithoutuserspacepagecache
Open

Add CGroupMemoryUsedWithoutPageCache async metric and clarify CGroupMemoryUsed description#101513
primeroz wants to merge 7 commits intoClickHouse:masterfrom
primeroz:cgroupmemorywithoutuserspacepagecache

Conversation

@primeroz
Copy link
Copy Markdown
Member

@primeroz primeroz commented Apr 1, 2026

CGroupMemoryUsed is the preferred metric for memory accounting in cgroup environments (as noted in support escalation #7289), where autoscaling and memory overload warnings are moving away from MemoryResident.

However, when the ClickHouse userspace page cache is enabled, CGroupMemoryUsed includes that cache's memory in its value — just like MemoryResident does. PR #81233 addressed this for the RSS-based path by introducing MemoryResidentWithoutPageCache. This PR adds the equivalent for the cgroup path.

Additionally, the existing CGroupMemoryUsed description said "(excluding page cache)" without specifying which page cache — it actually excludes only the kernel OS page cache, not the ClickHouse userspace page cache. This was confusing.

Changes

  • Simplified the CGroupMemoryUsed description to explicitly state it excludes the kernel OS page cache.
  • Added CGroupMemoryUsedWithoutPageCache async metric:
    • Formula: max(0, CGroupMemoryUsed - page_cache_bytes)
    • When userspace page cache is disabled, equals CGroupMemoryUsed
    • Mirrors the pattern established by MemoryResidentWithoutPageCache in add MemoryResidentWithoutPageCache #81233
  • Added a stateless test verifying CGroupMemoryUsedWithoutPageCache presence and invariant (<= CGroupMemoryUsed).

Related: #81233 (added MemoryResidentWithoutPageCache)
Related: #100901 (improves CGroupMemoryUsed calculation by subtracting slab_reclaimable)
Related: https://github.com/ClickHouse/support-escalation/issues/7289

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Added CGroupMemoryUsedWithoutPageCache async metric that reports cgroup memory usage excluding both the kernel OS page cache and the ClickHouse userspace page cache, mirroring MemoryResidentWithoutPageCache. Also clarified the CGroupMemoryUsed metric description.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Co-authored-by: Claude noreply@anthropic.com

…emoryUsed description

Add a new `CGroupMemoryUsedWithoutPageCache` async metric that subtracts the
ClickHouse userspace page cache from `CGroupMemoryUsed`, mirroring what
`MemoryResidentWithoutPageCache` does for RSS (added in ClickHouse#81233).

Also clarify the `CGroupMemoryUsed` description: it previously said
"excluding page cache" without specifying which page cache. It now explicitly
states that the kernel page cache (OS-level file cache) is excluded because
`memory.stat` does not account for the `file` field, while the ClickHouse
userspace page cache is NOT excluded - that is what the new metric is for.

Closes ClickHouse/support-escalation#7289

Co-authored-by: Claude <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Apr 1, 2026

Workflow [PR], commit [01e715a]

Summary:

job_name test_name status info comment
Stateless tests (arm_asan_ubsan, flaky check) failure
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
Stateless tests (amd_asan_ubsan, flaky check) failure
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
Stateless tests (amd_tsan, flaky check) failure
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
Stateless tests (amd_msan, flaky check) failure
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
Stateless tests (amd_debug, flaky check) failure
04075_memory_userspace_pagecache_metrics FAIL cidb
04075_memory_userspace_pagecache_metrics FAIL cidb
Stress test (arm_msan) failure
Server died FAIL cidb
MemorySanitizer: use-of-uninitialized-value (STID: 1003-358c) FAIL cidb, issue

AI Review

Summary

This PR introduces CGroupMemoryUsedWithoutPageCache, clarifies the CGroupMemoryUsed description, and adds a stateless test for presence and invariant checks. After reviewing the diff, surrounding code paths, and existing inline comments from clickhouse-gh[bot], I did not find additional high-confidence correctness, safety, performance, or compatibility issues in the current PR head.

ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Apr 1, 2026

UInt64 cgroup_page_cache_bytes = 0;
if (context && context->getPageCache())
cgroup_page_cache_bytes = context->getPageCache()->sizeInBytes();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have two metrics *WithoutPageCache can we expose size of user-space page cache in asynchronous metrics instead?

Copy link
Copy Markdown
Member Author

@primeroz primeroz Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do expose the size of user-space page cache already but is not usable in clickhouse-scraper

because clickhouse-scraper can only fetch metrics from table and aggregate over 1 minute. so we end up with

max_over_1m(CGroupUsedMemory) - max_over_1m(page_cache) which is not the same as max_over_1m(CgroupUsedMemory-page_cache)

We already tried this before but we failed and that's why we ended up creating MemoryResidentWithoutPageCache in #81233

we already had this conversation in https://github.com/ClickHouse/data-plane-application/pull/19933 which lead to exposing a raw metric that already took the WithoutPageCache into account

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, maybe I wasn't clear, let me clarify.

Right now we expose PageCacheBytes only in system.metrics, I'm suggesting to expose it also in system.asynchronous_metric that way you will get it in one place and can do what ever arithmetic your need.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah , yeah i misunderstood. 👍

indeed right now we do graph the page cache usage from metrics and the max page cache from async metrics

image

I'm suggesting to expose it also in system.asynchronous_metric that way you will get it in one place and can do what ever arithmetic your need.

if you think is best sure, 💯

do you want to do it or should i throw my claude at it ?

Side note , i just noticed with use the page cache also in a warning about memory usage

Copy link
Copy Markdown
Member

@azat azat Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to do it 👍

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you check the current version ? i have a feeling you meant something slighlty different because i don't really need to expose it as async metric as well as standard metric this way ...

@azat azat self-assigned this Apr 1, 2026
primeroz and others added 2 commits April 1, 2026 18:38
Add `MemoryUserSpacePageCache` async metric that exposes the ClickHouse
userspace page cache size in `system.asynchronous_metrics`, alongside the
existing `MemoryResident`, `MemoryResidentWithoutPageCache`, `CGroupMemoryUsed`
and `CGroupMemoryUsedWithoutPageCache` metrics.

Previously this value was only available in `system.metrics` as `PageCacheBytes`.
Having it in `system.asynchronous_metrics` means operators can do arbitrary
arithmetic directly in one place, e.g.:
  CGroupMemoryUsed - MemoryUserSpacePageCache
  MemoryResident - MemoryUserSpacePageCache

The value is computed from the already-fetched `context->getPageCache()->sizeInBytes()`
local variable, so no extra call is made.

Co-authored-by: Claude <noreply@anthropic.com>
…ved metrics

`MemoryUserSpacePageCache` is now populated first (single call to
`context->getPageCache()->sizeInBytes()`), and both
`MemoryResidentWithoutPageCache` and `CGroupMemoryUsedWithoutPageCache`
read back that value instead of calling `getPageCache()->sizeInBytes()`
independently.

Co-authored-by: Claude <noreply@anthropic.com>
…s test

- Reword `CGroupMemoryUsed` description to attribute the page cache
  exclusion to ClickHouse's own field selection from `memory.stat`
  (anonymous memory, socket buffers, non-reclaimable kernel memory),
  rather than implying it is a kernel accounting guarantee.

- Add stateless test 04075 that:
  - Verifies `MemoryUserSpacePageCache` is always present in
    `system.asynchronous_metrics`
  - When cgroup metrics are available, asserts the invariant
    `CGroupMemoryUsedWithoutPageCache <= CGroupMemoryUsed`

Co-authored-by: Claude <noreply@anthropic.com>
Replace the WHERE-based filter that silently produced no output when
CGroupMemoryUsedWithoutPageCache was missing with explicit if()-based
assertions that always produce deterministic output. The existence
check and invariant check now properly fail when cgroup metrics are
available but the expected metric is absent, while gracefully passing
in environments without cgroup support.

Co-authored-by: Claude <noreply@anthropic.com>
…description

Remove the intermediate MemoryUserSpacePageCache async metric — it is
not needed since page_cache_bytes is already in scope.
CGroupMemoryUsedWithoutPageCache now reads page_cache_bytes directly.

Simplify the CGroupMemoryUsed description to say it excludes the kernel
OS page cache, without claiming specifics about cgroup v1/v2 memory.stat
field accounting.

Update the stateless test accordingly.

Co-authored-by: Claude <noreply@anthropic.com>
page_cache_bytes is local to an earlier #if block and not visible in the
cgroup metrics section. Read the value from context->getPageCache()
directly instead.

Co-authored-by: Claude <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Apr 2, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 78.00% (685,718/879,053) 84.00% (738,411/879,065) +6.00% (+52,693)
Functions 90.40% (794,106/878,414) 90.90% (798,232/878,432) +0.50% (+4,126)
Branches 70.60% (220,742/312,660) 76.60% (239,354/312,666) +6.00% (+18,612)

Changed lines: 100.00% (18/18) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants