Skip to content

fix: report registry container working set memory instead of raw cgro…#838

Merged
openshift-merge-bot[bot] merged 1 commit into
redhat-performance:mainfrom
akrzos:fix/registry-memory-working-set
Jun 11, 2026
Merged

fix: report registry container working set memory instead of raw cgro…#838
openshift-merge-bot[bot] merged 1 commit into
redhat-performance:mainfrom
akrzos:fix/registry-memory-working-set

Conversation

@akrzos

@akrzos akrzos commented Jun 9, 2026

Copy link
Copy Markdown
Member

…up usage

The registry container memory panel was showing cgroup memory.current which includes reclaimable page cache, inflating reported usage (~75 GiB) far beyond actual working set memory (~25 GiB). Add a working_set metric computed as memory.current minus inactive_file (matching kubectl top / cadvisor approach) and display it as the primary series on the dashboard.

Summary by CodeRabbit

  • New Features

    • Registry container memory dashboard now displays working set and total memory metrics separately.
    • Added working set memory metric collection for registry containers.
  • Documentation

    • Updated documentation explaining registry memory metrics (working set vs. total) and calculation methods.

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@akrzos, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 23 minutes and 28 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6916da4b-35c4-4942-a4a7-920ee492b1e1

📥 Commits

Reviewing files that changed from the base of the PR and between 6a5ede9 and 87af9ae.

📒 Files selected for processing (3)
  • ansible/roles/hv-metrics-server/files/dashboards/bastion-dashboard.json
  • ansible/roles/hv-metrics-server/templates/registry-traffic-collector.sh.j2
  • docs/hv-metrics.md
📝 Walkthrough

Walkthrough

The PR adds a working-set memory metric to registry container monitoring. The metrics collector script now computes working-set memory by subtracting inactive file pages from container memory usage and emits it as a new Prometheus gauge. The Grafana dashboard panel is updated to display both working-set and total memory, and documentation is expanded to explain the distinction.

Changes

Registry Memory Metrics Expansion

Layer / File(s) Summary
Working-set metric collection and emission
ansible/roles/hv-metrics-server/templates/registry-traffic-collector.sh.j2
Script computes working-set memory from cgroup memory.current minus inactive_file, initializes it conditionally, and extends Prometheus output to emit registry_container_memory_working_set_bytes with HELP/TYPE declarations.
Dashboard panel and metric documentation
ansible/roles/hv-metrics-server/files/dashboards/bastion-dashboard.json, docs/hv-metrics.md
Dashboard panel targets now display registry_container_memory_working_set_bytes ("working set") alongside registry_container_memory_usage_bytes ("total incl. page cache"). Documentation updates the panel description and adds detailed "Registry Memory Metrics" section explaining the working-set vs. total calculations and their cgroup derivation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • redhat-performance/jetlag#836: The foundation PR that introduced bastion registry monitoring; this PR extends it with split working-set and total memory metrics.

Suggested labels

lgtm, approved

Suggested reviewers

  • agurenko
  • mcornea

Poem

🐰 A metric split in two, so clear and bright—
Working set and total, each its own light.
From cgroup shadows, memory takes flight,
Dashboard panels glow with dual insight! 📊✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: replacing raw cgroup memory reporting with a working set memory metric for the registry container. It captures the primary intent of the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@akrzos akrzos force-pushed the fix/registry-memory-working-set branch 2 times, most recently from 38beb3f to 6a5ede9 Compare June 10, 2026 17:32
@akrzos akrzos marked this pull request as ready for review June 10, 2026 17:36

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/hv-metrics.md`:
- Line 158: The citation for "working set" has mismatched issue numbers: the
link text shows containers/common#2455 while the URL points to .../issues/2454;
update the reference so both the link text and the URL use the same issue number
(choose either `#2454` or `#2455` consistently) by editing the markdown near the
"working set" line (the link text `containers/common#2455` and the URL
`.../issues/2454`) so they match.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b84edafe-abac-4535-bfbe-3121fe0e526b

📥 Commits

Reviewing files that changed from the base of the PR and between 70cd49c and 6a5ede9.

📒 Files selected for processing (3)
  • ansible/roles/hv-metrics-server/files/dashboards/bastion-dashboard.json
  • ansible/roles/hv-metrics-server/templates/registry-traffic-collector.sh.j2
  • docs/hv-metrics.md

Comment thread docs/hv-metrics.md Outdated
…up usage

The registry container memory panel was reporting cgroup memory.current
which includes reclaimable page cache, inflating reported usage far beyond
actual container memory consumption. Add a working_set metric computed as
memory.current - inactive_file (the standard calculation used by podman
stats and Docker stats per containers/common#2455) and display it as the
primary series on the bastion dashboard.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@akrzos akrzos force-pushed the fix/registry-memory-working-set branch from 6a5ede9 to 87af9ae Compare June 10, 2026 18:13
@akrzos akrzos requested review from agurenko and mcornea and removed request for josecastillolema and jtaleric June 10, 2026 19:55

@agurenko agurenko left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mcornea

mcornea commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

/lgtm

@mcornea

mcornea commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

/approve

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: agurenko, mcornea

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit fef075c into redhat-performance:main Jun 11, 2026
2 checks passed
@akrzos akrzos deleted the fix/registry-memory-working-set branch June 11, 2026 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants