Skip to content

Add file permissions for 'cpu,cpuacct' cgroup#20108

Merged
cwperks merged 1 commit intoopensearch-project:mainfrom
andrross:add-cgroup-file-perms
Dec 2, 2025
Merged

Add file permissions for 'cpu,cpuacct' cgroup#20108
cwperks merged 1 commit intoopensearch-project:mainfrom
andrross:add-cgroup-file-perms

Conversation

@andrross
Copy link
Copy Markdown
Member

@andrross andrross commented Nov 26, 2025

This change is required to get CPU stat reporting to work on certain linux distributions.

Related Issues

Resolves #19120

Check List

  • Functionality includes testing.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • Chores

    • Broadened system security permissions to allow additional read access for CPU-related cgroup paths, improving resource and CPU usage monitoring.
  • Documentation

    • Added a changelog entry documenting a fix for negative CPU usage values reported in node statistics.

✏️ Tip: You can customize this high-level summary in your review settings.

@andrross andrross requested a review from a team as a code owner November 26, 2025 23:16
@github-actions github-actions bot added bug Something isn't working Clients Clients within the Core repository such as High level Rest client and low level client labels Nov 26, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Nov 26, 2025

Walkthrough

Added three read FilePermission entries for additional cgroup paths in the security policy and updated the changelog with an Unreleased Fixed entry for negative CPU usage values in node stats (issue #19120).

Changes

Cohort / File(s) Summary
Security Policy Permissions
server/src/main/resources/org/opensearch/bootstrap/security.policy
Added three FilePermission entries granting read access to /sys/fs/cgroup/cpu,cpuacct, /sys/fs/cgroup/cpu,cpuacct/-, and /sys/fs/cgroup/cpuset/-.
Changelog
CHANGELOG.md
Updated an existing Fixed entry reference and added an Unreleased Fixed entry: "Fix negative CPU usage values in node stats ([#19120])".

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

  • Verify the cgroup paths (including the combined cpu,cpuacct syntax) are intentional and correctly formatted.
  • Confirm changelog entry placement and wording follow project conventions.

Poem

🐰
I hopped through cgroup fields at night,
Gave three small reads the gentle light,
Stats no longer slip or frown,
Numbers hop back up, not down,
— a rabbit’s twitchy code delight.

Pre-merge checks and finishing touches

❌ Failed checks (2 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The description provides context for the change and references issue #19120, but omits the Description section from the template that should detail what the change achieves. Add a Description section explaining that the change adds file permissions for cpu,cpuacct cgroup paths to fix CPU stat reporting on certain Linux distributions.
Out of Scope Changes check ❓ Inconclusive The CHANGELOG.md modification documenting the fix is appropriate for this PR, but additional unrelated PR reference fix appears tangential. Clarify whether the EngineConfig toBuilder reference fix in CHANGELOG.md is intentional or should be submitted as a separate PR.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding file permissions for the cpu,cpuacct cgroup to enable CPU stat reporting.
Linked Issues check ✅ Passed The PR adds cgroup file permissions for cpu,cpuacct paths as required by issue #19120 to fix negative CPU values in node stats reporting.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
CHANGELOG.md (1)

93-93: LGTM! Clear documentation of the fix.

The changelog entry accurately documents the fix for negative CPU usage values.

Optional: For consistency with some other entries (e.g., line 91), consider adding # before the issue number:

-- Fix negative CPU usage values in node stats ([19120](https://github.com/opensearch-project/OpenSearch/issues/19120))
+- Fix negative CPU usage values in node stats ([#19120](https://github.com/opensearch-project/OpenSearch/issues/19120))
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d69581 and e8d64d6.

📒 Files selected for processing (2)
  • CHANGELOG.md (1 hunks)
  • server/src/main/resources/org/opensearch/bootstrap/security.policy (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: detect-breaking-change
🔇 Additional comments (2)
server/src/main/resources/org/opensearch/bootstrap/security.policy (2)

245-246: LGTM! Essential permissions for cpu,cpuacct combined mount.

These permissions address the reported issue where CPU stats return -1 on distributions (like RHEL 8.10) that mount cpu and cpuacct together as "cpu,cpuacct" instead of separately. The pattern correctly follows the existing convention of granting both base directory and recursive read permissions.


247-247: The cpuset permission is correctly configured — base directory permission is not needed.

The review comment misunderstands Java FilePermission semantics. The pattern /sys/fs/cgroup/cpuset/- grants recursive read access to all files and subdirectories under cpuset, which is exactly what the JVM needs to read cpuset.cpus and cpuset.cpus.effective files.

The base permission /sys/fs/cgroup/cpuset (without the - suffix) grants read access only to the directory itself (metadata/listing), not to files within it. Since OpenSearch only needs the JVM to read specific cpuset files nested in subdirectories, the recursive form alone is both necessary and sufficient.

While other cgroup controllers (cpu, cpuacct) include both base and recursive permissions, this appears to be defensive consistency rather than a functional requirement—the recursive permission already covers the required access. No inconsistency or missing permission exists here.

@github-actions
Copy link
Copy Markdown
Contributor

✅ Gradle check result for e8d64d6: SUCCESS

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 1, 2025

❌ Gradle check result for 623c314: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 1, 2025

❌ Gradle check result for 623c314: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

This change is required to get CPU stat reporting to work on certain
linux distributions.

Signed-off-by: Andrew Ross <andrross@amazon.com>
@andrross andrross force-pushed the add-cgroup-file-perms branch from 623c314 to a979577 Compare December 1, 2025 21:47
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 623c314 and a979577.

📒 Files selected for processing (2)
  • CHANGELOG.md (1 hunks)
  • server/src/main/resources/org/opensearch/bootstrap/security.policy (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • server/src/main/resources/org/opensearch/bootstrap/security.policy
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
🔇 Additional comments (1)
CHANGELOG.md (1)

97-97: Consider referencing the PR instead of the issue for consistency.

While the entry correctly describes the fix and matches the earlier suggestion, it references issue #19120 rather than PR #20108. The CHANGELOG convention in this file is to reference the PR that made the change (as seen in lines 75–96), not the originating issue. This may be intentional, but verify whether the PR reference should be used instead.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 1, 2025

❕ Gradle check result for a979577: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.30%. Comparing base (0197084) to head (a979577).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main   #20108      +/-   ##
============================================
+ Coverage     73.23%   73.30%   +0.07%     
- Complexity    71621    71683      +62     
============================================
  Files          5787     5787              
  Lines        327853   327854       +1     
  Branches      47218    47218              
============================================
+ Hits         240090   240335     +245     
+ Misses        68505    68221     -284     
- Partials      19258    19298      +40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Clients Clients within the Core repository such as High level Rest client and low level client

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] (v3.2.0) _cat/nodes API shows negative values in cpu stats

3 participants