[Metricbeat] Add memory PSI metrics for cgroupv2#48054
[Metricbeat] Add memory PSI metrics for cgroupv2#48054orestisfl merged 13 commits intoelastic:mainfrom
Conversation
Add memory pressure PSI metrics to the system.process.cgroup.memory
metricset, complementing the existing CPU and IO pressure metrics.
New fields added under system.process.cgroup.memory.pressure:
- pressure.some.{10,60,300}.pct - Share of time with some tasks stalled
- pressure.some.total - Total some pressure time
- pressure.full.{10,60,300}.pct - Share of time with all tasks stalled
- pressure.full.total - Total full pressure time
Closes elastic#47604
🤖 GitHub commentsJust comment with:
|
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
🔍 Preview links for changed docs |
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
colleenmcginnis
left a comment
There was a problem hiding this comment.
Starting with v9.0, there is no longer a new documentation set published with every minor release: the same page stays valid over time and shows version-related evolutions (ref). As a result, we add version information in the fields.yml and it adds version badges to the generated Markdown file. Read more in Contributing to the docs.
Based on the backport labels, I assumed these changes are targeting 9.3.0, but feel free to adjust as needed. After applying the suggestions you'll have to regenerate the docs.
Co-authored-by: Colleen McGinnis <colleen.j.mcginnis@gmail.com>
|
You'll have to regenerate the docs ( |
|
@Mergifyio backport 8.19 9.2 9.3 |
✅ Backports have been createdDetails
Cherry-pick of c3f35a9 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
Cherry-pick of c3f35a9 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally
Cherry-pick of c3f35a9 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
## Proposed commit message
Add memory pressure PSI metrics to the system.process.cgroup.memory
metricset, complementing the existing CPU and IO pressure metrics.
New fields added under system.process.cgroup.memory.pressure:
- pressure.some.{10,60,300}.pct - Share of time with some tasks stalled
- pressure.some.total - Total some pressure time
- pressure.full.{10,60,300}.pct - Share of time with all tasks stalled
- pressure.full.total - Total full pressure time
Closes #47604
## How to test this PR locally
### 1. Build and Run Metricbeat
```bash
cd metricbeat
go build .
```
### 2. Create Test Configuration
```yaml
metricbeat.modules:
- module: system
period: 5s
metricsets:
- process
processes: ['.*']
process.cgroups.enabled: true
output.console:
pretty: true
```
### 3. Run Metricbeat
```bash
./metricbeat -e -c /tmp/metricbeat-psi-test.yml
```
### 4. Verify Memory Pressure Fields
Look for `system.process.cgroup.memory.pressure` in the output:
```json
"memory": {
"pressure": {
"some": {
"10": { "pct": 0 },
"60": { "pct": 0 },
"300": { "pct": 0 },
"total": 0
},
"full": {
"10": { "pct": 0 },
"60": { "pct": 0 },
"300": { "pct": 0 },
"total": 0
}
}
}
```
### 5. Compare Before/After (Optional)
[compare-psi-metrics.sh](https://github.com/user-attachments/files/24191696/compare-psi-metrics.sh)
Use the comparison script to compare output from main vs this PR:
```
compare-psi-metrics.sh
Usage: ./compare-psi-metrics.sh <main-output.ndjson> <pr-output.ndjson>
```
## Related issues
- Requires elastic/elastic-agent-system-metrics#274
- Closes #47604
(cherry picked from commit c3f35a9)
# Conflicts:
# NOTICE.txt
# docs/reference/metricbeat/exported-fields-system.md
# docs/reference/metricbeat/metricbeat-metricset-system-process.md
# go.mod
# go.sum
# metricbeat/module/system/fields.go
|
I'm adding backports due to failures on other backports like #48858 |
## Proposed commit message
Add memory pressure PSI metrics to the system.process.cgroup.memory
metricset, complementing the existing CPU and IO pressure metrics.
New fields added under system.process.cgroup.memory.pressure:
- pressure.some.{10,60,300}.pct - Share of time with some tasks stalled
- pressure.some.total - Total some pressure time
- pressure.full.{10,60,300}.pct - Share of time with all tasks stalled
- pressure.full.total - Total full pressure time
Closes #47604
## How to test this PR locally
### 1. Build and Run Metricbeat
```bash
cd metricbeat
go build .
```
### 2. Create Test Configuration
```yaml
metricbeat.modules:
- module: system
period: 5s
metricsets:
- process
processes: ['.*']
process.cgroups.enabled: true
output.console:
pretty: true
```
### 3. Run Metricbeat
```bash
./metricbeat -e -c /tmp/metricbeat-psi-test.yml
```
### 4. Verify Memory Pressure Fields
Look for `system.process.cgroup.memory.pressure` in the output:
```json
"memory": {
"pressure": {
"some": {
"10": { "pct": 0 },
"60": { "pct": 0 },
"300": { "pct": 0 },
"total": 0
},
"full": {
"10": { "pct": 0 },
"60": { "pct": 0 },
"300": { "pct": 0 },
"total": 0
}
}
}
```
### 5. Compare Before/After (Optional)
[compare-psi-metrics.sh](https://github.com/user-attachments/files/24191696/compare-psi-metrics.sh)
Use the comparison script to compare output from main vs this PR:
```
compare-psi-metrics.sh
Usage: ./compare-psi-metrics.sh <main-output.ndjson> <pr-output.ndjson>
```
## Related issues
- Requires elastic/elastic-agent-system-metrics#274
- Closes #47604
(cherry picked from commit c3f35a9)
# Conflicts:
# NOTICE.txt
# go.mod
# go.sum
## Proposed commit message
Add memory pressure PSI metrics to the system.process.cgroup.memory
metricset, complementing the existing CPU and IO pressure metrics.
New fields added under system.process.cgroup.memory.pressure:
- pressure.some.{10,60,300}.pct - Share of time with some tasks stalled
- pressure.some.total - Total some pressure time
- pressure.full.{10,60,300}.pct - Share of time with all tasks stalled
- pressure.full.total - Total full pressure time
Closes #47604
## How to test this PR locally
### 1. Build and Run Metricbeat
```bash
cd metricbeat
go build .
```
### 2. Create Test Configuration
```yaml
metricbeat.modules:
- module: system
period: 5s
metricsets:
- process
processes: ['.*']
process.cgroups.enabled: true
output.console:
pretty: true
```
### 3. Run Metricbeat
```bash
./metricbeat -e -c /tmp/metricbeat-psi-test.yml
```
### 4. Verify Memory Pressure Fields
Look for `system.process.cgroup.memory.pressure` in the output:
```json
"memory": {
"pressure": {
"some": {
"10": { "pct": 0 },
"60": { "pct": 0 },
"300": { "pct": 0 },
"total": 0
},
"full": {
"10": { "pct": 0 },
"60": { "pct": 0 },
"300": { "pct": 0 },
"total": 0
}
}
}
```
### 5. Compare Before/After (Optional)
[compare-psi-metrics.sh](https://github.com/user-attachments/files/24191696/compare-psi-metrics.sh)
Use the comparison script to compare output from main vs this PR:
```
compare-psi-metrics.sh
Usage: ./compare-psi-metrics.sh <main-output.ndjson> <pr-output.ndjson>
```
## Related issues
- Requires elastic/elastic-agent-system-metrics#274
- Closes #47604
(cherry picked from commit c3f35a9)
# Conflicts:
# NOTICE.txt
# go.mod
# go.sum
…v2 (#49056) * [Metricbeat] Add memory PSI metrics for cgroupv2 (#48054) ## Proposed commit message Add memory pressure PSI metrics to the system.process.cgroup.memory metricset, complementing the existing CPU and IO pressure metrics. New fields added under system.process.cgroup.memory.pressure: - pressure.some.{10,60,300}.pct - Share of time with some tasks stalled - pressure.some.total - Total some pressure time - pressure.full.{10,60,300}.pct - Share of time with all tasks stalled - pressure.full.total - Total full pressure time Closes #47604 ## How to test this PR locally ### 1. Build and Run Metricbeat ```bash cd metricbeat go build . ``` ### 2. Create Test Configuration ```yaml metricbeat.modules: - module: system period: 5s metricsets: - process processes: ['.*'] process.cgroups.enabled: true output.console: pretty: true ``` ### 3. Run Metricbeat ```bash ./metricbeat -e -c /tmp/metricbeat-psi-test.yml ``` ### 4. Verify Memory Pressure Fields Look for `system.process.cgroup.memory.pressure` in the output: ```json "memory": { "pressure": { "some": { "10": { "pct": 0 }, "60": { "pct": 0 }, "300": { "pct": 0 }, "total": 0 }, "full": { "10": { "pct": 0 }, "60": { "pct": 0 }, "300": { "pct": 0 }, "total": 0 } } } ``` ### 5. Compare Before/After (Optional) [compare-psi-metrics.sh](https://github.com/user-attachments/files/24191696/compare-psi-metrics.sh) Use the comparison script to compare output from main vs this PR: ``` compare-psi-metrics.sh Usage: ./compare-psi-metrics.sh <main-output.ndjson> <pr-output.ndjson> ``` ## Related issues - Requires elastic/elastic-agent-system-metrics#274 - Closes #47604 (cherry picked from commit c3f35a9) # Conflicts: # NOTICE.txt # go.mod # go.sum * Resolve conflicts --------- Co-authored-by: Orestis Floros <orestis.floros@elastic.co> Co-authored-by: Denis Rechkunov <denis.rechkunov@elastic.co>
…pv2 (#49055) * [Metricbeat] Add memory PSI metrics for cgroupv2 (#48054) ## Proposed commit message Add memory pressure PSI metrics to the system.process.cgroup.memory metricset, complementing the existing CPU and IO pressure metrics. New fields added under system.process.cgroup.memory.pressure: - pressure.some.{10,60,300}.pct - Share of time with some tasks stalled - pressure.some.total - Total some pressure time - pressure.full.{10,60,300}.pct - Share of time with all tasks stalled - pressure.full.total - Total full pressure time Closes #47604 ## How to test this PR locally ### 1. Build and Run Metricbeat ```bash cd metricbeat go build . ``` ### 2. Create Test Configuration ```yaml metricbeat.modules: - module: system period: 5s metricsets: - process processes: ['.*'] process.cgroups.enabled: true output.console: pretty: true ``` ### 3. Run Metricbeat ```bash ./metricbeat -e -c /tmp/metricbeat-psi-test.yml ``` ### 4. Verify Memory Pressure Fields Look for `system.process.cgroup.memory.pressure` in the output: ```json "memory": { "pressure": { "some": { "10": { "pct": 0 }, "60": { "pct": 0 }, "300": { "pct": 0 }, "total": 0 }, "full": { "10": { "pct": 0 }, "60": { "pct": 0 }, "300": { "pct": 0 }, "total": 0 } } } ``` ### 5. Compare Before/After (Optional) [compare-psi-metrics.sh](https://github.com/user-attachments/files/24191696/compare-psi-metrics.sh) Use the comparison script to compare output from main vs this PR: ``` compare-psi-metrics.sh Usage: ./compare-psi-metrics.sh <main-output.ndjson> <pr-output.ndjson> ``` ## Related issues - Requires elastic/elastic-agent-system-metrics#274 - Closes #47604 (cherry picked from commit c3f35a9) # Conflicts: # NOTICE.txt # docs/reference/metricbeat/exported-fields-system.md # docs/reference/metricbeat/metricbeat-metricset-system-process.md # go.mod # go.sum # metricbeat/module/system/fields.go * Resolve conflicts --------- Co-authored-by: Orestis Floros <orestis.floros@elastic.co> Co-authored-by: Denis Rechkunov <denis.rechkunov@elastic.co>
Proposed commit message
Add memory pressure PSI metrics to the system.process.cgroup.memory
metricset, complementing the existing CPU and IO pressure metrics.
New fields added under system.process.cgroup.memory.pressure:
Closes #47604
Checklist
I have made corresponding change to the default configuration filesstresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.How to test this PR locally
1. Build and Run Metricbeat
2. Create Test Configuration
3. Run Metricbeat
4. Verify Memory Pressure Fields
Look for
system.process.cgroup.memory.pressurein the output:5. Compare Before/After (Optional)
compare-psi-metrics.sh
Use the comparison script to compare output from main vs this PR:
Related issues