-
Notifications
You must be signed in to change notification settings - Fork 30
Comparing changes
Open a pull request
base repository: elastic/elastic-agent-system-metrics
base: v0.14.1
head repository: elastic/elastic-agent-system-metrics
compare: v0.14.2
- 6 commits
- 52 files changed
- 1 contributor
Commits on Feb 16, 2026
-
## What does this PR do? Adds support for zswap tracking: 1. From `/proc/vmstat` 2. From `/proc/meminfo` 3. From `/sys/kernel/debug/zswap/` 4. From `/sys/fs/cgroup/<cgroup name>/memory.stat` The testing here is not trivial because: 1. zswap might be completely disabled at compile time: https://github.com/torvalds/linux/blob/1f97d9dcf53649c41c33227b345a36902cbb08ad/mm/memcontrol.c#L1346-L1348 2. debugfs (`/sys/kernel/debug/`) might not be mounted at all 3. Access to `/sys/kernel/debug/` requires root To deal with this situation, `metric/memory/memory_integration_test.go` decides on the expected result based on `BUILDKITE_STEP_KEY`. This makes it slightly harder to add new images to our pipeline (the `ciExpectations` map needs to be updated) but ensures we are actually reading zswap debug metrics somewhere in our tests. Reviewer note: Please check my nesting of the `debug` metrics, it could be that we prefer them at the same level. ## Why is it important? Adds important kernel metrics of the zswap compressed swap backend. See elastic/beats#47605 for more info. ## Checklist - [x] My code follows the style guidelines of this project - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added tests that prove my fix is effective or that my feature works - [ ] ~~I have added an entry in `CHANGELOG.md`~~ ## Testing Instructions ### 1. Local integration tests All tests should run and succeed locally by default, if you want to check for actual access of all metrics, use: ```shell $ go test -c ./metric/memory -o memory.test $ sudo BUILDKITE_STEP_KEY=manual PRIVILEGED=1 ./memory.test -test.run TestMemoryFromContainer -test.v === RUN TestMemoryFromContainer memory_integration_test.go:64: Total: 33323646976, Free: 4660809728, Used: 28662837248 memory_integration_test.go:70: Zswap exists: true, Debug exists: true (BUILDKITE_STEP_KEY="manual") memory_integration_test.go:72: Zswap: Compressed=557056 bytes, Uncompressed=217088 bytes memory_integration_test.go:72: Zswap debug: StoredPages=53, PoolTotalSize=557056 --- PASS: TestMemoryFromContainer (0.00s) PASS $ sudo setcap cap_dac_read_search=+ep memory.test $ BUILDKITE_STEP_KEY=manual PRIVILEGED=1 ./memory.test -test.run TestMemoryFromContainer -test.v === RUN TestMemoryFromContainer memory_integration_test.go:67: Total: 33323646976, Free: 19951370240, Used: 13372276736 memory_integration_test.go:73: Zswap exists: true, Debug exists: true (BUILDKITE_STEP_KEY="manual") memory_integration_test.go:75: Zswap: Compressed=430080 bytes, Uncompressed=106496 bytes memory_integration_test.go:75: Zswap debug: StoredPages=26, PoolTotalSize=430080 --- PASS: TestMemoryFromContainer (0.00s) PASS ``` ### 2. Metricbeat -- memory metricset ```yaml metricbeat.modules: - module: system period: 1s metricsets: - memory output.console: pretty: true ``` Test this config: - With old elastic-agent-system-metrics - Without any extra permissions - With `sudo` - With `sudo setcap cap_dac_read_search=+ep ./metricbeat` - After `sudo umount debugfs` -> debug stats disabled <details> <summary>Before</summary> ```json { "@metadata": { "beat": "metricbeat", "type": "_doc", "version": "9.4.0" }, "@timestamp": "2026-01-28T09:43:47.196Z", "agent": { "ephemeral_id": "1242f8a9-2768-4582-82e2-f1a605d4f182", "id": "8b35a267-5085-44a8-a82e-e43a4af9af11", "name": "laptop", "type": "metricbeat", "version": "9.4.0" }, "ecs": { "version": "8.0.0" }, "event": { "dataset": "system.memory", "duration": 127644, "module": "system" }, "host": { "name": "laptop" }, "metricset": { "name": "memory", "period": 1000 }, "service": { "type": "system" }, "system": { "memory": { "actual": { "free": 18093944832, "used": { "bytes": 15229702144, "pct": 0.457 } }, "cached": 14415081472, "free": 6065573888, "swap": { "free": 20558376960, "total": 21474832384, "used": { "bytes": 916455424, "pct": 0.0427 } }, "total": 33323646976, "used": { "bytes": 27258073088, "pct": 0.818 } } } } ``` </details> <details> <summary>Unprivileged output</summary> ```json { "@metadata": { "beat": "metricbeat", "type": "_doc", "version": "9.4.0" }, "@timestamp": "2026-01-28T09:41:08.897Z", "agent": { "ephemeral_id": "9aa025fc-2b19-4e40-82d5-7603b3da8b99", "id": "8b35a267-5085-44a8-a82e-e43a4af9af11", "name": "laptop", "type": "metricbeat", "version": "9.4.0" }, "ecs": { "version": "8.0.0" }, "event": { "dataset": "system.memory", "duration": 127572, "module": "system" }, "host": { "name": "laptop" }, "metricset": { "name": "memory", "period": 1000 }, "service": { "type": "system" }, "system": { "memory": { "actual": { "free": 17940357120, "used": { "bytes": 15383289856, "pct": 0.4616 } }, "cached": 15113895936, "free": 5096550400, "swap": { "free": 20952903680, "total": 21474832384, "used": { "bytes": 521928704, "pct": 0.0243 } }, "total": 33323646976, "used": { "bytes": 28227096576, "pct": 0.8471 }, "zswap": { "compressed": 112336896, "uncompressed": 464326656 } } } } ``` </details> <details> <summary>Privileged output</summary> ```json { "@metadata": { "beat": "metricbeat", "type": "_doc", "version": "9.4.0" }, "@timestamp": "2026-01-28T09:48:09.372Z", "agent": { "ephemeral_id": "55e06c15-f7e0-44bf-b361-28e69015ec34", "id": "8b35a267-5085-44a8-a82e-e43a4af9af11", "name": "laptop", "type": "metricbeat", "version": "9.4.0" }, "ecs": { "version": "8.0.0" }, "event": { "dataset": "system.memory", "duration": 181346, "module": "system" }, "host": { "name": "laptop" }, "metricset": { "name": "memory", "period": 1000 }, "service": { "type": "system" }, "system": { "memory": { "actual": { "free": 18898096128, "used": { "bytes": 14425550848, "pct": 0.4329 } }, "cached": 13020823552, "free": 7782682624, "swap": { "free": 20104949760, "total": 21474832384, "used": { "bytes": 1369882624, "pct": 0.0638 } }, "total": 33323646976, "used": { "bytes": 25540964352, "pct": 0.7665 }, "zswap": { "compressed": 312053760, "debug": { "pool_limit_hit": 0, "pool_total_size": 312053760, "reject_alloc_fail": 0, "reject_compress_fail": 8665, "reject_compress_poor": 0, "reject_kmemcache_fail": 0, "reject_reclaim_fail": 0, "stored_pages": 304752, "written_back_pages": 0 }, "uncompressed": 1248264192 } } } } ``` </details> ### 3. Metricbeat -- processes metricset Add: ```yaml - process processes: ['.*'] process.cgroups.enabled: true ``` to get zswap stats for the process metricset, check output with: ```shell $ ./metricbeat -c metricbeat.yaml | jq '(.system.process.cgroup.memory.stats // {}) | with_entries(select(.key | IN("zswap","zswapped","zswpin","zswpout","zswpwb"))) | select(length > 0)' ... { "zswpout": 412, "zswpwb": 0, "zswapped": { "bytes": 1249280 }, "zswap": { "bytes": 283113 }, "zswpin": 107 } ... ``` ## Related issues - Relates elastic/beats#47605
Configuration menu - View commit details
-
Copy full SHA for 7613049 - Browse repository at this point
Copy the full SHA 7613049View commit details
Commits on Feb 17, 2026
-
PULL_REQUEST_TEMPLATE: Remove reference to non-existant CHANGELOG.md (#…
…286) ## What does this PR do? We don't keep a changelog in this repo, [releases](github.com/elastic/elastic-agent-system-metrics/releases) automatically generate a changelog from commits and the beats repo imports this module and is responsible for the changelog. ## Why is it important? Contributor clarity ## Checklist - [ ] My code follows the style guidelines of this project - [ ] I have commented my code, particularly in hard-to-understand areas - [ ] I have added tests that prove my fix is effective or that my feature works
Configuration menu - View commit details
-
Copy full SHA for c5bc299 - Browse repository at this point
Copy the full SHA c5bc299View commit details
Commits on Feb 19, 2026
-
fix: remove shared context timeout from matrix tests (#285)
## What does this PR do? Remove per-test context timeouts and rely on the go test -timeout flag (20m in CI). Bump sleep durations to 1200s to outlast the test run. ## Why is it important? The container system tests used a shared 5-minute context.WithTimeout for the entire permission matrix. On slower CI workers the 8-case matrix in TestProcessAllSettings recently started exceeded this, causing "context deadline exceeded" failures: https://buildkite.com/elastic/elastic-agent-system-metrics/builds/929 ## Checklist - [x] My code follows the style guidelines of this project - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added tests that prove my fix is effective or that my feature works - [ ] ~~I have added an entry in `CHANGELOG.md`~~ ## Related issues CI failed in main and elastic/beats#48834
Configuration menu - View commit details
-
Copy full SHA for 01e4c15 - Browse repository at this point
Copy the full SHA 01e4c15View commit details
Commits on Feb 20, 2026
-
Enable modernize linter and apply suggestions (#287)
## What does this PR do? Add the `modernize` golangci-lint analyzer (with omitzero disabled for now) and fix all findings: strings.Split → SplitSeq iterators, interface{} → any, manual min/contains → builtins and slices package, reflect.TypeOf().Elem() → TypeFor[T](). Also fix a prealloc finding. Consolidate the two golangci-lint CI jobs into a single matrix job and bump golangci-lint to v2.10.1. This switches to running golangci-lint fully on _all_ files which should be okay given the small-ish size of this repo. ## Why is it important? Introduces optimizations, consolidates Go syntax into new form, catches omitzero bug. ## Checklist - [x] My code follows the style guidelines of this project - [x] I have commented my code, particularly in hard-to-understand areas - [ ] ~~I have added tests that prove my fix is effective or that my feature works~~Configuration menu - View commit details
-
Copy full SHA for d2c71a8 - Browse repository at this point
Copy the full SHA d2c71a8View commit details
Commits on Feb 26, 2026
-
Use json omitzero for struct-typed fields in cgroup structs (#288)
## What does this PR do? encoding/json's omitempty is silently ignored on struct fields — zero-valued structs are always serialized regardless of the tag. Go 1.24's omitzero fixes this by checking IsZero() or the zero value. ## Why is it important? No downstream production code is affected (consumers use struct tags via go-structform, not json tags), but this corrects a latent bug if these types are ever json.Marshal'd directly. ## Checklist - [x] My code follows the style guidelines of this project - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added tests that prove my fix is effective or that my feature works ## Related issues - Blocks #281
Configuration menu - View commit details
-
Copy full SHA for 4c778fb - Browse repository at this point
Copy the full SHA 4c778fbView commit details -
[cgv2] Add CPU CFS quota, period, and weight metrics (#281)
## What does this PR do? Add support for collecting CPU bandwidth control settings from cgroupv2: - cpu.max (quota and period in microseconds) - cpu.weight (relative weight, replaces shares from v1) ## Why is it important? This brings cgroupv2 to feature parity with cgroupv1 for CPU limit metrics. ## Checklist - [x] My code follows the style guidelines of this project - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added tests that prove my fix is effective or that my feature works - [ ] ~~I have added an entry in `CHANGELOG.md`~~ ## Related issues - Relates elastic/beats#47708
Configuration menu - View commit details
-
Copy full SHA for 69b8af0 - Browse repository at this point
Copy the full SHA 69b8af0View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v0.14.1...v0.14.2