Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: elastic/elastic-agent-system-metrics
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.14.1
Choose a base ref
...
head repository: elastic/elastic-agent-system-metrics
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.14.2
Choose a head ref
  • 6 commits
  • 52 files changed
  • 1 contributor

Commits on Feb 16, 2026

  1. Add Zswap Metrics (#279)

    ## What does this PR do?
    
    Adds support for zswap tracking:
    1. From `/proc/vmstat`
    2. From `/proc/meminfo`
    3. From `/sys/kernel/debug/zswap/`
    4. From `/sys/fs/cgroup/<cgroup name>/memory.stat`
    
    The testing here is not trivial because:
    1. zswap might be completely disabled at compile time:
    https://github.com/torvalds/linux/blob/1f97d9dcf53649c41c33227b345a36902cbb08ad/mm/memcontrol.c#L1346-L1348
    2. debugfs (`/sys/kernel/debug/`) might not be mounted at all
    3. Access to `/sys/kernel/debug/` requires root
    
    To deal with this situation, `metric/memory/memory_integration_test.go`
    decides on the expected result based on `BUILDKITE_STEP_KEY`. This makes
    it slightly harder to add new images to our pipeline (the
    `ciExpectations` map needs to be updated) but ensures we are actually
    reading zswap debug metrics somewhere in our tests.
    
    Reviewer note: Please check my nesting of the `debug` metrics, it could
    be that we prefer them at the same level.
    
    ## Why is it important?
    
    Adds important kernel metrics of the zswap compressed swap backend. See
    elastic/beats#47605 for more info.
    
    ## Checklist
    
    - [x] My code follows the style guidelines of this project
    - [x] I have commented my code, particularly in hard-to-understand areas
    - [x] I have added tests that prove my fix is effective or that my
    feature works
    - [ ] ~~I have added an entry in `CHANGELOG.md`~~
    
    ## Testing Instructions
    
    ### 1. Local integration tests
    All tests should run and succeed locally by default, if you want to
    check for actual access of all metrics, use:
    ```shell
    $ go test -c ./metric/memory -o memory.test
    $ sudo BUILDKITE_STEP_KEY=manual PRIVILEGED=1 ./memory.test -test.run TestMemoryFromContainer -test.v
    === RUN   TestMemoryFromContainer
        memory_integration_test.go:64: Total: 33323646976, Free: 4660809728, Used: 28662837248
        memory_integration_test.go:70: Zswap exists: true, Debug exists: true (BUILDKITE_STEP_KEY="manual")
        memory_integration_test.go:72: Zswap: Compressed=557056 bytes, Uncompressed=217088 bytes
        memory_integration_test.go:72: Zswap debug: StoredPages=53, PoolTotalSize=557056
    --- PASS: TestMemoryFromContainer (0.00s)
    PASS
    $ sudo setcap cap_dac_read_search=+ep memory.test
    $ BUILDKITE_STEP_KEY=manual PRIVILEGED=1 ./memory.test -test.run TestMemoryFromContainer -test.v
    === RUN   TestMemoryFromContainer
        memory_integration_test.go:67: Total: 33323646976, Free: 19951370240, Used: 13372276736
        memory_integration_test.go:73: Zswap exists: true, Debug exists: true (BUILDKITE_STEP_KEY="manual")
        memory_integration_test.go:75: Zswap: Compressed=430080 bytes, Uncompressed=106496 bytes
        memory_integration_test.go:75: Zswap debug: StoredPages=26, PoolTotalSize=430080
    --- PASS: TestMemoryFromContainer (0.00s)
    PASS
    ```
    
    ### 2. Metricbeat -- memory metricset
    ```yaml
    metricbeat.modules:
      - module: system
        period: 1s
        metricsets:
          - memory
    
    output.console:
      pretty: true
    ```
    
    Test this config:
    - With old elastic-agent-system-metrics
    - Without any extra permissions
    - With `sudo`
    - With `sudo setcap cap_dac_read_search=+ep ./metricbeat`
    - After `sudo umount debugfs` -> debug stats disabled
    
    <details>
      <summary>Before</summary>
    
    ```json
    {
      "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "9.4.0"
      },
      "@timestamp": "2026-01-28T09:43:47.196Z",
      "agent": {
        "ephemeral_id": "1242f8a9-2768-4582-82e2-f1a605d4f182",
        "id": "8b35a267-5085-44a8-a82e-e43a4af9af11",
        "name": "laptop",
        "type": "metricbeat",
        "version": "9.4.0"
      },
      "ecs": {
        "version": "8.0.0"
      },
      "event": {
        "dataset": "system.memory",
        "duration": 127644,
        "module": "system"
      },
      "host": {
        "name": "laptop"
      },
      "metricset": {
        "name": "memory",
        "period": 1000
      },
      "service": {
        "type": "system"
      },
      "system": {
        "memory": {
          "actual": {
            "free": 18093944832,
            "used": {
              "bytes": 15229702144,
              "pct": 0.457
            }
          },
          "cached": 14415081472,
          "free": 6065573888,
          "swap": {
            "free": 20558376960,
            "total": 21474832384,
            "used": {
              "bytes": 916455424,
              "pct": 0.0427
            }
          },
          "total": 33323646976,
          "used": {
            "bytes": 27258073088,
            "pct": 0.818
          }
        }
      }
    }
    ```
    </details>
    
    <details>
      <summary>Unprivileged output</summary>
    
    ```json
    {
      "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "9.4.0"
      },
      "@timestamp": "2026-01-28T09:41:08.897Z",
      "agent": {
        "ephemeral_id": "9aa025fc-2b19-4e40-82d5-7603b3da8b99",
        "id": "8b35a267-5085-44a8-a82e-e43a4af9af11",
        "name": "laptop",
        "type": "metricbeat",
        "version": "9.4.0"
      },
      "ecs": {
        "version": "8.0.0"
      },
      "event": {
        "dataset": "system.memory",
        "duration": 127572,
        "module": "system"
      },
      "host": {
        "name": "laptop"
      },
      "metricset": {
        "name": "memory",
        "period": 1000
      },
      "service": {
        "type": "system"
      },
      "system": {
        "memory": {
          "actual": {
            "free": 17940357120,
            "used": {
              "bytes": 15383289856,
              "pct": 0.4616
            }
          },
          "cached": 15113895936,
          "free": 5096550400,
          "swap": {
            "free": 20952903680,
            "total": 21474832384,
            "used": {
              "bytes": 521928704,
              "pct": 0.0243
            }
          },
          "total": 33323646976,
          "used": {
            "bytes": 28227096576,
            "pct": 0.8471
          },
          "zswap": {
            "compressed": 112336896,
            "uncompressed": 464326656
          }
        }
      }
    }
    ```
    </details>
    
    <details>
      <summary>Privileged output</summary>
    
    ```json
    {
      "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "9.4.0"
      },
      "@timestamp": "2026-01-28T09:48:09.372Z",
      "agent": {
        "ephemeral_id": "55e06c15-f7e0-44bf-b361-28e69015ec34",
        "id": "8b35a267-5085-44a8-a82e-e43a4af9af11",
        "name": "laptop",
        "type": "metricbeat",
        "version": "9.4.0"
      },
      "ecs": {
        "version": "8.0.0"
      },
      "event": {
        "dataset": "system.memory",
        "duration": 181346,
        "module": "system"
      },
      "host": {
        "name": "laptop"
      },
      "metricset": {
        "name": "memory",
        "period": 1000
      },
      "service": {
        "type": "system"
      },
      "system": {
        "memory": {
          "actual": {
            "free": 18898096128,
            "used": {
              "bytes": 14425550848,
              "pct": 0.4329
            }
          },
          "cached": 13020823552,
          "free": 7782682624,
          "swap": {
            "free": 20104949760,
            "total": 21474832384,
            "used": {
              "bytes": 1369882624,
              "pct": 0.0638
            }
          },
          "total": 33323646976,
          "used": {
            "bytes": 25540964352,
            "pct": 0.7665
          },
          "zswap": {
            "compressed": 312053760,
            "debug": {
              "pool_limit_hit": 0,
              "pool_total_size": 312053760,
              "reject_alloc_fail": 0,
              "reject_compress_fail": 8665,
              "reject_compress_poor": 0,
              "reject_kmemcache_fail": 0,
              "reject_reclaim_fail": 0,
              "stored_pages": 304752,
              "written_back_pages": 0
            },
            "uncompressed": 1248264192
          }
        }
      }
    }
    ```
    </details>
    
    ### 3. Metricbeat -- processes metricset
    Add:
    ```yaml
          - process
        processes: ['.*']
        process.cgroups.enabled: true
    ```
    to get zswap stats for the process metricset, check output with:
    ```shell
    $ ./metricbeat -c metricbeat.yaml | jq '(.system.process.cgroup.memory.stats // {}) | with_entries(select(.key | IN("zswap","zswapped","zswpin","zswpout","zswpwb"))) | select(length > 0)' 
    ...
    {
      "zswpout": 412,
      "zswpwb": 0,
      "zswapped": {
        "bytes": 1249280
      },
      "zswap": {
        "bytes": 283113
      },
      "zswpin": 107
    }
    ...
    ```
    
    ## Related issues
    
    - Relates elastic/beats#47605
    orestisfl authored Feb 16, 2026
    Configuration menu
    Copy the full SHA
    7613049 View commit details
    Browse the repository at this point in the history

Commits on Feb 17, 2026

  1. PULL_REQUEST_TEMPLATE: Remove reference to non-existant CHANGELOG.md (#…

    …286)
    
    ## What does this PR do?
    
    We don't keep a changelog in this repo,
    [releases](github.com/elastic/elastic-agent-system-metrics/releases)
    automatically generate a changelog from commits and the beats repo
    imports this module and is responsible for the changelog.
    
    ## Why is it important?
    
    Contributor clarity
    
    ## Checklist
    
    - [ ] My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have added tests that prove my fix is effective or that my
    feature works
    orestisfl authored Feb 17, 2026
    Configuration menu
    Copy the full SHA
    c5bc299 View commit details
    Browse the repository at this point in the history

Commits on Feb 19, 2026

  1. fix: remove shared context timeout from matrix tests (#285)

    ## What does this PR do?
    
    Remove per-test context timeouts and rely on the go test -timeout flag
    (20m in CI). Bump sleep durations to 1200s to outlast the test run.
    
    ## Why is it important?
    
    The container system tests used a shared 5-minute context.WithTimeout
    for the entire permission matrix. On slower CI workers the 8-case matrix
    in TestProcessAllSettings recently started exceeded this, causing
    "context deadline exceeded" failures:
    https://buildkite.com/elastic/elastic-agent-system-metrics/builds/929
    
    ## Checklist
    
    - [x] My code follows the style guidelines of this project
    - [x] I have commented my code, particularly in hard-to-understand areas
    - [x] I have added tests that prove my fix is effective or that my
    feature works
    - [ ] ~~I have added an entry in `CHANGELOG.md`~~
    
    ## Related issues
    
    CI failed in main and elastic/beats#48834
    orestisfl authored Feb 19, 2026
    Configuration menu
    Copy the full SHA
    01e4c15 View commit details
    Browse the repository at this point in the history

Commits on Feb 20, 2026

  1. Enable modernize linter and apply suggestions (#287)

    ## What does this PR do?
    Add the `modernize` golangci-lint analyzer (with omitzero disabled for
    now) and fix all findings: strings.Split → SplitSeq iterators,
    interface{} → any, manual min/contains → builtins and slices package,
    reflect.TypeOf().Elem() → TypeFor[T](). Also fix a prealloc finding.
    
    Consolidate the two golangci-lint CI jobs into a single matrix job and
    bump golangci-lint to v2.10.1. This switches to running golangci-lint
    fully on _all_ files which should be okay given the small-ish size of
    this repo.
    
    ## Why is it important?
    
    Introduces optimizations, consolidates Go syntax into new form, catches
    omitzero bug.
    
    ## Checklist
    
    - [x] My code follows the style guidelines of this project
    - [x] I have commented my code, particularly in hard-to-understand areas
    - [ ] ~~I have added tests that prove my fix is effective or that my
    feature works~~
    orestisfl authored Feb 20, 2026
    Configuration menu
    Copy the full SHA
    d2c71a8 View commit details
    Browse the repository at this point in the history

Commits on Feb 26, 2026

  1. Use json omitzero for struct-typed fields in cgroup structs (#288)

    ## What does this PR do?
    encoding/json's omitempty is silently ignored on struct fields —
    zero-valued structs are always serialized regardless of the tag. Go
    1.24's omitzero fixes this by checking IsZero() or the zero value.
    
    ## Why is it important?
    
    No downstream production code is affected (consumers use struct tags via
    go-structform, not json tags), but this corrects a latent bug if these
    types are ever json.Marshal'd directly.
    
    ## Checklist
    
    - [x] My code follows the style guidelines of this project
    - [x] I have commented my code, particularly in hard-to-understand areas
    - [x] I have added tests that prove my fix is effective or that my
    feature works
    
    ## Related issues
    
    - Blocks
    #281
    orestisfl authored Feb 26, 2026
    Configuration menu
    Copy the full SHA
    4c778fb View commit details
    Browse the repository at this point in the history
  2. [cgv2] Add CPU CFS quota, period, and weight metrics (#281)

    ## What does this PR do?
    Add support for collecting CPU bandwidth control settings from cgroupv2:
    - cpu.max (quota and period in microseconds)
    - cpu.weight (relative weight, replaces shares from v1)
    
    ## Why is it important?
    This brings cgroupv2 to feature parity with cgroupv1 for CPU limit
    metrics.
    
    ## Checklist
    
    - [x] My code follows the style guidelines of this project
    - [x] I have commented my code, particularly in hard-to-understand areas
    - [x] I have added tests that prove my fix is effective or that my
    feature works
    - [ ] ~~I have added an entry in `CHANGELOG.md`~~
    
    ## Related issues
    - Relates elastic/beats#47708
    orestisfl authored Feb 26, 2026
    Configuration menu
    Copy the full SHA
    69b8af0 View commit details
    Browse the repository at this point in the history
Loading