Skip to content

[linux] migrate Linux metrics data streams to TSDB#17379

Merged
AndersonQ merged 17 commits intoelastic:mainfrom
AndersonQ:16511-linux-metrics-TSDB
Mar 12, 2026
Merged

[linux] migrate Linux metrics data streams to TSDB#17379
AndersonQ merged 17 commits intoelastic:mainfrom
AndersonQ:16511-linux-metrics-TSDB

Conversation

@AndersonQ
Copy link
Copy Markdown
Member

@AndersonQ AndersonQ commented Feb 11, 2026

Proposed commit message

Enable TSDB for Linux integration data streams

Enable time series data stream (TSDB) support for 8 of 11 Linux
integration data streams: conntrack, entropy, iostat, ksm, memory,
pageinfo, raid, and service. Set `elasticsearch.index_mode: "time_series"`
in each manifest and configure dimensions and metric types across all
field definitions.

Dimensions:

All 8 TSDB-enabled data streams share common infrastructure
dimensions defined in agent.yml with `dimension: true`:
  agent.id, agent.name,
  cloud.account.id, cloud.availability_zone, cloud.instance.id,
  cloud.instance.name, cloud.machine.type, cloud.provider,
  cloud.region, cloud.project.id, cloud.image.id,
  container.id, container.image.name, container.name,
  host.architecture, host.domain, host.hostname, host.id, host.name,
  host.os.family, host.os.name, host.os.platform, host.type

host.containerized is not a dimension due to a bug in elastic-package
that prevents boolean dimensions from validating correctly
(https://github.com/elastic/package-spec/issues/1106).

Domain-specific dimensions:
- iostat: linux.iostat.name (block device name)
- raid: system.raid.name (RAID device), system.raid.level (RAID level)
- service: system.service.name, systemd.unit, systemd.fragment_path

Metrics:

- conntrack (8 metrics): summary.drop, summary.early_drop (counter);
  summary.entries (gauge); summary.found, summary.ignore,
  summary.insert_failed, summary.invalid, summary.search_restart
  (counter)
- entropy (2 metrics): available_bits, pct (gauge)
- iostat (13 metrics): read/write request rates, throughput, await,
  queue size, service_time, busy (all gauge)
- ksm (7 metrics): pages_shared, pages_sharing, pages_unshared,
  pages_volatile, stable_node_chains, stable_node_dups (gauge);
  full_scans (counter)
- memory (24 metrics): pgscan_kswapd, pgscan_direct, pgfree,
  pgsteal_kswapd, pgsteal_direct, swap.out, swap.in,
  swap.readahead.pages, swap.readahead.cached,
  hugepages.swap.out.fallback, hugepages.swap.out.pages (counter);
  direct_efficiency.pct, kswapd_efficiency.pct, swap.total,
  swap.used.bytes, swap.free, swap.used.pct, hugepages.total,
  hugepages.used.bytes, hugepages.used.pct, hugepages.free,
  hugepages.reserved, hugepages.surplus, hugepages.default_size (gauge)
- pageinfo (33 metrics): buddy_info DMA.0-10, DMA32.0-10, Normal.0-10
  (all gauge)
- raid (6 metrics): disks.active, disks.total, disks.spare,
  disks.failed, blocks.total, blocks.synced (all gauge)
- service (7 metrics): cpu.usage.ns, network.in.bytes,
  network.in.packets, network.out.packets, network.out.bytes (counter);
  memory.usage.bytes, tasks.count (gauge)

Bug fix: linux.memory.swap.in.pages metric_type changed from gauge
to counter — it reads the cumulative pswpin value from /proc/vmstat,
matching its sibling swap.out.pages which was already counter.

Assisted by Cursor

Summary of the changes

click to expand

Linux Integration - TSDB Field Analysis

Overview

Data Stream TSDB Enabled Domain Dimensions Metrics Issues
conntrack Yes 0 8 None
entropy Yes 0 2 None
iostat Yes 1 (name) 13 None
ksm Yes 0 7 None
memory Yes 0 24 vmstat (flattened) not a metric
network_summary No 0 0 Not TSDB; dynamic objects prevent metric marking
pageinfo Yes 0 33 nodes.* (object) not a metric
raid Yes 2 (name, level) 6 disks.states.* (object) not a metric
service Yes 3 (name, unit, fragment_path) 7 None
socket No 0 0 Not TSDB; event-like data, not time series
users No 0 0 Not TSDB; event-like session records

Common Infrastructure Dimensions (agent.yml)

All 8 TSDB-enabled data streams share the same agent.yml defining 24 infrastructure dimensions. These are correct and make sense.

Field Type Assessment
agent.id keyword Correct - identifies the collecting agent
agent.name keyword Correct - identifies the collecting agent
cloud.account.id keyword Correct - cloud tenant identity
cloud.availability_zone keyword Correct - cloud placement
cloud.instance.id keyword Correct - cloud VM identity
cloud.instance.name keyword Correct - cloud VM identity
cloud.machine.type keyword Correct - cloud VM class
cloud.provider keyword Correct - cloud vendor
cloud.region keyword Correct - cloud placement
cloud.project.id keyword Correct - GCP project identity
cloud.image.id keyword Correct - cloud image identity
container.id keyword Correct - container identity
container.image.name keyword Correct - container image identity
container.name keyword Correct - container identity
host.architecture keyword Correct - host attribute
host.domain keyword Correct - host domain
host.hostname keyword Correct - host identity
host.id keyword Correct - host identity
host.name keyword Correct - host identity
host.os.family keyword Correct - OS classification
host.os.name keyword Correct - OS classification
host.os.platform keyword Correct - OS classification
host.type keyword Correct - host classification
host.containerized boolean Correct - host classification. Note: currently blocked by a bug in elastic-package/package-spec that prevents boolean dimensions from validating correctly — see package-spec#1106

Note on service data stream

The service data stream splits these dimensions differently: ecs.yml defines dimension on host.architecture, host.name, host.os.family, host.os.name, host.os.platform, host.type; while agent.yml covers the rest. The total is equivalent.


Common Non-Dimension Fields (shared across TSDB-enabled streams)

These fields appear in every data stream and CANNOT / SHOULD NOT be dimensions:

Field Type Source Why not a dimension
@timestamp date base-fields.yml Time axis for TSDB, not a dimension type
data_stream.type constant_keyword base-fields.yml Routing constant; same for all docs in the stream
data_stream.dataset constant_keyword base-fields.yml Routing constant; same for all docs in the stream
data_stream.namespace constant_keyword base-fields.yml Routing constant; same for all docs in the stream
event.module constant_keyword base-fields.yml Always linux; zero discriminating value
event.dataset constant_keyword base-fields.yml Always the stream name; zero discriminating value
ecs.version keyword (ext) ecs.yml Version metadata, not an entity identifier
event.duration long (ext) ecs.yml Numeric measurement of collection time (could be gauge metric)
service.address keyword (ext) ecs.yml Endpoint address; could technically be a dimension but typically constant per host
service.type keyword (ext) ecs.yml Always linux; zero discriminating value
container.labels object agent.yml object type not supported for dimensions
host.ip ip agent.yml Multi-valued (array); changing IPs would create new time series
host.mac keyword agent.yml Multi-valued (array); changing MACs would create new time series
host.os.kernel keyword agent.yml Changes on kernel upgrades; label, not entity identifier
host.os.version keyword agent.yml Changes on OS upgrades; label, not entity identifier
host.os.build keyword agent.yml Changes on OS updates; label, not entity identifier
host.os.codename keyword agent.yml Changes on OS upgrades; label, not entity identifier

Per-Data-Stream Reports


1. conntrack

TSDB: Yes | Entity: Host-level (one series per host)

Dimensions (24)

Only the 24 common infrastructure dimensions (see table above). No domain-specific dimensions needed -- conntrack is a single host-level summary.

Metrics (8)

Field Type metric_type Assessment
linux.conntrack.summary.drop long counter Correct - cumulative dropped packets
linux.conntrack.summary.early_drop long counter Correct - cumulative early drops
linux.conntrack.summary.entries long gauge Correct - current conntrack entry count
linux.conntrack.summary.found long counter Correct - cumulative search hits
linux.conntrack.summary.ignore long counter Correct - cumulative ignored packets
linux.conntrack.summary.insert_failed long counter Correct - cumulative insert failures
linux.conntrack.summary.invalid long counter Correct - cumulative invalid packets
linux.conntrack.summary.search_restart long counter Correct - cumulative search restarts

Cannot Be Dimension

Field Type Reason
linux.conntrack group Group type, not a leaf field
linux.conntrack.summary group Group type, not a leaf field
(common fields) (various) (see common table above)

Missing Dimensions

None. Host-level data is fully identified by the infrastructure dimensions.


2. entropy

TSDB: Yes | Entity: Host-level (one series per host)

Dimensions (24)

Only the 24 common infrastructure dimensions. No domain-specific dimensions needed -- entropy is a single host-level value.

Metrics (2)

Field Type metric_type Assessment
system.entropy.available_bits long gauge Correct - current available entropy
system.entropy.pct scaled_float gauge Correct - current entropy percentage

Cannot Be Dimension

Field Type Reason
system.entropy group Group type, not a leaf field
(common fields) (various) (see common table above)

Missing Dimensions

None. Host-level data is fully identified by the infrastructure dimensions.


3. iostat

TSDB: Yes | Entity: Per-device per host (one series per block device per host)

Dimensions (25)

24 common + 1 domain-specific:

Field Type Source Assessment
linux.iostat.name keyword fields.yml Correct - identifies the block device

Metrics (13)

Field Type metric_type Assessment
linux.iostat.read.request.merges_per_sec float gauge Correct - rate
linux.iostat.write.request.merges_per_sec float gauge Correct - rate
linux.iostat.read.request.per_sec float gauge Correct - rate
linux.iostat.write.request.per_sec float gauge Correct - rate
linux.iostat.read.per_sec.bytes float gauge Correct - rate
linux.iostat.read.await float gauge Correct - avg latency
linux.iostat.write.per_sec.bytes float gauge Correct - rate
linux.iostat.write.await float gauge Correct - avg latency
linux.iostat.request.avg_size float gauge Correct - average
linux.iostat.queue.avg_size float gauge Correct - average
linux.iostat.await float gauge Correct - avg latency
linux.iostat.service_time float gauge Correct - avg latency
linux.iostat.busy float gauge Correct - utilization %

Cannot Be Dimension

Field Type Reason
linux.iostat group Group type, not a leaf field
(common fields) (various) (see common table above)

Missing Dimensions

None. Device + host fully identifies each time series.


4. ksm

TSDB: Yes | Entity: Host-level (one series per host)

Dimensions (24)

Only the 24 common infrastructure dimensions. KSM is a single host-level subsystem.

Metrics (7)

Field Type metric_type Assessment
linux.ksm.stats.pages_shared long gauge Correct - current shared pages
linux.ksm.stats.pages_sharing long gauge Correct - current sharing sites
linux.ksm.stats.pages_unshared long gauge Correct - current unique pages
linux.ksm.stats.pages_volatile long gauge Correct - current volatile pages
linux.ksm.stats.full_scans long counter Correct - cumulative scan count
linux.ksm.stats.stable_node_chains long gauge Correct - current chain count
linux.ksm.stats.stable_node_dups long gauge Correct - current dup count

Cannot Be Dimension

Field Type Reason
linux.ksm group Group type, not a leaf field
linux.ksm.stats group Group type, not a leaf field
(common fields) (various) (see common table above)

Missing Dimensions

None. Host-level data is fully identified by the infrastructure dimensions.


5. memory

TSDB: Yes | Entity: Host-level (one series per host)

Dimensions (24)

Only the 24 common infrastructure dimensions. Memory is host-level.

Metrics (24)

Field Type metric_type Assessment
linux.memory.page_stats.pgscan_kswapd.pages long counter Correct - cumulative
linux.memory.page_stats.pgscan_direct.pages long counter Correct - cumulative
linux.memory.page_stats.pgfree.pages long counter Correct - cumulative
linux.memory.page_stats.pgsteal_kswapd.pages long counter Correct - cumulative
linux.memory.page_stats.pgsteal_direct.pages long counter Correct - cumulative
linux.memory.page_stats.direct_efficiency.pct scaled_float gauge Correct - percentage
linux.memory.page_stats.kswapd_efficiency.pct scaled_float gauge Correct - percentage
linux.memory.swap.total long gauge Correct - current value
linux.memory.swap.used.bytes long gauge Correct - current value
linux.memory.swap.free long gauge Correct - current value
linux.memory.swap.out.pages long counter Correct - cumulative
linux.memory.swap.in.pages long counter Correct - cumulative
linux.memory.swap.readahead.pages long counter Correct - cumulative
linux.memory.swap.readahead.cached long counter Correct - cumulative
linux.memory.swap.used.pct scaled_float gauge Correct - percentage
linux.memory.hugepages.total long gauge Correct - current value
linux.memory.hugepages.used.bytes long gauge Correct - current value
linux.memory.hugepages.used.pct scaled_float gauge Correct - percentage
linux.memory.hugepages.free long gauge Correct - current value
linux.memory.hugepages.reserved long gauge Correct - current value
linux.memory.hugepages.surplus long gauge Correct - current value
linux.memory.hugepages.default_size long gauge Correct - current value
linux.memory.hugepages.swap.out.fallback long counter Correct - cumulative
linux.memory.hugepages.swap.out.pages long counter Correct - cumulative

Cannot Be Dimension

Field Type Reason
linux.memory group Group type, not a leaf field
linux.memory.page_stats group Group type, not a leaf field
linux.memory.swap group Group type, not a leaf field
linux.memory.hugepages group Group type, not a leaf field
linux.memory.vmstat flattened Dynamic keys from /proc/vmstat; cannot mark individual sub-fields as metrics or dimensions
(common fields) (various) (see common table above)

Missing Dimensions

None. Host-level data is fully identified by the infrastructure dimensions.


6. network_summary

TSDB: No (not enabled) | Entity: Host-level

Dimensions (0)

No dimensions defined. The agent.yml in this data stream does not have dimension: true on any field.

Metrics (0)

No explicit metrics. All data fields use dynamic objects:

Field Type Description
system.network_summary.ip.* object (long) IP protocol counters
system.network_summary.tcp.* object (long) TCP protocol counters
system.network_summary.udp.* object (long) UDP protocol counters
system.network_summary.udp_lite.* object (long) UDP Lite protocol counters
system.network_summary.icmp.* object (long) ICMP protocol counters

Cannot Be Dimension

Field Type Reason
system.network_summary.ip.* object Dynamic object; object type not supported for dimensions
system.network_summary.tcp.* object Dynamic object; object type not supported for dimensions
system.network_summary.udp.* object Dynamic object; object type not supported for dimensions
system.network_summary.udp_lite.* object Dynamic object; object type not supported for dimensions
system.network_summary.icmp.* object Dynamic object; object type not supported for dimensions
(common fields) (various) (see common table above)

TSDB Conversion Blockers

  1. Dynamic objects (ip.*, tcp.*, etc.) use wildcard field names -- TSDB requires explicit field definitions with metric_type
  2. All values are counters from /proc/net/snmp and /proc/net/netstat but cannot be marked as such without expanding to explicit fields
  3. To enable TSDB: expand each counter to an explicit long field with metric_type: counter, add dimension: true to agent/host/cloud/container fields in agent.yml, and add index_mode: "time_series" to manifest

7. pageinfo

TSDB: Yes | Entity: Host-level (one series per host)

Dimensions (24)

Only the 24 common infrastructure dimensions.

Metrics (33)

Field Type metric_type Assessment
linux.pageinfo.buddy_info.DMA.0 long gauge Correct - current free chunks
linux.pageinfo.buddy_info.DMA.1 long gauge Correct
linux.pageinfo.buddy_info.DMA.2 long gauge Correct
linux.pageinfo.buddy_info.DMA.3 long gauge Correct
linux.pageinfo.buddy_info.DMA.4 long gauge Correct
linux.pageinfo.buddy_info.DMA.5 long gauge Correct
linux.pageinfo.buddy_info.DMA.6 long gauge Correct
linux.pageinfo.buddy_info.DMA.7 long gauge Correct
linux.pageinfo.buddy_info.DMA.8 long gauge Correct
linux.pageinfo.buddy_info.DMA.9 long gauge Correct
linux.pageinfo.buddy_info.DMA.10 long gauge Correct
linux.pageinfo.buddy_info.DMA32.0 long gauge Correct
linux.pageinfo.buddy_info.DMA32.1 long gauge Correct
linux.pageinfo.buddy_info.DMA32.2 long gauge Correct
linux.pageinfo.buddy_info.DMA32.3 long gauge Correct
linux.pageinfo.buddy_info.DMA32.4 long gauge Correct
linux.pageinfo.buddy_info.DMA32.5 long gauge Correct
linux.pageinfo.buddy_info.DMA32.6 long gauge Correct
linux.pageinfo.buddy_info.DMA32.7 long gauge Correct
linux.pageinfo.buddy_info.DMA32.8 long gauge Correct
linux.pageinfo.buddy_info.DMA32.9 long gauge Correct
linux.pageinfo.buddy_info.DMA32.10 long gauge Correct
linux.pageinfo.buddy_info.Normal.0 long gauge Correct
linux.pageinfo.buddy_info.Normal.1 long gauge Correct
linux.pageinfo.buddy_info.Normal.2 long gauge Correct
linux.pageinfo.buddy_info.Normal.3 long gauge Correct
linux.pageinfo.buddy_info.Normal.4 long gauge Correct
linux.pageinfo.buddy_info.Normal.5 long gauge Correct
linux.pageinfo.buddy_info.Normal.6 long gauge Correct
linux.pageinfo.buddy_info.Normal.7 long gauge Correct
linux.pageinfo.buddy_info.Normal.8 long gauge Correct
linux.pageinfo.buddy_info.Normal.9 long gauge Correct
linux.pageinfo.buddy_info.Normal.10 long gauge Correct

Cannot Be Dimension

Field Type Reason
linux.pageinfo group Group type, not a leaf field
linux.pageinfo.buddy_info group Group type, not a leaf field
linux.pageinfo.buddy_info.DMA group Group type, not a leaf field
linux.pageinfo.buddy_info.DMA32 group Group type, not a leaf field
linux.pageinfo.buddy_info.Normal group Group type, not a leaf field
linux.pageinfo.nodes.* object (keyword) Dynamic object with wildcard keys; object type not supported
(common fields) (various) (see common table above)

Missing Dimensions

None. The zone names (DMA, DMA32, Normal) are encoded in the field path rather than as a dimension value. This is a structural design choice -- if the zones were dynamic, a zone dimension would be needed, but since they're hardcoded field names, the current approach works.


8. raid

TSDB: Yes | Entity: Per-RAID-device per host

Dimensions (26)

24 common + 2 domain-specific:

Field Type Source Assessment
system.raid.name keyword fields.yml Correct - RAID device name (e.g., md0)
system.raid.level keyword fields.yml Correct - RAID level (e.g., raid1, raid5); stable per device

Metrics (6)

Field Type metric_type Assessment
system.raid.disks.active long gauge Correct - current active disk count
system.raid.disks.total long gauge Correct - current total disk count
system.raid.disks.spare long gauge Correct - current spare disk count
system.raid.disks.failed long gauge Correct - current failed disk count
system.raid.blocks.total long gauge Correct - current block count
system.raid.blocks.synced long gauge Correct - current synced block count

Cannot Be Dimension

Field Type Reason
system.raid group Group type, not a leaf field
system.raid.status keyword Mutable activity state (e.g., active, inactive); changes create new series
system.raid.sync_action keyword Mutable sync state (e.g., idle, resync); changes create new series
system.raid.disks.states.* object (long) Dynamic object with wildcard keys; object type not supported
(common fields) (various) (see common table above)

Missing Dimensions

None. Device name + level + host fully identifies each RAID time series.


9. service

TSDB: Yes | Entity: Per-systemd-service per host

Dimensions (27)

24 common (split between agent.yml + ecs.yml) + 3 domain-specific:

Field Type Source Assessment
systemd.fragment_path keyword fields.yml Correct - service file location, stable per service
systemd.unit keyword fields.yml Correct - unit name, uniquely identifies the service
system.service.name keyword fields.yml Correct - service name

Metrics (7)

Field Type metric_type Assessment
system.service.resources.cpu.usage.ns long counter Correct - cumulative CPU time
system.service.resources.memory.usage.bytes long gauge Correct - current memory
system.service.resources.tasks.count long gauge Correct - current task count
system.service.resources.network.in.bytes long counter Correct - cumulative bytes in
system.service.resources.network.in.packets long counter Correct - cumulative packets in
system.service.resources.network.out.packets long counter Correct - cumulative packets out
system.service.resources.network.out.bytes long counter Correct - cumulative bytes out

Cannot Be Dimension

Field Type Reason
system.service group Group type, not a leaf field
system.service.resources group Group type, not a leaf field
system.service.resources.network group Group type, not a leaf field
system.service.load_state keyword Mutable state (loaded/not-found/masked); changes would split time series
system.service.state keyword Mutable state (active/inactive/failed); changes would split time series
system.service.sub_state keyword Mutable state (running/dead/exited); changes would split time series
system.service.state_since date date type not supported for dimensions
system.service.exec_code keyword Transient exit code; high cardinality and not an entity identifier
process.name keyword (ext) Process name of the service; could differ across restarts
process.pid long (ext) PID changes on every restart; very high cardinality
process.pgid long (ext) Process group ID; changes on restart
process.ppid long (ext) Parent PID; changes on restart
process.exit_code long (ext) Transient exit code; high cardinality
process.working_directory keyword (ext) File path; high cardinality
user.name keyword (ext) Service run-as user; could be dimension but redundant with service name
host.os.full keyword (ext) Full OS string; high cardinality version detail
(common fields) (various) (see common table above)

Missing Dimensions

None. Service name + unit + fragment_path + host fully identifies each service time series.


10. socket

TSDB: No (not enabled) | Entity: Per-socket per host

Dimensions (0)

No dimensions defined. The agent.yml does not have dimension: true on any field.

Metrics (0)

No metrics defined. The socket data stream captures point-in-time socket snapshots, not numeric measurements over time.

Cannot Be Dimension

Field Type Reason
system.socket group Group type, not a leaf field
system.socket.local.ip ip Supported type, but this is the measured entity, not a stable identifier across time
system.socket.local.port long Ephemeral ports are extremely high cardinality
system.socket.remote.ip ip Connection endpoint; very high cardinality
system.socket.remote.port long Connection endpoint; high cardinality
system.socket.remote.host keyword Reverse DNS; can change and is high cardinality
system.socket.remote.etld_plus_one keyword Derived domain; moderate-high cardinality
system.socket.remote.host_error keyword Transient error string
system.socket.process.cmdline keyword Full command line; extremely high cardinality
network.direction keyword (ext) Could be dimension (inbound/outbound) but low value without other dimensions
network.type keyword (ext) Could be dimension (ipv4/ipv6) but low value without other dimensions
process.name keyword (ext) Process name; moderate cardinality
process.executable keyword (ext) Full path; high cardinality
process.pid long (ext) PID; very high cardinality
user.full_name keyword (ext) User running the process; moderate cardinality
user.id keyword (ext) User ID; moderate cardinality
(common fields) (various) (see common table above)

TSDB Conversion Assessment

Not recommended. The socket data stream captures a snapshot of all open sockets at each collection interval. This is event-like data (the set of sockets changes constantly), not a stable set of time series with numeric measurements. There are no numeric metrics to track over time -- the value is in the enumeration itself.


11. users

TSDB: No (not enabled) | Entity: Per-session per host

Dimensions (0)

No dimensions defined. The agent.yml does not have dimension: true on any field.

Metrics (0)

No metrics defined. The users data stream captures point-in-time session records.

Cannot Be Dimension

Field Type Reason
system.users group Group type, not a leaf field
system.users.id keyword Session ID; unique per session, extremely high cardinality
system.users.seat keyword Logind seat; low cardinality but sessions are transient
system.users.path keyword DBus object path; unique per session
system.users.type keyword Session type (tty/x11/wayland); low cardinality
system.users.service keyword Associated service; moderate cardinality
system.users.remote boolean Remote flag; supported type but no metric to track
system.users.state keyword Mutable session state; changes over time
system.users.scope keyword Systemd scope; unique per session
system.users.leader long Root PID; changes per session, high cardinality
system.users.remote_host keyword Remote host address; high cardinality
source.ip ip (ext) Source IP; high cardinality
source.port long (ext) Source port; very high cardinality
(common fields) (various) (see common table above)

TSDB Conversion Assessment

Not recommended. Like socket, the users data stream captures a point-in-time snapshot of logged-in sessions. Sessions are inherently transient -- they appear and disappear. There are no numeric metrics to track over time.


Summary of Findings

Correctness of Existing TSDB Configuration

All 8 TSDB-enabled data streams have correct dimension and metric assignments:

  • Infrastructure dimensions (24 fields from agent.yml) correctly identify host/cloud/container
  • Domain-specific dimensions (iostat.name, raid.name, raid.level, service.name, systemd.unit, systemd.fragment_path) correctly identify the measured entity
  • All metric types (gauge vs counter) are correctly assigned
  • No fields that should be dimensions are missing from the TSDB-enabled streams

Non-TSDB Data Streams

Data Stream Recommendation Reason
network_summary Possible with refactoring Dynamic objects need expansion to explicit fields with metric_type
socket Not recommended Event-like snapshots, no numeric time series
users Not recommended Event-like snapshots, no numeric time series

No Missing Dimensions Found

For all TSDB-enabled data streams, the current dimension set is complete. The entity being measured is fully identified:

  • Host-level (conntrack, entropy, ksm, memory, pageinfo): host/agent/cloud/container dimensions suffice
  • Per-device (iostat): device name + host dimensions
  • Per-RAID (raid): device name + RAID level + host dimensions
  • Per-service (service): service name + unit + fragment path + host dimensions

Tests with TSDB-migration-test-kit

Use TSDB migration test kit to test.

Run the test for the following data streams:

                     "metrics-linux.conntrack-default",
                     "metrics-linux.entropy-default",
                     "metrics-linux.iostat-default",
                     "metrics-linux.ksm-default",
                     "metrics-linux.memory-default",
                     "metrics-linux.pageinfo-default",
                     "metrics-linux.raid-default",
                     "metrics-linux.service-default"

Checklist

  • [ ] I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • [ ] I have verified that Kibana version constraints are current according to guidelines.
  • [ ] I have verified that any added dashboard complies with Kibana's Dashboard good practices

How to test this PR locally

Related issues

@AndersonQ AndersonQ self-assigned this Feb 11, 2026
@AndersonQ AndersonQ added Integration:linux Linux Metrics Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane] labels Feb 11, 2026
@AndersonQ AndersonQ requested a review from Copilot February 11, 2026 17:41
@AndersonQ AndersonQ force-pushed the 16511-linux-metrics-TSDB branch from 57914bb to 15d89c6 Compare February 11, 2026 17:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates several Linux integration metrics data streams to Elasticsearch TSDB / time_series data streams by enabling index_mode: "time_series" and annotating fields with metric_type/dimension so metrics can be stored and queried as time series efficiently.

Changes:

  • Enable TSDB (elasticsearch.index_mode: "time_series") for conntrack, entropy, iostat, ksm, memory, pageinfo, raid, and service data streams.
  • Mark common identifying fields (e.g., agent/cloud/container/host) as dimension: true and add stream-specific dimensions (e.g., device/service/raid name).
  • Annotate numeric metric fields with metric_type (gauge/counter).

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/linux/data_stream/service/manifest.yml Enables TSDB index mode for the service metrics data stream.
packages/linux/data_stream/service/fields/fields.yml Adds dimension for service name and metric_type for service resource metrics.
packages/linux/data_stream/service/fields/ecs.yml Marks host.name as a TSDB dimension for service metrics.
packages/linux/data_stream/service/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container, etc.) for service metrics.
packages/linux/data_stream/raid/manifest.yml Enables TSDB index mode for the raid metrics data stream.
packages/linux/data_stream/raid/fields/fields.yml Marks raid name as a dimension and annotates numeric fields with metric_type.
packages/linux/data_stream/raid/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container/host.name, etc.) for raid metrics.
packages/linux/data_stream/pageinfo/manifest.yml Enables TSDB index mode for the pageinfo metrics data stream.
packages/linux/data_stream/pageinfo/fields/fields.yml Annotates buddyinfo numeric fields with metric_type: gauge for TSDB.
packages/linux/data_stream/pageinfo/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container/host.name, etc.) for pageinfo metrics.
packages/linux/data_stream/memory/manifest.yml Enables TSDB index mode for the memory metrics data stream.
packages/linux/data_stream/memory/fields/fields.yml Adds metric_type annotations across paging/swap/hugepages metrics for TSDB.
packages/linux/data_stream/memory/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container/host.name, etc.) for memory metrics.
packages/linux/data_stream/ksm/manifest.yml Enables TSDB index mode for the ksm metrics data stream.
packages/linux/data_stream/ksm/fields/fields.yml Annotates KSM numeric fields with metric_type for TSDB.
packages/linux/data_stream/ksm/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container/host.name, etc.) for ksm metrics.
packages/linux/data_stream/iostat/manifest.yml Enables TSDB index mode for the iostat metrics data stream.
packages/linux/data_stream/iostat/fields/fields.yml Marks disk device name as a dimension and annotates iostat numeric fields with metric_type.
packages/linux/data_stream/iostat/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container/host.name, etc.) for iostat metrics.
packages/linux/data_stream/entropy/manifest.yml Enables TSDB index mode for the entropy metrics data stream.
packages/linux/data_stream/entropy/fields/fields.yml Annotates entropy numeric fields with metric_type: gauge for TSDB.
packages/linux/data_stream/entropy/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container/host.name, etc.) for entropy metrics.
packages/linux/data_stream/conntrack/manifest.yml Enables TSDB index mode for the conntrack metrics data stream.
packages/linux/data_stream/conntrack/fields/fields.yml Annotates conntrack numeric fields with metric_type for TSDB.
packages/linux/data_stream/conntrack/fields/agent.yml Adds common TSDB dimensions (agent/cloud/container/host.name, etc.) for conntrack metrics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

description: bytes in
- name: in.packets
type: long
format: bytes
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system.service.resources.network.in.packets is a packet count but is still declared with format: bytes, which will cause incorrect formatting/units in Kibana and exported field docs. Remove the bytes format (or switch to a numeric format appropriate for counts).

Suggested change
format: bytes

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Feb 11, 2026

Vale Linting Results

Summary: 1 warning, 4 suggestions found

⚠️ Warnings (1)
File Line Rule Message
packages/linux/docs/README.md 306 Elastic.Latinisms Latin terms and abbreviations are a common source of confusion. Use 'for example' instead of 'e.g'.
💡 Suggestions (4)
File Line Rule Message
packages/linux/docs/README.md 100 Elastic.WordChoice Consider using 'can, might' instead of 'may', unless the term is in the UI.
packages/linux/docs/README.md 214 Elastic.WordChoice Consider using 'can, might' instead of 'may', unless the term is in the UI.
packages/linux/docs/README.md 281 Elastic.WordChoice Consider using 'can, might' instead of 'may', unless the term is in the UI.
packages/linux/docs/README.md 331 Elastic.Wordiness Consider using 'all' instead of 'all of '.

The Vale linter checks documentation changes against the Elastic Docs style guide.

To use Vale locally or report issues, refer to Elastic style guide for Vale.

@andrewkroh andrewkroh added the documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. label Feb 11, 2026
Enable time series data streams (TSDB) for 8 of 11 data streams in the
Linux integration: conntrack, entropy, iostat, ksm, memory, pageinfo,
raid, and service.

For each data stream:
- Add `elasticsearch.index_mode: "time_series"` to manifest.yml
- Annotate numeric fields with appropriate metric_type (gauge/counter)
- Mark dimension fields to uniquely identify each time series

Common dimensions (all 8 data streams):
- agent.id
- agent.name
- cloud.account.id
- cloud.availability_zone
- cloud.instance.id
- cloud.provider
- cloud.region
- container.id
- host.name

Integration-specific dimensions:
- iostat: linux.iostat.name (disk device)
- raid: system.raid.name (RAID array)
- service: system.service.name (systemd service)

Excluded data streams:
- socket: transient entities with no persistent time series
- users: transient sessions with no numeric metrics
- network_summary: fields use object wildcard mappings that cannot carry
  metric_type annotations, limiting TSDB benefits

Assisted by Cursor
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 12 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 115 to 116
type: long
format: percent
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linux.memory.hugepages.used.pct is declared as type: long with format: percent, while other percent fields in this data stream (for example linux.memory.swap.used.pct) use scaled_float with unit: percent. If the hugepages percentage is non-integer, the current mapping will truncate/round; consider switching this field to scaled_float and adding unit: percent for consistency.

Suggested change
type: long
format: percent
type: scaled_float
format: percent
unit: percent

Copilot uses AI. Check for mistakes.
@AndersonQ AndersonQ marked this pull request as ready for review February 13, 2026 07:25
@AndersonQ AndersonQ requested a review from a team as a code owner February 13, 2026 07:25
@elasticmachine
Copy link
Copy Markdown

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@AndersonQ AndersonQ changed the title WIP [linux] migrate Linux metrics data streams to TSDB [linux] migrate Linux metrics data streams to TSDB Feb 13, 2026
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@pierrehilbert pierrehilbert requested review from rdner and removed request for faec and khushijain21 February 13, 2026 07:50
@AndersonQ
Copy link
Copy Markdown
Member Author

My feeling is that the dimensions in this package should be the same as the ones in system, since this is often installed as a companion to the system integration, unless there is a data source specific additional dimension that we need.

This could mean we also need to update the system package dimensions if something is missing there.

@cmacknz, I checked with AI and they're the same. The only difference is that I'm adding agent.name here because it's the actual field we filter/aggregate for on the dashboards. I can add that to the system integration as well. I think it's a good idea.

Anyway I'm confirming with the ES/es-storage-engine team that adding redundant dimensions have at most a negligible impact on storage.

@AndersonQ
Copy link
Copy Markdown
Member Author

hey @Oddly, I was checking and I don't think we can confidently map all fields on the linux.memory.vmstat. They change depending on the kernel and new fields can be added.

So, I don't think it makes sense trying to change anything here.
You can open an issue with more details and we'll evaluate it.

@Oddly
Copy link
Copy Markdown
Contributor

Oddly commented Feb 19, 2026

Good point, thanks for looking at this!

@AndersonQ
Copy link
Copy Markdown
Member Author

AndersonQ commented Feb 19, 2026

My feeling is that the dimensions in this package should be the same as the ones in system, since this is often installed as a companion to the system integration, unless there is a data source specific additional dimension that we need.

This could mean we also need to update the system package dimensions if something is missing there.

@cmacknz, it's done.

I checked with the ES team, they said redundant dimensions have negligible impact and do not help queries, thus, no need to have agent.name as a dimension. So I removed it from here instead of adding it to the system integration.

@AndersonQ
Copy link
Copy Markdown
Member Author

@rdner, @orestisfl I believe Craig's questions have been answered. When you have some time, could you review it?

@cmacknz
Copy link
Copy Markdown
Member

cmacknz commented Feb 19, 2026

I checked with the ES team, they said redundant dimensions have negligible impact and do not help queries, thus, no need to have agent.name as a dimension. So I removed it from here instead of adding it to the system integration.

👍 thanks

fixes made:
  +----------------+----------------------------+----------------------------------+
  | Data Stream    | Field                      | Added                            |
  +----------------+----------------------------+----------------------------------+
  | entropy        | system.entropy.pct         | unit: percent                    |
  | iostat         | read.per_sec.bytes         | unit: byte                       |
  | iostat         | write.per_sec.bytes        | unit: byte                       |
  | iostat         | busy                       | format: percent, unit: percent   |
  | memory         | hugepages.used.bytes       | unit: byte                       |
  | memory         | hugepages.default_size     | unit: byte                       |
  | memory         | direct_efficiency.pct      | unit: percent                    |
  | memory         | kswapd_efficiency.pct      | unit: percent                    |
  | service        | resources.cpu.usage.ns     | unit: nanos                      |
  | service        | resources.memory.usage.bytes | format: bytes, unit: byte      |
  | service        | network.in.bytes           | unit: byte                       |
  | service        | network.out.bytes          | unit: byte                       |
  +----------------+----------------------------+----------------------------------+
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@rdner rdner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and clarifications.

@AndersonQ
Copy link
Copy Markdown
Member Author

after some more talking to the es team, I got to a final guidance on what should be a dimension for the TSDB. The tl;dr is:

  • adding redundant dimensions has negligible - however there is no magic number of when they start to impact the tsid generation
  • ideally the TSDB has only dimensions and metrics
  • fields that are dimensions should have performance improvement for queries/aggregations

So I'll check to add more fields as dimensions. I'll keep the PR in draft until I update them and find out if format is a valid field

@AndersonQ AndersonQ marked this pull request as draft March 4, 2026 14:26
@AndersonQ
Copy link
Copy Markdown
Member Author

@rdner

Could not find format in the spec https://github.com/elastic/package-spec/blob/dde18ea1f8f6481bcc756063df442b0e56272e50/spec/integration/data_stream/fields/fields.spec.yml

it comes from here: https://github.com/elastic/package-spec/blob/5f23052266aab46c2b17423f07ac528c4026fded/spec/integration/data_stream/fields/fields.spec.yml#L39-L40

However, it might not be working as expected (elastic/kibana#207849). Even though I think it's better to have it as it should work.

@AndersonQ AndersonQ marked this pull request as ready for review March 11, 2026 08:42
@AndersonQ AndersonQ requested a review from rdner March 11, 2026 08:42
@elasticmachine
Copy link
Copy Markdown

💚 Build Succeeded

History

cc @AndersonQ

Copy link
Copy Markdown
Member

@rdner rdner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me.

I think we should have a proper issue for the TODO/follow up you added about the boolean dimension and remove the comment from this PR.

It's not blocking though.

@AndersonQ AndersonQ merged commit f68249b into elastic:main Mar 12, 2026
10 checks passed
@elastic-vault-github-plugin-prod
Copy link
Copy Markdown

Package linux - 1.1.0 containing this change is available at https://epr.elastic.co/package/linux/1.1.0/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. Integration:linux Linux Metrics Team:Elastic-Agent-Data-Plane Agent Data Plane team [elastic/elastic-agent-data-plane]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Linux Metrics integration] Migrate data streams to TSDB

7 participants