Skip to content

Memory Leak in v2.27.1: tsdb.txRing consumes much heap memory #9253

@ilixiaocui

Description

@ilixiaocui

What did you do?
We ran into the same problem as: #7120

What did you expect to see?
The memory usage is as stable as other Prometheus instances

What did you see instead? Under which circumstances?
The available memory of the node which hosts the Prometheus 2.27.1 instance gets decreasing like below:
image
image
image
image

Environment

  • System information:

    Linux 4.9.65-netease x86_64

  • Prometheus version:
    {
    "status": "success",
    "data": {
    "version": "2.27.1",
    "revision": "db7f0bcec27bd8aeebad6b08ac849516efa9ae02",
    "branch": "HEAD",
    "buildUser": "root@fd804fbd4f25",
    "buildDate": "20210518-14:17:54",
    "goVersion": "go1.16.4"
    }
    }

  • Alertmanager version:

    insert output of alertmanager --version here (if relevant to the issue)

  • Prometheus configuration file:

{
  "status": "success",
  "data": {
    "alertmanager.notification-queue-capacity": "10000",
    "alertmanager.timeout": "",
    "config.file": "/etc/prometheus/prometheus.yml",
    "enable-feature": "",
    "log.format": "logfmt",
    "log.level": "info",
    "query.lookback-delta": "5m",
    "query.max-concurrency": "20",
    "query.max-samples": "50000000",
    "query.timeout": "2m",
    "rules.alert.for-grace-period": "10m",
    "rules.alert.for-outage-tolerance": "1h",
    "rules.alert.resend-delay": "1m",
    "scrape.adjust-timestamps": "true",
    "storage.exemplars.exemplars-limit": "0",
    "storage.remote.flush-deadline": "1m",
    "storage.remote.read-concurrent-limit": "10",
    "storage.remote.read-max-bytes-in-frame": "1048576",
    "storage.remote.read-sample-limit": "50000000",
    "storage.tsdb.allow-overlapping-blocks": "false",
    "storage.tsdb.max-block-chunk-segment-size": "0B",
    "storage.tsdb.max-block-duration": "2h",
    "storage.tsdb.min-block-duration": "2h",
    "storage.tsdb.no-lockfile": "false",
    "storage.tsdb.path": "/prometheus",
    "storage.tsdb.retention": "0s",
    "storage.tsdb.retention.size": "256GiB",
    "storage.tsdb.retention.time": "1w",
    "storage.tsdb.wal-compression": "true",
    "storage.tsdb.wal-segment-size": "0B",
    "web.config.file": "",
    "web.console.libraries": "/usr/share/prometheus/console_libraries",
    "web.console.templates": "/usr/share/prometheus/consoles",
    "web.cors.origin": ".*",
    "web.enable-admin-api": "false",
    "web.enable-lifecycle": "false",
    "web.external-url": "",
    "web.listen-address": ":9090",
    "web.max-connections": "512",
    "web.page-title": "Prometheus Time Series Collection and Processing Server",
    "web.read-timeout": "5m",
    "web.route-prefix": "/",
    "web.user-assets": ""
  }
}
  • Alertmanager configuration file:
insert configuration here (if relevant to the issue)
  • Logs:
insert Prometheus and Alertmanager logs relevant to the issue here

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions