Skip to content

[tsdb] Ingest out of order samples and samples from a few hours ago #8535

@gouthamve

Description

@gouthamve

Proposal

Prometheus now accepts remote write data. But it can't ingest data more than an hour old reliably. This means if there is an outage or network partition and the downstream Prometheus has issues with pushing for more than an hour, it is likely that there will be data loss after the issues are resolved. Cortex and Thanos, which consume TSDB have the same limitation.

We should solve this issue in TSDB, and a very interesting trade-off was presented at the storage working group. When ingesting data that is outside the current head block, we don't need to make it immediately available for querying. We could write to a log and compact it in the background before making it available for querying. I think its a fair trade-off that wouldn't use too much extra resources. I'm curious what others think!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions