Skip to content

Prometheus Agent: consider never moving samples to checkpoint #9848

@rfratto

Description

@rfratto

Prometheus Agent will currently move samples to checkpoints if, during GC, the timestamp of those samples are behind the lowest sent timestamp across all remote_write endpoints. The intent is that samples are kept best-effort: i.e., try to not delete them before they're sent.

However, remote_write will never read samples from checkpoints. We should consider always dropping all samples when creating a new checkpoint for the Prometheus Agent WAL. Series records for live time series should still be kept in the checkpoint.

As an aside: @gouthamve pointed out that remote_write doesn't currently any existing samples at all from an existing WAL, which makes having a WAL seem a little pointless. I know that I've heard talk from @csmarchbanks and @cstyan of using some kind of marker so remote_write can re-start from a marker, but I don't know what the status of that is. Until we have that, I do agree that the Prometheus Agent's (and grafana/agent's) WAL isn't really providing any benefits to the user.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions