-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Description
Prometheus Agent will currently move samples to checkpoints if, during GC, the timestamp of those samples are behind the lowest sent timestamp across all remote_write endpoints. The intent is that samples are kept best-effort: i.e., try to not delete them before they're sent.
However, remote_write will never read samples from checkpoints. We should consider always dropping all samples when creating a new checkpoint for the Prometheus Agent WAL. Series records for live time series should still be kept in the checkpoint.
As an aside: @gouthamve pointed out that remote_write doesn't currently any existing samples at all from an existing WAL, which makes having a WAL seem a little pointless. I know that I've heard talk from @csmarchbanks and @cstyan of using some kind of marker so remote_write can re-start from a marker, but I don't know what the status of that is. Until we have that, I do agree that the Prometheus Agent's (and grafana/agent's) WAL isn't really providing any benefits to the user.