-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Closed
Labels
Description
EDIT: Last updated 2024-04-03
I'd like to collect issues around the current status of remote write so that we can better track what's still a problem, what we might want to add or improve, and discuss options.
- 1. Revisit remote write/queue config defaults, bump samples per send/lower max shards/bump min shards. (reevaluate remote write queue config defaults #8808)
- 2. Keep a checkpoint per remote write queue so we can attempt restart from that point in the WAL on restarts. ([Remote Write] WAL watching best-effort start from a checkpoint instead of time.Now #8809)
- 3. transactional remote write design doc
- 4. investigate WAL replay improvements, TSDB related as well ([Remote Write] Improve multi-queue WAL replay #6733, Memory usage spikes during WAL replay to more than normal usage #6934)
- 5. improvements to remote write alerts and dashboards in prometheus-mixin (mixin: remote-write grafana dashboard should graph write latencies #7218 mixin: remote-write related alert severity should take HA setup into account #7176)
- 6. investigate possible improvements for resharding behaviour (Remote write should retry on 429 #8418)
- 7. metrics audit (remote storage metrics audit #8779)
- 8. remote write 2.0 ([meta] Remote write 2.0 #13105)
Reactions are currently unavailable