Adds support to drop samples in remote-write on violating the retry-policy#8549
Adds support to drop samples in remote-write on violating the retry-policy#8549harkishen wants to merge 3 commits intoprometheus:mainfrom
Conversation
|
Note: there are disagreements over this on the mailing list and the doc. |
c108c15 to
2032b2d
Compare
|
In recent discussions with @csmarchbanks, we came to an approach to respect both samples age and max-retries count in the retry mechanism of the remote-write component. The recent commit implements the same along with a refactor to the retry configurations in queue_config. |
3a30638 to
4e26cab
Compare
cstyan
left a comment
There was a problem hiding this comment.
quick review, haven't looked at the tests and I also think we should update docs here as well
we potentially want to include a disclaimer somewhere that using retries could lead to increased remote write falls behind issues?
4e26cab to
9e2d57a
Compare
Yes, I think we should. |
9e2d57a to
4bcc198
Compare
4bcc198 to
932ec91
Compare
|
This is a breaking change in the configuration. I know that "formally" we do not provide stability guarantees for the config of remote write/read but on the other hand it is said left and right that "remote write should be stable" or even "is stable". I would avoid that breaking change in the configuration. |
|
Reverted the breaking change in config. The retry-policy should now be applicable without breaking existing systems. BTW, since we implemented exemplars that will be passed in the same |
15fb75e to
6dbc972
Compare
|
@cstyan can you take another look at this, please? |
storage/remote/queue_manager.go
Outdated
| ) | ||
| if s.qm.cfg.RetryPolicy != nil { | ||
| exemplars := removeTs(samples) | ||
| if len(exemplars) > 0 { |
There was a problem hiding this comment.
you could also just check the input parameter exemplarCount int
There was a problem hiding this comment.
Isn't it safer to consider the actual length received than function param? If you think we should use the param, let me know and I will update.
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
6dbc972 to
59ae979
Compare
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
b31260d to
f49040e
Compare
Signed-off-by: Harkishen Singh <harkishensingh@hotmail.com>
f49040e to
7fc39f5
Compare
|
Hello @Harkishen-Singh @cstyan we have checked this PR in the bug scrub and we have decided to close it, but feel free to rebase and reopen if you have time and agreement on the next steps. Thanks! |
Signed-off-by: Harkishen-Singh harkishensingh@hotmail.com
Fixes: #7912
Design doc: https://docs.google.com/document/d/1pC5GjMkSe4YHFqVMp0ZX9C1Iih8QLOnnBG-HguKzwrM/edit?usp=sharing
Mailing list discussion: https://groups.google.com/u/1/g/prometheus-developers/c/2PEY4o5tgoI