Skip to content

Adds support to drop samples in remote-write on violating the retry-policy#8549

Closed
harkishen wants to merge 3 commits intoprometheus:mainfrom
harkishen:remote-write-drop-samples
Closed

Adds support to drop samples in remote-write on violating the retry-policy#8549
harkishen wants to merge 3 commits intoprometheus:mainfrom
harkishen:remote-write-drop-samples

Conversation

@harkishen
Copy link
Copy Markdown
Contributor

@harkishen harkishen commented Mar 1, 2021

@roidelapluie
Copy link
Copy Markdown
Member

Note: there are disagreements over this on the mailing list and the doc.

@harkishen harkishen changed the title Adds support in remote-write to drop samples based on response status WIP: Adds support in remote-write to drop samples based on response status Mar 1, 2021
@harkishen harkishen force-pushed the remote-write-drop-samples branch from c108c15 to 2032b2d Compare April 4, 2021 07:05
@harkishen
Copy link
Copy Markdown
Contributor Author

In recent discussions with @csmarchbanks, we came to an approach to respect both samples age and max-retries count in the retry mechanism of the remote-write component. The recent commit implements the same along with a refactor to the retry configurations in queue_config.

@harkishen harkishen changed the title WIP: Adds support in remote-write to drop samples based on response status Adds support in remote-write to drop samples based on response status Apr 4, 2021
@harkishen harkishen force-pushed the remote-write-drop-samples branch 2 times, most recently from 3a30638 to 4e26cab Compare April 4, 2021 07:22
Copy link
Copy Markdown
Member

@cstyan cstyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick review, haven't looked at the tests and I also think we should update docs here as well

we potentially want to include a disclaimer somewhere that using retries could lead to increased remote write falls behind issues?

@harkishen harkishen force-pushed the remote-write-drop-samples branch from 4e26cab to 9e2d57a Compare April 26, 2021 07:18
@harkishen
Copy link
Copy Markdown
Contributor Author

we potentially want to include a disclaimer somewhere that using retries could lead to increased remote write falls behind issues?

Yes, I think we should.

@harkishen harkishen force-pushed the remote-write-drop-samples branch from 9e2d57a to 4bcc198 Compare April 26, 2021 07:22
@harkishen harkishen requested a review from cstyan April 26, 2021 07:22
@harkishen harkishen force-pushed the remote-write-drop-samples branch from 4bcc198 to 932ec91 Compare April 26, 2021 09:41
Copy link
Copy Markdown
Member

@cstyan cstyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good

@roidelapluie
Copy link
Copy Markdown
Member

This is a breaking change in the configuration.

I know that "formally" we do not provide stability guarantees for the config of remote write/read but on the other hand it is said left and right that "remote write should be stable" or even "is stable".

I would avoid that breaking change in the configuration.

@harkishen
Copy link
Copy Markdown
Contributor Author

harkishen commented May 25, 2021

Reverted the breaking change in config. The retry-policy should now be applicable without breaking existing systems.

BTW, since we implemented exemplars that will be passed in the same []prompb.Timeseries slice, the retry-policy will drop the exemplars too, if the min timestamp of the samples from time-series is below the permitted range. I think we should not drop the exemplars but haven't implemented it yet since I want to knowed the views from other people regarding this.

@harkishen harkishen changed the title Adds support in remote-write to drop samples based on response status Adds support to drop samples in remote-write on violating the retry-policy May 25, 2021
@harkishen harkishen requested a review from cstyan May 25, 2021 15:12
@harkishen harkishen force-pushed the remote-write-drop-samples branch from 15fb75e to 6dbc972 Compare May 26, 2021 07:14
@codesome
Copy link
Copy Markdown
Member

@cstyan can you take another look at this, please?

)
if s.qm.cfg.RetryPolicy != nil {
exemplars := removeTs(samples)
if len(exemplars) > 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could also just check the input parameter exemplarCount int

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it safer to consider the actual length received than function param? If you think we should use the param, let me know and I will update.

Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
@harkishen harkishen force-pushed the remote-write-drop-samples branch from 6dbc972 to 59ae979 Compare July 30, 2021 07:30
Signed-off-by: Harkishen-Singh <harkishensingh@hotmail.com>
@harkishen harkishen force-pushed the remote-write-drop-samples branch 2 times, most recently from b31260d to f49040e Compare July 30, 2021 07:35
Signed-off-by: Harkishen Singh <harkishensingh@hotmail.com>
@roidelapluie
Copy link
Copy Markdown
Member

Hello @Harkishen-Singh @cstyan we have checked this PR in the bug scrub and we have decided to close it, but feel free to rebase and reopen if you have time and agreement on the next steps.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Automatically Drop Old Remote Write Samples Before Sending

4 participants