Skip to content

Discussion: add an optional retry mechanism to SLM policies #70587

@joegallo

Description

@joegallo

In the case of a snapshot policy with a high frequency, there's no need for a retry policy, but with low frequency, it seems that there is such a need.

For example, if a policy were configured to take a snapshot every 30 minutes, and a particular snapshot failed or only partially succeeded, then it's probably not all that important to retry, because the next snapshot attempt is going to occur in 30 minutes anyway.

On the other hand, if I have a snapshot scheduled once a month, then being unlucky and failing to take a snapshot is a bit of a pain -- having some kindof configurable retry mechanism here might be nice.

A made-up simple version of this would be adding "retry":true to the SLM policy definition, where SLM would control the backoff and number of retries internally 'as appropriate'. Contrariwise, a made-up complex version of this would require that we express the maximum number of retries to attempt, and the time to wait between retries (potentially with some kind of backoff, etc).

The goal of this ticket is to have a discussion and decide whether we should implement this, and if so how, and then to close this ticket in favor of a ticket that describes the outcome of this discussion.

Related to #65826, in that the discussion of that ticket caused me to file this ticket.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions