Skip to content

Prevent lock refresh from leaving behind lots of stale locks#3512

Merged
fd0 merged 3 commits intorestic:masterfrom
MichaelEischer:cleaner-lock-refresh
Mar 21, 2022
Merged

Prevent lock refresh from leaving behind lots of stale locks#3512
fd0 merged 3 commits intorestic:masterfrom
MichaelEischer:cleaner-lock-refresh

Conversation

@MichaelEischer
Copy link
Copy Markdown
Member

@MichaelEischer MichaelEischer commented Sep 15, 2021

What does this PR change? What problem does it solve?

When the lock refresh failed to delete the old lock file, it forgot about the new one. Instead it continued trying to delete the old (usually no longer existing) lock file and thus over time lots of lock files accumulate.

One way to trigger this is when a delete operation seems to fail for example due to a timeout although it actually succeeded.

The PR also contains a small initialization cleanup to ensure that the lock cleanup handler is called after the global context was canceled on interruptions.

Was the change previously discussed in an issue or on the forum?

Fixes #2473
Fixes #2452
Fixes #2562

Checklist

  • I have read the contribution guidelines.
  • I have enabled maintainer edits.
  • I have added tests for all code changes.
  • [ ] I have added documentation for relevant changes (in the manual).
  • There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
  • I have run gofmt on the code in all commits.
  • All commit messages are formatted in the same style as the other commits in the repo.
  • I'm done! This pull request is ready for review.

Copy link
Copy Markdown

@tbclark3 tbclark3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I first noticed the locks problem several weeks ago after running restic for more than a year with no errors. These changes work for me.

This ensures that restic won't create lots of new lock files without
deleting them later on.

In some cases a Delete operation on a backend can return a "File does
not exist" error even though the Delete operation succeeded. This can
for example be caused by request retries. This caused restic to forget
about the new lock file and continue trying to remove the old (already
deleted) lock file.
cleanup handlers run in the order in which they are added. As Go calls
init() functions in lexical order, the cleanup handler from global.go
was registered before that from lock.go, which is the correct order.

Make this order explicit to ensure that this won't break accidentally.
Copy link
Copy Markdown
Member

@fd0 fd0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@fd0 fd0 merged commit 0b8b524 into restic:master Mar 21, 2022
@MichaelEischer MichaelEischer deleted the cleaner-lock-refresh branch March 22, 2022 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Intermittently fails to remove Backblaze locks Prune operation fails repeatedly (B2 Bucket) Remove(<lock/5a53798b51>) returned error

3 participants