vinyl: fix gc vs vylog race leading to duplicate record#10130
Merged
locker merged 1 commit intotarantool:masterfrom Jun 13, 2024
Merged
vinyl: fix gc vs vylog race leading to duplicate record#10130locker merged 1 commit intotarantool:masterfrom
locker merged 1 commit intotarantool:masterfrom
Conversation
lenkis
approved these changes
Jun 11, 2024
Vinyl run files aren't always deleted immediately after compaction,
because we need to keep run files corresponding to checkpoints for
backups. Such run files are deleted by the garbage collection procedure,
which performs the following steps:
1. Loads information about all run files from the last vylog file.
2. For each loaded run record that is marked as dropped:
a. Tries to remove the run files.
b. On success, writes a "forget" record for the dropped run,
which will make vylog purge the run record on the next
vylog rotation (checkpoint).
(see `vinyl_engine_collect_garbage()`)
The garbage collection procedure writes the "forget" records
asynchronously using `vy_log_tx_try_commit()`, see `vy_gc_run()`.
This procedure can be successfully executed during vylog rotation,
because it doesn't take the vylog latch. It simply appends records
to a memory buffer which is flushed either on the next synchronous
vylog write or vylog recovery.
The problem is that the garbage collection isn't necessarily loads
the latest vylog file because the vylog file may be rotated between
it calls `vy_log_signature()` and `vy_recovery_new()`. This may
result in a "forget" record written twice to the same vylog file
for the same run file, as follows:
1. GC loads last vylog N
2. GC starts removing dropped run files.
3. CHECKPOINT starts vylog rotation.
4. CHECKPOINT loads vylog N.
5. GC writes a "forget" record for run A to the buffer.
6. GC is completed.
7. GC is restarted.
8. GC finds that the last vylog is N and blocks on the vylog latch
trying to load it.
9. CHECKPOINT saves vylog M (M > N).
10. GC loads vylog N. This triggers flushing the forget record for
run A to vylog M (not to vylog N), because vylog M is the last
vylog at this point of time.
11. GC starts removing dropped run files.
12. GC writes a "forget" record for run A to the buffer again,
because in vylog N it's still marked as dropped and not forgotten.
(The previous "forget" record was written to vylog M).
13. Now we have two "forget" records for run A in vylog M.
Such duplicate run records aren't tolerated by the vylog recovery
procedure, resulting in a permanent error on the next checkpoint:
```
ER_INVALID_VYLOG_FILE: Invalid VYLOG file: Run XXXX forgotten but not registered
```
To fix this issue, we move `vy_log_signature()` under the vylog latch
to `vy_recovery_new()`. This makes sure that GC will see vylog records
that it's written during the previous execution.
Catching this race in a function test would require a bunch of ugly
error injections so let's assume that it'll be tested by fuzzing.
Closes tarantool#10128
NO_DOC=bug fix
NO_TEST=tested manually with fuzzer
0c8d013 to
c46f10c
Compare
Member
Author
|
Cherry-picked to 2.11 and 3.1. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Vinyl run files aren't always deleted immediately after compaction, because we need to keep run files corresponding to checkpoints for backups. Such run files are deleted by the garbage collection procedure, which performs the following steps:
(see
vinyl_engine_collect_garbage())The garbage collection procedure writes the "forget" records asynchronously using
vy_log_tx_try_commit(), seevy_gc_run(). This procedure can be successfully executed during vylog rotation, because it doesn't take the vylog latch. It simply appends records to a memory buffer which is flushed either on the next synchronous vylog write or vylog recovery.The problem is that the garbage collection isn't necessarily loads the latest vylog file because the vylog file may be rotated between it calls
vy_log_signature()andvy_recovery_new(). This may result in a "forget" record written twice to the same vylog file for the same run file, as follows:Such duplicate run records aren't tolerated by the vylog recovery procedure, resulting in a permanent error on the next checkpoint:
To fix this issue, we move
vy_log_signature()under the vylog latch tovy_recovery_new(). This makes sure that GC will see vylog records that it's written during the previous execution.Catching this race in a function test would require a bunch of ugly error injections so let's assume that it'll be tested by fuzzing.
Closes #10128