Skip to content

vinyl: wake up waiters after clearing checkpoint_in_progress flag#10271

Merged
locker merged 1 commit intotarantool:masterfrom
locker:vy-snapshot-hang-fix-2
Jul 18, 2024
Merged

vinyl: wake up waiters after clearing checkpoint_in_progress flag#10271
locker merged 1 commit intotarantool:masterfrom
locker:vy-snapshot-hang-fix-2

Conversation

@locker
Copy link
Member

@locker locker commented Jul 18, 2024

The function vy_space_build_index, which builds a new index on DDL, calls vy_scheduler_dump on completion. If there's a checkpoint in progress, the latter will wait on vy_scheduler::dump_cond until vy_scheduler::checkpoint_in_progress is cleared. The problem is vy_scheduler_end_checkpoint doesn't broadcast dump_cond when it clears the flag. Usually, everything works fine because the condition variable is broadcast on any dump completion, and vinyl checkpoint implies a dump, but under certain conditions this may lead to a fiber hang. Let's broadcast dump_cond in vy_scheduler_end_checkpoint to be on the safe side.

Closes #10267
Follow-up #10234

@locker locker requested a review from a team as a code owner July 18, 2024 10:03
@locker locker force-pushed the vy-snapshot-hang-fix-2 branch from fc92eb1 to 0cdb171 Compare July 18, 2024 10:14
@coveralls
Copy link

coveralls commented Jul 18, 2024

Coverage Status

coverage: 87.076% (-0.003%) from 87.079%
when pulling 759933a on locker:vy-snapshot-hang-fix-2
into 62c4936
on tarantool:master
.

@locker locker requested a review from nshy July 18, 2024 10:29
@locker locker force-pushed the vy-snapshot-hang-fix-2 branch from 0cdb171 to 6b46642 Compare July 18, 2024 12:20
The function `vy_space_build_index`, which builds a new index on DDL,
calls `vy_scheduler_dump` on completion. If there's a checkpoint in
progress, the latter will wait on `vy_scheduler::dump_cond` until
`vy_scheduler::checkpoint_in_progress` is cleared. The problem is
`vy_scheduler_end_checkpoint` doesn't broadcast `dump_cond` when it
clears the flag. Usually, everything works fine because the condition
variable is broadcast on any dump completion, and vinyl checkpoint
implies a dump, but under certain conditions this may lead to a fiber
hang. Let's broadcast `dump_cond` in `vy_scheduler_end_checkpoint`
to be on the safe side.

While we are at it, let's also inject a dump delay to the original
test to make it more robust.

Closes tarantool#10267
Follow-up tarantool#10234

NO_DOC=bug fix
@locker locker force-pushed the vy-snapshot-hang-fix-2 branch from 6b46642 to 759933a Compare July 18, 2024 13:04
@locker locker assigned locker and unassigned nshy Jul 18, 2024
@locker locker added the full-ci Enables all tests for a pull request label Jul 18, 2024
@locker locker merged commit fc3196d into tarantool:master Jul 18, 2024
@locker locker deleted the vy-snapshot-hang-fix-2 branch July 18, 2024 14:52
@locker
Copy link
Member Author

locker commented Jul 18, 2024

Cherry-picked to 2.11 and 3.1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

full-ci Enables all tests for a pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lua execution is blocked after running random operations on vinyl space

4 participants