Skip to content

kvcoord: Release catchup reservation before re-acquire attempt#105083

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
miretskiy:catchup
Jun 20, 2023
Merged

kvcoord: Release catchup reservation before re-acquire attempt#105083
craig[bot] merged 1 commit intocockroachdb:masterfrom
miretskiy:catchup

Conversation

@miretskiy
Copy link
Copy Markdown
Contributor

@miretskiy miretskiy commented Jun 16, 2023

Release catchup scan reservation prior to attemt to re-acquire it. Failure to do so could result in a stuck mux rangefeed when enough ranges encounter an error, such as range split, prior to receiving the first checkpoint event, that would cause additional attempts to acquire catchup scan quota.

Fixes #105058

Release note (bug fix): Fix a bug in mux rangefeed implementation that may cause mux rangefeed to become stuck if enough ranges encounter certain error concurrently.

@miretskiy miretskiy requested review from a team and aliher1911 June 16, 2023 21:21
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@miretskiy miretskiy added the backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only label Jun 16, 2023
Release catchup scan reservation prior to attemt to re-acquire
it.  Failure to do so could result in a stuck mux rangefeed when enough
ranges encounter an error at the same time (which can happen if
e.g. a node gets restarted).

Fixes cockroachdb#105058

Release note (bug fix): Fix a bug in mux rangefeed implementation that
may cause mux rangefeed to become stuck if enough ranges enounter
certain error concurrently.
Copy link
Copy Markdown
Contributor

@aliher1911 aliher1911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand how current change makes reservations aligned with activeMuxRangeFeed lifecycle.

BuI can only see us having double reservations if we had to do a errInfo.resolveSpan branch and have new feeds being blocked by feed they are trying to replace. This doesn't look like the case of nodes being restarted, but having many concurrent splits when table grows.

If that's the case, can you update the description?

@miretskiy
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 20, 2023

Build succeeded:

@shermanCRL
Copy link
Copy Markdown
Contributor

How about v22.2 @miretskiy? Or should we simply say mux is not recommended on v22.2?

@miretskiy
Copy link
Copy Markdown
Contributor Author

How about v22.2 @miretskiy? Or should we simply say mux is not recommended on v22.2?

not applicable. this is rewritten code that didn't exist (in its current shape) in 22.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kvcoord: catchup scan quota acquisition

4 participants