changefeedccl: prevent deadlock in TestChangefeedKafkaMessageTooLarge#107752
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Jul 28, 2023
Merged
Conversation
Member
ff789aa to
284f2b2
Compare
miretskiy
reviewed
Jul 27, 2023
284f2b2 to
fa37eaf
Compare
miretskiy
approved these changes
Jul 27, 2023
fa37eaf to
a8f825f
Compare
Previously, this test would deadlock due to kafka retrying messages too many times. These messages are stored in a buffer of size 1024 created by the CDC testing infra: https://github.com/cockroachdb/cockroach/blob/5c3f96d38cdc3a2d953ca3ffb1e39e97d7e5110e/pkg/ccl/changefeedccl/testfeed_test.go#L1819 The test asserts that 2000 messages pass through the buffer. When the test finishes, it stops reading from the buffer. The problem is that due to retries, there may be more messages sent to the buffer than that are read out of the buffer. Even after the 2000 messages are read and the test is shutting down, the sink may be blocked trying to put resolved messages (plus retries) in the buffer. If this happens, the changefeed resumer (same goroutine as the kafka sink) gets blocked and does not terminate when the job is cancelled at the end of the test. This change caps the number of retries at 200 for this test, so there should be no more than 200 extra messages plus a few resolved messages during this test. This is far less than the buffer size of 1024. See detailed explanation in cockroachdb#107591. Fixes: cockroachdb#107591 Epic: none Release note: None
a8f825f to
4f1e340
Compare
Contributor
Author
|
bors r=miretskiy |
Contributor
|
This PR was included in a batch that was canceled, it will be automatically retried |
Contributor
|
Build succeeded: |
This was referenced Jul 31, 2023
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously, this test would deadlock due to kafka retrying messages too many times. These messages are stored in a buffer of size 1024 created by the CDC testing infra:
cockroach/pkg/ccl/changefeedccl/testfeed_test.go
Line 1819 in 5c3f96d
The test asserts that 2000 messages pass through the buffer. When the test finishes, it stops reading from the buffer. The problem is that due to retries, there may be more messages sent to the buffer than that are read out of the buffer. Even after the 2000 messages are read and the test is shutting down, the sink may be blocked trying to put resolved messages (plus retries) in the buffer. If this happens, the changefeed resumer (same goroutine as the kafka sink) gets blocked and does not terminate when the job is cancelled at the end of the test.
This change caps the number of retries at 200 for this test, so there should be no more than 200 extra messages plus a few resolved messages during this test. This is far less than the buffer size of 1024.
See detailed explanation in #107591.
Fixes: #107591
Epic: none
Release note: None