Skip to content

feat: Add time.Sleep to mitigate race condition.#1923

Merged
marianogappa merged 2 commits intomainfrom
mariano/mitigate-race-condition
Oct 4, 2024
Merged

feat: Add time.Sleep to mitigate race condition.#1923
marianogappa merged 2 commits intomainfrom
mariano/mitigate-race-condition

Conversation

@marianogappa
Copy link
Copy Markdown
Contributor

The ShuffleQueue scheduler strategy has an infrequent race condition, as explained by the comment:

	// A race condition is possible when the last active table asynchronously
	// queues a relation. The table finishes (calling `.Done()`) a moment
	// before the queue receives the `.Push()`. At this point, the queue is
	// empty and there are no active workers.
	//
	// A moment later, the queue receives the `.Push()` and queues a new task.
	//
	// This is a very infrequent case according to tests, but it happens.

After many attempts at a more elegant solution, I finally yielded:

time.Sleep(10 * time.Millisecond)

Looks ugly, but after running the tests 300 times (so around 3000 syncs), it works 🤷

✓ cloudquery/plugin-sdk main* $ go test ./scheduler -count=100 -run TestScheduler             ⏱ 15:04:12
ok  	github.com/cloudquery/plugin-sdk/v4/scheduler	143.523s
✓ cloudquery/plugin-sdk main* $ go test ./scheduler -count=100 -run TestScheduler              ⏱ 15:06:56
ok  	github.com/cloudquery/plugin-sdk/v4/scheduler	142.796s
✓ cloudquery/plugin-sdk main* $ go test ./scheduler -count=100 -run TestScheduler              ⏱ 15:09:22
ok  	github.com/cloudquery/plugin-sdk/v4/scheduler	144.304s

@marianogappa marianogappa marked this pull request as ready for review October 4, 2024 14:15
@marianogappa marianogappa requested review from a team and erezrokah October 4, 2024 14:15
@github-actions github-actions bot added the feat label Oct 4, 2024
@marianogappa
Copy link
Copy Markdown
Contributor Author

lol, there's a unit test failure but that one wasn't this code:

panic: test timed out after 10m0s
	running tests:
		TestScheduler_Cancellation (9m51s)
		TestScheduler_Cancellation/should_not_consume_all_message_on_cancel_shuffle (9m51s)

Note the strategy is shuffle, not shuffle-queue:

t.Run(fmt.Sprintf("%s_%s", tc.name, strategy.String())

@marianogappa marianogappa merged commit 83dfcad into main Oct 4, 2024
@marianogappa marianogappa deleted the mariano/mitigate-race-condition branch October 4, 2024 14:47
kodiakhq bot pushed a commit that referenced this pull request Oct 7, 2024
🤖 I have created a release *beep* *boop*
---


## [4.66.0](v4.65.0...v4.66.0) (2024-10-07)


### Features

* Add time.Sleep to mitigate race condition. ([#1923](#1923)) ([83dfcad](83dfcad))


### Bug Fixes

* **deps:** Update aws-sdk-go-v2 monorepo ([#1926](#1926)) ([4fc8896](4fc8896))
* **deps:** Update module google.golang.org/grpc to v1.67.1 ([#1925](#1925)) ([5e0305d](5e0305d))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants