Conversation
⏱️ Benchmark results
|
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #1218 +/- ##
==========================================
+ Coverage 48.54% 48.75% +0.20%
==========================================
Files 87 88 +1
Lines 8185 8240 +55
==========================================
+ Hits 3973 4017 +44
- Misses 3842 3850 +8
- Partials 370 373 +3
☔ View full report in Codecov by Sentry. |
scheduler/scheduler_shuffle.go
Outdated
|
|
||
| func shuffle(tableClients []tableClient) { | ||
| // use a fixed seed so that runs with the same tables and clients perform similarly across syncs | ||
| r := rand.New(rand.NewSource(99)) |
There was a problem hiding this comment.
Doesn't this mean we always randomize it the same way 🙃 ? Ideally this would be configurable but maybe we can use something from the spec so if the initial shuffle is bad users can modify it. Maybe hash on table/client IDs or something
There was a problem hiding this comment.
Doesn't this mean we always randomize it the same way
First the obligatory https://xkcd.com/221/
Maybe hash on table/client IDs or something
Yeah we can do that, so then changing the ordering of tables in your config will cause a different shuffle.
There was a problem hiding this comment.
I don't think it was really necessary, the odds of getting a bad shuffle (at either end of the DFS->Round Robin spectrum) if there are enough table-clients are astronomically small, but in 27c4e22 I added the ability to control it a little bit by changing the ordering of the tables in the config.
🤖 I have created a release *beep* *boop* --- ## [4.8.0](v4.7.1...v4.8.0) (2023-09-19) ### Features * Add Checksums to package.json format ([#1217](#1217)) ([720baae](720baae)) * Add message to package command ([#1216](#1216)) ([44956d9](44956d9)) * Add shuffle scheduler ([#1218](#1218)) ([2b1ba30](2b1ba30)) * Update package command ([#1211](#1211)) ([39fc65e](39fc65e)) ### Bug Fixes * Add schema version to package.json ([#1212](#1212)) ([393c94d](393c94d)) * **deps:** Update github.com/cloudquery/arrow/go/v14 digest to 483f6b2 ([#1209](#1209)) ([179769a](179769a)) * **deps:** Update github.com/cloudquery/arrow/go/v14 digest to ffb7089 ([#1215](#1215)) ([70f20bb](70f20bb)) * Use -dir suffix for plugin package arguments ([#1213](#1213)) ([93f9398](93f9398)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Falling somewhere between DFS and Round Robin, the
shufflescheduler uses randomization with a fixed seed to spread the load across tables and clients more evenly.