multitenant: re-enable admission control fairness tests#89721
multitenant: re-enable admission control fairness tests#89721craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
0102d86 to
5f14dd4
Compare
|
Spoke to Tommy, going to take over this PR. |
6e45bbe to
97aaa39
Compare
Previously these tests were disabled for being flakey. Re-enable them and increase tenant resource limits to prevent throughput collapse, not sure why this wasn't an issue originally. Also disable running the tests w/o admission control as that mode is flakey and no longer of interest. Also includes some commented out code to attempt to use prometheus graphana, I couldn't get it to work but its probably close. Fixes: cockroachdb#82033, cockroachdb#83994 Release note: None
6226993 to
530ef25
Compare
|
+cc @andrewbaptist, @sumeerbhola this is now ready for a look. https://www.loom.com/share/3fcbc44fd3f04650b5e5b702b6d9d21d is what the prometheus dashboard looks like. Probably a lot of the roachtest primitives for multi-tenant tests can be improved (it's a bit odd that the workload binary and the tenant SQL pods are physically colocated), but leaving things as is for now. It's easiest probably to read the file from scratch instead of the diff. |
04d4e6d to
184c154
Compare
Zipped artifacts for all variations: artifacts.tar.gz. As usual, navigate to each folder's |
sumeerbhola
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r1, 10 of 10 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @renatolabs, and @smg260)
pkg/cmd/roachtest/tests/admission_control_multitenant_fairness.go line 40 at r2 (raw file):
// // [1]: Co-locating the SQL pod and the workload generator is a bit funky, but //
nit: remove empty line
pkg/cmd/roachtest/tests/multitenant_fairness.go line 55 at r1 (raw file):
{ name: "concurrency-skew", concurrency: func(i int) int { return i * 250 },
btw, do we see the same throughput across the different tenants for the concurrency-skew tests (both for the CPU and store overload setups)? IIRC, we had inconclusive results earlier and it was not clear to me why -- sometimes it seemed that the cause was that the minimum concurrency was insufficient to use 1/4th of the bottleneck resource.
184c154 to
af396e7
Compare
irfansharif
left a comment
There was a problem hiding this comment.
TFTR!
bors r+
Reviewed 1 of 1 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @renatolabs, @smg260, and @sumeerbhola)
pkg/cmd/roachtest/tests/admission_control_multitenant_fairness.go line 40 at r2 (raw file):
Previously, sumeerbhola wrote…
nit: remove empty line
Done.
pkg/cmd/roachtest/tests/multitenant_fairness.go line 55 at r1 (raw file):
Previously, sumeerbhola wrote…
btw, do we see the same throughput across the different tenants for the concurrency-skew tests (both for the CPU and store overload setups)? IIRC, we had inconclusive results earlier and it was not clear to me why -- sometimes it seemed that the cause was that the minimum concurrency was insufficient to use 1/4th of the bottleneck resource.
We see the same throughput across the different tenants in read-heavy/skewed:
The second one, write-heavy/skewed is a bit more interesting. We see something like this. The periods where there is a skew in throughput are where we're not hitting the admission IO threshold (more apparent at the start), but as soon as we do, we do see that throughput is made identical. The oscillations between both regimes is interesting, and doesn't look desirable.
Also integrate prometheus, --skip-init. Release note: None
af396e7 to
51f27e6
Compare
|
Canceled. |
|
bors r+ |
|
Build succeeded: |





Previously these tests were disabled for being flakey. Re-enable them
and increase tenant resource limits to prevent throughput collapse,
not sure why this wasn't an issue originally. Also disable running the
tests w/o admission control as that mode is flakey and no longer of
interest.
Also includes some commented out code to attempt to use prometheus
graphana, I couldn't get it to work but its probably close.
Fixes: #82033, #83994
Release note: None