serverutils: don't use tenant mode under bench by yuzefovich · Pull Request #83254 · cockroachdb/cockroach

yuzefovich · 2022-06-23T02:42:56Z

This commit makes sure that we don't use the multi-tenant test harness
when running the benchmarks by default. This is intended as a
temporary fix until we ensure that using the multi-tenant setup
doesn't introduce any non-determinism.

Additionally, this commit disables logging the INFO message about
starting a default tenant when running the benchmarks since those
messages break the benchstat utility.

Informs: #83461.

Release note: None

cockroach-teamcity · 2022-06-23T02:43:04Z

This change is

yuzefovich · 2022-06-27T18:02:52Z

Friendly ping @cockroachdb/multi-tenant - my reasoning for this change is that it pollutes the output of the benchmarks breaking benchstat as well as possibly introduces non-determinism which we want to avoid.

rharding6373

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @yuzefovich)

pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

	}

	if rand.Float64() > probabilityOfStartingDefaultTestTenant || skip.UnderBench() {

Will we want to benchmark multi-tenant separately from non-multi-tenant in the near future? Ideally I think that benchmarks would set the env variable to force tenant or force no tenant.

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)

pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Will we want to benchmark multi-tenant separately from non-multi-tenant in the near future? Ideally I think that benchmarks would set the env variable to force tenant or force no tenant.

Quite possibly, I dunno. The current setup is definitely broken for benchmarks, and this is the quick fix. Apart from introducing a way to run benchmarks either via single-tenant or multi-tenant config, we'll also need to suppress the INFO messages about the multi-tenant setup being used (this is what's breaking the benchstat). We also will want to somehow preserve the ability of comparison against 22.1 branches so that 22.2 branch runs the single-tenant for fair comparison.

Thus, addressing this proper will require more thinking and more work, but in the meantime this seems like a reasonable quick fix.

knz · 2022-06-27T20:35:55Z

Thanks for this "quick fix".

That said we should not send this PR and then call it a day. What this PR is effectively saying is that Adam's recent testing PR has "broken" the benchmarks. Some of it is is a regression, but some of it is uncovering an important task we need to schedule for INI-214.

There are multiple followup actions here:

You have discovered two different failures here, both of which need to be addressed. They do not need to be addressed in this PR, but we need to file issues for them nonetheless:
- one is the cosmetic issue of the info messages, causing the benchstat utility to get confused. This can be reliably addressed in multiple ways, by tweaking the messages (e.g. adding a newline character at the beginning).
- the other is the question of non-determinism in benchmark outputs. Here the situation is that we have automation that tracks benchmark regressions over time. This automation is currently not aware of the multi-tenancy randomization. We need to enhance that.
Separately, we need to make peace with the idea that single tenancy is going away. At some point in the future, we will transition all our bench tests to run on a multi-tenant cluster, and we are expecting no performance regression. If we are seeing a performance regression today, this means that there is multi-tenancy optimization work to do that needs to be planned and done by the appropriate teams.

This last point is the most important here. I would personally be keen to fix the issues at the top, so that we closely track any performance regressions introduced by multi-tenancy and roadmap the corresponding work.

This PR is making us blind to it, and that's not good.

cc @ajstorm

yuzefovich · 2022-06-27T20:46:39Z

Thanks for the write up.

To be clear, I'm not suggesting that we merge this fix and call it a day - I'm suggesting restoring the status quo ASAP (for two reasons below) and then discuss how to fix it proper:

in order to unblock ongoing work that is using the comparison between different SHAs on microbenchmarks as a guide (e.g. I often use ./scripts/bench (which under the hood uses benchstat) to see what perf impact my change has, and this is currently broken)
in order to ease the bisect of perf regressions on microbenchmarks in the future, when 22.2 release is about to be shipped. Currently, if we were to run a bisect, again, benchstat would not work, making the bisect burdensome.

Thus, I strongly believe it is worth merging the quick fix ASAP, and then we can discuss how to address the issue proper. My guess is that we will want to run benchmarks in both modes in 22.2 timeframe so that we have comparison against 22.1 (while keeping the names of benchmarks under single-tenant config unchanged) and so that we transition to running only multi-tenant configs in 23.1 time frame.

knz · 2022-06-27T20:51:49Z

no objection to merging a quick fix ASAP. I'd just like to ensure we also file the followup work and keep it on the adequate radar in the same action.

yuzefovich · 2022-06-27T21:00:12Z

Filed #83461.

ajstorm

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)

pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Quite possibly, I dunno. The current setup is definitely broken for benchmarks, and this is the quick fix. Apart from introducing a way to run benchmarks either via single-tenant or multi-tenant config, we'll also need to suppress the INFO messages about the multi-tenant setup being used (this is what's breaking the benchstat). We also will want to somehow preserve the ability of comparison against 22.1 branches so that 22.2 branch runs the single-tenant for fair comparison.

Thus, address this proper will require more thinking and more work, but in the meantime this seems like a reasonable quick fix.

Could we change this so that skip.UnderBench() changes the defaultProbabilityOfStartingTestTenant to 0.0. That way, we could still override it using one of the flags. You might also want to unconditionally skip the INFO message in cases where we're running bench, so that we don't impact benchstat (along with an accompanying comment).

Note also that my review of this caught a problem with my original PR. The default probability above is incorrectly set to 1.0 (it should be 0.5). I did one last run before I merged where I wanted to ensure that I tested everything inside tenants. Unfortunately, I then merged the 1.0 probability instead of reverting it to 0.5. I've got a PR out now to correct that initial bad merge.

yuzefovich

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)

pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Could we change this so that skip.UnderBench() changes the defaultProbabilityOfStartingTestTenant to 0.0. That way, we could still override it using one of the flags. You might also want to unconditionally skip the INFO message in cases where we're running bench, so that we don't impact benchstat (along with an accompanying comment).

Note also that my review of this caught a problem with my original PR. The default probability above is incorrectly set to 1.0 (it should be 0.5). I did one last run before I merged where I wanted to ensure that I tested everything inside tenants. Unfortunately, I then merged the 1.0 probability instead of reverting it to 0.5. I've got a PR out now to correct that initial bad merge.

Good points, done.

rharding6373

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajstorm)

pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Good points, done.

Looks good.

ajstorm

too

Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)

This commit makes sure that we don't use the multi-tenant test harness when running the benchmarks by default. This is intended as a temporary fix until we ensure that using the multi-tenant setup doesn't introduce any non-determinism. Additionally, this commit disables logging the INFO message about starting a default tenant when running the benchmarks since those messages break the `benchstat` utility. Release note: None

yuzefovich

TFTRs!

bors r+

Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm and @rharding6373)

craig · 2022-06-28T05:13:00Z

Build succeeded:

GitHub CI (Cockroach)

yuzefovich requested review from a team and ajstorm June 23, 2022 02:42

rharding6373 requested changes Jun 27, 2022

View reviewed changes

yuzefovich commented Jun 27, 2022

View reviewed changes

yuzefovich mentioned this pull request Jun 27, 2022

bench: ensure that benchmarks "work" under multi-tenancy #83461

Open

ajstorm requested a review from rharding6373 June 27, 2022 21:05

ajstorm reviewed Jun 27, 2022

View reviewed changes

ajstorm self-requested a review June 27, 2022 21:05

yuzefovich force-pushed the tenant-bench branch 3 times, most recently from ca0963d to fc512cc Compare June 27, 2022 21:15

yuzefovich commented Jun 27, 2022

View reviewed changes

rharding6373 approved these changes Jun 27, 2022

View reviewed changes

yuzefovich mentioned this pull request Jun 27, 2022

multi-tenant: Fix MT testing probability #83462

Merged

ajstorm requested review from ajstorm and rharding6373 June 27, 2022 23:05

ajstorm approved these changes Jun 27, 2022

View reviewed changes

yuzefovich force-pushed the tenant-bench branch from fc512cc to bffe9a0 Compare June 28, 2022 02:42

yuzefovich commented Jun 28, 2022

View reviewed changes

craig bot merged commit 096ed43 into cockroachdb:master Jun 28, 2022

yuzefovich deleted the tenant-bench branch June 28, 2022 05:13

Conversation

yuzefovich commented Jun 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Jun 23, 2022

Uh oh!

yuzefovich commented Jun 27, 2022

Uh oh!

rharding6373 left a comment

Choose a reason for hiding this comment

Uh oh!

yuzefovich left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

knz commented Jun 27, 2022

Uh oh!

yuzefovich commented Jun 27, 2022

Uh oh!

knz commented Jun 27, 2022

Uh oh!

yuzefovich commented Jun 27, 2022

Uh oh!

ajstorm left a comment

Choose a reason for hiding this comment

Uh oh!

yuzefovich left a comment

Choose a reason for hiding this comment

Uh oh!

rharding6373 left a comment

Choose a reason for hiding this comment

Uh oh!

ajstorm left a comment

Choose a reason for hiding this comment

Uh oh!

yuzefovich left a comment

Choose a reason for hiding this comment

Uh oh!

craig bot commented Jun 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yuzefovich commented Jun 23, 2022 •

edited

Loading

yuzefovich left a comment •

edited

Loading