Skip to content

serverutils: don't use tenant mode under bench#83254

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:tenant-bench
Jun 28, 2022
Merged

serverutils: don't use tenant mode under bench#83254
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:tenant-bench

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Jun 23, 2022

This commit makes sure that we don't use the multi-tenant test harness
when running the benchmarks by default. This is intended as a
temporary fix until we ensure that using the multi-tenant setup
doesn't introduce any non-determinism.

Additionally, this commit disables logging the INFO message about
starting a default tenant when running the benchmarks since those
messages break the benchstat utility.

Informs: #83461.

Release note: None

@yuzefovich yuzefovich requested review from a team and ajstorm June 23, 2022 02:42
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@yuzefovich
Copy link
Copy Markdown
Member Author

Friendly ping @cockroachdb/multi-tenant - my reasoning for this change is that it pollutes the output of the benchmarks breaking benchstat as well as possibly introduces non-determinism which we want to avoid.

Copy link
Copy Markdown
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @yuzefovich)


pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

	}

	if rand.Float64() > probabilityOfStartingDefaultTestTenant || skip.UnderBench() {

Will we want to benchmark multi-tenant separately from non-multi-tenant in the near future? Ideally I think that benchmarks would set the env variable to force tenant or force no tenant.

Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)


pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, rharding6373 (Rachael Harding) wrote…

Will we want to benchmark multi-tenant separately from non-multi-tenant in the near future? Ideally I think that benchmarks would set the env variable to force tenant or force no tenant.

Quite possibly, I dunno. The current setup is definitely broken for benchmarks, and this is the quick fix. Apart from introducing a way to run benchmarks either via single-tenant or multi-tenant config, we'll also need to suppress the INFO messages about the multi-tenant setup being used (this is what's breaking the benchstat). We also will want to somehow preserve the ability of comparison against 22.1 branches so that 22.2 branch runs the single-tenant for fair comparison.

Thus, addressing this proper will require more thinking and more work, but in the meantime this seems like a reasonable quick fix.

@knz
Copy link
Copy Markdown
Contributor

knz commented Jun 27, 2022

Thanks for this "quick fix".

That said we should not send this PR and then call it a day. What this PR is effectively saying is that Adam's recent testing PR has "broken" the benchmarks. Some of it is is a regression, but some of it is uncovering an important task we need to schedule for INI-214.

There are multiple followup actions here:

  • You have discovered two different failures here, both of which need to be addressed. They do not need to be addressed in this PR, but we need to file issues for them nonetheless:

    • one is the cosmetic issue of the info messages, causing the benchstat utility to get confused. This can be reliably addressed in multiple ways, by tweaking the messages (e.g. adding a newline character at the beginning).
    • the other is the question of non-determinism in benchmark outputs. Here the situation is that we have automation that tracks benchmark regressions over time. This automation is currently not aware of the multi-tenancy randomization. We need to enhance that.
  • Separately, we need to make peace with the idea that single tenancy is going away. At some point in the future, we will transition all our bench tests to run on a multi-tenant cluster, and we are expecting no performance regression. If we are seeing a performance regression today, this means that there is multi-tenancy optimization work to do that needs to be planned and done by the appropriate teams.

This last point is the most important here. I would personally be keen to fix the issues at the top, so that we closely track any performance regressions introduced by multi-tenancy and roadmap the corresponding work.

This PR is making us blind to it, and that's not good.

cc @ajstorm

@yuzefovich
Copy link
Copy Markdown
Member Author

Thanks for the write up.

To be clear, I'm not suggesting that we merge this fix and call it a day - I'm suggesting restoring the status quo ASAP (for two reasons below) and then discuss how to fix it proper:

  1. in order to unblock ongoing work that is using the comparison between different SHAs on microbenchmarks as a guide (e.g. I often use ./scripts/bench (which under the hood uses benchstat) to see what perf impact my change has, and this is currently broken)
  2. in order to ease the bisect of perf regressions on microbenchmarks in the future, when 22.2 release is about to be shipped. Currently, if we were to run a bisect, again, benchstat would not work, making the bisect burdensome.

Thus, I strongly believe it is worth merging the quick fix ASAP, and then we can discuss how to address the issue proper. My guess is that we will want to run benchmarks in both modes in 22.2 timeframe so that we have comparison against 22.1 (while keeping the names of benchmarks under single-tenant config unchanged) and so that we transition to running only multi-tenant configs in 23.1 time frame.

@knz
Copy link
Copy Markdown
Contributor

knz commented Jun 27, 2022

no objection to merging a quick fix ASAP. I'd just like to ensure we also file the followup work and keep it on the adequate radar in the same action.

@yuzefovich
Copy link
Copy Markdown
Member Author

Filed #83461.

@ajstorm ajstorm requested a review from rharding6373 June 27, 2022 21:05
Copy link
Copy Markdown
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)


pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Quite possibly, I dunno. The current setup is definitely broken for benchmarks, and this is the quick fix. Apart from introducing a way to run benchmarks either via single-tenant or multi-tenant config, we'll also need to suppress the INFO messages about the multi-tenant setup being used (this is what's breaking the benchstat). We also will want to somehow preserve the ability of comparison against 22.1 branches so that 22.2 branch runs the single-tenant for fair comparison.

Thus, address this proper will require more thinking and more work, but in the meantime this seems like a reasonable quick fix.

Could we change this so that skip.UnderBench() changes the defaultProbabilityOfStartingTestTenant to 0.0. That way, we could still override it using one of the flags. You might also want to unconditionally skip the INFO message in cases where we're running bench, so that we don't impact benchstat (along with an accompanying comment).

Note also that my review of this caught a problem with my original PR. The default probability above is incorrectly set to 1.0 (it should be 0.5). I did one last run before I merged where I wanted to ensure that I tested everything inside tenants. Unfortunately, I then merged the 1.0 probability instead of reverting it to 0.5. I've got a PR out now to correct that initial bad merge.

@ajstorm ajstorm self-requested a review June 27, 2022 21:05
@yuzefovich yuzefovich force-pushed the tenant-bench branch 3 times, most recently from ca0963d to fc512cc Compare June 27, 2022 21:15
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)


pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, ajstorm (Adam Storm) wrote…

Could we change this so that skip.UnderBench() changes the defaultProbabilityOfStartingTestTenant to 0.0. That way, we could still override it using one of the flags. You might also want to unconditionally skip the INFO message in cases where we're running bench, so that we don't impact benchstat (along with an accompanying comment).

Note also that my review of this caught a problem with my original PR. The default probability above is incorrectly set to 1.0 (it should be 0.5). I did one last run before I merged where I wanted to ensure that I tested everything inside tenants. Unfortunately, I then merged the 1.0 probability instead of reverting it to 0.5. I've got a PR out now to correct that initial bad merge.

Good points, done.

Copy link
Copy Markdown
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajstorm)


pkg/testutils/serverutils/test_server_shim.go line 100 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Good points, done.

Looks good.

Copy link
Copy Markdown
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: too

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @ajstorm and @rharding6373)

This commit makes sure that we don't use the multi-tenant test harness
when running the benchmarks by default. This is intended as a
temporary fix until we ensure that using the multi-tenant setup
doesn't introduce any non-determinism.

Additionally, this commit disables logging the INFO message about
starting a default tenant when running the benchmarks since those
messages break the `benchstat` utility.

Release note: None
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @ajstorm and @rharding6373)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 28, 2022

Build succeeded:

@craig craig bot merged commit 096ed43 into cockroachdb:master Jun 28, 2022
@yuzefovich yuzefovich deleted the tenant-bench branch June 28, 2022 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants