roachtest: enable schema changes in acceptance/version-upgrade#98855
roachtest: enable schema changes in acceptance/version-upgrade#98855craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
renatolabs
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @fqazi and @herkolategan)
pkg/cmd/roachtest/roachtestutil/mixedversion/runner.go line 480 at r1 (raw file):
// Run performs `cluster.RunE` on a randomly picked node. func (h *Helper) Run(rng *rand.Rand, cmd ...string) error {
I'm not convinced this function needs to be part of *Helper (I'd like to only add functions to this struct once there are many call sites that would benefit from it). I know we have Query related functions here already, but one distinction is that when we need to run SQL in a test, we will always connect to one gateway node; for running commands, it's very common that commands will need to run on multiple nodes at the same time, and it's not great to have behavior/logging be different if we are running a command in one vs many nodes.
In the mixed-version closure, you can achieve the same behavior with
node := h.RandomNode(rng, c.All())
c.RunE(...)pkg/cmd/roachtest/roachtestutil/mixedversion/runner.go line 487 at r1 (raw file):
// InitLegacyWorkload initializes a legacy workload on all nodes. func (h *Helper) InitLegacyWorkload(workloadPath string, workload string) {
I also believe this function is doing something too specific to warrant a function in *Helper. In addition, it sounds like a function that the caller would always need to make sure to call just once. This is not quite the type of function that should live here. I'll make a comment down below.
pkg/cmd/roachtest/roachtestutil/mixedversion/runner.go line 490 at r1 (raw file):
// Stage workload on all nodes as the load node to run workload is chosen // randomly. h.runner.cluster.Put(h.ctx, workloadPath, "./workload", h.runner.crdbNodes)
What's the reason we need to use ./workload instead of ./cockroach workload?
pkg/cmd/roachtest/tests/versionupgrade.go line 134 at r1 (raw file):
func(ctx context.Context, l *logger.Logger, rng *rand.Rand, h *mixedversion.Helper) error { initWorkload.Do(func() { h.InitLegacyWorkload(t.DeprecatedWorkload(), "schemachange")
Instead of using the sync.Once, I think it makes more sense to call mvt.OnStartup and init the workload there.
pkg/cmd/roachtest/tests/versionupgrade.go line 142 at r1 (raw file):
fmt.Sprintf("--concurrency %d", 2), fmt.Sprintf("{pgurl:1-%d}", len(c.All())), }
Nit: consider using commandbuilder, we've been trying to use more structured data types for commands in roachtest.
pkg/cmd/roachtest/tests/versionupgrade.go line 143 at r1 (raw file):
fmt.Sprintf("{pgurl:1-%d}", len(c.All())), } return h.Run(rng, runCmd...)
Please use cluster.RunE. We want to return errors instead of calling t.Fatal in these callbacks.
fqazi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @herkolategan and @renatolabs)
pkg/cmd/roachtest/roachtestutil/mixedversion/runner.go line 480 at r1 (raw file):
Previously, renatolabs (Renato Costa) wrote…
I'm not convinced this function needs to be part of
*Helper(I'd like to only add functions to this struct once there are many call sites that would benefit from it). I know we haveQueryrelated functions here already, but one distinction is that when we need to run SQL in a test, we will always connect to one gateway node; for running commands, it's very common that commands will need to run on multiple nodes at the same time, and it's not great to have behavior/logging be different if we are running a command in one vs many nodes.In the mixed-version closure, you can achieve the same behavior with
node := h.RandomNode(rng, c.All()) c.RunE(...)
I was debating if it should be in the helper at all. I'll move this code out.
pkg/cmd/roachtest/roachtestutil/mixedversion/runner.go line 490 at r1 (raw file):
Previously, renatolabs (Renato Costa) wrote…
What's the reason we need to use
./workloadinstead of./cockroach workload?
The schemachange workload is not supported under the cockroach binary, that's something that we should get done. I'll get an issue open for it.
pkg/cmd/roachtest/tests/versionupgrade.go line 134 at r1 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Instead of using the
sync.Once, I think it makes more sense to callmvt.OnStartupand init the workload there.
Good point I'll switch this.
pkg/cmd/roachtest/tests/versionupgrade.go line 143 at r1 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Please use
cluster.RunE. We want to return errors instead of callingt.Fatalin these callbacks.
The run wrapper was doing this, but I"ll remove it and put the calls here directly on cluster.
4f8987d to
2f65e5d
Compare
fqazi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @herkolategan and @renatolabs)
pkg/cmd/roachtest/roachtestutil/mixedversion/runner.go line 487 at r1 (raw file):
Previously, renatolabs (Renato Costa) wrote…
I also believe this function is doing something too specific to warrant a function in
*Helper. In addition, it sounds like a function that the caller would always need to make sure to call just once. This is not quite the type of function that should live here. I'll make a comment down below.
Done.
renatolabs
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @fqazi and @herkolategan)
pkg/cmd/roachtest/tests/versionupgrade.go line 112 at r2 (raw file):
if err := c.PutE(ctx, t.L(), t.DeprecatedWorkload(), "./workload", c.All()); err != nil { return err }
Probably cleaner to stash ./workload "out-of-band" -- i.e., you can just call c.Put() (no need for PutE) before the mvt := mixedversion.NewTest line. If that fails, we don't waste time setting up the mixed-version test, and it also simplifies this closure too.
pkg/cmd/roachtest/tests/versionupgrade.go line 113 at r2 (raw file):
return err } return c.RunE(ctx, c.All(), "./workload init", "schemachange")
Nit: since these are all consts, probably reads better as a single string: c.RunE(ctx, c.All(), "./workload init schemachange")
pkg/cmd/roachtest/tests/versionupgrade.go line 145 at r2 (raw file):
runCmd.Flag("max-ops", 10) runCmd.Flag("concurrency", 2) runCmd.Arg("{pgurl:1-%d}", len(c.All()))
Nit: these can be chained
runCmd := roachtestutil.NewCommand("...")
.Flag("verbose", 1)
.Flag("max-ops", 10)pkg/cmd/roachtest/tests/versionupgrade.go line 146 at r2 (raw file):
runCmd.Flag("concurrency", 2) runCmd.Arg("{pgurl:1-%d}", len(c.All())) randomNode := c.All().RandNode()
We should use h.RandomNode() here (i.e., using the rng provided as parameter instead of the global rand).
2f65e5d to
ece58b7
Compare
fqazi
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @herkolategan and @renatolabs)
pkg/cmd/roachtest/tests/versionupgrade.go line 112 at r2 (raw file):
Previously, renatolabs (Renato Costa) wrote…
Probably cleaner to stash
./workload"out-of-band" -- i.e., you can just callc.Put()(no need forPutE) before themvt := mixedversion.NewTestline. If that fails, we don't waste time setting up the mixed-version test, and it also simplifies this closure too.
Done.
pkg/cmd/roachtest/tests/versionupgrade.go line 146 at r2 (raw file):
Previously, renatolabs (Renato Costa) wrote…
We should use
h.RandomNode()here (i.e., using therngprovided as parameter instead of the globalrand).
Done.
Previously, due to flakes we disabled schema changes inside the version update test. This patch re-enables them, since we are confident that the workload itslef is now stable in a mixed version state. Fixes: cockroachdb#58489 Release note: None
ece58b7 to
a4aa8ef
Compare
renatolabs
left a comment
There was a problem hiding this comment.
Thank you for adding the schemachange workload back!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @herkolategan)
|
@renatolabs @chengxiong-ruan TFTR! |
|
This will need some tweaks on the workload side, we don't handle a specific version case for CREATE's properly. It should be minor will try and validate it a few times on Monday (otherwise there is a risk of intermittent failures). |
Previously, when the randomized declarative schema changer was run during version upgrade we the descriptor ID generator can be temporarily unavailable between 22.2 and 23.1 This patch, allows this error temporarily for this workload. Epic: none Release note: None
|
bors r+ |
|
Build failed (retrying...): |
|
Build succeeded: |
Previously, due to flakes we disabled schema changes inside the version update test. This patch re-enables them, since we are confident that the workload itslef is now stable in a mixed version state.
Fixes: #58489
Release note: None