Skip to content

roachtest: prevent aws roachtest panic#100286

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
msbutler:butler-aws-panic
Mar 31, 2023
Merged

roachtest: prevent aws roachtest panic#100286
craig[bot] merged 1 commit intocockroachdb:masterfrom
msbutler:butler-aws-panic

Conversation

@msbutler
Copy link
Copy Markdown
Collaborator

After #99723 merged as a bandaid for #98783, the aws roachtest nightly began to panic because of a different roachtest papercut #96655. Specifically, because roachtest filters which tests run on which cloud within the evaluation of the test closure, tests meant to run on gce will still get registered in an AWS run. During the registration of the gce test
restore/tpce/400GB/gce/nodes=4/cpus=8/lowmem on aws, the aws test harness panics because the aws roachprod implementation does not have a low memory cpu configuration. This patch prevents this panic and should be reverted once the pr #99402 merges.

Epic: None

Release note: None

After cockroachdb#99723 merged as a bandaid for cockroachdb#98783, the aws roachtest nightly began to
panic because of a different roachtest papercut cockroachdb#96655. Specifically, because
roachtest filters which tests run on which cloud within the evaluation of the
test closure, tests meant to run on gce will still get registered in an AWS
run. During the registration of the gce test
`restore/tpce/400GB/gce/nodes=4/cpus=8/lowmem` _on aws_, the aws test harness
panics because the aws roachprod implementation does not have a low memory cpu
configuration. This patch prevents this panic and should be reverted once
the pr cockroachdb#99402 merges.

Epic: None

Release note: None
@msbutler msbutler added the T-testeng TestEng Team label Mar 31, 2023
@msbutler msbutler requested review from rail and srosenberg March 31, 2023 11:40
@msbutler msbutler self-assigned this Mar 31, 2023
@msbutler msbutler requested a review from a team as a code owner March 31, 2023 11:40
@msbutler msbutler requested review from smg260 and removed request for a team March 31, 2023 11:40
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@msbutler msbutler added the backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only label Mar 31, 2023
@rail
Copy link
Copy Markdown
Member

rail commented Mar 31, 2023

Let’s rerun the nightly after this one is merged.

@msbutler
Copy link
Copy Markdown
Collaborator Author

TFTR!

bors r=rail

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 31, 2023

Build succeeded:

@craig craig bot merged commit dcac33e into cockroachdb:master Mar 31, 2023
@msbutler msbutler deleted the butler-aws-panic branch March 31, 2023 16:20
@msbutler
Copy link
Copy Markdown
Collaborator Author

s := r.MakeClusterSpec(hw.nodes, clusterOpts...)

if s.Cloud == "aws" && s.VolumeSize != 0 {
if backupCloud == spec.AWS && s.Cloud == spec.AWS && s.VolumeSize != 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second conjunct is a tautology because s.Cloud is empty in this case. Incidentally, none of the existing roachtests set s.Cloud to anything other than gce. 🤷

Copy link
Copy Markdown
Collaborator Author

@msbutler msbutler Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srosenberg I don't believe that's the case, at least on master and 23.1 anymore. In an ad hoc test, I added a print statement in this branch to confirm we take it. We've also confirmed in later 23.1 restore roachtest failures, that we properly removed the d in the aws machine type specification. (see here).

Also, if s.Cloud were not set to aws, I don't think any aws roachtests would work? We seem to rely on this variable to set the right machine type.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nm, you're right. I missed the fact that Cloud is passed (r.Cloud below) from the testRegistryImply, which in turns gets initialized before tests are registered,

spec.MakeClusterSpec(r.cloud, r.instanceType, nodeCount, finalOpts...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only T-testeng TestEng Team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants