roachtest: upgrade GCE to n2 and AWS to m6id and c6id#104419
roachtest: upgrade GCE to n2 and AWS to m6id and c6id#104419craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
erikgrinaker
left a comment
There was a problem hiding this comment.
Nice. Probably goes without saying that we'll want an annotation for this, as it's going to throw our benchmarks out of whack.
tbg
left a comment
There was a problem hiding this comment.
Thanks! Also worth doing a canary run of 1-2 roachtests for both GCE and AWS, disregard if you've done it already.
Absolutely! Canary runs are queued up; won't merge until we have the signal. AWS: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_Nightlies_RoachtestNightlyAwsBazel/10440661?buildTab=tests |
|
AWS run succeeded, but GCE run had a couple of issues, |
0ee6b20 to
ac232a7
Compare
|
After further tweaks, it looks like everything passed,
In the GCE run, there were two cluster creation errors, one of which is a known issue (resolved by [1]), the other one is a small quota in |
Previously, roachtest used n1 in GCE, m5d and c6d in AWS. CockroachDB Cloud hardware now uses n2 in GCE, m6i in AWS [1]. This change brings roachtest hardware into parity with CockroachDB Cloud; it is based on the draft PR in [2]. [1] https://cockroachlabs.atlassian.net/wiki/spaces/MC/pages/2799501550/CockroachDB+Cloud+Hardware [2] cockroachdb#99991 Epic: none Release note: None Co-authored-by: Nick Travers <travers@cockroachlabs.com>
ac232a7 to
5ad20cc
Compare
|
Quota in |
|
Both runs succeeded, modulo the know issue with
N.B. variance between runs is much higher on master. Further study is queued up once we have more data points for this change. TL;DR: shaving ~3 hours from a GCE run is a quick win, considering we were approaching 24h for some GCE runs. |
|
TFTR! bors r=erikgrinaker,tbg,herkolategan |
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 5ad20cc to blathers/backport-release-22.2-104419: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. error creating merge commit from 5ad20cc to blathers/backport-release-23.1-104419: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Since the bump to new instance types in GCE and AWS [1], we are still experiencing occasional cluster creation issues owing to "insufficient capacity". GCE quota has already been bumped, with `asia-northeast1` being the latest, and hopefully last. The most recent cluster creation in AWS is owing to "insufficient capacity" of `c6id.24xlarge` in us-east-2a. As a workaround, we extend the existing zone override to place `c6id.24xlarge` into us-east-2b, which allegedly has sufficient capacity. Note, the long-term fix is to rework how cluster creation retry currently operates, by effectively trying other AZs. [1] cockroachdb#104419 Epic: none Fixes: cockroachdb#78601 (comment) Release note: None
105234: roachprod: add aws AZ override for c6id.24xlarge r=renatolabs a=srosenberg Since the bump to new instance types in GCE and AWS [1], we are still experiencing occasional cluster creation issues owing to "insufficient capacity". GCE quota has already been bumped, with `asia-northeast1` being the latest, and hopefully last. The most recent cluster creation in AWS is owing to "insufficient capacity" of `c6id.24xlarge` in us-east-2a. As a workaround, we extend the existing zone override to place `c6id.24xlarge` into us-east-2b, which allegedly has sufficient capacity. Note, the long-term fix is to rework how cluster creation retry currently operates, by effectively trying other AZs. [1] #104419 Epic: none Fixes: #78601 (comment) Release note: None Co-authored-by: Stan Rosenberg <stan.rosenberg@gmail.com>
Since the bump to new instance types in GCE and AWS [1], we are still experiencing occasional cluster creation issues owing to "insufficient capacity". GCE quota has already been bumped, with `asia-northeast1` being the latest, and hopefully last. The most recent cluster creation in AWS is owing to "insufficient capacity" of `c6id.24xlarge` in us-east-2a. As a workaround, we extend the existing zone override to place `c6id.24xlarge` into us-east-2b, which allegedly has sufficient capacity. Note, the long-term fix is to rework how cluster creation retry currently operates, by effectively trying other AZs. [1] #104419 Epic: none Fixes: #78601 (comment) Release note: None
Since the bump to new instance types in GCE and AWS [1], we are still experiencing occasional cluster creation issues owing to "insufficient capacity". GCE quota has already been bumped, with `asia-northeast1` being the latest, and hopefully last. The most recent cluster creation in AWS is owing to "insufficient capacity" of `c6id.24xlarge` in us-east-2a. As a workaround, we extend the existing zone override to place `c6id.24xlarge` into us-east-2b, which allegedly has sufficient capacity. Note, the long-term fix is to rework how cluster creation retry currently operates, by effectively trying other AZs. [1] cockroachdb#104419 Epic: none Fixes: cockroachdb#78601 (comment) Release note: None
Previously, roachtest used n1 in GCE, m5d and c6d in AWS. CockroachDB Cloud hardware now uses n2 in GCE, m6i in AWS [1]. This change brings roachtest hardware into parity with CockroachDB Cloud; it is based on the draft PR in [2].
[1] https://cockroachlabs.atlassian.net/wiki/spaces/MC/pages/2799501550/CockroachDB+Cloud+Hardware
[2] #99991
Epic: none
Release note: None