Skip to content

roachtest: revert harmonize GCE and AWS machine types#111633

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
srosenberg:sr/revert_111140
Oct 3, 2023
Merged

roachtest: revert harmonize GCE and AWS machine types#111633
craig[bot] merged 1 commit intocockroachdb:masterfrom
srosenberg:sr/revert_111140

Conversation

@srosenberg
Copy link
Copy Markdown
Member

Revert the change to machine types in [1] until
after 23.2 branch is cut.

[1] #111140

Epic: none

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@srosenberg srosenberg requested a review from RaduBerinde October 3, 2023 03:29
@srosenberg srosenberg marked this pull request as ready for review October 3, 2023 03:30
@srosenberg srosenberg requested a review from a team as a code owner October 3, 2023 03:30
@srosenberg srosenberg requested review from renatolabs and smg260 and removed request for a team October 3, 2023 03:30
@srosenberg
Copy link
Copy Markdown
Member Author

SELECT_PROBABILITY=.25

@RaduBerinde
Copy link
Copy Markdown
Member

Is it possible to keep most of the code but tweak it to resolve to the same machine types as before? Otherwise, we'll have this ongoing, non-trivial code difference between master and realease-* branches. The differences even conflict with an outstanding PR (#111324).

@RaduBerinde
Copy link
Copy Markdown
Member

RaduBerinde commented Oct 3, 2023

Ah, sorry, saw the title and assumed it was a revert of the commit. I looked at the diff and it LGTM!

Revert the change to machine types in [1] until
after 23.2 branch is cut.

[1] cockroachdb#111140

Epic: none

Release note: None
@srosenberg
Copy link
Copy Markdown
Member Author

TFTR!

The only interesting roachtest failure is already tracked in [1]. So, we're good to merge; will follow up after 23.2 is cut, and all perf. regressions are resolved.

[1] #111539

@srosenberg
Copy link
Copy Markdown
Member Author

bors r=RaduBerinde,erikgrinaker

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Oct 3, 2023

Build succeeded:

@craig craig bot merged commit a227d78 into cockroachdb:master Oct 3, 2023
RaduBerinde pushed a commit to RaduBerinde/cockroach that referenced this pull request Nov 2, 2023
…l revert

This is a backport of the "merged" diff of the following PRs:
  roachtest: harmonize GCE and AWS machine types cockroachdb#111140
  roachtest: revert harmonize GCE and AWS machine types cockroachdb#111633

Release justification: test-only code, keeping roachtest in sync.
Epic: none
Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jan 18, 2024
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jan 27, 2024
Previously, same (performance) roachtest executed in GCE and AWS
may have used a different memory (per CPU) multiplier and/or
cpu family, e.g., cascade lake vs ice lake. In the best case,
this resulted in different performance baselines on an otherwise
equivalent machine type. In the worst case, this resulted in OOMs
due to VMs in AWS having 2x less memory per CPU.

This change harmozines GCE and AWS machine types by making them
as isomorphic as possible, wrt memory, cpu family and price.
The following heuristics are used depending on specified MemPerCPU:
Standard yields 4GB/cpu, High yields 8GB/cpu,
Auto yields 4GB/cpu up to and including 16 vCPUs, then 2GB/cpu.
Low is supported only in GCE.
Consequently, n2-standard maps to m6i, n2-highmem maps to r6i,
n2-custom maps to c6i, modulo local SSDs in which case m6id is
used, etc. Note, we also force --gce-min-cpu-platform to Ice Lake;
isomorphic AWS machine types are exclusively on Ice Lake.

Roachprod is extended to show cpu family and architecture on List.
Cost estimation now correctly deals with custom machine types.

Note, this PR essentially resurrects [1], after it was reverted
in [2]. Since [1], `SelectAzureMachineType` has been added.
MemPerCPU is preserved across all three cloud providers.
However, when mem is Auto (default) and cpus > 80, we switch
to AMD Milan, both in GCE and AWS, but not Azure. (The latter
doesn't support 2GB per AMD CPU.)

For complete lists of machine types see `ExampleXXXMachineType`.

[1] cockroachdb#111140
[2] cockroachdb#111633

Epic: none
Fixes: cockroachdb#106570

Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Feb 8, 2024
Previously, same (performance) roachtest executed in GCE and AWS
may have used a different memory (per CPU) multiplier and/or
cpu family, e.g., cascade lake vs ice lake. In the best case,
this resulted in different performance baselines on an otherwise
equivalent machine type. In the worst case, this resulted in OOMs
due to VMs in AWS having 2x less memory per CPU.

This change harmozines GCE and AWS machine types by making them
as isomorphic as possible, wrt memory, cpu family and price.
The following heuristics are used depending on specified MemPerCPU:
Standard yields 4GB/cpu, High yields 8GB/cpu,
Auto yields 4GB/cpu up to and including 16 vCPUs, then 2GB/cpu.
Low is supported only in GCE.
Consequently, n2-standard maps to m6i, n2-highmem maps to r6i,
n2-custom maps to c6i, modulo local SSDs in which case m6id is
used, etc. Note, we also force --gce-min-cpu-platform to Ice Lake;
isomorphic AWS machine types are exclusively on Ice Lake.

Roachprod is extended to show cpu family and architecture on List.
Cost estimation now correctly deals with custom machine types.

Note, this PR essentially resurrects [1], after it was reverted
in [2]. Since [1], `SelectAzureMachineType` has been added.
MemPerCPU is preserved across all three cloud providers.
However, when mem is Auto (default) and cpus > 80, we switch
to AMD Milan, both in GCE and AWS, but not Azure. (The latter
doesn't support 2GB per AMD CPU.)

For complete lists of machine types see `ExampleXXXMachineType`.

[1] cockroachdb#111140
[2] cockroachdb#111633

Epic: none
Fixes: cockroachdb#106570

Release note: None
craig bot pushed a commit that referenced this pull request Feb 9, 2024
117852: roachtest: harmonize GCE, AWS, Azure machine types r=renatolabs a=srosenberg

Previously, same (performance) roachtest executed in GCE and AWS
may have used a different memory (per CPU) multiplier and/or
cpu family, e.g., cascade lake vs ice lake. In the best case,
this resulted in different performance baselines on an otherwise
equivalent machine type. In the worst case, this resulted in OOMs
due to VMs in AWS having 2x less memory per CPU.

This change harmozines GCE and AWS machine types by making them
as isomorphic as possible, wrt memory, cpu family and price.
The following heuristics are used depending on specified MemPerCPU:
Standard yields 4GB/cpu, High yields 8GB/cpu,
Auto yields 4GB/cpu up to and including 16 vCPUs, then 2GB/cpu.
Low is supported only in GCE.
Consequently, n2-standard maps to m6i, n2-highmem maps to r6i,
n2-custom maps to c6i, modulo local SSDs in which case m6id is
used, etc. Note, we also force --gce-min-cpu-platform to Ice Lake;
isomorphic AWS machine types are exclusively on Ice Lake.

Roachprod is extended to show cpu family and architecture on List.
Cost estimation now correctly deals with custom machine types.

Note, this PR essentially resurrects [1], after it was reverted
in [2]. Since [1], `SelectAzureMachineType` has been added.
MemPerCPU is preserved across all three cloud providers.
However, when mem is Auto (default) and cpus > 80, we switch
to AMD Milan, both in GCE and AWS, but not Azure. (The latter
doesn't support 2GB per AMD CPU.)

For complete lists of machine types see `ExampleXXXMachineType`.

[1] #111140
[2] #111633

Epic: none
Fixes: #106570

Release note: None

Co-authored-by: Stan Rosenberg <stan.rosenberg@gmail.com>
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Feb 13, 2024
Previously, same (performance) roachtest executed in GCE and AWS
may have used a different memory (per CPU) multiplier and/or
cpu family, e.g., cascade lake vs ice lake. In the best case,
this resulted in different performance baselines on an otherwise
equivalent machine type. In the worst case, this resulted in OOMs
due to VMs in AWS having 2x less memory per CPU.

This change harmozines GCE and AWS machine types by making them
as isomorphic as possible, wrt memory, cpu family and price.
The following heuristics are used depending on specified MemPerCPU:
Standard yields 4GB/cpu, High yields 8GB/cpu,
Auto yields 4GB/cpu up to and including 16 vCPUs, then 2GB/cpu.
Low is supported only in GCE.
Consequently, n2-standard maps to m6i, n2-highmem maps to r6i,
n2-custom maps to c6i, modulo local SSDs in which case m6id is
used, etc. Note, we also force --gce-min-cpu-platform to Ice Lake;
isomorphic AWS machine types are exclusively on Ice Lake.

Roachprod is extended to show cpu family and architecture on List.
Cost estimation now correctly deals with custom machine types.

Note, this PR essentially resurrects [1], after it was reverted
in [2]. Since [1], `SelectAzureMachineType` has been added.
MemPerCPU is preserved across all three cloud providers.
However, when mem is Auto (default) and cpus > 80, we switch
to AMD Milan, both in GCE and AWS, but not Azure. (The latter
doesn't support 2GB per AMD CPU.)

For complete lists of machine types see `ExampleXXXMachineType`.

[1] cockroachdb#111140
[2] cockroachdb#111633

Epic: none
Fixes: cockroachdb#106570

Release note: None
cockroach-dev-inf pushed a commit that referenced this pull request Feb 14, 2024
Previously, same (performance) roachtest executed in GCE and AWS
may have used a different memory (per CPU) multiplier and/or
cpu family, e.g., cascade lake vs ice lake. In the best case,
this resulted in different performance baselines on an otherwise
equivalent machine type. In the worst case, this resulted in OOMs
due to VMs in AWS having 2x less memory per CPU.

This change harmozines GCE and AWS machine types by making them
as isomorphic as possible, wrt memory, cpu family and price.
The following heuristics are used depending on specified MemPerCPU:
Standard yields 4GB/cpu, High yields 8GB/cpu,
Auto yields 4GB/cpu up to and including 16 vCPUs, then 2GB/cpu.
Low is supported only in GCE.
Consequently, n2-standard maps to m6i, n2-highmem maps to r6i,
n2-custom maps to c6i, modulo local SSDs in which case m6id is
used, etc. Note, we also force --gce-min-cpu-platform to Ice Lake;
isomorphic AWS machine types are exclusively on Ice Lake.

Roachprod is extended to show cpu family and architecture on List.
Cost estimation now correctly deals with custom machine types.

Note, this PR essentially resurrects [1], after it was reverted
in [2]. Since [1], `SelectAzureMachineType` has been added.
MemPerCPU is preserved across all three cloud providers.
However, when mem is Auto (default) and cpus > 80, we switch
to AMD Milan, both in GCE and AWS, but not Azure. (The latter
doesn't support 2GB per AMD CPU.)

For complete lists of machine types see `ExampleXXXMachineType`.

[1] #111140
[2] #111633

Epic: none
Fixes: #106570

Release note: None
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Feb 16, 2024
Previously, same (performance) roachtest executed in GCE and AWS
may have used a different memory (per CPU) multiplier and/or
cpu family, e.g., cascade lake vs ice lake. In the best case,
this resulted in different performance baselines on an otherwise
equivalent machine type. In the worst case, this resulted in OOMs
due to VMs in AWS having 2x less memory per CPU.

This change harmozines GCE and AWS machine types by making them
as isomorphic as possible, wrt memory, cpu family and price.
The following heuristics are used depending on specified MemPerCPU:
Standard yields 4GB/cpu, High yields 8GB/cpu,
Auto yields 4GB/cpu up to and including 16 vCPUs, then 2GB/cpu.
Low is supported only in GCE.
Consequently, n2-standard maps to m6i, n2-highmem maps to r6i,
n2-custom maps to c6i, modulo local SSDs in which case m6id is
used, etc. Note, we also force --gce-min-cpu-platform to Ice Lake;
isomorphic AWS machine types are exclusively on Ice Lake.

Roachprod is extended to show cpu family and architecture on List.
Cost estimation now correctly deals with custom machine types.

Note, this PR essentially resurrects [1], after it was reverted
in [2]. Since [1], `SelectAzureMachineType` has been added.
MemPerCPU is preserved across all three cloud providers.
However, when mem is Auto (default) and cpus > 80, we switch
to AMD Milan, both in GCE and AWS, but not Azure. (The latter
doesn't support 2GB per AMD CPU.)

For complete lists of machine types see `ExampleXXXMachineType`.

[1] cockroachdb#111140
[2] cockroachdb#111633

Epic: none
Fixes: cockroachdb#106570

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants