Skip to content

pkg/roachprod: allow multiple-stores to be created on GCP#72553

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
nicktrav:nickt.nickt-roachprod-gce-raid
Nov 9, 2021
Merged

pkg/roachprod: allow multiple-stores to be created on GCP#72553
craig[bot] merged 1 commit intocockroachdb:masterfrom
nicktrav:nickt.nickt-roachprod-gce-raid

Conversation

@nicktrav
Copy link
Copy Markdown
Collaborator

@nicktrav nicktrav commented Nov 9, 2021

Port an existing flag from the AWS roachprod flags that allows multiple
stores to be created. When this flag is enabled, multiple data
directories are created and mounted as /mnt/data{1..N}.

Standardize the existing ext4 disk creation logic in the GCE setup
script to match the AWS functionality. Interleave the existing ZFS setup
commands based on the --filesystem flag.

Fix a bug introduced in #54986 that will always create multiple data
disks, ignoring the value of the flag. This has the effect of never
creating a RAID 0 array, which is the intended default behavior.

The ability to create a RAID 0 array on GCE VMs is required for the
Pebble write-throughput benchmarks.

Release note: None

@nicktrav nicktrav requested review from a team and bananabrick November 9, 2021 02:21
@nicktrav nicktrav requested a review from a team as a code owner November 9, 2021 02:21
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

mkdir -p ${mountpoint}
chmod 777 ${mountpoint}
elif [ "${#disks[@]}" -eq "1" ] || [ -n "use_multiple_disks" ]; then
elif [ "${#disks[@]}" -eq "1" ] || [ -n "$use_multiple_disks" ]; then
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @nvanbenschoten @miretskiy - looks like this was picked up in the original PR.

I believe this has probably been broken since that the original commit was merged (March), and we would have been creating non-RAID 0 clusters in cases where we declared that we wanted multiple SSDs attached with a single store (i.e. just /mnt/data1). Instead we've been creating multiple data dirs in a non-RAID configuration.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha.. Nice one; goes to show just how under tested this stuff was.

@nicktrav
Copy link
Copy Markdown
Collaborator Author

nicktrav commented Nov 9, 2021

Ran through the various combinations and put the output in this gist to confirm that the correct block devices were being created.

Copy link
Copy Markdown
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I trust your gist you've passed in wrt to the correctness of the new shell script.

mkdir -p ${mountpoint}
chmod 777 ${mountpoint}
elif [ "${#disks[@]}" -eq "1" ] || [ -n "use_multiple_disks" ]; then
elif [ "${#disks[@]}" -eq "1" ] || [ -n "$use_multiple_disks" ]; then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha.. Nice one; goes to show just how under tested this stuff was.

@nicktrav nicktrav force-pushed the nickt.nickt-roachprod-gce-raid branch from d0d5d28 to 6f6684c Compare November 9, 2021 16:17
Copy link
Copy Markdown
Contributor

@bananabrick bananabrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick, @miretskiy, @nicktrav, and @nvanbenschoten)


pkg/roachprod/vm/gce/gcloud.go, line 306 at r2 (raw file):

		"Size in GB of persistent disk volume, only used if local-ssd=false")
	flags.BoolVar(&o.UseMultipleDisks, ProviderName+"-enable-multiple-stores",
		o.UseMultipleDisks, "Enable the use of multiple stores by creating one store directory per disk. "+

Should this have some default value? Looks like it will be false?


pkg/roachprod/vm/gce/gcloud.go, line 308 at r2 (raw file):

		o.UseMultipleDisks, "Enable the use of multiple stores by creating one store directory per disk. "+
			"Default is to raid0 stripe all disks. "+
			"See repeating --"+ProviderName+"-ebs-volume for adding extra volumes.")

nit: Are gce persistent disks also called ebs volumes? The "See repeating --"+ProviderName+"-ebs-volume for adding extra volumes." sentence is also a bit unclear to me.

Port an existing flag from the AWS roachprod flags that allows multiple
stores to be created. When this flag is enabled, multiple data
directories are created and mounted as `/mnt/data{1..N}`.

Standardize the existing ext4 disk creation logic in the GCE setup
script to match the AWS functionality. Interleave the existing ZFS setup
commands based on the `--filesystem` flag.

Fix a bug introduced in cockroachdb#54986 that will always create multiple data
disks, ignoring the value of the flag. This has the effect of never
creating a RAID 0 array, which is the intended default behavior.

The ability to create a RAID 0 array on GCE VMs is required for the
Pebble write-throughput benchmarks.

Release note: None
@nicktrav nicktrav force-pushed the nickt.nickt-roachprod-gce-raid branch from 6f6684c to 039277a Compare November 9, 2021 18:43
Copy link
Copy Markdown
Collaborator Author

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @miretskiy)


pkg/roachprod/vm/gce/gcloud.go, line 306 at r2 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

Should this have some default value? Looks like it will be false?

I believe the original default was intended to be false. Same as for AWS.


pkg/roachprod/vm/gce/gcloud.go, line 308 at r2 (raw file):

Previously, bananabrick (Arjun Nair) wrote…

nit: Are gce persistent disks also called ebs volumes? The "See repeating --"+ProviderName+"-ebs-volume for adding extra volumes." sentence is also a bit unclear to me.

Copypasta. Removed.

@nicktrav
Copy link
Copy Markdown
Collaborator Author

nicktrav commented Nov 9, 2021

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 9, 2021

Build succeeded:

@craig craig bot merged commit 8b3d6fc into cockroachdb:master Nov 9, 2021
@nicktrav nicktrav deleted the nickt.nickt-roachprod-gce-raid branch November 9, 2021 20:46
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Nov 16, 2021
Fixes bugs introduced by cockroachdb#72553, which ported support for multiple
stores from AWS to GCP.  This change fixes the bugs that caused
disks mounted to not be written to `/etc/fstab` on GCP instances,
causing mounts to go missing on instance restart.  Additionally, this
change modifies the defaults on GCP and AWS to prefer creating multiple
stores rather than creating a RAID 0 array, thus preserving the existing
behavior used in roachtests such as `kv/multi-store-with-overload`.

Fixes cockroachdb#72635

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Nov 16, 2021
Fixes bugs introduced by cockroachdb#72553, which ported support for multiple
stores from AWS to GCP.  This change fixes the bugs that caused
disks mounted to not be written to `/etc/fstab` on GCP instances,
causing mounts to go missing on instance restart.  Additionally, this
change modifies roachtest to default to creating multiple stores on GCP
and AWS in roachprod invocations rather than mounting a RAID 0 array,
thus preserving the existing behavior used in roachtests such as
`kv/multi-store-with-overload`.

Fixes cockroachdb#72635

Release note: None
AlexTalks added a commit to AlexTalks/cockroach that referenced this pull request Nov 16, 2021
Fixes bugs introduced by cockroachdb#72553, which ported support for multiple
stores from AWS to GCP.  This change fixes the bugs that caused
disks mounted to not be written to `/etc/fstab` on GCP instances,
causing mounts to go missing on instance restart.  Additionally, this
change modifies roachtest to default to creating multiple stores on GCP
in roachprod invocations rather than mounting a RAID 0 array, thus
preserving the existing behavior used in roachtests such as
`kv/multi-store-with-overload`.

Fixes cockroachdb#72635

Release note: None
craig bot pushed a commit that referenced this pull request Nov 17, 2021
71813: sql: add delete preserving index encoding, use for primary and second… r=rhu713 a=rhu713

…ary index

As part of the transition for Bulk Ops to use only MVCC operations, the index
backfiller must be rewritten. Part of the rewrite is to introduce a temporary
index at the beginning of the backfill encoded with a new delete-preserving
index encoding. This patch adds this encoding and uses the encoding for writes
to indices configured with this encoding.

Release note: None

72803: roachprod,roachtest: fix mounting on GCP and disable RAID0 in roachtests r=AlexTalks a=AlexTalks

Fixes bugs introduced by #72553, which ported support for multiple
stores from AWS to GCP.  This change fixes the bugs that caused
disks mounted to not be written to `/etc/fstab` on GCP instances,
causing mounts to go missing on instance restart.  Additionally, this
change modifies roachtest to default to creating multiple stores on GCP
in roachprod invocations rather than mounting a RAID 0 array, thus
preserving the existing behavior used in roachtests such as
`kv/multi-store-with-overload`.

Fixes #72635

Release note: None

72817: sql/catalog/resolver: remove an allocation, add a benchmark r=ajwerner a=ajwerner

We didn't need to allocate this name except on error paths. Make that lazy.

The added benchmark also shows that we do horrifically badly when using
user-defined schemas and `search_path`. Yay for benchmarks.

```
name                                                                    old time/op    new time/op    delta
ResolveExistingObject/CREATE_TABLE_foo_()foo-16                           2.18µs ± 5%    2.03µs ± 2%    -6.96%  (p=0.008 n=5+5)
ResolveExistingObject/CREATE_SCHEMA_sc;CREATE_TABLE_sc.foo_()sc.foo-16    1.59µs ± 4%    1.53µs ± 4%      ~     (p=0.063 n=5+5)
ResolveExistingObject/CREATE_SCHEMA_sc;CREATE_TABLE_sc.foo_()foo-16       60.5µs ± 4%    60.1µs ± 6%      ~     (p=0.690 n=5+5)

name                                                                    old alloc/op   new alloc/op   delta
ResolveExistingObject/CREATE_TABLE_foo_()foo-16                            72.4B ± 1%      7.6B ±18%   -89.50%  (p=0.008 n=5+5)
ResolveExistingObject/CREATE_SCHEMA_sc;CREATE_TABLE_sc.foo_()sc.foo-16     69.4B ± 1%      5.0B ± 0%   -92.80%  (p=0.008 n=5+5)
ResolveExistingObject/CREATE_SCHEMA_sc;CREATE_TABLE_sc.foo_()foo-16       11.2kB ± 0%    11.1kB ± 0%    -0.56%  (p=0.008 n=5+5)

name                                                                    old allocs/op  new allocs/op  delta
ResolveExistingObject/CREATE_TABLE_foo_()foo-16                             1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
ResolveExistingObject/CREATE_SCHEMA_sc;CREATE_TABLE_sc.foo_()sc.foo-16      1.00 ± 0%      0.00       -100.00%  (p=0.008 n=5+5)
ResolveExistingObject/CREATE_SCHEMA_sc;CREATE_TABLE_sc.foo_()foo-16         75.0 ± 0%      74.0 ± 0%    -1.33%  (p=0.008 n=5+5)
```

Release note: None

Co-authored-by: Rui Hu <rui@cockroachlabs.com>
Co-authored-by: Alex Sarkesian <sarkesian@cockroachlabs.com>
Co-authored-by: Andrew Werner <awerner32@gmail.com>
blathers-crl bot pushed a commit that referenced this pull request Nov 17, 2021
Fixes bugs introduced by #72553, which ported support for multiple
stores from AWS to GCP.  This change fixes the bugs that caused
disks mounted to not be written to `/etc/fstab` on GCP instances,
causing mounts to go missing on instance restart.  Additionally, this
change modifies roachtest to default to creating multiple stores on GCP
in roachprod invocations rather than mounting a RAID 0 array, thus
preserving the existing behavior used in roachtests such as
`kv/multi-store-with-overload`.

Fixes #72635

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants