*: configure the use of separated intents in a cluster by sumeerbhola · Pull Request #59829 · cockroachdb/cockroach

sumeerbhola · 2021-02-04T23:35:10Z

This is done using a combination of the cluster version,
which prevents separated intents until any node is running
below the SeparatedIntents version, and then by the
storage.transaction.separated_intents.enabled cluster
setting which is on by default.

This replaces the temporary consts that were being used
previously for configuration.

The safety of transition from 20.2 to 21.1 is
discussed in:

comments in intent_reader_writer.go (where the cluster
setting is declared).
the new Writer.SafeToWriteSeparatedIntents which
gates the use of MVCCMetadata.TxnDidNotUpdateMeta
so that replicas do not diverge across new and
old nodes that do not understand this field.
comments in store_snapshot.go regarding why snapshots
exchanged between 20.2 and 21.1 nodes are fine
even though the latter will iterate through the
lock table key space for outgoing snapshots and
construct ssts for deleting the lock table key
space for incoming snapshots.

The only non-test code that cares about this configuration
lives in the storage package and accesses the config
via the optional cluster.Settings provided when
constructing a Pebble Engine.

Affected tests are in 3 categories:

tests that had different datadriven test files for
different configuration settings. These now iterate
through all the configurations. This is a small
subset of storage tests.
tests that were adjusting their expected values
based on the configuration settings. These now
randomize the configuration settings.
tests that were ignorant of the configuration
settings. These also randomize the settings, which
is done at different levels of abstraction depending
on the kind of test.
- When constructing the Engine in the test. For tests
  that directly construct an Engine or indirectly
  via LocalTestCluster or kvserver.testContext.
  This allows for manipulating both the version
  and the cluster setting.
- When constructing cluster.Settings for
  server.TestServerFactory. This only manipulates
  the cluster setting.

Informs #41720

Release note (ops change): adds the
storage.transaction.separated_intents.enabled cluster
setting, which enables separated intents by default.

cockroach-teamcity · 2021-02-04T23:35:18Z

This change is

nvb

Reviewed 57 of 57 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal, @petermattis, and @sumeerbhola)

pkg/kv/kvserver/store_snapshot.go, line 98 at r1 (raw file):

// Separated locks and snapshots send/receive:
// When running in a mixed version cluster with 21.1 and 20.2, snapshots sent
// by 20.1 nodes will attempt to read the lock table key space and send any

s/20.1/21.1/

pkg/kv/kvserver/rditer/replica_data_iter.go, line 83 at r1 (raw file):

func makeRangeKeyRanges(d *roachpb.RangeDescriptor, replicatedOnly bool) []KeyRange {
	rangeIDLocal := MakeRangeIDLocalKeyRange(d.RangeID, replicatedOnly)

Can the functions in this file be simplified at all now that storage.DisallowSeparatedIntents is gone? It seems like they can, especially if we're ok with assuming the len of the array returned by makeRangeLockTableKeyRanges at its callers.

pkg/server/sticky_engine.go, line 108 at r1 (raw file):

		id:     spec.StickyInMemoryEngineID,
		closed: false,
		// This engine will stay alive after the node dies, so don't want the caller to pass

"so we don't"

pkg/storage/intent_reader_writer.go, line 62 at r1 (raw file):

	// initialize it up-front, but don't know if they are being used by code
	// that cares about intents (e.g. a temporary Engine used for disk-spilling
	// during query execution will never read-write intents).

There are trade-offs with this lazy initialization. On the one hand, it avoids the cost of computing writeSeparatedIntents for users who do not care about writing intents. On the other hand, it means that each intent write that a request issues will be forced to perform the computation, which looks moderately expensive. For instance, I think I see heap allocations and proto unmarshalling in clusterVersionSetting.activeVersionOrEmpty.

If we aren't relying on long-lived intentDemuxWriter instances observing changes to the cluster version or SeparatedIntentsEnabled setting, then maybe we can take a hybrid approach and memorize the lazy computation of writeSeparatedIntents. That would eliminate the trade-off with the lazy approach, at the expense of a bit more code.

EDIT: actually, is this structure long-lived?

pkg/storage/intent_reader_writer_test.go, line 218 at r1 (raw file):

				// doesn't matter how the original call to createTestPebbleEngine
				// behaved in terms of separated intents config.
				w = wrapIntentWriter(&pw, makeSettingsForSeparatedIntents(false, separated))

nit: include a /* argName */ for false.

pkg/storage/pebble.go, line 881 at r1 (raw file):

			errors.Errorf("Pebble without cluster.Settings does not support SafeToWriteSeparatedIntents")
	}
	return p.settings.Version.IsActive(ctx, clusterversion.SeparatedIntents), nil

nit: Should we reach inside the wrappedIntentWriter for the settings so that this impl directly mirrors the one in pebbleBatch?

sumeerbhola

TFTR!

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal, @nvanbenschoten, and @petermattis)

pkg/kv/kvserver/store_snapshot.go, line 98 at r1 (raw file):