Skip to content

kvtenant,settingswatcher: ensure overrides are applied during start#105566

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
knz:20230626-wait-for-setting-overrides
Jun 27, 2023
Merged

kvtenant,settingswatcher: ensure overrides are applied during start#105566
craig[bot] merged 2 commits intocockroachdb:masterfrom
knz:20230626-wait-for-setting-overrides

Conversation

@knz
Copy link
Copy Markdown
Contributor

@knz knz commented Jun 26, 2023

See individual commits for details.

Fixes #96512.
Needed to unblock #104859.
Epic: CRDB-26691

@knz knz requested review from healthy-pod and yuzefovich June 26, 2023 18:30
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 26, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@knz knz force-pushed the 20230626-wait-for-setting-overrides branch from bd00d42 to 048098e Compare June 26, 2023 18:56
@knz knz marked this pull request as ready for review June 26, 2023 18:56
@knz knz requested review from a team as code owners June 26, 2023 18:56
@knz knz added the backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only label Jun 26, 2023
@knz knz force-pushed the 20230626-wait-for-setting-overrides branch 2 times, most recently from dea3984 to aae65b3 Compare June 26, 2023 20:13
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r1, 6 of 6 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @healthy-pod and @knz)


pkg/kv/kvclient/kvtenant/setting_overrides.go line 126 at r1 (raw file):

	}

	settingsReady = c.settingsMu.receivedFirstAllTenantOverrides && c.settingsMu.receivedFirstSpecificOverrides

nit: why receiving at least one event on both channel is sufficient here? I'm assuming that it's possible to have a configuration which is "scattered" over multiple events on each channel, no? Perhaps a quick comment here would be worthwhile.


pkg/server/settingswatcher/overrides.go line 27 at r2 (raw file):

	// Overrides retrieves the current set of setting overrides, as a
	// map from setting key to EncodedValue. Any settings that are
	// present must be set to the overridden value. It also retruns a

nit: s/retruns/returns/g.

@knz knz force-pushed the 20230626-wait-for-setting-overrides branch from aae65b3 to fd59c8f Compare June 27, 2023 09:11
Copy link
Copy Markdown
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @healthy-pod and @yuzefovich)


pkg/kv/kvclient/kvtenant/setting_overrides.go line 126 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: why receiving at least one event on both channel is sufficient here? I'm assuming that it's possible to have a configuration which is "scattered" over multiple events on each channel, no? Perhaps a quick comment here would be worthwhile.

Added a clarification in the protobuf file - this is part of the protocol (and what node.go has done since forever).


pkg/server/settingswatcher/overrides.go line 27 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/retruns/returns/g.

Fixed.

knz added 2 commits June 27, 2023 11:37
Prior to this patch, the tenant connector would report readiness
(via the `Start` method returning) as soon as just 1
`TenantSettingsEvent` was received from KV.

This was incorrect: to gather a full view of cluster settings,
the connector must gather *both* the all-tenant overrides *and* the
per-tenant overrides.

This commit makes a step in this direction, by holding the return of
the `Start` method until values for both types of overrides have been
received.

It is not a full fix for the issue however, because the consumer of
this data (via the `settingswatcher.OverridesMonitor` interface) does
not synchronize with the above. So it's possible that the in-RAM
values of cluster settings are not yet updated by the time the
connector's `Start` method returns. This will be fixed by the next
commit.

Release note: None
Prior to this patch, the `settingswatcher` code was polling overrides
asynchronously from the "overrides monitor" (interface
`settingswatcher.OverridesMonitor`).

Theoretically, this made it possible for the settings watcher to
signal readiness before the tenant connector had finished receiving
the first batch of overrides.

In practice however, we were lucky that the connector's `Start()`
function was called before the `settingwatcher`'s `Start` method, so
as per the previous commit the synchronization was right already.
However, this was somewhat a lucky accident.

To strengthen the situation further, this commit applies the same
synchronization pattern we have already implemented in the
`tenantsettingswatcher` package (file `watcher.go`):

- the accessor method returns an update channel that gets closed
  every time there's new data.

- a new `startCh` channel on the connector is only closed when the
  connector has finished starting up. The consumer (via
  `settingswatcher.OverridesMonitor`) waits on that before starting to
  consume overrides.

This way the overrides are guaranteed to be applied to the in-RAM
setting values prior to the rest of server start-up.

Release note: None
@knz knz force-pushed the 20230626-wait-for-setting-overrides branch from fd59c8f to 27249ff Compare June 27, 2023 09:37
@knz
Copy link
Copy Markdown
Contributor Author

knz commented Jun 27, 2023

bors r=yuzefovich

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 27, 2023

Build succeeded:

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jun 27, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error setting reviewers, but backport branch blathers/backport-release-23.1-105566 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/105602/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. []

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

server: avoid starting tenant servers until all overrides have been received

3 participants