Skip to content

*: simplify tests#111383

Merged
craig[bot] merged 7 commits intocockroachdb:masterfrom
knz:20230927-simplify-tests
Sep 29, 2023
Merged

*: simplify tests#111383
craig[bot] merged 7 commits intocockroachdb:masterfrom
knz:20230927-simplify-tests

Conversation

@knz
Copy link
Copy Markdown
Contributor

@knz knz commented Sep 27, 2023

All commits except the last are from #110758.
Epic: CRDB-6671

Now that "tenant-ro" settings take their default from the system tenant's value, we do not need ALTER TENANT ALL for them any more.

This patch simplifies test code accordingly.

Rcommended by @yuzefovich in the review for #110758.

@knz knz requested a review from yuzefovich September 27, 2023 21:42
@knz knz requested review from a team as code owners September 27, 2023 21:42
@knz knz requested review from a team, DarrylWong, herkolategan, jayshrivastava, renatolabs and rhu713 and removed request for a team, DarrylWong, jayshrivastava and rhu713 September 27, 2023 21:42
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last commit :lgtm:

Reviewed 15 of 15 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan and @renatolabs)

(This function/method will also be deleted in a later commit.)

Release note: None
@knz knz force-pushed the 20230927-simplify-tests branch from 4e73d82 to ea871c7 Compare September 29, 2023 14:59
@knz knz requested a review from a team September 29, 2023 14:59
knz and others added 2 commits September 29, 2023 19:06
…ed cache

(For context, on each node there is a local persisted cache of cluster
setting customizations. This exists to ensure that configured values
can be used even before a node has fully started up and can start
reading customizations from `system.settings`.)

Prior to this patch, entries were never evicted from the local
persisted cache: when a cluster setting was reset, any previously
saved entry in the cache would remain there.

This is a very old bug, which was long hidden and was recently
revealed when commit 2f5d717 was
merged. In a nutshell, before this recent commit the code responsible
to load the entries from the cache didn't fully work and so the stale
entries were never restored from the cache. That commit fixed the
loader code, and so the stale entries became active, which made the
old bug visible.

To fix the old bug, this present commit modifies the settings watcher
to preserve KV deletion events, and propagates them to the persisted
cache.

(There is no release note because there is no user-facing release where
the bug was visible.)

Release note: None

Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Prior to this patch, the rangefeed watcher over `system.settings` was
updating the in-RAM value store before it propagated the updates to
the persisted local cache.

In fact, the update to the persisted local cache was lagging quite a
bit behind, because the rangefeed watcher would buffer updates and
only flush them after a while.

As a result, the following sequence was possible:

1. client updates a cluster setting.
2. server is immediately shut down. The persisted cache
   has not been updated yet.
3. server is restarted. For a short while (until the settings watcher
   has caught up), the old version of the setting remains active.

This recall of ghost values of a setting was simply a bug. This patch
fixes that, by ensuring that the persisted cache is written through
before the in-RAM value store.

By doing this, we give up on batching updates to the persisted local
store. This is deemed acceptable because cluster settings are not
updated frequently.

Release note: None
@knz knz force-pushed the 20230927-simplify-tests branch from ea871c7 to 567c57d Compare September 29, 2023 17:27
knz added 4 commits September 29, 2023 19:38
In a later commit, we will want to feed changes to TenantReadOnly
settings in the system tenant's `system.settings` table into the
connector, for use by the virtual cluster servers.

This commit sets up the first building block: an extra callback
function in the settings watcher that gets called when the watcher
detects a change to TenantReadOnly settings.

Release note: None
As an end-goal, we want that when there is no override in
`system.tenant_settings`, i.e. the rangefeed over this table has
nothing to stay, we pick up the current value of `TenantReadOnly`
settings from `system.settings` instead to feed to virtual clusters.

To achieve this, we need to merge the current `TenantReadOnly` values
"underneath" the tenant overrides, i.e. use the current
`TenantReadOnly` as default override value when the rangefeed over
`tenant_settings` has nothing to say.

This commit updates the overrides store accordingly. It does so by
maintaining an `alternateDefaults` override slice, updated from
changes to `TenantReadOnly` settings from `system.settings`.

Then, updates to the overrides map (in `tenantOverrides`) for the
special tenant ID `allTenantOverridesID` are "merged" from
`alternateDefaults` with the following rule:

- when a change to `alternateDefaults` is made, the current
  `allTenantOverridesID` override slice is updated for each
  setting *not* currently set from the rangefeed (those missing from the
  `explicitKeys` map).

- when an *addition* or *modification* is observed from the
  rangefeed (for tenant ID `allTenantOverridesID`), the overrides slice
  is updated with the rangefeed value, and the rangefeed setting key is
  added to `explicitKeys`.

- when a *deletion* is observed from the rangefeed, the key is removed
  from `explicitKeys` and any override from `alternateDefaults` is
  inserted instead.

Note that entries are never deleted from `alternateDefaults`, even
when a TenantReadOnly customization is removed from
`system.settings` (i.e. the callback `setAlternateDefaults` is given a
smaller slice than the current `alternateDefaults`). This is on
purpose: to properly support mixed-version deployments, we want
that SQL servers observe the storage layer's idea of what the default
value of a TenantReadOnly setting should be. To achieve this, it's
important that the storage server always reports an override for every
TenantReadOnly setting.

Release note: None
TLDR: this patch ensures that virtual cluster servers observe changes
made to TenantReadOnly settings via SET CLUSTER SETTING in the system
interface, even when there is no override set via ALTER VIRTUAL
CLUSTER SET CLUSTER SETTING.

For example, after `SET CLUSTER SETTING
kv.closed_timestamp.target_duration = '10s'` in the system interface,
this value will show up via `SHOW CLUSTER SETTING` in a virtual
cluster SQL session.

This changes the way that settings are picked up in virtual cluster,
as follows:

1. if there is an override specifically for this tenant's ID
   (in `tenant_settings`), use that.
2. otherwise, if there is an override for the pseudo-ID 0
   (in `tenant_settings` still, set via `ALTER TENANT ALL SET CLUSTER
   SETTING`), then use that.
3. **NEW**  otherwise, if the class is TenantReadOnly and there
   is a custom value in `system.settings`, set via a regular
   `SET CLUSTER SETTING` in the system tenant, then use that.
4. otherwise, use the global default set via the setting's
   `Register()` call.

----

Prior to this patch, TenantReadOnly settings as observed from virtual
clusters were defined as the following priority order:

1. if there is an override specifically for this tenant's ID
   (in `tenant_settings`), use that.
2. otherwise, if there is an override for the pseudo-ID 0
   (in `tenant_settings` still, set via `ALTER TENANT ALL SET CLUSTER
   SETTING`), then use that.
3. otherwise, use the global default set via the setting's
   `Register()` call.

Remarkably, this did not pick up any changes made via a plain `SET
CLUSTER SETTING` statement via the system interface, which only
modifies this setting's value in `system.settings` (thus not
`tenant_settings`).

This situation was problematic in two ways.

To start, settings like `kv.closed_timestamp.target_duration` cannot
be set solely in `system.tenant_settings`; they are also used in the
storage layer and so must also be picked up from changes in
`system.settings`.

For these settings, it is common for operators to just issue the plain
`SET CLUSTER SETTING` statement (to update `system.settings`) and
simply forget to _also_ run `ALTER TENANT ALL SET CLUSTER SETTING`.

This mistake is nearly unavoidable and would result in incoherent
behavior, where the storage layer would use the customized value
and virtual clusters would use the registered global default.

The second problem is in mixed-version configurations, where
the storage layer runs version N+1 and the SQL service runs version N
of the executable. If the registered global default changes from
version N to N+1, the SQL service would not properly pick up
the new default defined in version N+1 of the storage layer.

This patch fixes both problems as follows:

- it integrates changes to TenantReadOnly settings observed
  in `system.settings`, to the watcher that tracks changes
  to `system.tenant_settings`. When a TenantReadOnly setting
  is present in the former but not the latter, a synthetic
  override is added.

- it also initializes synthetic overrides for all the
  TenantReadOnly settings upon server initialization,
  from the registered global default, so that virtual
  cluster servers always pick up the storage layer's
  default as override.

Release note: None
Now that "tenant-ro" settings take their default from the system
tenant's value, we do not need `ALTER TENANT ALL` for them any more.

This patch simplifies test code accordingly.

Release note: None
@knz knz force-pushed the 20230927-simplify-tests branch from 567c57d to efdf0db Compare September 29, 2023 17:41
@knz
Copy link
Copy Markdown
Contributor Author

knz commented Sep 29, 2023

TFYR

bors r=yuzefovich

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 29, 2023

Build succeeded:

@craig craig bot merged commit 1787e21 into cockroachdb:master Sep 29, 2023
@knz knz deleted the 20230927-simplify-tests branch September 29, 2023 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants