Skip to content

roachpb: introduce the concept of SystemSpanConfig and related protos#74765

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:system-span-config
Jan 25, 2022
Merged

roachpb: introduce the concept of SystemSpanConfig and related protos#74765
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:system-span-config

Conversation

@arulajmani
Copy link
Copy Markdown
Collaborator

This patch is motivated by the desire to let the host tenant lay
protected timestamps on one or all secondary tenants' keyspace. It
also provides a mechanism to allow secondary tenants to lay protected
timestamps on their entire keyspace without updating every span
configuration.

We introduce the concept of SystemSpanConfig and
SystemSpanConfigTarget to enable this. We tie these together using a
SystemSpanConfigEntry.

A SystemSpanConfig is a system installed configuration that can apply
to multiple spans. It only contains protected timestamp information.

A SystemSpanConfigTarget is used to specify the spans a
SystemSpanConfig applies over. It can be used to target the entire
(logical) cluster or a particular secondary tenant. We will ensure that
only the host tenant can target secondary tenants in a future PR that
actually persists SystemSpanConfigs.

We will persist SystemSpanConfigs in system.span_configurations in
a future patch. The SystemSpanConfigTarget will be encoded into
special reserved keys when we do so.

This change introduces the notion of a hierarchy to span configurations.
The configuration that applies to a span will now bee the SpanConfig
stored in system.span_configurations combined with all the
SystemSpanConfigs that apply to the span. This can be at most 4
levels deep -- for a secondary tenant's range, the secondary tenant can
install a SystemSpanConfig that applies to all its ranges, the host
tenant can install a SystemSpanConfig that applies to all ranges of
the secondary tenant, and the host tenant can install a
SystemSpanConfig that applies to all ranges.

These protos form the data model which will later be used to enable
protected timestamp support for secondary tenants using the span config
infrastructure. It will be used by the various components such as the
SQLTranslator, KVAccessor, Reconciler etc.

Release note: None

@arulajmani arulajmani requested a review from a team as a code owner January 12, 2022 21:28
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@arulajmani
Copy link
Copy Markdown
Collaborator Author

This is slightly different from the data model described in the RFC, prompted by a discussion I had with @irfansharif and @adityamaru offline after the RFC was published. Specifically the introduction of the SystemSpanConfigTarget message -- by doing so, we can keep the meaning of the special keys we are hoping to reserve contained within KVAccessor and KVSubscriber. The SQLTranslator and Reconciler won't need to know how SystemSpanConfigs are stored internally.

I'll go back and update the RFC with where this PR ends up.

Copy link
Copy Markdown
Contributor

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches my mental model, but again I might be too close to the fire so we should probably wait for one more LGTM.


// ProtectedTimestamp is a timestamp above which GC should not run, regardless
// of the GC TTL.
util.hlc.Timestamp protected_timestamp = 10 [(gogoproto.nullable) = false];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this field should be ID 1?

@arulajmani
Copy link
Copy Markdown
Collaborator Author

@nvanbenschoten could I get a look on this from you as well before we merge this? Specifically the discussion around the introduction of the SystemSpanConfig message that you raised on the RFC. Is my rationale (copy pasted from the RFC discussion below) convincing?

We considered using the same SpanConfig proto but decided to introduce this special type once the design moved away from generalized inheritance. As the hierarchy we're proposing is strictly above RANGE DEFAULT, SpanConfigs are guaranteed to be fully hydrated so we can never have inheritance of fields. We made this fully hydrated distinction explicit by choosing to make all fields on SpanConfigs non-nullable (unlike zone configs) and introducing this new message lets us keep things that way.

Now that I type this, maybe this fully hydrated argument may not apply in the future if we introduce new fields on SpanConfigs to be only set by the host tenant and apply to all tenant ranges. But a different type here shouldn't preclude us from doing so in the future.

Copy link
Copy Markdown
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to y'all about whether we want a dedicated SystemSpanConfig type, or just re-use the SpanConfig one.

}

// SystemSpanConfigTarget is used to specify the target of a SystemSpanConfig.
message SystemSpanConfigTarget {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed a bit offline, but could this perhaps be pared down to something like the following?

message SystemSpanConfigTarget {
   // TenantID indicates the tenant ID of the logical cluster being targeted.
   // For secondary tenants this field is left unset. For the host we can use
   // this field to protect a specific guest tenant. Down in KV we already have
   // access to roachpb.TenantFromContext to disambiguate where the request is
   // originating from -- precisely to avoid needing to add data to our message
   // types to encode as much. We can auth that this field is left empty if
   // originating from a secondary tenant.
   roachpb.TenantID tenant_id = 1 [(gogoproto.customname) = "TenantID", (gogoproto.nullable) = true];
 }

A relevant example (not too different from what spanconfig.KVAccessor will need to do):

tenantID, ok := roachpb.TenantFromContext(ctx)
if !ok || tenantID == roachpb.SystemTenantID {

// range (in conjunction with the GC Policy). A ProtectionPolicy can be used
// to indicate a timestamp above which GC should not run, regardless of the
// GC TTL.
repeated ProtectionPolicy protection_policies = 10 [(gogoproto.nullable) = false];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making the ProtectionPolicy itself a repeated type, I think we could nest the repetition within the message itself and be just as future proof (we only care about repeated timestamps for now, right?). I'd actually go as far as to say perhaps this should be nested with our existing GCPolicy type; these timestamps are only ever used in the context of GC after all.

Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke to @nvanbenschoten about dedicated SystemSpanConfig type vs. not and we agreed that we're on the same page with introducing the new one.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @irfansharif, and @nvanbenschoten)


pkg/roachpb/span_config.proto, line 41 at r2 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

I think this field should be ID 1?

Done.


pkg/roachpb/span_config.proto, line 159 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Instead of making the ProtectionPolicy itself a repeated type, I think we could nest the repetition within the message itself and be just as future proof (we only care about repeated timestamps for now, right?). I'd actually go as far as to say perhaps this should be nested with our existing GCPolicy type; these timestamps are only ever used in the context of GC after all.

We talked about why we want to make ProtectionPolicy a repeated type offline -- imagine wanting to introduce ProtectionMode in the future for realsies, in which case, we would want to tie the mode to a particular timestamp (as opposed to all timestamps) . The desire here is to future proof ourselves for changes like these.

That being said, it makes sense to nest this under the GCPolicy message because it's only ever used in conjunction with it; done.


pkg/roachpb/span_config.proto, line 182 at r2 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Discussed a bit offline, but could this perhaps be pared down to something like the following?

message SystemSpanConfigTarget {
   // TenantID indicates the tenant ID of the logical cluster being targeted.
   // For secondary tenants this field is left unset. For the host we can use
   // this field to protect a specific guest tenant. Down in KV we already have
   // access to roachpb.TenantFromContext to disambiguate where the request is
   // originating from -- precisely to avoid needing to add data to our message
   // types to encode as much. We can auth that this field is left empty if
   // originating from a secondary tenant.
   roachpb.TenantID tenant_id = 1 [(gogoproto.customname) = "TenantID", (gogoproto.nullable) = true];
 }

A relevant example (not too different from what spanconfig.KVAccessor will need to do):

tenantID, ok := roachpb.TenantFromContext(ctx)
if !ok || tenantID == roachpb.SystemTenantID {

Thanks for pointing me to this TenantFromContext pattern -- done!

Copy link
Copy Markdown
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @nvanbenschoten and @ajwerner, this PR introduces the base protos to power the rest of #74685, and it'd be worth taking a look to see what you think.

@arulajmani
Copy link
Copy Markdown
Collaborator Author

Going to merge this for now as we have a few dependent pieces on this, but if there's any comments on this later I'm happy to retroactively address them.

bors r+

This patch is motivated by the desire to let the host tenant lay
protected timestamps on one or all secondary tenants' keyspace. It
also provides a mechanism to allow secondary tenants to lay protected
timestamps on their entire keyspace without updating every span
configuration.

We introduce the concept of `SystemSpanConfig` and
`SystemSpanConfigTarget` to enable this. We tie these together using a
`SystemSpanConfigEntry`.

A `SystemSpanConfig` is a system installed configuration that can apply
to multiple spans. It only contains protected timestamp information.

A `SystemSpanConfigTarget` is used to specify the spans a
`SystemSpanConfig` applies over. It can be used to target the entire
(logical) cluster or a particular secondary tenant. We will ensure only
the host tenant can target particular secondary tenants in a future PR
that actually persists `SystemSpanConfigs`.

We will persist `SystemSpanConfigs` in `system.span_configurations` in
a future patch. The `SystemSpanConfigTarget` will be encoded into
special reserved keys when we do so.

This change introduces the notion of a hierarchy to span configurations.
The configuration that applies to a span will now bee the `SpanConfig`
stored in `system.span_configurations` combined with all the
`SystemSpanConfigs` that apply to the span. This can be at most 4
levels deep -- for a secondary tenant's range, the secondary tenant can
install a `SystemSpanConfig` that applies to all its ranges, the host
tenant can install a `SystemSpanConfig` that applies to all ranges of
the secondary tenant, and the host tenant can install a
`SystemSpanConfig` that applies to all ranges.

These protos form the data model which will later be used to enable
protected timestamp support for secondary tenants using the span config
infrastructure. It will be used by the various components such as the
`SQLTranslator`, `KVAccessor`, `Reconciler` etc.

Release note: None
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 24, 2022

Canceled.

@arulajmani
Copy link
Copy Markdown
Collaborator Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 24, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 25, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants