Skip to content

tenantcapabilities: introduce a Watcher over system.tenants#95040

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:ten-capabilities-watcher
Jan 19, 2023
Merged

tenantcapabilities: introduce a Watcher over system.tenants#95040
craig[bot] merged 1 commit intocockroachdb:masterfrom
arulajmani:ten-capabilities-watcher

Conversation

@arulajmani
Copy link
Copy Markdown
Collaborator

@arulajmani arulajmani commented Jan 11, 2023

This patch introduces three new interfaces -- a Watcher, a Reader,
and Authorizer. They're not hooked up yet, but once they are, they'll
work together to provide (in-memory) capability checks for incoming
tenant requests.

The Watcher establishes a rangefeed over system.tenants to
incrementally (and transparently) maintain an in-memory view of the
global tenant capability state. Publicly, it exposes a Reader
interface.

The Reader provides access to the global tenant capability state.
The Watcher and Authorizer communicate with each other using the
Reader interface.

The Authorizer consulsts the global tenant capability state to perform
authorization checks for incoming requests issued by tenants. Part of
the motivation to structure the code as such is to expand the set of
inputs the Authorizer uses to authorize requests. One could imagine
other dependencies being injected into the Authorizer in the future.

Epic: CRDB-18503
References: #94643

Release note: None

@arulajmani arulajmani requested review from ecwall and knz January 11, 2023 03:24
@arulajmani arulajmani requested a review from a team as a code owner January 11, 2023 03:24
@arulajmani arulajmani requested a review from a team January 11, 2023 03:24
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@arulajmani arulajmani force-pushed the ten-capabilities-watcher branch 2 times, most recently from 5315c46 to 9772b3f Compare January 11, 2023 16:22
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with the overall architecture of the watcher, and with the approach to testing.
Here are my major comments so far.

  1. I'm not sure we can reasonably expect the Authorizer to be implemented in a self-contained package over time. IMHO it will need to be part of either pkg/server or pkg/kv/kvserver, because it will grow over time to include parameters from the environment (things like resource usage, pattern of previous requests etc). I have no objections to placing the interface in a separate package but putting the implementation there is dubious (to me)

  2. it's going to be very important to have more members of the KV team look at this code and participate in the review iteration, because the KV team as a whole will bear the responsibility to maintain this forward.

I'd like to iterate on point (1) before this PR can be merged. Point (2) may not be a blocker to the merge, but be wary of adding much KV code without creating a sense of ownership to the rest of the KV team; otherwise this could come back to bite us later.

Reviewed 18 of 19 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @ecwall)


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/decoder.go line 102 at r1 (raw file):

	deleted := !ev.Value.IsPresent()
	var value roachpb.Value
	if deleted {

I don't understand what this condition is doing. This function could use a few more explanatory comments.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/decoder.go line 116 at r1 (raw file):

		Value: value,
	})

nit: remove empty line.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 53 at r1 (raw file):

	bufferMemLimit int64,
	knobs *tenantcapabilities.TestingKnobs,
) *Watcher {

you can return the interface type here to hide implementation details.

Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can talk about point (1) a bit more in-person, during tomorrow's pod meeting. I do want to explore how we could keep the Authorizer implementation self-contained, if at all possible. Could we instead inject interfaces which give the Authorizer access to whatever parameters it needs from the environment when we get there? Maybe I'm oversimplifying this because I don't appreciate how some of the capabilities we might add in the future might look in code.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ecwall)

@arulajmani arulajmani requested a review from knz January 11, 2023 21:25
@knz
Copy link
Copy Markdown
Contributor

knz commented Jan 11, 2023

Could we instead inject interfaces which give the Authorizer access to whatever parameters it needs from the environment when we get there?

yes that sounds reasonable. Do you have an idea (if only a sketch) of how it would look like?

Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an idea (if only a sketch) of how it would look like?

I think instead of having the Watcher call into tenantcapabilitiesauthorizer.New, it instead needs to be passed in the Authorizer [1] or an AuthorizerFactory. Then, whatever dependencies we need for the Authorizer, would be passed to it during creation (or in the creation of the AuthorizerFactory) .

[1] If we go this route, we would need to add a Reset method on the Authorizer, to have it throw away its state. We need this if, for some reason, the rangefeed runs into an error and as a result performs a full scan of system.tenants.

I have a mild preference for the AuthorizerFactory approach. But I could also go the other way, given it feels a bit premature for now, without any dependencies to speak of. What do you think?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @knz)

Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My expectation based on other subsystems is that the Reset route is better -- the authorizer will have other state besides whatever cache it will maintain, and re-instantiating it on every invalidation will be expensive.

But I agree it could go either way.

At this point I think the most important to decouple the two and make the authorizer/watcher connection work via an interface, instead of a direct package dependency.

Btw my other comments from before still apply.

Reviewed 1 of 19 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @ecwall)

ecwall
ecwall previously requested changes Jan 13, 2023
Copy link
Copy Markdown
Contributor

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani)


pkg/multitenant/tenantcapabilities/capabilities.go line 24 at r1 (raw file):

// by watching for changes.
type CapabilitiesWatcher interface {
	Authorizer

Why is the watcher also an authorizer?

It looks like CapabilitiesWatcher is the write interface and Authorizer is the read interface?


pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go line 55 at r1 (raw file):

		switch ru.GetInner().(type) {
		case *roachpb.AdminSplitRequest:
			cp, ok := a.mu.capabilities[tenID]

The RLock() only needs to guard this line and it can happen once before the loop.

cp will be a copy of what is in the map so it does not need to be guarded.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 37 at r1 (raw file):

	bufferMemLimit   int64

	tenantsTableID uint32 // overriden for tests

typo

@arulajmani arulajmani force-pushed the ten-capabilities-watcher branch from 9772b3f to 1f08974 Compare January 19, 2023 04:35
@arulajmani arulajmani requested a review from a team as a code owner January 19, 2023 04:35
Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After mulling over this a bit more, and keeping in mind we want the Authorizer to be general enough that it could be extended to use other signals to perform authorization in the future, I decided to add a new Reader interface. We then use this Reader interface to communicate between the Authorizer and Watcher -- the Watcher is responsible for incrementally maintaining the underlying tenant capabilities state, and exposes it via the Reader. Conversely, the Authorizer consults the global tenant capability state to make authorization decisions. Let me know what you think!

Dismissed @ecwall from a discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @knz)


pkg/multitenant/tenantcapabilities/capabilities.go line 24 at r1 (raw file):

Previously, ecwall (Evan Wall) wrote…

Why is the watcher also an authorizer?

It looks like CapabilitiesWatcher is the write interface and Authorizer is the read interface?

Changed some of this stuff around, so this comment no longer applies.


pkg/multitenant/tenantcapabilities/tenantcapabilitiesauthorizer/authorizer.go line 55 at r1 (raw file):

Previously, ecwall (Evan Wall) wrote…

The RLock() only needs to guard this line and it can happen once before the loop.

cp will be a copy of what is in the map so it does not need to be guarded.

Changed some of this stuff around, but the new changes follow the ethos of your comment.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/decoder.go line 102 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

I don't understand what this condition is doing. This function could use a few more explanatory comments.

Added.


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 37 at r1 (raw file):

Previously, ecwall (Evan Wall) wrote…

typo

Which part?


pkg/multitenant/tenantcapabilities/tenantcapabilitieswatcher/watcher.go line 53 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

you can return the interface type here to hide implementation details.

I was intending to accept the value returned of this function as the interface, instead of returning an interface here. I don't think the whole "return interface vs. structs" matters much for what we're doing here, but the one nicety of returning the concrete struct here is it allows us to access the testing function directly, without having to perform a cast or pollute the interface.

@arulajmani arulajmani changed the title tenantcapabilities: introduce a CapabilitiesWatcher over system.tenants tenantcapabilities: introduce a Watcher over system.tenants Jan 19, 2023
Copy link
Copy Markdown
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Before you merge this, can you do a back-of-the-envelope estimation of how much memory this uses for, say, 10000 tenant records and 100 bytes of capability payload per tenant - and mention it in the PR description.

Reviewed 15 of 15 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani, @ecwall, and @irfansharif)


-- commits line 18 at r2:
consults*

This patch introduces three new interfaces -- a Watcher, a Reader,
and Authorizer. They're not hooked up yet, but once they are, they'll
work together to provide (in-memory) capability checks for incoming
tenant requests.

The Watcher establishes a rangefeed over `system.tenants` to
incrementally (and transparently) maintain an in-memory view of the
global tenant capability state. Publicly, it exposes a `Reader`
interface.

The `Reader` provides access to the global tenant capability state.
The `Watcher` and `Authorizer` communicate with each other using the
`Reader` interface.

The `Authorizer` consults the global tenant capability state to perform
authorization checks for incoming requests issued by tenants. Part of
the motivation to structure the code as such is to expand the set of
inputs the `Authorizer` uses to authorize requests. One could imagine
other dependencies being injected into the `Authorizer` in the future.

In this PR, we conservatively estimate a tenant capabilities proto to be
50 bytes in size. Given the `Watcher` keeps the entire capabilities
state in memory, this means that for 10,0000 tenants, we would have a
very sane memory footprint of 500KB.

Epic: CRDB-18503
References: cockroachdb#94643

Release note: None
@arulajmani arulajmani force-pushed the ten-capabilities-watcher branch from 1f08974 to 1286dcc Compare January 19, 2023 18:50
Copy link
Copy Markdown
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some words. I chose to use 50 bytes per capability payload, given that's the constant we use in the PR to allocate the buffer.

TFTR!

bors r=knz

Dismissed @knz from a discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ecwall and @irfansharif)


-- commits line 18 at r2:

Previously, knz (Raphael 'kena' Poss) wrote…

consults*

Done.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 19, 2023

👎 Rejected by code reviews

@arulajmani
Copy link
Copy Markdown
Collaborator Author

bors r=knz

Copy link
Copy Markdown
Contributor

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 19 files at r1, 15 of 15 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 19, 2023

Build failed (retrying...):

Copy link
Copy Markdown
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code smells vaguely familiar 😛

if s.overridesMonitor != nil {
s.mu.overrides = make(map[string]settings.EncodedValue)
// Initialize the overrides. We want to do this before processing the
s.mu.overrides = make(map[string]settings.EncodedValue) // Initialize the overrides. We want to do this before processing the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete sentence. Also did you mean to inline it? If so, use the whole lower case/no period thing.

}
cp, found := a.capabilitiesReader.GetCapabilities(tenID)
if !found {
log.Infof(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a Warningf instead. Also, the way we've written this code, we're expecting the zero values tenantcapabilitiespb.TenantCapabilities be what we use when no capability is found for any tenant. Maybe you could make this more explicit, by setting a fallback explicitly if !found.

----
false

# Test that the order of the split request doesn't have any effect.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test case you're not messing with the ordering, you don't have a split at all. It's the same as the test below.

ok

# tenant capabilities should not apply to the system tenant.
# TODO(arul): this is likely to change, but is contingent on a migration to seed
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this future, would that mean that the tenantcapabilities rangefeed watcher running + state being present in the system.tenant would be data dependencies for the system tenant to be able to issue splits/exercise its capabilities?

true, /* withPrevValue */
w.decoder.translateEvent,
w.handleUpdate,
w.knobs.WatcherRangeFeedKnobs.(*rangefeedcache.TestingKnobs),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this type conversion panic if nil?

updates
----

update-state
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there no "Complete Update" here? It's completely updated, right? To an empty table.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 19, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants