sql: make statusServer optional#47693
Conversation
|
❌ The GitHub CI (Cockroach) build has failed on 992ad73d. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
|
❌ The GitHub CI (Cockroach) build has failed on 2c54e3ba. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
5d7927b to
a804e74
Compare
|
❌ The GitHub CI (Cockroach) build has failed on a804e74b. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
nvb
left a comment
There was a problem hiding this comment.
Reviewed 12 of 12 files at r1, 3 of 3 files at r2, 3 of 3 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @asubiotto, and @tbg)
pkg/server/server.go, line 501 at r1 (raw file):
) // TODO(tbg): don't pass all of Server into this to avoid this hack. sAuthentication := newAuthenticationServer(lateBoundServer)
nit: Consider sAuth?
pkg/sql/planner.go, line 54 at r1 (raw file):
Tracing *SessionTracing // StatusServer gives access to the Status service. Used to cancel queries.
Give this a bit more of a comment. Describe what the boolean means and why it might be false.
pkg/sql/planner.go, line 55 at r1 (raw file):
// StatusServer gives access to the Status service. Used to cancel queries. StatusServer func() (serverpb.StatusServer, bool)
If we're planning on keeping this for a while, consider adding an serverpb.OptionalStatusServer object that has a sane zero value impl and a Get method. It would allow us to avoid boilerplate like noStatusServer.
andreimatei
left a comment
There was a problem hiding this comment.
LGTM
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @asubiotto, @nvanbenschoten, and @tbg)
pkg/server/server.go, line 501 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: Consider
sAuth?
consider authServer and friends :P
pkg/server/server.go, line 510 at r2 (raw file):
sqlServer, err := newSQLServer(ctx, sqlServerArgs{ sqlServerOptionalArgs: sqlServerOptionalArgs{
maybe name the struct kvAccess?
pkg/server/server_sql.go, line 106 at r1 (raw file):
// The status server is handed the stmtDiagnosticsRegistry. // // The status server is not available in pure SQL servers, in which case
I don't know about the name "pure SQL servers". I'm thinking if it'd be possible to sidestep the issue of rationalizing this comment by abstracting the statusServer in a clusterAccess interface, and have a dummy impl that returns errors for SessionRegistry().
pkg/server/server_sql.go, line 526 at r1 (raw file):
cfg.circularInternalExecutor, cfg.db, cfg.gossip, cfg.Settings) if statusServer, ok := cfg.status(); ok { statusServer.setStmtDiagnosticsRequester(stmtDiagnosticsRegistry)
I'd try to rework this interaction between this constructor and statusServer. Lift the call to setStmtDiagnosticsRequester() up to a higher level.
pkg/server/testserver.go, line 496 at r3 (raw file):
} // StartTenant starts a SQL tenant communicating with the receiver.
what's "the receiver"?
asubiotto
left a comment
There was a problem hiding this comment.
I think "Pure SQL server" is a confusing term to describe a SQL server not running in multi-tenant mode. I don't have any better suggestions than maybe "single-tenant SQL server" right now but we should probably agree on better terminology and use that from here on out.
Reviewed 1 of 12 files at r1, 1 of 3 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @tbg)
pkg/server/server_sql.go, line 106 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I don't know about the name "pure SQL servers". I'm thinking if it'd be possible to sidestep the issue of rationalizing this comment by abstracting the
statusServerin aclusterAccessinterface, and have a dummy impl that returns errors forSessionRegistry().
Not sure if this is what you're suggesting, but I'm +1 to making this an interface field (we probably need the whole serverpb.StatusServer interface though, right?) and returning errors for method calls. We'll probably want to move towards having a different status server implementation depending on whether we're running in multi-tenant mode or not.
pkg/sql/temporary_schema.go, line 505 at r1 (raw file):
// TODO(tbg): a pure SQL server still needs to clean up temporary schemas. // Listing sessions should not require a full status server. log.Infof(ctx, "early exiting temporary schema cleaner as no temporary schemas were found")
Is this log message right?
SQL depends on the status server to list sessions and nodes. A tenant SQL server (used in multi-tenancy) will not have access to these. Wrap the status server in a `(serverpb.StatusServer, bool)` wrapped that indicates when the status server is not present, and return the newly introduced `pgerr.UnsupportedWithMultiTenancy()` in such cases. Over time, we'll have to take stock of the functionality lost that way and we may have to restore some of it, however at the moment the focus is on discovering which functionality is affected, and turning their resolution into work items which can be handed to the SQL team. The intention is that looking at the callers of the above error constructor provides an overview to that end. Release note: None
The fields in this struct won't be available (at least initially) in a SQL tenant server. Release note: None
Allow starting a tenant process pointing at a node from a TestCluster. This paves the way to writing simple unit tests for multi-tenancy from anywhere in the codebase. The methods that construct and start the server are almost pure code movement, with minimal adjustments. Release note: None
Release note: None
tbg
left a comment
There was a problem hiding this comment.
Hmm, yeah, right now the nomenclature is in between chairs. This is because so far all I've cared about is to untangle the SQL server from the main server; there isn't anything about tenancy yet. And yet I've named the new method StartTenant in anticipation of it gaining a tenantID parameter soon (you can think of it as having an implicit tenantID of zero at the moment).
I'll stick with "SQL tenant server" which is how we've called this in the design docs (there we mostly talk about the SQL tenant process, which is a process running a SQL tenant server). Does that work?
Thanks for the reviews! I can sense that there's a desire to polish upfront, which I can understand. I hope it's okay that I'm pushing back against that a little bit (though I'm happy to address smaller items). My "mission" here is to go through sqlOptionalArgs quickly and a) sanely but with little effort make sure we error out when they're not there, therefore b) discovering all the places we may need to fix up later, and c) putting all of them into issues at the end, and d) making sure we're not regressing during current development.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @asubiotto, and @nvanbenschoten)
pkg/server/server.go, line 501 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
consider
authServerand friends :P
authServer is a bad choice because that shadows the type name (reason I originally made the renames). I went with sAuth as Nathan suggested.
pkg/server/server.go, line 510 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
maybe name the struct
kvAccess?
That's misleading in that it's too specific. I'd like KV to refer to the KV API. This is much more than that. Will leave as is for now.
pkg/server/server_sql.go, line 106 at r1 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Not sure if this is what you're suggesting, but I'm +1 to making this an interface field (we probably need the whole
serverpb.StatusServerinterface though, right?) and returning errors for method calls. We'll probably want to move towards having a different status server implementation depending on whether we're running in multi-tenant mode or not.
I did consider that and explicitly decided against it. The fact that multi-tenancy is becoming a thing does not have mindshare yet, which means we need to actively "help" engineers on the team reason about what is or is not available. An interface that is sometimes just not going to work does not meet that requirement. It should be obvious when you're using a status server that it won't always be there, with a comment on when it won't (which I wrote now).
I agree that in the long run some parts of this functionality need to be recovered even when running a SQL tenant process. However, this is not the goal right now. We're quite willing to sacrifice any nonessential functionality at the moment. As far as I can tell, nothing that was lost here is required for Phase 2. And even if it was, this is exactly the work I want to take out of my/Nathan's critical path. We want to enumerate what is lost, put it all into issues, and then we can decide what needs to come back. Maybe the right way is to pass a "local-only" impl of the status server around, maybe not. That will be for the person solving this issue to figure out. (PS: I don't think this one is a blocker neither).
pkg/server/server_sql.go, line 526 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I'd try to rework this interaction between this constructor and
statusServer. Lift the call tosetStmtDiagnosticsRequester()up to a higher level.
I agree that something remains to be done here, but I looked a bit and it wasn't clear exactly what. Lifting it to a higher level means lifting it above newSQLServer, and I would like to avoid making the full server initialization require one-off random calls like these. That said, this isn't pretty, but it's good enough for me to move past.
pkg/server/testserver.go, line 496 at r3 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
what's "the receiver"?
The receiver is ts *TestServer. I clarified.
pkg/sql/planner.go, line 55 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
If we're planning on keeping this for a while, consider adding an
serverpb.OptionalStatusServerobject that has a sane zero value impl and aGetmethod. It would allow us to avoid boilerplate likenoStatusServer.
That's a giant interface and we're not going to need this boilerplate in many places. I'm not convinced that's better. I'm also not convinced it's worse. Basically what I'm trying to do right now is to encode all of the optionality in the simplest possible way (that doesn't feel like leaving a mess behind) without going down too many rabbit holes. If I look at how we're passing this giant interface to everyone in SQL, and how circular the status server is, I am tempted to spend more time on it, but I think we oughtn't at this very moment.
pkg/sql/temporary_schema.go, line 505 at r1 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Is this log message right?
No, total BS! Thanks for catching that, fixed.
|
Going to merge this, but I'll keep touching code and am happy to respond to more comments. bors r=andreimatei,nvanbenschoten |
Build succeeded |
|
I find this StatusServer API confusing - we return |
|
Jordan, good point, I meant to update this to the pattern we've been using
more recently. PR here: #48143
|
|
Thanks! |
SQL depends on the status server to list sessions and nodes. A
pure SQL server (used for multi-tenancy) will not have access to
these.
Wrap the status server in a
(serverpb.StatusServer, bool)wrapperthat indicates when the status server is not present, and return the
newly introduced
pgerr.UnsupportedWithMultiTenancy()in such cases.Over time, we'll have to take stock of the functionality lost that
way and we may have to restore some of it, however at the moment
the focus is on discovering which functionality is affected, and
turning their resolution into work items which can be handed to
the SQL team.
The intention is that looking at the callers of the above error
constructor provides an overview to that end.
Release note: None