Skip to content

sql: make statusServer optional#47693

Merged
craig[bot] merged 4 commits intocockroachdb:masterfrom
tbg:sql-no-status
Apr 21, 2020
Merged

sql: make statusServer optional#47693
craig[bot] merged 4 commits intocockroachdb:masterfrom
tbg:sql-no-status

Conversation

@tbg
Copy link
Copy Markdown
Member

@tbg tbg commented Apr 20, 2020

SQL depends on the status server to list sessions and nodes. A
pure SQL server (used for multi-tenancy) will not have access to
these.

Wrap the status server in a (serverpb.StatusServer, bool) wrapper
that indicates when the status server is not present, and return the
newly introduced pgerr.UnsupportedWithMultiTenancy() in such cases.

Over time, we'll have to take stock of the functionality lost that
way and we may have to restore some of it, however at the moment
the focus is on discovering which functionality is affected, and
turning their resolution into work items which can be handed to
the SQL team.

The intention is that looking at the callers of the above error
constructor provides an overview to that end.

Release note: None

@tbg tbg requested a review from andreimatei April 20, 2020 12:44
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 20, 2020

❌ The GitHub CI (Cockroach) build has failed on 992ad73d.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 20, 2020

❌ The GitHub CI (Cockroach) build has failed on 2c54e3ba.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@tbg tbg requested a review from asubiotto April 20, 2020 14:35
@tbg tbg force-pushed the sql-no-status branch 2 times, most recently from 5d7927b to a804e74 Compare April 20, 2020 19:28
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Apr 20, 2020

❌ The GitHub CI (Cockroach) build has failed on a804e74b.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 12 of 12 files at r1, 3 of 3 files at r2, 3 of 3 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @asubiotto, and @tbg)


pkg/server/server.go, line 501 at r1 (raw file):

	)
	// TODO(tbg): don't pass all of Server into this to avoid this hack.
	sAuthentication := newAuthenticationServer(lateBoundServer)

nit: Consider sAuth?


pkg/sql/planner.go, line 54 at r1 (raw file):

	Tracing *SessionTracing

	// StatusServer gives access to the Status service. Used to cancel queries.

Give this a bit more of a comment. Describe what the boolean means and why it might be false.


pkg/sql/planner.go, line 55 at r1 (raw file):

	// StatusServer gives access to the Status service. Used to cancel queries.
	StatusServer func() (serverpb.StatusServer, bool)

If we're planning on keeping this for a while, consider adding an serverpb.OptionalStatusServer object that has a sane zero value impl and a Get method. It would allow us to avoid boilerplate like noStatusServer.

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto, @nvanbenschoten, and @tbg)


pkg/server/server.go, line 501 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: Consider sAuth?

consider authServer and friends :P


pkg/server/server.go, line 510 at r2 (raw file):

	sqlServer, err := newSQLServer(ctx, sqlServerArgs{
		sqlServerOptionalArgs: sqlServerOptionalArgs{

maybe name the struct kvAccess?


pkg/server/server_sql.go, line 106 at r1 (raw file):

	// The status server is handed the stmtDiagnosticsRegistry.
	//
	// The status server is not available in pure SQL servers, in which case

I don't know about the name "pure SQL servers". I'm thinking if it'd be possible to sidestep the issue of rationalizing this comment by abstracting the statusServer in a clusterAccess interface, and have a dummy impl that returns errors for SessionRegistry().


pkg/server/server_sql.go, line 526 at r1 (raw file):

		cfg.circularInternalExecutor, cfg.db, cfg.gossip, cfg.Settings)
	if statusServer, ok := cfg.status(); ok {
		statusServer.setStmtDiagnosticsRequester(stmtDiagnosticsRegistry)

I'd try to rework this interaction between this constructor and statusServer. Lift the call to setStmtDiagnosticsRequester() up to a higher level.


pkg/server/testserver.go, line 496 at r3 (raw file):

}

// StartTenant starts a SQL tenant communicating with the receiver.

what's "the receiver"?

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "Pure SQL server" is a confusing term to describe a SQL server not running in multi-tenant mode. I don't have any better suggestions than maybe "single-tenant SQL server" right now but we should probably agree on better terminology and use that from here on out.

Reviewed 1 of 12 files at r1, 1 of 3 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @tbg)


pkg/server/server_sql.go, line 106 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I don't know about the name "pure SQL servers". I'm thinking if it'd be possible to sidestep the issue of rationalizing this comment by abstracting the statusServer in a clusterAccess interface, and have a dummy impl that returns errors for SessionRegistry().

Not sure if this is what you're suggesting, but I'm +1 to making this an interface field (we probably need the whole serverpb.StatusServer interface though, right?) and returning errors for method calls. We'll probably want to move towards having a different status server implementation depending on whether we're running in multi-tenant mode or not.


pkg/sql/temporary_schema.go, line 505 at r1 (raw file):

		// TODO(tbg): a pure SQL server still needs to clean up temporary schemas.
		// Listing sessions should not require a full status server.
		log.Infof(ctx, "early exiting temporary schema cleaner as no temporary schemas were found")

Is this log message right?

tbg added 4 commits April 21, 2020 11:58
SQL depends on the status server to list sessions and nodes. A
tenant SQL server (used in multi-tenancy) will not have access to
these.

Wrap the status server in a `(serverpb.StatusServer, bool)` wrapped
that indicates when the status server is not present, and return the
newly introduced `pgerr.UnsupportedWithMultiTenancy()` in such cases.

Over time, we'll have to take stock of the functionality lost that
way and we may have to restore some of it, however at the moment
the focus is on discovering which functionality is affected, and
turning their resolution into work items which can be handed to
the SQL team.

The intention is that looking at the callers of the above error
constructor provides an overview to that end.

Release note: None
The fields in this struct won't be available (at least initially) in a
SQL tenant server.

Release note: None
Allow starting a tenant process pointing at a node from a TestCluster.
This paves the way to writing simple unit tests for multi-tenancy
from anywhere in the codebase.

The methods that construct and start the server are almost pure code
movement, with minimal adjustments.

Release note: None
@tbg tbg requested review from andreimatei and nvb April 21, 2020 10:10
Copy link
Copy Markdown
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, right now the nomenclature is in between chairs. This is because so far all I've cared about is to untangle the SQL server from the main server; there isn't anything about tenancy yet. And yet I've named the new method StartTenant in anticipation of it gaining a tenantID parameter soon (you can think of it as having an implicit tenantID of zero at the moment).

I'll stick with "SQL tenant server" which is how we've called this in the design docs (there we mostly talk about the SQL tenant process, which is a process running a SQL tenant server). Does that work?

Thanks for the reviews! I can sense that there's a desire to polish upfront, which I can understand. I hope it's okay that I'm pushing back against that a little bit (though I'm happy to address smaller items). My "mission" here is to go through sqlOptionalArgs quickly and a) sanely but with little effort make sure we error out when they're not there, therefore b) discovering all the places we may need to fix up later, and c) putting all of them into issues at the end, and d) making sure we're not regressing during current development.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei, @asubiotto, and @nvanbenschoten)


pkg/server/server.go, line 501 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

consider authServer and friends :P

authServer is a bad choice because that shadows the type name (reason I originally made the renames). I went with sAuth as Nathan suggested.


pkg/server/server.go, line 510 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

maybe name the struct kvAccess?

That's misleading in that it's too specific. I'd like KV to refer to the KV API. This is much more than that. Will leave as is for now.


pkg/server/server_sql.go, line 106 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Not sure if this is what you're suggesting, but I'm +1 to making this an interface field (we probably need the whole serverpb.StatusServer interface though, right?) and returning errors for method calls. We'll probably want to move towards having a different status server implementation depending on whether we're running in multi-tenant mode or not.

I did consider that and explicitly decided against it. The fact that multi-tenancy is becoming a thing does not have mindshare yet, which means we need to actively "help" engineers on the team reason about what is or is not available. An interface that is sometimes just not going to work does not meet that requirement. It should be obvious when you're using a status server that it won't always be there, with a comment on when it won't (which I wrote now).

I agree that in the long run some parts of this functionality need to be recovered even when running a SQL tenant process. However, this is not the goal right now. We're quite willing to sacrifice any nonessential functionality at the moment. As far as I can tell, nothing that was lost here is required for Phase 2. And even if it was, this is exactly the work I want to take out of my/Nathan's critical path. We want to enumerate what is lost, put it all into issues, and then we can decide what needs to come back. Maybe the right way is to pass a "local-only" impl of the status server around, maybe not. That will be for the person solving this issue to figure out. (PS: I don't think this one is a blocker neither).


pkg/server/server_sql.go, line 526 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I'd try to rework this interaction between this constructor and statusServer. Lift the call to setStmtDiagnosticsRequester() up to a higher level.

I agree that something remains to be done here, but I looked a bit and it wasn't clear exactly what. Lifting it to a higher level means lifting it above newSQLServer, and I would like to avoid making the full server initialization require one-off random calls like these. That said, this isn't pretty, but it's good enough for me to move past.


pkg/server/testserver.go, line 496 at r3 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

what's "the receiver"?

The receiver is ts *TestServer. I clarified.


pkg/sql/planner.go, line 55 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

If we're planning on keeping this for a while, consider adding an serverpb.OptionalStatusServer object that has a sane zero value impl and a Get method. It would allow us to avoid boilerplate like noStatusServer.

That's a giant interface and we're not going to need this boilerplate in many places. I'm not convinced that's better. I'm also not convinced it's worse. Basically what I'm trying to do right now is to encode all of the optionality in the simplest possible way (that doesn't feel like leaving a mess behind) without going down too many rabbit holes. If I look at how we're passing this giant interface to everyone in SQL, and how circular the status server is, I am tempted to spend more time on it, but I think we oughtn't at this very moment.


pkg/sql/temporary_schema.go, line 505 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Is this log message right?

No, total BS! Thanks for catching that, fixed.

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Apr 21, 2020

Going to merge this, but I'll keep touching code and am happy to respond to more comments.

bors r=andreimatei,nvanbenschoten

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 21, 2020

Build succeeded

@craig craig bot merged commit 3e6bf86 into cockroachdb:master Apr 21, 2020
@tbg tbg added the A-multitenancy Related to multi-tenancy label Apr 23, 2020
@jordanlewis
Copy link
Copy Markdown
Member

I find this StatusServer API confusing - we return false if it's not available, but every single caller is supposed to just always return exactly this special error. Why doesn't it just return the error as the second argument instead of this ok boolean?

@tbg
Copy link
Copy Markdown
Member Author

tbg commented Apr 29, 2020 via email

@jordanlewis
Copy link
Copy Markdown
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-multitenancy Related to multi-tenancy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants