ccl/sqlproxyccl: add support for moving connections away from DRAINING pods#79725
Conversation
1038ef1 to
1298f62
Compare
…G pods In cockroachdb#79346, we added a rebalancer queue for connection rebalancing. This commit adds support for transferring connections away from DRAINING pods. The rebalance loop runs once every 30 seconds for now, and connections will only be moved away from DRAINING pods if the pod has been draining for at least 1 minute. At the same time, we also fix an enqueue bug on the rebalancer queue where we're releasing the semaphore in the case of an update, which is incorrect. Release note: None
1298f62 to
2efd260
Compare
|
TFTR! (and rebased) |
|
bors r=JeffSwenson |
andy-kimball
left a comment
There was a problem hiding this comment.
, very nice work. I've just got some nits and suggestions.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball and @jaylim-crl)
pkg/ccl/sqlproxyccl/balancer/balancer.go, line 281 at r2 (raw file):
// coming from the pod watcher. func (b *Balancer) rebalance(ctx context.Context) { // getAllConns ensures that tenants will have at least one connection.
NIT: getAllConns => GetAllCons
pkg/ccl/sqlproxyccl/balancer/balancer.go, line 282 at r2 (raw file):
func (b *Balancer) rebalance(ctx context.Context) { // getAllConns ensures that tenants will have at least one connection. tenantConns := b.connTracker.GetAllConns()
There are a lot of objects being created here, both in GetAllConns and in the below loop. Are you sure there is no clean way to reduce the number of transformations from one kind of map to another simply in order to iterate?
It probably doesn't matter, since we only run this every 30 seconds, but if there's something relatively easy/cheap, it might be worth looking for. For example, maybe there'd be a ConnTracker.GetTenants that returns a slice of tenants with connections. Then you'd iterate over that and call a GetConnsByPod method to get that tenant's pods, indexed by address. This would avoid allocating large objects, would minimize the scope of locks, and would avoid extra allocations to copy objects from map to map.
One other nit I noticed when looking at ConnTracker: the GetConns method is forcing creation of a tenant entry, which doesn't seem right, if the tenantID is not already present in the tracker.
pkg/ccl/sqlproxyccl/balancer/balancer.go, line 335 at r2 (raw file):
} // Transfer all connections in DRAINING pods.
NIT: Should the DRAINING case be extracted as a separate function that will eventually live alongside a function that looks for connections to overloaded pods?
|
Build succeeded: |
This is a follow up to cockroachdb#79725. Previously, there were two issues: 1. Some of the ConnTracker APIs (e.g. GetConns and OnDisconnect) were forcing creation of tenant entries even though that was unnecessary. 2. We were holding onto a large portion of memory (with connections from all tenants). This commit addresses the two issues above: (1) tenant entries are no longer being created when that was unnecessary, and (2) instead of holding onto all the connections at once, we will retrieve the tenant's connections one by one. At the same time, to avoid extra allocations that are solely used during data transformations, we update the ConnTracker API to return a list of connection handles indexed by pod's address directly. Note that we may reintroduce the data transformation in future PRs, depending on how we implement the connection rebalancing logic. That is currently unclear. Release note: None
jaylim-crl
left a comment
There was a problem hiding this comment.
TFTR! Addressed all the nits in #79857.
Reviewable status:
complete! 1 of 0 LGTMs obtained
pkg/ccl/sqlproxyccl/balancer/balancer.go, line 281 at r2 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
NIT: getAllConns => GetAllCons
Done.
pkg/ccl/sqlproxyccl/balancer/balancer.go, line 282 at r2 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
There are a lot of objects being created here, both in GetAllConns and in the below loop. Are you sure there is no clean way to reduce the number of transformations from one kind of map to another simply in order to iterate?
It probably doesn't matter, since we only run this every 30 seconds, but if there's something relatively easy/cheap, it might be worth looking for. For example, maybe there'd be a
ConnTracker.GetTenantsthat returns a slice of tenants with connections. Then you'd iterate over that and call aGetConnsByPodmethod to get that tenant's pods, indexed by address. This would avoid allocating large objects, would minimize the scope of locks, and would avoid extra allocations to copy objects from map to map.One other nit I noticed when looking at ConnTracker: the
GetConnsmethod is forcing creation of a tenant entry, which doesn't seem right, if thetenantIDis not already present in the tracker.
Done. Good point about the extra allocations and transformations. re: scope of locks, that may not be relevant here since the previous design obtained a snapshot of tenants before copying the individual tenant conns, so we're not holding onto the lock for a long time.
pkg/ccl/sqlproxyccl/balancer/balancer.go, line 335 at r2 (raw file):
Previously, andy-kimball (Andy Kimball) wrote…
NIT: Should the DRAINING case be extracted as a separate function that will eventually live alongside a function that looks for connections to overloaded pods?
Hm, it's currently unclear how the other function will look like yet, so I left this as a TODO in the other PR, but consider this done. We will extract this out once we have a clearer idea.
This is a follow up to cockroachdb#79725. Previously, there were two issues: 1. Some of the ConnTracker APIs (e.g. GetConns and OnDisconnect) were forcing creation of tenant entries even though that was unnecessary. 2. We were holding onto a large portion of memory (with connections from all tenants). This commit addresses the two issues above: (1) tenant entries are no longer being created when that was unnecessary, and (2) instead of holding onto all the connections at once, we will retrieve the tenant's connections one by one. At the same time, to avoid extra allocations that are solely used during data transformations, we update the ConnTracker API to return a list of connection handles indexed by pod's address directly. Note that we may reintroduce the data transformation in future PRs, depending on how we implement the connection rebalancing logic. That is currently unclear. Release note: None
79477: ccl/sqlproxyccl: add server name indication (SNI) support r=darinpp a=darinpp Previously the proxy supported two ways of providing tenant id and cluster name information. One was through the database name. The second was through the options parameter. As part of the serverless routing changes, we want to support routing with a tenant id passed through SNI. This PR adds the ability to route using SNI info. Release justification: Low risk, high reward changes to existing functionality Release note: None 79629: clisqlshell: implement `COPY ... FROM STDIN` for CLI r=otan a=otan Resolves #16392 Steps: * Add a lower level API to lib/pq for use. * Add some abstraction boundary breakers in `clisqlclient` that allow a lower level handling of the COPY protocol. * Altered the state machine in `clisqlshell` to account for copy. Release note (cli change): COPY ... FROM STDIN now works from the cockroach CLI. It is not supported inside transactions. 79677: sql: session variable to allow multiple modification subqueries of table r=rytaft a=michae2 Add a new session variable, enable_multiple_modifications_of_table, which can be used instead of sql.multiple_modifications_of_table.enabled to allow execution of statements with multiple modification subqueries of the same table. Instead of making the original cluster setting the GlobalDefault of this new session setting, the original cluster setting is kept in the optbuilder logic. This is to avoid breaking applications that are already toggling the cluster setting mid-session to allow statements. Fixes: #76261 Release note (sql change): Add a new session variable, enable_multiple_modifications_of_table, which can be used instead of cluster variable sql.multiple_modifications_of_table.enabled to allow statements containing multiple INSERT ON CONFLICT, UPSERT, UPDATE, or DELETE subqueries modifying the same table. Note that underlying issue 70731 is not fixed. As with sql.multiple_modifications_of_table.enabled, be warned that with this session variable enabled there is nothing to prevent the table corruption seen in issue 70731 from occuring if the same row is modified multiple times by different subqueries of a single statment. It's best to rewrite these statements, but the session variable is provided as an aid if this is not possible. 79697: sql: ensure descriptors versions change only once, fix related bugs r=ajwerner a=ajwerner The first commit adds back the change from #79580. The second commit fixes fallout. As soon as #79580 merged, it tickled flakes. These flakes were caused by operations to alter the database which would build new mutable tables from immutable tables and thus overwrite existing entries. The fix here is to retrieve the proper descriptors using the collection, and to make sure that those descriptors are properly hydrated. This, in turn, revealed that our policy checking in validation to avoid using descriptors which had been already checked out was too severe and would cause excessive numbers of extra round-trip. I'll be honest, I haven't fully internalized why that policy was there. I supposed it was there to ensure that we didn't have a case where we check out a descriptor and then fail to write it to the store. I don't exactly know what to do to re-establish the desired behavior of the code. At the very least, it feels like if we did re-read the descriptor once, then we should do something about resetting the status. I guess I could try to unravel the mystery leading to the checkout in the first place. The test is very flakey without this patch. The third commit sets `AvoidLeased` flags which ought to be set. It's sad that they ought to be set and that it's possible that there were bugs due to the flag not being set. Backport will address #79704. Release note: None 79857: ccl/sqlproxyccl: cleanup balancer and connection tracker r=JeffSwenson a=jaylim-crl This is a follow up to #79725. Previously, there were two issues: 1. Some of the ConnTracker APIs (e.g. GetConns and OnDisconnect) were forcing creation of tenant entries even though that was unnecessary. 2. We were holding onto a large portion of memory (with connections from all tenants). This commit addresses the two issues above: (1) tenant entries are no longer being created when that was unnecessary, and (2) instead of holding onto all the connections at once, we will retrieve the tenant's connections one by one. At the same time, to avoid extra allocations that are solely used during data transformations, we update the ConnTracker API to return a list of connection handles indexed by pod's address directly. Note that we may reintroduce the data transformation in future PRs, depending on how we implement the connection rebalancing logic. That is currently unclear. Release note: None 79874: ccl/sqlproxyccl: add metrics to track the number of rebalance requests r=JeffSwenson a=jaylim-crl This commit addresses a TODO by adding metrics to track the number of rebalance requests (total, queued, and running). The following metrics are added: - proxy.balancer.rebalance.running - proxy.balancer.rebalance.queued - proxy.balancer.rebalance.total Release note: None Co-authored-by: Darin Peshev <darinp@gmail.com> Co-authored-by: Oliver Tan <otan@cockroachlabs.com> Co-authored-by: Michael Erickson <michae2@cockroachlabs.com> Co-authored-by: Andrew Werner <awerner32@gmail.com> Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com> Co-authored-by: Jay <jay@cockroachlabs.com>
|
blathers backport 22.1 |
This is a follow up to cockroachdb#79725. Previously, there were two issues: 1. Some of the ConnTracker APIs (e.g. GetConns and OnDisconnect) were forcing creation of tenant entries even though that was unnecessary. 2. We were holding onto a large portion of memory (with connections from all tenants). This commit addresses the two issues above: (1) tenant entries are no longer being created when that was unnecessary, and (2) instead of holding onto all the connections at once, we will retrieve the tenant's connections one by one. At the same time, to avoid extra allocations that are solely used during data transformations, we update the ConnTracker API to return a list of connection handles indexed by pod's address directly. Note that we may reintroduce the data transformation in future PRs, depending on how we implement the connection rebalancing logic. That is currently unclear. Release note: None
In #79346, we added a rebalancer queue for connection rebalancing. This commit
adds support for transferring connections away from DRAINING pods. The
rebalance loop runs once every 30 seconds for now, and connections will only be
moved away from DRAINING pods if the pod has been draining for at least 1
minute.
At the same time, we also fix an enqueue bug on the rebalancer queue where
we're releasing the semaphore in the case of an update, which is incorrect.
Release note: None