Skip to content

ccl/sqlproxyccl: cleanup balancer and connection tracker#79857

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
jaylim-crl:jay/220412-cleanup-balancer
Apr 13, 2022
Merged

ccl/sqlproxyccl: cleanup balancer and connection tracker#79857
craig[bot] merged 1 commit intocockroachdb:masterfrom
jaylim-crl:jay/220412-cleanup-balancer

Conversation

@jaylim-crl
Copy link
Copy Markdown
Contributor

This is a follow up to #79725. Previously, there were two issues:

  1. Some of the ConnTracker APIs (e.g. GetConns and OnDisconnect) were forcing
    creation of tenant entries even though that was unnecessary.
  2. We were holding onto a large portion of memory (with connections from all
    tenants).

This commit addresses the two issues above: (1) tenant entries are no longer
being created when that was unnecessary, and (2) instead of holding onto all
the connections at once, we will retrieve the tenant's connections one by one.
At the same time, to avoid extra allocations that are solely used during data
transformations, we update the ConnTracker API to return a list of connection
handles indexed by pod's address directly.

Note that we may reintroduce the data transformation in future PRs, depending
on how we implement the connection rebalancing logic. That is currently
unclear.

Release note: None

@jaylim-crl jaylim-crl requested review from a team as code owners April 12, 2022 19:30
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@jaylim-crl jaylim-crl requested review from andy-kimball and jeffswenson and removed request for a team April 12, 2022 19:30
@jaylim-crl
Copy link
Copy Markdown
Contributor Author

re: #79725 (review)

TFTR!


Previously, andy-kimball (Andy Kimball) wrote…

NIT: getAllConns => GetAllCons

Done.


Previously, andy-kimball (Andy Kimball) wrote…

There are a lot of objects being created here, both in GetAllConns and in the below loop. Are you sure there is no clean way to reduce the number of transformations from one kind of map to another simply in order to iterate?

It probably doesn't matter, since we only run this every 30 seconds, but if there's something relatively easy/cheap, it might be worth looking for. For example, maybe there'd be a ConnTracker.GetTenants that returns a slice of tenants with connections. Then you'd iterate over that and call a GetConnsByPod method to get that tenant's pods, indexed by address. This would avoid allocating large objects, would minimize the scope of locks, and would avoid extra allocations to copy objects from map to map.

One other nit I noticed when looking at ConnTracker: the GetConns method is forcing creation of a tenant entry, which doesn't seem right, if the tenantID is not already present in the tracker.

Done. Good point about the extra allocations and transformations. re: scope of locks, that may not be relevant here since the previous design obtained a snapshot of tenants before copying the individual tenant conns, so we're not holding onto the lock for a long time.


Previously, andy-kimball (Andy Kimball) wrote…

NIT: Should the DRAINING case be extracted as a separate function that will eventually live alongside a function that looks for connections to overloaded pods?

Hm, it's currently unclear how the other function will look like yet, so I left this as a TODO in the other PR, but consider this done. We will extract this out once we have a clearer idea.

Copy link
Copy Markdown
Contributor Author

@jaylim-crl jaylim-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball and @jeffswenson)


pkg/ccl/sqlproxyccl/balancer/balancer.go, line 314 at r1 (raw file):

		}

		connMap := b.connTracker.GetConnsMap(tenantID)

One aspect to point out about this design is that we're now obtaining a lock each time we do this (to retrieve *tenantEntry), unlike the previous approach, where we're only obtaining the lock once. That said, these operations should be in the order of nanoseconds, so it should not matter much.

@jaylim-crl jaylim-crl changed the title ccl/sqlproxycc: cleanup balancer and connection tracker ccl/sqlproxyccl: cleanup balancer and connection tracker Apr 12, 2022
@jaylim-crl jaylim-crl force-pushed the jay/220412-cleanup-balancer branch 2 times, most recently from c3a5324 to 246fe83 Compare April 12, 2022 20:05
This is a follow up to cockroachdb#79725. Previously, there were two issues:
1. Some of the ConnTracker APIs (e.g. GetConns and OnDisconnect) were forcing
   creation of tenant entries even though that was unnecessary.
2. We were holding onto a large portion of memory (with connections from all
   tenants).

This commit addresses the two issues above: (1) tenant entries are no longer
being created when that was unnecessary, and (2) instead of holding onto all
the connections at once, we will retrieve the tenant's connections one by one.
At the same time, to avoid extra allocations that are solely used during data
transformations, we update the ConnTracker API to return a list of connection
handles indexed by pod's address directly.

Note that we may reintroduce the data transformation in future PRs, depending
on how we implement the connection rebalancing logic. That is currently
unclear.

Release note: None
@jaylim-crl jaylim-crl force-pushed the jay/220412-cleanup-balancer branch from 246fe83 to 4f6b777 Compare April 13, 2022 17:36
Copy link
Copy Markdown
Collaborator

@jeffswenson jeffswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jaylim-crl
Copy link
Copy Markdown
Contributor Author

TFTR!

bors r=JeffSwenson

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Apr 13, 2022

Build succeeded:

@craig craig bot merged commit ba9baca into cockroachdb:master Apr 13, 2022
@jaylim-crl jaylim-crl deleted the jay/220412-cleanup-balancer branch April 13, 2022 22:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants