-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sqlinstance: optimize for MR cold-start #85737
Description
Is your feature request related to a problem? Please describe.
The sqlinstance subsystem deals with allocating SQL instance IDs to sql servers and disseminating information about locality and network addresses of live instances among the sql servers. The existing startup process requires a number of synchronous WAN round-trips. The protocol proceeds as follows:
- Create a
sqllivenesssession- Reducing WAN RPC tracked in sqlliveness: partition table so that sessions can be created without WAN RPCs #85736.
- Find the next available instance ID
- This works by first scanning the complete listing of instance IDs
- For each existing entry, the logic then proceeds to check whether the instance is alive
- The lowest entry which is not alive becomes the current processes's entry to claim
- Claim the instance
- Start up the data structure to track live instances
- We used to do this asynchronously, prior to claiming an instance. This caused problems because we had no claim that'd we see our own instance. The DistSQL layer assumes that it can find at least its own instance.
Describe the solution you'd like
We'll need to solve each of these steps somewhat distinctly. This issue tracks the work required to make 2 efficient and 4 asynchronous.
We need to makes the data scanned and the writes involved involved in the process the claiming of an instance only need to interact with data local to the current region. To do this, we propose partitioning the system.sql_instances table as a REGIONAL BY ROW table.
We need to make sure that the just by scanning the local region's partition of the table we can claim an ID. To do this, we'll have some background process pre-populate each region with some unclaimed IDs.
We need to eliminate the code whereby today we check if any claimed entries are expired. If we just eliminate this, then IDs will never be reclaimed. Given we proposed a background process above, we can delegate the work to cleaning up expired claims to that same background task.
In order to make process 4 asynchronous, we need to augment the implementation of the sqlinstance.AddressResolver to know about the current process's Instance as soon as it has been claimed, and then make the rest of the addresses available only asynchronously.
Additional context
This will rely on being able to partition system database tables using MR syntax. This is a part of #85612.
Epic: CRDB-18596
Jira issue: CRDB-18411