-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql/flowinfra: don't send ConsumerSignal during FlowRegistry registration while holding lock #72964
Description
The largest source of mutex contention in TPC-E is in a location I haven't seen before. During a test, we see over 25% of time spent blocking on mutexes process-wide in FlowRegistry.ConnectInboundStream, which is responsible for registering an incoming stream with the flow registry:
Looking at the code, it's not hard to see why. FlowRegistry.ConnectInboundStream grabs a registry-wide (i.e. process-wide) exclusive lock and proceeds to register the flow with the registry. So far, this seems reasonable.
However, it then constructs and sends at least one (and sometimes two) *ConsumerSignal messages on a gRPC stream. So under a process-wide exclusive lock, we're interacting with gRPC. Granted this is a gRPC stream so it is usually non-blocking, but it's still expensive.
We should try to restructure this code to pull any interactions with gRPC outside of the locking. Ideally, the locking would surround just the portion of the code that requires mutual exclusion and nothing else. I believe this is just the interactions with the FlowRegistry.flows map and the corresponding flowEntry.
cc. @yuzefovich do you have ideas about how we can improve the structure of this function?
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
