Skip to content

sql/flowinfra: don't send ConsumerSignal during FlowRegistry registration while holding lock #72964

@nvb

Description

@nvb

The largest source of mutex contention in TPC-E is in a location I haven't seen before. During a test, we see over 25% of time spent blocking on mutexes process-wide in FlowRegistry.ConnectInboundStream, which is responsible for registering an incoming stream with the flow registry:

Screen Shot 2021-11-18 at 10 18 17 PM

Looking at the code, it's not hard to see why. FlowRegistry.ConnectInboundStream grabs a registry-wide (i.e. process-wide) exclusive lock and proceeds to register the flow with the registry. So far, this seems reasonable.

However, it then constructs and sends at least one (and sometimes two) *ConsumerSignal messages on a gRPC stream. So under a process-wide exclusive lock, we're interacting with gRPC. Granted this is a gRPC stream so it is usually non-blocking, but it's still expensive.

We should try to restructure this code to pull any interactions with gRPC outside of the locking. Ideally, the locking would surround just the portion of the code that requires mutual exclusion and nothing else. I believe this is just the interactions with the FlowRegistry.flows map and the corresponding flowEntry.

cc. @yuzefovich do you have ideas about how we can improve the structure of this function?

Metadata

Metadata

Assignees

Labels

A-sql-executionRelating to SQL execution.C-performancePerf of queries or internals. Solution not expected to change functional behavior.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions