-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storage: explore using a goroutine pool for async intent resolution #30780
Description
Async intent resolution can be moderately costly for write-heavy workloads with secondary indices. For instance, in sysbench's insert workload, we've seen that cleanupTxnIntentsAsync consumes about 5% of CPU.
We should explore replacing the individual async goroutine that each EndTxn batch kicks off with a fixed-size goroutine pool, along the same vein as the raftScheduler. EndTxns batches would simply hand over their off-range intents to the pool before returning. A major benefit of this would be that it would permit async intent resolution attempts to clean up the intents for multiple txns at once. It would also avoid the creation of a short-lived goroutine per txn.
One thing we would need to consider is how our current backpressure mechanism would translate to a world with a fixed-size goroutine pool instead of semaphore-limited async tasks. We would need some kind of queue to communicate txn intents to the pool, so perhaps we could backpressure when this queue grows too large.
A downside to this approach is that all workers in the pool could get stuck trying to resolve intents on an unavailable range, which would prevent intents on any other range from being resolved.