Skip to content

sql: likely incorrect locking pattern in ConcurrentBufferGuard (contention mgt), possibly leading to rare deadlocks #83080

@knz

Description

@knz

Describe the problem

In sql/contention/contentionutils/concurrent_buffer_guard.go we see the following pattern:

func (c *ConcurrentBufferGuard) syncRLocked() {
  // We upgrade the read-lock to a write-lock, then when we are done flushing,
  // the lock is downgraded to a read-lock.
  c.flushSyncLock.RUnlock()
  defer c.flushSyncLock.RLock()
  c.flushSyncLock.Lock()
  defer c.flushSyncLock.Unlock()
  c.syncLocked()
}

The call to RUnlock() is followed by Lock(). In between the two the mutex is not held.
This syncRLocked() function is called from AtomicWrite, which (as the name implies) is meant to do its operation atomically.
Because of the pattern in syncRLocked(), the atomicity is lost.

The result is that contention information may be lost, or the execution could deadlock.

Expected behavior

The logic in ConcurrentBufferGuard should be rewritten to avoid this loss of atomicity, possibly using a channel instead.

Jira issue: CRDB-16850

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-sql-observabilityRelated to observability of the SQL layerC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-observability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions