-
Notifications
You must be signed in to change notification settings - Fork 4.1k
stmtdiagnostics: seemingly a deadlock in spanlatch.(*Manager).wait #119593
Copy link
Copy link
Closed
Labels
C-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).T-sql-queriesSQL Queries TeamSQL Queries Team
Description
Extracted from this failure of TestDiagnosticsRequest:
* INFO: slow quiesce. stack traces:
...
test_server_shim.go:157: automatically injected a shared process virtual cluster under test; see comment at top of test_server_shim.go for details.
...
goroutine 3547 [chan receive (nil chan), 4 minutes]:
github.com/cockroachdb/cockroach/pkg/util/timeutil.(*Timer).Reset(0xc0062988d0, 0xc002ad16f0?)
github.com/cockroachdb/cockroach/pkg/util/timeutil/timer.go:83 +0x5a
github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanlatch.(*Manager).wait(0xc004499b00?, {0x65c13b8, 0xc00163f290}, 0xc003f40400, {{{{0x0, 0x0}, {0x0, 0x0}}, {{0x0, 0x0}, ...}}})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanlatch/manager.go:499 +0x8f
github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanlatch.(*Manager).Acquire(0x4322203a2265756c?, {0x65c13b8, 0xc00163f290}, 0x51c9a46?, 0x22ee95b?, {0x6584680?, 0xc003c19320?})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanlatch/manager.go:251 +0x1a5
github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency.(*latchManagerImpl).Acquire(0xc00449f2c0?, {0x65c13b8?, 0xc00163f290?}, {0x0, {0x17b68ddcf017e2ae, 0x0}, 0x0, 0x0, 0x0, 0x0, ...})
github.com/cockroachdb/cockroach/pkg/kv/kvserver/concurrency/latch_manager.go:29 +0x45
The test failed during the server shutdown. To me it seems like a deadlock - the timer in question uses 15 seconds.
I think we have a problematic pattern of using timeutil.Timer at least in stmtdiagnostics.Registry.poll - namely we allocate the timer locally but then calling Stop on it puts it into the sync.Pool, yet we later will keep on using the same timer which can result in concurrent access to the same timer object, which is invalid.
Jira issue: CRDB-36234
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
C-test-failureBroken test (automatically or manually discovered).Broken test (automatically or manually discovered).T-sql-queriesSQL Queries TeamSQL Queries Team
Type
Projects
Status
Done