Skip to content

kv, kvserver: add query locks command#76998

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
AlexTalks:cmd_query_locks
Mar 15, 2022
Merged

kv, kvserver: add query locks command#76998
craig[bot] merged 1 commit intocockroachdb:masterfrom
AlexTalks:cmd_query_locks

Conversation

@AlexTalks
Copy link
Copy Markdown
Contributor

@AlexTalks AlexTalks commented Feb 24, 2022

This change adds a new KV API, QueryLocks, allowing a client to query
the state of a replica's lock table over a provided key span. This
exposes the in-memory locks currently managed by Concurrency Control,
and is part of the work needed to implement #75541.

Resolves #74833.

Release note: None

Release justification: Low-risk new functionality.

@AlexTalks AlexTalks requested a review from a team as a code owner February 24, 2022 19:06
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's exciting to see this all falling into place.

Reviewed 5 of 5 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)


pkg/kv/kvserver/replica_read.go, line 373 at r3 (raw file):

			// and becomes the inclusive start key of what was read.
			header.Key = t.ResumeSpan.EndKey
		case *roachpb.QueryLocksResponse:

This function is used for optimistic evaluation of MVCC operations. It shouldn't apply to a metadata operation like QueryLocksRequest.


pkg/kv/kvserver/batcheval/cmd_query_locks.go, line 36 at r3 (raw file):

	maxOffset time.Duration,
) {
	DefaultDeclareKeys(rs, header, req, latchSpans, lockSpans, maxOffset)

Do we want to acquire read latches across the key span being observed? I didn't think we needed to.


pkg/kv/kvserver/batcheval/cmd_query_locks.go, line 41 at r3 (raw file):

}

func QueryLocks(

We should add some kv-level testing for this request type.


pkg/roachpb/api.go, line 667 at r3 (raw file):

// Method implements the Request interface.
func (*QueryLocksRequest) Method() Method { return QueryLocks }

We'll want to implement the combinable interface for this type earlier in this file so that requests that are split on range boundaries can have their responses merged back together.

@AlexTalks AlexTalks requested a review from a team March 9, 2022 20:03
Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)


pkg/kv/kvserver/batcheval/cmd_query_locks.go, line 36 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we want to acquire read latches across the key span being observed? I didn't think we needed to.

Perhaps not necessary, but is there any reason we might want to? Perhaps to not race with operations on these keys that may acquire locks? But let me know if we think this feature shouldn't attempt to block that during evaluation...


pkg/kv/kvserver/batcheval/cmd_query_locks.go, line 41 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We should add some kv-level testing for this request type.

Working - should finish this shortly

@AlexTalks AlexTalks requested a review from a team as a code owner March 10, 2022 04:14
Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 23 files at r4, 10 of 17 files at r5, 6 of 10 files at r6.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)

Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @nvanbenschoten)


pkg/kv/kvserver/client_split_test.go, line 403 at r8 (raw file):

// create transactions that hold locks on keys across the ranges, and then
// utilize the QueryLocks API to validate the exposed lock table information.
func TestStoreRangeSplitAndQueryLocks(t *testing.T) {

FYI, Still not entirely sure this is the correct place for this test, especially because it feels like it's using different approaches to the KV APIs (I.e. most things in this module use a Store / store.TestSender(), whereas what we want to test is sending a batch with QueryLocksRequest using the DistSender). I think this test does work properly, but please let me know if it should be moved and/or cleaned up at all!

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 17 files at r5, 3 of 10 files at r6, 1 of 3 files at r7, 5 of 5 files at r8, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks)


pkg/kv/kvserver/client_split_test.go, line 403 at r8 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

FYI, Still not entirely sure this is the correct place for this test, especially because it feels like it's using different approaches to the KV APIs (I.e. most things in this module use a Store / store.TestSender(), whereas what we want to test is sending a batch with QueryLocksRequest using the DistSender). I think this test does work properly, but please let me know if it should be moved and/or cleaned up at all!

I think this is a fine place for the test.

But CI is indicating that the test is flaky. We may just need to bump the 10ms up to testutils.DefaultSucceedsSoonDuration. Or remove the manual timeout entirely and defer to the test's timeout. That's what you did in the other test.


pkg/kv/kvserver/replica_test.go, line 11250 at r8 (raw file):

			require.Equal(t, roachpb.Key("a"), lockInfo.Key)
			require.NotNil(t, lockInfo.LockHolder, "expected lock to be held")
			if !lockInfo.LockHolder.ID.Equal(txn.ID) || !lockInfo.LockHolder.WriteTimestamp.Equal(txn.WriteTimestamp) {

minor nit: any reason to not use a pair of require.Equal calls here (and below)?

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:LGTM: mod the deadlines in one of the tests

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @AlexTalks)


pkg/kv/kvserver/concurrency/concurrency_control.go, line 467 at r8 (raw file):

	ResumeReason    roachpb.ResumeReason
	ResumeNextBytes int64
	TotalBytes      int64

please comment this new field, and also on ResumeNextBytes


pkg/roachpb/api.proto, line 1118 at r8 (raw file):

// A QueryLocksRequest is arguments to the QueryLocks() method.  It requests
// the state of the locks held in the lock table over the given key span,

consider putting some words here about what locks are in the lock table and what locks aren't


pkg/roachpb/api.proto, line 1125 at r8 (raw file):

// for example in the case of an unreplicated lock.

I think this "for example" is confusing. If you want to say anything, I think you need to say more to make the connection with replicated vs unreplicated locks; you could say that replicated locks are not part of the lock table if they were never contended. I think you should also say that only locks in the lock table are returned, regardless of this flag (right?).


pkg/kv/kvserver/client_split_test.go, line 403 at r8 (raw file):

// create transactions that hold locks on keys across the ranges, and then
// utilize the QueryLocks API to validate the exposed lock table information.
func TestStoreRangeSplitAndQueryLocks(t *testing.T) {

nit: make the name of this test start with TestQueryLocks. Maybe TestQueryLocksAcrossRanges. The fact that this test "does splits" is not very relevant.


pkg/kv/kvserver/client_split_test.go, line 403 at r8 (raw file):

I think this is a fine place for the test.

+1


pkg/kv/kvserver/client_split_test.go, line 412 at r8 (raw file):

			Store: &kvserver.StoreTestingKnobs{
				DisableMergeQueue: true,
				DisableSplitQueue: true,

how come DisableSplitQueue is needed? Consider commenting.


pkg/kv/kvserver/client_split_test.go, line 480 at r8 (raw file):

	case err := <-scanRes:
		t.Fatalf("scan did not wait on locks, returned err %v", err)
	case <-time.After(10 * time.Millisecond):

Is this assuming that the scan will block in 10ms? This seems like a recipe for flakiness. Let's use some sort of a knob to get notified when the scan blocks.


pkg/kv/kvserver/client_split_test.go, line 529 at r8 (raw file):

	case err := <-scanRes:
		require.NoError(t, err, "txn3.Scan() should have finished without error after txns unlock")
	case <-time.After(10 * time.Millisecond):

I would make this deadline a lot larger, or remove it completely.

Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/concurrency/concurrency_control.go, line 467 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

please comment this new field, and also on ResumeNextBytes

Will do - sg


pkg/roachpb/api.proto, line 1125 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

// for example in the case of an unreplicated lock.

I think this "for example" is confusing. If you want to say anything, I think you need to say more to make the connection with replicated vs unreplicated locks; you could say that replicated locks are not part of the lock table if they were never contended. I think you should also say that only locks in the lock table are returned, regardless of this flag (right?).

Will do - sg


pkg/kv/kvserver/client_split_test.go, line 403 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: make the name of this test start with TestQueryLocks. Maybe TestQueryLocksAcrossRanges. The fact that this test "does splits" is not very relevant.

Will do - sg


pkg/kv/kvserver/client_split_test.go, line 412 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

how come DisableSplitQueue is needed? Consider commenting.

This was definitely cargo-culting, heh. It is likely unnecessary, can remove.


pkg/kv/kvserver/client_split_test.go, line 480 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Is this assuming that the scan will block in 10ms? This seems like a recipe for flakiness. Let's use some sort of a knob to get notified when the scan blocks.

Do we have a knob that I can use to get a signal?


pkg/kv/kvserver/client_split_test.go, line 529 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I would make this deadline a lot larger, or remove it completely.

Will do - sg


pkg/kv/kvserver/replica_test.go, line 11250 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

minor nit: any reason to not use a pair of require.Equal calls here (and below)?

Not really, can change

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/client_split_test.go, line 480 at r8 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Do we have a knob that I can use to get a signal?

I don't see anything in the lock table; I think we should add something.
Alternatively, we could install a RequestFilter and watch for a PushTxn I guess, although the push is not immediate (but I think there's a knob you can use to make it immediate).
Or, you could record the trace of the respective request and poll it periodically to see if there's some event (or, with #74878, some tag) about the lock. We could also introduce a trace-based notification system for such tests.

Btw, if this patch needs to go in today, feel free to merge as is if you'll clean up later.

Copy link
Copy Markdown
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @AlexTalks, @andreimatei, and @nvanbenschoten)


pkg/kv/kvserver/client_split_test.go, line 480 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I don't see anything in the lock table; I think we should add something.
Alternatively, we could install a RequestFilter and watch for a PushTxn I guess, although the push is not immediate (but I think there's a knob you can use to make it immediate).
Or, you could record the trace of the respective request and poll it periodically to see if there's some event (or, with #74878, some tag) about the lock. We could also introduce a trace-based notification system for such tests.

Btw, if this patch needs to go in today, feel free to merge as is if you'll clean up later.

actually, there is a lockTableWaiterImpl.onContentionEvent that might be useful

Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/kvserver/concurrency/concurrency_control.go, line 467 at r8 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Will do - sg

Done.


pkg/roachpb/api.proto, line 1118 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

consider putting some words here about what locks are in the lock table and what locks aren't

Done.


pkg/roachpb/api.proto, line 1125 at r8 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Will do - sg

Done.


pkg/kv/kvserver/client_split_test.go, line 412 at r8 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

This was definitely cargo-culting, heh. It is likely unnecessary, can remove.

Done


pkg/kv/kvserver/client_split_test.go, line 480 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

actually, there is a lockTableWaiterImpl.onContentionEvent that might be useful

ok cool - I'll stress it to make sure it's not flaking now but will plan to merge shortly. Especially since the existing tests in replica_test.go seem to wait 5ms for operations to block on locks (though I realize that's a bit different since it's using the replica Send directly).


pkg/kv/kvserver/client_split_test.go, line 529 at r8 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

Will do - sg

Done.

@AlexTalks AlexTalks removed request for a team March 14, 2022 21:12
Copy link
Copy Markdown
Contributor Author

@AlexTalks AlexTalks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/kvserver/client_split_test.go, line 480 at r8 (raw file):

Previously, AlexTalks (Alex Sarkesian) wrote…

ok cool - I'll stress it to make sure it's not flaking now but will plan to merge shortly. Especially since the existing tests in replica_test.go seem to wait 5ms for operations to block on locks (though I realize that's a bit different since it's using the replica Send directly).

Unfortunately (of course) onContentionEvent only gets called when the contention is resolved, not started. I think the options are either to increase the wait time here or install a new event...

@AlexTalks AlexTalks force-pushed the cmd_query_locks branch 2 times, most recently from f2c877e to 7ff0dac Compare March 15, 2022 01:37
This change adds a new KV API, `QueryLocks`, allowing a client to query
the state of a replica's lock table over a provided key span.  This
exposes the in-memory locks currently managed by Concurrency Control,
and is part of the work needed to implement cockroachdb#75541.

Resolves cockroachdb#74833.

Release note: None

Release justification: Low-risk new functionality.
@AlexTalks
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Mar 15, 2022

Build succeeded:

@craig craig bot merged commit f771999 into cockroachdb:master Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kv: expose interface to query locks

4 participants