kv: use evictionToken to coalesce RangeLookups on SendError#38527
kv: use evictionToken to coalesce RangeLookups on SendError#38527craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 3 of 4 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jeffrey-xiao, and @nvanbenschoten)
pkg/kv/dist_sender_test.go, line 3089 at r1 (raw file):
rs, err := keys.Range(ba) if err != nil { t.Fatal(err)
If you call t.Fatal from a goroutine other than the main one, you're going to have a bad time. Instead call t.Error here and then return the error. The test will fail but it won't hang.
FailNow must be called from the goroutine running the test or benchmark function, not from other goroutines created during the test
https://golang.org/pkg/testing/#T.FailNow
pkg/kv/dist_sender_test.go, line 3108 at r1 (raw file):
var kv roachpb.KeyValue if err := kv.Value.SetProto(&testMetaRangeDescriptor); err != nil { t.Fatal(err)
here too
pkg/kv/dist_sender_test.go, line 3119 at r1 (raw file):
var kv roachpb.KeyValue if err := kv.Value.SetProto(&testUserRangeDescriptor); err != nil { t.Fatal(err)
and here
pkg/kv/dist_sender_test.go, line 3155 at r1 (raw file):
batchWaitGroup.Add(2) go func() {
nit: make a helper that takes a string rather than duplicating this code
d97bee5 to
e271833
Compare
jeffrey-xiao
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jeffrey-xiao, and @nvanbenschoten)
pkg/kv/dist_sender_test.go, line 3089 at r1 (raw file):
Previously, ajwerner wrote…
If you call t.Fatal from a goroutine other than the main one, you're going to have a bad time. Instead call t.Error here and then return the error. The test will fail but it won't hang.
FailNow must be called from the goroutine running the test or benchmark function, not from other goroutines created during the test
There is some retry logic in dist_sender that makes the test hang and not fail hard when I provide an error. I've opted to call waitGroupBatch.Donewhen a fatal error is encountered in the go routines so the test fails fast.
e271833 to
e769681
Compare
jeffrey-xiao
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/kv/dist_sender_test.go, line 3089 at r1 (raw file):
Previously, jeffrey-xiao (Jeffrey Xiao) wrote…
There is some retry logic in
dist_senderthat makes the test hang and not fail hard when I provide an error. I've opted to callwaitGroupBatch.Donewhen a fatal error is encountered in the go routines so the test fails fast.
Ignore my previous comment -- returning nil and calling t.Error causes the test to fail fast.
pkg/kv/dist_sender_test.go, line 3108 at r1 (raw file):
Previously, ajwerner wrote…
here too
Done.
pkg/kv/dist_sender_test.go, line 3119 at r1 (raw file):
Previously, ajwerner wrote…
and here
Done.
nvb
left a comment
There was a problem hiding this comment.
Reviewed 2 of 4 files at r1, 1 of 2 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jeffrey-xiao, and @nvanbenschoten)
pkg/kv/dist_sender_test.go, line 3108 at r1 (raw file):
Previously, jeffrey-xiao (Jeffrey Xiao) wrote…
Done.
I'm not seeing the update.
pkg/kv/dist_sender_test.go, line 3053 at r2 (raw file):
} func TestEvictionTokenCoalesce(t *testing.T) {
Just to confirm, this test fails without your change in DistSender, right?
pkg/util/syncutil/singleflight/singleflight.go, line 154 at r2 (raw file):
g.mu.Lock() defer g.mu.Unlock() if g.m == nil {
I don't think you need this. Accessing a nil map should behave correctly. https://play.golang.org/p/Kv4LuuCQe-y
e769681 to
7577c93
Compare
jeffrey-xiao
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/kv/dist_sender_test.go, line 3108 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm not seeing the update.
I had to override MaxRetries in RPCRetryOptions to get the test to terminate if I'm returning nil, err or br, nil where br.Error = err. Unfortunately, it calls t.Error multiple times per batch so the error stack trace is fairly confusing. I think just returning br, nil is fine since it fails fast and provides a clean error stack.
pkg/kv/dist_sender_test.go, line 3053 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Just to confirm, this test fails without your change in DistSender, right?
Yep, this test fails without supplying the eviction token.
1ca847b to
f693165
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jeffrey-xiao, and @nvanbenschoten)
pkg/cmd/roachtest/bank.go, line 457 at r4 (raw file):
nodes[i]++ } sort.Ints(nodes)
Could you add a comment explaining why we do this? What exactly was deadlocking?
pkg/kv/dist_sender_test.go, line 3108 at r1 (raw file):
Previously, jeffrey-xiao (Jeffrey Xiao) wrote…
I had to override
MaxRetriesinRPCRetryOptionsto get the test to terminate if I'm returningnil, errorbr, nilwherebr.Error = err. Unfortunately, it callst.Errormultiple times per batch so the error stack trace is fairly confusing. I think just returningbr, nilis fine since it fails fast and provides a clean error stack.
Did we explore assigning br.Error = roachpb.NewError(err)? It's still generally bad form to fail the test from within these mocking functions,
738fee1 to
7fa15a7
Compare
jeffrey-xiao
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/cmd/roachtest/bank.go, line 457 at r4 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Could you add a comment explaining why we do this? What exactly was deadlocking?
Oops, misread the code.
pkg/kv/dist_sender_test.go, line 3108 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Did we explore assigning
br.Error = roachpb.NewError(err)? It's still generally bad form to fail the test from within these mocking functions,
dist_sender didn't properly propagate errors from range_cache, but now br.Error = roachpb.NewError(err) works.
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r2, 1 of 2 files at r3, 4 of 4 files at r5.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @jeffrey-xiao)
pkg/kv/dist_sender_test.go, line 3105 at r5 (raw file):
if bytes.HasPrefix(rs.Key, keys.Meta1Prefix) { // Querying meta 1 range. br := &roachpb.BatchResponse{}
The linter is complaining about this shadowing of br.
pkg/kv/range_cache.go, line 146 at r5 (raw file):
} } ret.WriteString(string(key))
nit: is there compiler intrinsics to avoid a copy here when casting from bytes to a string? If not can you use Write() and cast to a []byte instead of the string cast?
7fa15a7 to
1121af8
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 1 of 4 files at r5.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @jeffrey-xiao, and @nvanbenschoten)
pkg/kv/dist_sender.go, line 1415 at r6 (raw file):
if err != nil { log.VErrEventf(ctx, 1, "range descriptor re-lookup failed: %s", err) pErr = roachpb.NewError(err)
What's up with this? It looks like it will be dropped in most cases because we'll retry. Is the idea to return an error if we don't retry because we've timed out? If so, please leave a comment.
pkg/kv/range_cache.go, line 135 at r5 (raw file):
// to indicate a meta2 lookup. if key.AsRawKey().Compare(keys.Meta1KeyMax) < 0 { ret.WriteString("1:")
I'm thinking that we should write Meta1Prefix and Meta2Prefix directly. This string doesn't need to be pretty, but we do want to make sure that we don't get any unexpected aliasing between lookups. For instance, imaging what would happen if two lookups ran concurrently with the keys /meta1/test and 1:test right now.
pkg/kv/range_cache.go, line 146 at r5 (raw file):
Previously, ajwerner wrote…
nit: is there compiler intrinsics to avoid a copy here when casting from bytes to a string? If not can you use
Write()and cast to a[]byteinstead of the string cast?
+1
1121af8 to
2270934
Compare
jeffrey-xiao
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner and @nvanbenschoten)
pkg/kv/dist_sender_test.go, line 3105 at r5 (raw file):
Previously, ajwerner wrote…
The linter is complaining about this shadowing of
br.
Done.
aa0b328 to
818539f
Compare
Also fixed an issue where a meta2 range lookup could be coalesced with a meta1 range lookup and cause a deadlock. Release note: None
818539f to
37a38ed
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r7.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner)
|
TFTRs! bors r+ |
38527: kv: use evictionToken to coalesce RangeLookups on SendError r=jeffrey-xiao a=jeffrey-xiao Also fixes an issue where a meta2 range lookup could be coalesced with a meta1 range lookup and cause a deadlock. Fixes #28967 Release note: None I've collected metrics to compare this PR against master (`15d4329`) under two workloads while the cluster is running `kv50`. 1. Split workload: Two workers splitting random ranges: `ALTER TABLE kv.kv SPLIT AT VALUES (CAST(ROUND(RANDOM() * 9223372036854775808) AS INT)) WITH EXPIRATION '1s'` 2. Relocate workload: Two workers relocating random ranges: `ALTER TABLE kv.kv EXPERIMENTAL_RELOCATE SELECT ARRAY[%s, %s, %s], CAST(ROUND(RANDOM() * 9223372036854775808) AS INT)`. The new set of replicas is a random subset of the available replicas of size 3. More details can be found in this [script](https://gist.github.com/jeffrey-xiao/2af9de98a56d8921dc7efa0cf7fb5b5e).  Co-authored-by: Jeffrey Xiao <jeffrey.xiao1998@gmail.com>
Build succeeded |
Also fixes an issue where a meta2 range lookup could be coalesced with a
meta1 range lookup and cause a deadlock.
Fixes #28967
Release note: None
I've collected metrics to compare this PR against master (
15d4329) under two workloads while the cluster is runningkv50.ALTER TABLE kv.kv SPLIT AT VALUES (CAST(ROUND(RANDOM() * 9223372036854775808) AS INT)) WITH EXPIRATION '1s'ALTER TABLE kv.kv EXPERIMENTAL_RELOCATE SELECT ARRAY[%s, %s, %s], CAST(ROUND(RANDOM() * 9223372036854775808) AS INT). The new set of replicas is a random subset of the available replicas of size 3.More details can be found in this script.