Skip to content

add region cache state test & fix some issues of replica selector#910

Merged
disksing merged 3 commits intotikv:tidb-6.5from
you06:region-cache-test
Aug 7, 2023
Merged

add region cache state test & fix some issues of replica selector#910
disksing merged 3 commits intotikv:tidb-6.5from
you06:region-cache-test

Conversation

@you06
Copy link
Contributor

@you06 you06 commented Jul 26, 2023

This PR add the selector replica test for stale read, cover the most possible errors, and fix the bug the test finds:

Fixes:

  • when fallback to followers from leader, stale-read flag should be removed.
  • in some cases, the fallback to follower not work.
  • when using stale read, allow async reload region when leader is exhausted, because leader may just be unavailable for a while, and the late update of region info won't affect stale read processing.

@you06 you06 force-pushed the region-cache-test branch 2 times, most recently from 9bbea85 to 533d7eb Compare August 2, 2023 05:34
Signed-off-by: you06 <you1474600@gmail.com>
@you06 you06 force-pushed the region-cache-test branch from e607df2 to 5efef85 Compare August 2, 2023 07:06
@you06 you06 marked this pull request as ready for review August 2, 2023 07:06
@you06 you06 changed the title Add region cache state test add region cache state test & fix some issues of replica selector Aug 2, 2023
you06 added 2 commits August 2, 2023 15:15
Signed-off-by: you06 <you1474600@gmail.com>
Signed-off-by: you06 <you1474600@gmail.com>
@cfzjywxk
Copy link
Contributor

cfzjywxk commented Aug 4, 2023

/cc @crazycs520

// In stale-read, the request will fallback to leader after the local follower failure.
// If the leader is also unavailable, we can fallback to the follower and use replica-read flag again,
// The remote follower not tried yet, and the local follower can retry without stale-read flag.
if state.isStaleRead {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be printed in the above log so we know if tryFollower happens in current turn, so is the fallbackFromLeader state.

leader := selector.replicas[state.leaderIdx]
leaderInvalid := leader.isEpochStale() || state.IsLeaderExhausted(leader)
leaderEpochStale := leader.isEpochStale()
leaderInvalid := leaderEpochStale || state.IsLeaderExhausted(leader)
Copy link
Contributor

@cfzjywxk cfzjywxk Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could refactor the implememtation of state.IsLeaderExhausted, cheking if the leader is exausted by the current states instead of leader.isExhausted(2).


resp, _, err := s.regionRequestSender.SendReqCtx(bo, req, regionLoc.Region, time.Second, tikvrpc.TiKV, ops...)
if !available {
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this err not checked if it's expected or not?

@disksing disksing merged commit 5968ce9 into tikv:tidb-6.5 Aug 7, 2023
you06 added a commit to you06/client-go that referenced this pull request Aug 14, 2023
you06 added a commit to you06/client-go that referenced this pull request Aug 14, 2023
…kv#910)

Signed-off-by: you06 <you1474600@gmail.com>

remove duplicate code

Signed-off-by: you06 <you1474600@gmail.com>
you06 added a commit to you06/client-go that referenced this pull request Aug 14, 2023
…kv#910)

Signed-off-by: you06 <you1474600@gmail.com>

remove duplicate code

Signed-off-by: you06 <you1474600@gmail.com>
you06 added a commit to you06/client-go that referenced this pull request Aug 15, 2023
…kv#910)

Signed-off-by: you06 <you1474600@gmail.com>

remove duplicate code

Signed-off-by: you06 <you1474600@gmail.com>
you06 added a commit to you06/client-go that referenced this pull request Aug 15, 2023
…kv#910)

Signed-off-by: you06 <you1474600@gmail.com>

remove duplicate code

Signed-off-by: you06 <you1474600@gmail.com>
cfzjywxk pushed a commit that referenced this pull request Sep 12, 2023
…) (#942)

* add region cache state test & fix some issues of replica selector (#910)

Signed-off-by: you06 <you1474600@gmail.com>

remove duplicate code

Signed-off-by: you06 <you1474600@gmail.com>

* remove comment

Signed-off-by: you06 <you1474600@gmail.com>

* lint

Signed-off-by: you06 <you1474600@gmail.com>

* fix flaky test

Signed-off-by: you06 <you1474600@gmail.com>

---------

Signed-off-by: you06 <you1474600@gmail.com>
cfzjywxk added a commit that referenced this pull request Sep 26, 2023
* reload region cache when store is resolved from invalid status (#843)

Signed-off-by: you06 <you1474600@gmail.com>
Co-authored-by: disksing <i@disksing.com>

* fallback to follower when leader is busy (#916) (#923)

* fallback to follower when leader is busy

Signed-off-by: you06 <you1474600@gmail.com>
Co-authored-by: cfzjywxk <cfzjywxk@gmail.com>
Co-authored-by: cfzjywxk <lsswxrxr@163.com>

* Resume max retry time check for stale read retry with leader option(#903) (#911)

* Resume max retry time check for stale read retry with leader option

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

* add cancel

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

---------

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

* add region cache state test & fix some issues of replica selector (#910)

Signed-off-by: you06 <you1474600@gmail.com>

remove duplicate code

Signed-off-by: you06 <you1474600@gmail.com>

* enable workflow for tidb-7.1

Signed-off-by: you06 <you1474600@gmail.com>

* update

Signed-off-by: you06 <you1474600@gmail.com>

update

Signed-off-by: you06 <you1474600@gmail.com>

fix test

Signed-off-by: you06 <you1474600@gmail.com>

fix test

Signed-off-by: you06 <you1474600@gmail.com>

* lint

Signed-off-by: you06 <you1474600@gmail.com>

* lint

Signed-off-by: you06 <you1474600@gmail.com>

* fix flaky test

Signed-off-by: you06 <you1474600@gmail.com>

---------

Signed-off-by: you06 <you1474600@gmail.com>
Signed-off-by: cfzjywxk <lsswxrxr@163.com>
Co-authored-by: disksing <i@disksing.com>
Co-authored-by: cfzjywxk <cfzjywxk@gmail.com>
Co-authored-by: cfzjywxk <lsswxrxr@163.com>
iosmanthus added a commit that referenced this pull request Dec 20, 2023
Co-authored-by: cfzjywxk <cfzjywxk@gmail.com>
Co-authored-by: cfzjywxk <lsswxrxr@163.com>
Co-authored-by: disksing <i@disksing.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: zzm <zhouzemin@pingcap.com>
Co-authored-by: husharp <jinhao.hu@pingcap.com>
Co-authored-by: you06 <you1474600@gmail.com>
Co-authored-by: buffer <doufuxiaowangzi@gmail.com>
Co-authored-by: 3pointer <qdlc2010@gmail.com>
Co-authored-by: buffer <1045931706@qq.com>
Co-authored-by: husharp <ihusharp@gmail.com>
Co-authored-by: crazycs520 <crazycs520@gmail.com>
Co-authored-by: Smilencer <smityz@qq.com>
Co-authored-by: ShuNing <nolouch@gmail.com>
Co-authored-by: zyguan <zhongyangguan@gmail.com>
Co-authored-by: Jack Yu <jackysp@gmail.com>
Co-authored-by: Weizhen Wang <wangweizhen@pingcap.com>
Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
Co-authored-by: healthwaite <148101100+healthwaite@users.noreply.github.com>
Co-authored-by: xufei <xufeixw@mail.ustc.edu.cn>
Co-authored-by: JmPotato <ghzpotato@gmail.com>
Co-authored-by: ekexium <eke@fastmail.com>
Co-authored-by: 山岚 <36239017+YuJuncen@users.noreply.github.com>
Co-authored-by: glorv <glorvs@163.com>
Co-authored-by: Yongbo Jiang <cabinfeveroier@gmail.com>
resolve locks interface for tidb gc_worker (#945)
fix some issues of replica selector (#910)  (#942)
fix some issues of replica selector (#910)
fix issue of configure kv timeout not work when disable batch client (#980)
fix batch-client wait too long and add some metrics (#973)
fix batch-client wait too long and add some metrics (#973)" (#984)
fix data race at the aggressiveLockingDirty (#913)
fix MinSafeTS might be set to MaxUint64 permanently (#994)
fix: fix invalid nil pointer when trying to record Store.SlownessStat. (#1017)
Fix batch client batchSendLoop panic (#1021)
fix request source tag unset (#1025)
Fix comment of `SuspendTime` (#1057)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants