Skip to content

server/lock_manager: Support replacing lock by key to reduce RPC overhead of update_wait_for when the waiting queue is very long#17451

Closed
MyonKeminta wants to merge 16 commits intotikv:masterfrom
MyonKeminta:m/deadlock-oom-fix
Closed

server/lock_manager: Support replacing lock by key to reduce RPC overhead of update_wait_for when the waiting queue is very long#17451
MyonKeminta wants to merge 16 commits intotikv:masterfrom
MyonKeminta:m/deadlock-oom-fix

Conversation

@MyonKeminta
Copy link
Contributor

@MyonKeminta MyonKeminta commented Aug 28, 2024

What is changed and how it works?

Issue Number: Close #17394

What's Changed:

server/lock_manager: Support replacing lock by key to reduce RPC overhead of `update_wait_for` when the waiting queue is very long

* Make deadlock detector distinguish keys instead of key hashes
* Add `replace_lock_by_key` to support redetecting all waiters on specific key and specific lock
* Add `ReplaceLocksByKeys` type to the `DeadlockRequest` in the protobuf to support doing the new kind of check remotely

This is expected to solve the issue that frequently updating a key where the lock waiting queue is very long might cause TiKV OOM due to messages cumulating in a channel.

Related changes

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below) (TODO)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Release note

Fix an issue that when frequently updating a row while many transaction is waiting on the lock, it might sometimes cause TiKV OOM

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has signed the dco. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 28, 2024
…correct

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta MyonKeminta marked this pull request as draft August 28, 2024 05:57
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 28, 2024
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta
Copy link
Contributor Author

/release

@sre-bot
Copy link
Contributor

sre-bot commented Aug 28, 2024

500,
);
}

Copy link
Collaborator

@cfzjywxk cfzjywxk Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a upgrading test case to verify the behaviours are expected when upgrading tikv-servers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was now manually done.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta
Copy link
Contributor Author

/release

Copy link
Contributor

@ekexium ekexium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR itself LGTM.
Have we conducted any throughput benchmarks comparing this implementation to the original one? Additionally, micro-benchmarks could help us determine the component's maximum performance capacity.

@sre-bot
Copy link
Contributor

sre-bot commented Sep 3, 2024

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta MyonKeminta marked this pull request as ready for review September 3, 2024 12:57
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 3, 2024
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@cfzjywxk cfzjywxk requested review from you06 and zyguan September 4, 2024 15:13
detect_table.clean_up(txn.into());
None
}
(_, true) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This match is a bit weird, how about:

match req.get_tp(){
    DeadlockRequestType::CleanUpWaitFor => {
        if req.has_replace_locks_by_keys() {
            // replace_locks_by_keys logic 
        }else{
            // old logic
        }
    }
}

@ti-chi-bot ti-chi-bot bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 27, 2024
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Sep 27, 2024

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Comment on lines 585 to +586
self.wait_for_map.clear();
self.key_reverse_map.clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget to shrink? An empty hashmap may still occupy memory.

@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Nov 14, 2024
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta
Copy link
Contributor Author

Benchmarks

cargo bench -p tests --bench deadlock_detector
Gnuplot not found, using plotters backend
bench_dense_detect_without_cleanup/Config { n: 10, range: 10, ttl: 100000000s }
                        time:   [10.724 us 11.120 us 11.552 us]
                        change: [-17.155% +22.163% +64.753%] (p = 0.36 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe
bench_dense_detect_without_cleanup/Config { n: 10, range: 100, ttl: 100000000s }
                        time:   [30.282 us 30.784 us 31.059 us]
                        change: [+13.147% +16.133% +18.699%] (p = 0.00 < 0.05)
                        Performance has regressed.
bench_dense_detect_without_cleanup/Config { n: 10, range: 1000, ttl: 100000000s }
                        time:   [140.32 us 153.44 us 159.31 us]
                        change: [-18.998% +3.2548% +30.293%] (p = 0.80 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
bench_dense_detect_without_cleanup/Config { n: 10, range: 10000, ttl: 100000000s }
                        time:   [8.1210 us 20.725 us 29.401 us]
                        change: [-49.527% +7.0155% +171.54%] (p = 0.88 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
bench_dense_detect_without_cleanup/Config { n: 10, range: 100000, ttl: 100000000s }
                        time:   [8.4774 us 9.0053 us 9.3103 us]
                        change: [+92.856% +108.55% +124.40%] (p = 0.00 < 0.05)
                        Performance has regressed.
bench_dense_detect_without_cleanup/Config { n: 10, range: 1000000, ttl: 100000000s }
                        time:   [6.7621 us 7.0729 us 7.2675 us]
                        change: [+72.810% +79.399% +85.969%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high mild
bench_dense_detect_without_cleanup/Config { n: 10, range: 10000000, ttl: 100000000s }
                        time:   [10.089 us 10.501 us 11.550 us]
                        change: [+112.08% +342.11% +802.07%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low mild
  1 (10.00%) high severe
bench_dense_detect_without_cleanup/Config { n: 10, range: 100000000, ttl: 100000000s }
                        time:   [9.8447 us 10.390 us 10.950 us]
                        change: [+97.254% +235.22% +499.62%] (p = 0.04 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe

bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 1ms }
                        time:   [9.0705 us 9.1736 us 9.2422 us]
                        change: [+84.786% +87.079% +89.794%] (p = 0.00 < 0.05)
                        Performance has regressed.
bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 3ms }
                        time:   [12.324 us 12.379 us 12.443 us]
                        change: [+17.887% +18.956% +19.876%] (p = 0.00 < 0.05)
                        Performance has regressed.
bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 5ms }
                        time:   [17.529 us 17.573 us 17.644 us]
                        change: [+10.552% +10.941% +11.331%] (p = 0.00 < 0.05)
                        Performance has regressed.
bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 10ms }
                        time:   [28.987 us 29.191 us 29.308 us]
                        change: [+5.3008% +6.2587% +7.1333%] (p = 0.00 < 0.05)
                        Performance has regressed.
bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 100ms }
                        time:   [120.22 us 124.16 us 125.74 us]
                        change: [-8.0532% +5.5688% +21.993%] (p = 0.50 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 500ms }
                        time:   [140.86 us 155.11 us 161.97 us]
                        change: [-31.140% -13.518% +10.684%] (p = 0.26 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 1s }
                        time:   [142.32 us 154.59 us 160.57 us]
                        change: [-32.598% -12.598% +15.456%] (p = 0.35 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild
bench_dense_detect_with_cleanup/Config { n: 10, range: 1000, ttl: 3s }
                        time:   [139.10 us 150.11 us 155.37 us]
                        change: [-31.773% -12.046% +16.189%] (p = 0.38 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) low mild

There will be obvious regression in the deadlock detector.

@MyonKeminta MyonKeminta marked this pull request as draft November 19, 2024 09:55
@ti-chi-bot ti-chi-bot bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 19, 2024
@MyonKeminta MyonKeminta added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 19, 2024
@MyonKeminta MyonKeminta marked this pull request as ready for review November 19, 2024 09:56
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 19, 2024
MyonKeminta added a commit to MyonKeminta/tikv that referenced this pull request Nov 20, 2024
…sts (tikv#17500)

close tikv#17394

lock_manager: Skip updating lock wait info for non-fair-locking requests

This is a simpler and lower-risky fix of the OOM issue tikv#17394 for released branches, as an alternative solution to tikv#17451 .
In this way, for acquire_pessimistic_lock requests without enabling fair locking, the behavior of update_wait_for will be a noop. So that if fair locking is globally disabled, the behavior will be equivalent to versions before 7.0.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
ti-chi-bot bot pushed a commit that referenced this pull request Nov 20, 2024
…sts (#17500) (#17870)

close #17394

lock_manager: Skip updating lock wait info for non-fair-locking requests

This is a simpler and lower-risky fix of the OOM issue #17394 for released branches, as an alternative solution to #17451 .
In this way, for acquire_pessimistic_lock requests without enabling fair locking, the behavior of update_wait_for will be a noop. So that if fair locking is globally disabled, the behavior will be equivalent to versions before 7.0.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
ti-chi-bot bot pushed a commit that referenced this pull request Nov 20, 2024
…sts (#17500) (#17869)

close #17394

lock_manager: Skip updating lock wait info for non-fair-locking requests

This is a simpler and lower-risky fix of the OOM issue #17394 for released branches, as an alternative solution to #17451 .
In this way, for acquire_pessimistic_lock requests without enabling fair locking, the behavior of update_wait_for will be a noop. So that if fair locking is globally disabled, the behavior will be equivalent to versions before 7.0.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@ekexium
Copy link
Contributor

ekexium commented Mar 20, 2025

Is this abandoned?

@MyonKeminta
Copy link
Contributor Author

Is this abandoned?

Currently no plan to continue this as its overhead is large and we've find it hard to determine if it's acceptable.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Sep 16, 2025

@MyonKeminta: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-unit-test 9504bd8 link true /test pull-unit-test
pull-check-deps 9504bd8 link true /test pull-check-deps

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dco-signoff: yes Indicates the PR's author has signed the dco. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-1-more-lgtm Indicates a PR needs 1 more LGTM. needs-cherry-pick-release-7.1 Should cherry pick this PR to release-7.1 branch. needs-cherry-pick-release-7.5 Should cherry pick this PR to release-7.5 branch. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Memory usage may grow unexpectedly and causes OOM in some high-concurrency lock contention scenarios

7 participants