Skip to content

server/lock_manager: Support reporting the full wait chain when deadlock is detected#10085

Merged
ti-chi-bot merged 15 commits intotikv:masterfrom
MyonKeminta:m/record-information-for-lock-view
May 7, 2021
Merged

server/lock_manager: Support reporting the full wait chain when deadlock is detected#10085
ti-chi-bot merged 15 commits intotikv:masterfrom
MyonKeminta:m/record-information-for-lock-view

Conversation

@MyonKeminta
Copy link
Contributor

@MyonKeminta MyonKeminta commented Apr 26, 2021

This PR is now reviewable. But before merging, I want to test if there's any performance regression and try to optimize it if necessary.

What problem does this PR solve?

Problem Summary:

We are going to implement the feature Lock View . A part of it is to support collecting information about recent deadlocks.

What is changed and how it works?

What's Changed: Two more fields key and resource_group_tag is added to deadlock detection request, deadlock detector's inner information, and deadlock error message. SQL Digest will be encoded into resource_group_tag. The resource_group_tag may also be used by another feature Top SQL, and they want to pass both SQL Digest and the plan to TiKV by the resource_group_tag.

Related changes

Check List

Tests

With this PR Workload Threads Table size QPS .95 lat
Y oltp_read_write 32 100,000 14455.2 54.83
Y oltp_read_write 128 100,000 24456.78 127.81
oltp_read_write 32 100,000 14522.35 54.83
oltp_read_write 128 100,000 24734.54 127.81
Y oltp_write_only 32 100,000 10184.40 31.37
Y oltp_write_only 128 100,000 21315.33 80.03
oltp_write_only 32 100,000 10227.46 31.37
oltp_write_only 128 100,000 20600.07 82.96

Side effects

  • Performance regression
    • Consumes more CPU
    • Consumes more MEM

Release note

  • No release note

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@ti-chi-bot ti-chi-bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 26, 2021
@MyonKeminta
Copy link
Contributor Author

/bench

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta
Copy link
Contributor Author

/release

@MyonKeminta MyonKeminta marked this pull request as ready for review April 27, 2021 14:12
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 27, 2021
@sre-bot
Copy link
Contributor

sre-bot commented Apr 27, 2021

@MyonKeminta
Copy link
Contributor Author

/bench

@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 29, 2021
@MyonKeminta
Copy link
Contributor Author

/release

@sre-bot
Copy link
Contributor

sre-bot commented Apr 29, 2021

Copy link
Member

@longfangsong longfangsong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM, but we should merge kvproto#752 and use kvproto on master first.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 6, 2021
@youjiali1995 youjiali1995 added the sig/transaction SIG: Transaction label May 6, 2021
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@MyonKeminta
Copy link
Contributor Author

/test

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 7, 2021
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 7, 2021
@MyonKeminta
Copy link
Contributor Author

/test

Copy link
Contributor

@youjiali1995 youjiali1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM.

ts: TimeStamp,
hashes: Vec<u64>,
// (hash, key)
// The `key` is recorded as an diagnostic information. There may be multiple keys with the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The `key` is recorded as an diagnostic information. There may be multiple keys with the same
// The `key` is recorded as diagnostic information. There may be multiple keys with the same

pub pr: ProcessResult,
// (lock, is_first_lock, wait_timeout)
pub lock_info: Option<(lock_manager::Lock, bool, Option<WaitTimeout>)>,
// (lock, key, is_first_lock, wait_timeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// (lock, key, is_first_lock, wait_timeout)

waiter.deadlock_with(deadlock_key_hash);
// The `wait_chain` doesn't include the final edge that caused the deadlock, which
// may reduce some deadlock RPC size. To pass it to the client, append this edge
// to the wait chain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you append it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have appended it in the deadlock detector. It should be complete. I prefer to remove these comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry. This comment is about the first implementation and later I changed it.

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • longfangsong
  • youjiali1995

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Details

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 7, 2021
@youjiali1995
Copy link
Contributor

/merge

@ti-chi-bot
Copy link
Member

@youjiali1995: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

DetailsCommit hash: 65d8394

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label May 7, 2021
@ti-chi-bot
Copy link
Member

@MyonKeminta: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot merged commit daa9354 into tikv:master May 7, 2021
@MyonKeminta MyonKeminta deleted the m/record-information-for-lock-view branch May 8, 2021 03:38
jyz0309 pushed a commit to jyz0309/tikv that referenced this pull request May 20, 2021
…ock is detected (tikv#10085)

* code change draft

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Wrap the key and tag into a struct

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* temporarily silent the warning

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* fix build

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Fix incorrect assertion; add test for wait chain generating

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Move wait chain generation to  a seperated function

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Add test case and some refactor

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Check the reported wait chain in integration test

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Update kvproto

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Address comments

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

* Address comments

Signed-off-by: MyonKeminta <MyonKeminta@users.noreply.github.com>

Co-authored-by: MyonKeminta <MyonKeminta@users.noreply.github.com>
Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: jyz0309 <45495947@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sig/transaction SIG: Transaction size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants