Skip to content

PageStorage add PageFileID existence check for multi disk deployment#2412

Merged
ti-srebot merged 2 commits intopingcap:masterfrom
JaySon-Huang:fix_ps_gc_multi_disk
Jul 19, 2021
Merged

PageStorage add PageFileID existence check for multi disk deployment#2412
ti-srebot merged 2 commits intopingcap:masterfrom
JaySon-Huang:fix_ps_gc_multi_disk

Conversation

@JaySon-Huang
Copy link
Contributor

@JaySon-Huang JaySon-Huang commented Jul 15, 2021

Signed-off-by: JaySon-Huang jayson.hjs@gmail.com

What problem does this PR solve?

Issue Number: close #2413

Problem Summary: When TiFlash is deployed on multi disks, we need to check the file existence among all disks. Or we may generate some PageFiles that share the same PageFileID. While restoring from disks, we may ignore some PageFiles because of inserting them into PageFile "set".

What is changed and how it works?

  • Check the file existence among all disks for DataCompactor:: migratePages
  • Add mock classes for testing

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch:

Check List

Tests

  • Unit test

Side effects

Release note

  • Fix the bug that TiFlash can not restore data under some situations when deployed on multi disks

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>
@JaySon-Huang JaySon-Huang self-assigned this Jul 15, 2021
@JaySon-Huang JaySon-Huang added the type/bugfix This PR fixes a bug. label Jul 15, 2021
@JaySon-Huang
Copy link
Contributor Author

/run-all-tests

@JaySon-Huang
Copy link
Contributor Author

@flowbehappy @lidezhu PTAL

Copy link
Contributor

@flowbehappy flowbehappy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-srebot ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 19, 2021
@ti-srebot ti-srebot removed the status/LGT1 Indicates that a PR has LGTM 1. label Jul 19, 2021
@ti-srebot ti-srebot added the status/LGT2 Indicates that a PR has LGTM 2. label Jul 19, 2021
@JaySon-Huang
Copy link
Contributor Author

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 19, 2021
@ti-srebot
Copy link
Collaborator

/run-all-tests

@ti-srebot ti-srebot merged commit 114c55c into pingcap:master Jul 19, 2021
@JaySon-Huang JaySon-Huang deleted the fix_ps_gc_multi_disk branch July 19, 2021 11:48
@JaySon-Huang JaySon-Huang added needs-cherry-pick-release-4.0 PR which needs to be cherry-picked to release-4.0 needs-cherry-pick-release-5.0 PR which needs to be cherry-picked to release-5.0 needs-cherry-pick-release-5.1 PR which needs to be cherry-picked to release-5.1 labels Jul 19, 2021
@ti-srebot
Copy link
Collaborator

cherry pick to release-5.1 in PR #2432

@ti-srebot
Copy link
Collaborator

cherry pick to release-4.0 in PR #2433

@ti-srebot
Copy link
Collaborator

cherry pick to release-5.0 in PR #2434

JaySon-Huang added a commit that referenced this pull request Jul 21, 2021
…2412) (#2432)

* cherry pick #2412 to release-5.1
Co-authored-by: JaySon <tshent@qq.com>
flowbehappy pushed a commit that referenced this pull request Jul 21, 2021
* Revert "Revert skipping not consecutive sequence"

This reverts commit 1cea4ef.

* Add length check while running PageStorage GC (#2394)

* PageStorage add PageFileID existence check for multi disk deployment (#2412)

* Add query id for DT read result

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Add failpoint for holding storage streams for long time

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Add async metrics for oldest snapshot lifetime

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Show longest snapshot lifetime on Grafana

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* calculating num of legacy independently

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* reading writable files

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Add failpoint for testing skip non-continuous sequence

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Stop Legacy compactor by the min persisted sequence

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Fix other bugs

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Make DataCompactor move GC forward when there are PageFiles with high valid rate/bytes

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Split PageFile with valid pages and no valid pages

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* fix ut

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* enable fiu under release

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Fix the min persisted sequence

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>
JaySon-Huang added a commit that referenced this pull request Jul 28, 2021
…2412) (#2433)

* cherry pick #2412 to release-4.0
* Fix some param of GC

Co-authored-by: JaySon <tshent@qq.com>
Co-authored-by: JaySon-Huang <jayson.hjs@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-cherry-pick-release-4.0 PR which needs to be cherry-picked to release-4.0 needs-cherry-pick-release-5.0 PR which needs to be cherry-picked to release-5.0 needs-cherry-pick-release-5.1 PR which needs to be cherry-picked to release-5.1 status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/bugfix This PR fixes a bug.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicated PageFileID generated by DataCompactor when deployed on multi disks

4 participants