Skip to content

Fix the bug that duplicated page file block GC (#2170)#2183

Merged
JaySon-Huang merged 3 commits intopingcap:release-4.0from
ti-srebot:release-4.0-ec5f976a8fb8
Jul 6, 2021
Merged

Fix the bug that duplicated page file block GC (#2170)#2183
JaySon-Huang merged 3 commits intopingcap:release-4.0from
ti-srebot:release-4.0-ec5f976a8fb8

Conversation

@ti-srebot
Copy link
Collaborator

@ti-srebot ti-srebot commented Jun 17, 2021

cherry-pick #2170 to release-4.0
You can switch your code base to this Pull Request by using git-extras:

# In tics repo:
git pr https://github.com/pingcap/tics/pull/2183

After apply modifications, you can push your change to this PR via:

git push git@github.com:ti-srebot/tics.git pr/2183:release-4.0-ec5f976a8fb8

What problem does this PR solve?

Issue Number: close #2169

Problem Summary:
In DataCompactor::migratePages, we avoid generating a PageFile that already exists, but we didn't check whether its "Legacy" mode exists or not.
https://github.com/pingcap/tics/blob/74c69fb1d35da3582cb9279ecb4d8597e4a78d00/dbms/src/Storages/Page/gc/DataCompactor.cpp#L150-L158
https://github.com/pingcap/tics/blob/74c69fb1d35da3582cb9279ecb4d8597e4a78d00/dbms/src/Storages/Page/PageStorage.cpp#L1137-L1145

For example,

  1. We generate a PageFile "page_1000_1" for storing GC data
  2. Then the data in "page_1000_1" have been migrated to another file, and "page_1000_1" become "legacy.page_1000_1"
  3. Maybe some old files are held by snapshot for a long time, we happen to generate a PageFile "page_1000_1" again, then we have both "page_1000_1" and "legacy.page_1000_1" at the same time
  4. After the "page_1000_1" generate in step 3 become useless, we want to set it to "legacy" and remove its data, but we find "legacy.page_1000_1" already exists, then it will throw an exception and stop us from GCing useless data
  5. Finally, the TiFlash node will full of data in "t_{table_id}/log" (almost 1TiB in our case) and make the load balance bad between multiple TiFlash nodes

What is changed and how it works?

Check whether page file with same <id, level>, status in [Formal, Legacy] exists before generating PageFile for GC data

Related changes

  • Need to cherry-pick to the release branch: 5.1, 5.0, 4.0

Check List

Tests

  • Unit test

Side effects

Release note

  • Fix the bug that TiFlash can not GC delta data under rare case

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot ti-srebot added CHERRY-PICK cherry pick status/LGT1 Indicates that a PR has LGTM 1. type/bugfix This PR fixes a bug. labels Jun 17, 2021
@ti-srebot ti-srebot requested a review from flowbehappy June 17, 2021 07:32
@ti-srebot ti-srebot added this to the v4.0.14 milestone Jun 17, 2021
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved

Copy link
Contributor

@flowbehappy flowbehappy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a comment.

@JaySon-Huang JaySon-Huang changed the title Fix the bug that duplicated page file block GC (#2170) [DNM] Fix the bug that duplicated page file block GC (#2170) Jun 18, 2021
@JaySon-Huang JaySon-Huang changed the title [DNM] Fix the bug that duplicated page file block GC (#2170) Fix the bug that duplicated page file block GC (#2170) Jul 6, 2021
@JaySon-Huang
Copy link
Contributor

/run-all-tests

@JaySon-Huang JaySon-Huang merged commit b543ee6 into pingcap:release-4.0 Jul 6, 2021
@JaySon-Huang JaySon-Huang deleted the release-4.0-ec5f976a8fb8 branch July 6, 2021 07:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CHERRY-PICK cherry pick status/LGT1 Indicates that a PR has LGTM 1. type/bugfix This PR fixes a bug.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants