Skip to content

Add length check while running PageStorage GC#2394

Merged
ti-srebot merged 1 commit intopingcap:masterfrom
JaySon-Huang:fix_ps_gc_crash
Jul 13, 2021
Merged

Add length check while running PageStorage GC#2394
ti-srebot merged 1 commit intopingcap:masterfrom
JaySon-Huang:fix_ps_gc_crash

Conversation

@JaySon-Huang
Copy link
Contributor

@JaySon-Huang JaySon-Huang commented Jul 13, 2021

Signed-off-by: JaySon-Huang jayson.hjs@gmail.com

What problem does this PR solve?

Issue Number: related to #2393

Problem Summary:
In #2356, we assume that page_files are not empty. However, unit tests randomly crash after that.

  1. A global RegionPersister created with path "./tmp/kvstore" and register a GC task on BackgroundService
  2. Before some unit test of the storage layer run, they would clean up old files on disk. However, we directly remove the whole "./tmp" for convenience in some tests.
  3. The GC task of RegionPersister run, find out that there are no page files on "./tmp/kvstore", and the unit test crash.

What is changed and how it works?

Add check for the size of page_files while running PageStorage::gc

And also we should refine those tests that brutally remove "./tmp", the following fix would be track in #2393

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch:

Check List

Tests

  • Manual test (add detailed scripts or steps below)

Side effects

Release note

  • No release note

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>
@JaySon-Huang
Copy link
Contributor Author

/run-all-tests

Copy link
Contributor

@SchrodingerZhu SchrodingerZhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ti-srebot
Copy link
Collaborator

@SchrodingerZhu, Thanks for your review. The bot only counts LGTMs from Reviewers and higher roles, but you're still welcome to leave your comments. See the corresponding SIG page for more information. Related SIG: tiflash(slack).

@SchrodingerZhu
Copy link
Contributor

/merge

@ti-srebot ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Jul 13, 2021
@ti-srebot
Copy link
Collaborator

/run-all-tests

@ti-srebot ti-srebot merged commit b41a041 into pingcap:master Jul 13, 2021
JaySon-Huang added a commit to JaySon-Huang/tiflash that referenced this pull request Jul 14, 2021
JaySon-Huang added a commit to JaySon-Huang/tiflash that referenced this pull request Jul 14, 2021
JaySon-Huang added a commit to JaySon-Huang/tiflash that referenced this pull request Jul 14, 2021
JaySon-Huang added a commit to JaySon-Huang/tiflash that referenced this pull request Jul 21, 2021
flowbehappy pushed a commit that referenced this pull request Jul 21, 2021
* Revert "Revert skipping not consecutive sequence"

This reverts commit 1cea4ef.

* Add length check while running PageStorage GC (#2394)

* PageStorage add PageFileID existence check for multi disk deployment (#2412)

* Add query id for DT read result

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Add failpoint for holding storage streams for long time

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Add async metrics for oldest snapshot lifetime

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Show longest snapshot lifetime on Grafana

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* calculating num of legacy independently

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* reading writable files

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Add failpoint for testing skip non-continuous sequence

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Stop Legacy compactor by the min persisted sequence

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Fix other bugs

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Make DataCompactor move GC forward when there are PageFiles with high valid rate/bytes

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Split PageFile with valid pages and no valid pages

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* fix ut

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* enable fiu under release

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>

* Fix the min persisted sequence

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>
JaySon-Huang added a commit to JaySon-Huang/tiflash that referenced this pull request Aug 4, 2021
JaySon-Huang added a commit to JaySon-Huang/tiflash that referenced this pull request Aug 4, 2021
flowbehappy pushed a commit that referenced this pull request Aug 4, 2021
* Ignore sequence hole among PageFile meta (#2312)

* Fix bug for GC may skip unexpected WriteBatches (#2356)

* Add length check while running PageStorage GC (#2394)

* PageStorage skip non continuous sequence safely (#2435)

* Fix PageStorage GC with high valid rate PageFile (#2436)

* More debug info for DeltaTree (query_id, snapshot lifetime) (#2431)

* Fix deadlock on `removeExpiredSnapshots` (#2461)

* Add grafana panels for write throughput per instance (#2524)
JaySon-Huang pushed a commit that referenced this pull request Aug 5, 2021
* Ignore sequence hole among PageFile meta (#2312)
* Fix bug for GC may skip unexpected WriteBatches (#2356)
* Add length check while running PageStorage GC (#2394)
* PageStorage skip non continuous sequence safely (#2435)
* Fix PageStorage GC with high valid rate PageFile (#2436)

Signed-off-by: JaySon-Huang <jayson.hjs@gmail.com>
@JaySon-Huang JaySon-Huang deleted the fix_ps_gc_crash branch September 29, 2021 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/can-merge Indicates a PR has been approved by a committer.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants