Skip to content

Storage: Fix max-id being mis-reused cause data corruption after changing tiflash replica number (#8698)#8733

Merged
ti-chi-bot[bot] merged 3 commits intopingcap:release-7.5from
ti-chi-bot:cherry-pick-8698-to-release-7.5
Jan 26, 2024
Merged

Storage: Fix max-id being mis-reused cause data corruption after changing tiflash replica number (#8698)#8733
ti-chi-bot[bot] merged 3 commits intopingcap:release-7.5from
ti-chi-bot:cherry-pick-8698-to-release-7.5

Conversation

@ti-chi-bot
Copy link
Member

@ti-chi-bot ti-chi-bot commented Jan 25, 2024

This is an manual cherry-pick of #8701 and #8698

What problem does this PR solve?

Issue Number: close #8695

Problem Summary: As the issue describe: #8695 (comment)

What is changed and how it works?

Introduce a GlobalPageIdAllocator to avoid a page_id from being reused in one physical_table_id scope. Specifically,

  • Introduced a GlobalPageIdAllocator to allocate page_id for segment_id, delta-layer and stable-layer for all IStorage instances instead of each StoragePool allocate the page_id for its own IStorage instance independently
  • Each IStorage instance restored from disk, it will first try raise the lower bound of the GlobalPageIdAllocator. After that, a table won't be able to reuse the allocated id
  • If an IStorage instance being physically removed (cause by setting tiflash replica to 0) and re-create again, it will still allocated the page_id from GlobalPageIdAllocator, so no page_id will be reused too.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
Load tpc-ds 50
Set tiflash replica 1 for database tpcds50 and wait for all tables progress == 1
Set tiflash replica 0 and wait for gc
Set tiflash replica 1 and wait for all tables progress == 1
Run queries on tpcds 50 through tiflash
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Fix the issue that the tiflash replica data may be corrupted after setting the tiflash replica to 0 and add it back later

@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-7.5 This PR is cherry-picked to release-7.5 from a source PR. labels Jan 25, 2024
@ti-chi-bot ti-chi-bot added the cherry-pick-approved Cherry pick PR approved by release team. label Jan 25, 2024
@JaySon-Huang
Copy link
Contributor

/run-all-tests

Copy link
Contributor

@Lloyd-Pottiger Lloyd-Pottiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove all #include <Storages/DeltaMerge/Segment_fwd.h>

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Jan 25, 2024
@ti-chi-bot ti-chi-bot bot added the lgtm label Jan 25, 2024
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jan 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JinheLin, Lloyd-Pottiger

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [JinheLin,Lloyd-Pottiger]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jan 25, 2024
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Jan 25, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-01-25 06:22:21.783009886 +0000 UTC m=+1028183.347307603: ☑️ agreed by Lloyd-Pottiger.
  • 2024-01-25 06:45:31.8879204 +0000 UTC m=+1029573.452218105: ☑️ agreed by JinheLin.

@JaySon-Huang
Copy link
Contributor

/run-all-tests

2 similar comments
@JaySon-Huang
Copy link
Contributor

/run-all-tests

@JaySon-Huang
Copy link
Contributor

/run-all-tests

@JaySon-Huang JaySon-Huang force-pushed the cherry-pick-8698-to-release-7.5 branch from ff8e829 to e6fccfe Compare January 26, 2024 01:49
@JaySon-Huang
Copy link
Contributor

/run-all-tests

@JaySon-Huang
Copy link
Contributor

/run-integration-test

@ti-chi-bot ti-chi-bot bot merged commit 4335916 into pingcap:release-7.5 Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cherry-pick-approved Cherry pick PR approved by release team. lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-7.5 This PR is cherry-picked to release-7.5 from a source PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants