Skip to content

Lightning Dedup Introduces Fail-Fast Mechanism #40743

@dsdashun

Description

@dsdashun

Enhancement

Currently, when doing de-dup in Lightning, it will scan the table records and handle ALL of them. This is a time-consuming job if there are so many duplicated rows after importing KVs into TiDB from Lightning ( For example, choose the wrong columns as unique keys or hit this bug ) . In these situations, resolving ALL the duplicated records is not a sensible idea.
If there are quite many duplicated records to be resolved, it idicates that there are some problems with the data itself. Lightning had better fail fast, and let users check the data, thus saving the de-dup time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    component/lightningThis issue is related to Lightning of TiDB.type/enhancementThe issue or PR belongs to an enhancement.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions