Skip to content

GC use BatchResolveLocks may miss some pessimistic locks in rare cases #45134

@MyonKeminta

Description

@MyonKeminta

Bug Report

Please answer these questions before submitting your issue. Thanks!

This problem is introduced when fixing #43243 .

When BatchResolveLocks meets a pessimistic lock, it calls resolvePessimisticLock immediately, before calling getTxnStatus, considering that we will delete pessimistic locks no matter what state it's belonging transactions are:

https://github.com/tikv/client-go/blob/fbec0230608310ecd18caaea76c0cdbb23cc933a/txnkv/txnlock/lock_resolver.go#L248-L266

However, it's not noticed that resolvePessimisticLocks ignores primary locks (to be more precise, locks whose primary field points to itself), considering that CheckTxnStatus must has been called on it:

https://github.com/tikv/client-go/blob/fbec0230608310ecd18caaea76c0cdbb23cc933a/txnkv/txnlock/lock_resolver.go#L1178-L1181

This may cause that in some rare cases, some pessimistic locks may be left after GCing.

The problem usually happen when node crashing, RPC failing, etc. so that transactions are leaving uncleared locks, which are not common cases. Once it happens, it will possibly affect the lagging of CDC and stale read.

Metadata

Metadata

Assignees

No one assigned

    Labels

    affects-6.1This bug affects the 6.1.x(LTS) versions.affects-6.5This bug affects the 6.5.x(LTS) versions.affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.2severity/majorsig/transactionSIG:Transactiontype/bugThe issue is confirmed as a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions