Skip to content

TiKV-CDC Memory Quota Reached 512 MB After TiCDC Restart, Changefeed Stuck and Unrecoverable #18169

@wlwilliamx

Description

@wlwilliamx

Bug Report

  1. Our production CDC instance running in Kubernetes was restarted.
  2. After the restart, the memory usage of TiKV-CDC began to increase continuously.
  3. Eventually, the memory quota (512 MB) was fully utilized, causing the Changefeed to hang.
  4. Restarting the CDC instance multiple times did not resolve the issue. The memory quota of TiKV-CDC remained at 512 MB, and the Changefeed remained stuck.

What version of TiKV are you using?

  • 14 TiKV nodes with v7.5.4
  • 2 TiCDC nodes with v7.5.4

What operating system and CPU are you using?

Steps to reproduce

What did you expect?

  • TiKV-CDC should release memory after the restart and operate within the configured memory quota.
  • The Changefeed should resume normal processing after restarting the CDC instance.

What did happened?

  • TiKV-CDC's memory usage continued to increase after the restart and eventually hit the 512 MB quota.
  • The Changefeed was stuck and unable to recover.
  • Restarting the CDC instance multiple times did not free up memory or resolve the issue. The memory quota of TiKV-CDC remained at 512 MB.

Metadata

Metadata

Assignees

Labels

affects-7.1This bug affects the 7.1.x(LTS) versions.affects-7.5This bug affects the 7.5.x(LTS) versions.affects-8.1This bug affects the 8.1.x(LTS) versions.affects-8.5This bug affects the 8.5.x(LTS) versions.affects-9.0This bug affects the 9.0.x versions.component/CDCComponent: Change Data Captureimpact/crashcrash/fatalreport/customerCustomers have encountered this bug.severity/criticaltype/bugThe issue is confirmed as a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions