Skip to content

DS, eventCollector: drop event if needed#1430

Merged
ti-chi-bot[bot] merged 27 commits intopingcap:masterfrom
asddongmen:drop-event-if-needed
Jun 20, 2025
Merged

DS, eventCollector: drop event if needed#1430
ti-chi-bot[bot] merged 27 commits intopingcap:masterfrom
asddongmen:drop-event-if-needed

Conversation

@asddongmen
Copy link
Collaborator

@asddongmen asddongmen commented Jun 16, 2025

What problem does this PR solve?

Issue Number: ref #736

What is changed and how it works?

This pull request enhances the dynamic stream (dynstream) and event collector components to handle memory pressure more gracefully. It introduces a mechanism to signal dropped events using a new DropEvent type and refactors the memory control logic for better modularity. When the event collector drops certain event types due to memory limits, it now replaces them with a DropEvent and stops processing further events for that specific path.

Highlights

  • Introduce DropEvent: A new event type, DropEvent, is introduced (pkg/common/event/drop_event.go) to explicitly signal when an event (DML, DDL, Handshake) is dropped due to memory pressure in the event collector. This allows downstream components to be aware of the dropped event.
  • Modify OnDrop Interface: The OnDrop method in the dynstream.Handler interface is updated to return interface{} (utils/dynstream/interfaces.go). This enables handlers to return a replacement event (like the new DropEvent) when an event is dropped, rather than just performing an action.
  • Implement Event Dropping in EventCollector: The OnDrop handler for events in the eventcollector (downstreamadapter/eventcollector/helper.go) is updated. If a DML, Handshake, or DDL event is dropped due to memory pressure, it now logs the drop and returns a new DropEvent containing information about the dropped event (DispatcherID, Seq, CommitTs). This DropEvent replaces the original event in the stream.
  • Refactor Memory Control Algorithms: The memory control logic within dynstream is refactored into an interface (MemoryControlAlgorithm) and specific implementations (PullerMemoryControl, EventCollectorMemoryControl) (utils/dynstream/memory_control_algorithm.go). The AreaSettings now uses an integer type to specify the algorithm, and the areaMemStat holds an instance of the interface (utils/dynstream/interfaces.go, utils/dynstream/memory_control.go).
  • Mark Path as Dead on Drop: When an event is dropped and replaced by a DropEvent in the event collector (using the EventCollectorMemoryControl algorithm), the corresponding path in the dynamic stream is marked as 'dead' (utils/dynstream/stream.go, utils/dynstream/memory_control.go). Subsequent events for a dead path are ignored (utils/dynstream/parallel_dynamic_stream.go).

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  1. Create a changefeed sink to another TiDB and pause it.
  2. Insert 20GB of data in the upstream.
  3. Resume the changefeed.
  4. Since the sink speed is 28k rows per second while the puller speed is 200k rows per second, thousands of resets are triggered during the synchronization process. And memory usage of eventCollector's can be strictly controled around memoryQuota.
[root@upstream-ticdc-0 log]# grep "reset dispatcher" ticdc.log | wc -l
4404

image

  1. Compare the data consistency between the upstream and downstream after the synchronization is completed.
  2. The data is consistent.
  3. Repeat the above experiment 10 times, and all pass.

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

None

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Jun 16, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jun 16, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jun 16, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @asddongmen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the dynamic stream (dynstream) and event collector components to handle memory pressure more gracefully. It introduces a mechanism to signal dropped events using a new DropEvent type and refactors the memory control logic for better modularity. When the event collector drops certain event types due to memory limits, it now replaces them with a DropEvent and stops processing further events for that specific path.

Highlights

  • Introduce DropEvent: A new event type, DropEvent, is introduced (pkg/common/event/drop_event.go) to explicitly signal when an event (like DML, DDL, or Handshake) is dropped due to memory pressure in the event collector. This allows downstream components to be aware of the dropped event.
  • Modify OnDrop Interface: The OnDrop method in the dynstream.Handler interface is updated to return interface{} (utils/dynstream/interfaces.go). This enables handlers to return a replacement event (like the new DropEvent) when an event is dropped, rather than just performing an action.
  • Implement Event Dropping in EventCollector: The OnDrop handler for events in the eventcollector (downstreamadapter/eventcollector/helper.go) is updated. If a DML, Handshake, or DDL event is dropped due to memory pressure, it now logs the drop and returns a new DropEvent containing information about the dropped event (DispatcherID, Seq, CommitTs). This DropEvent replaces the original event in the stream.
  • Refactor Memory Control Algorithms: The memory control logic within dynstream is refactored into an interface (MemoryControlAlgorithm) and specific implementations (PullerMemoryControl, EventCollectorMemoryControl) (utils/dynstream/memory_control_algorithm.go). The AreaSettings now uses an integer type to specify the algorithm, and the areaMemStat holds an instance of the interface (utils/dynstream/interfaces.go, utils/dynstream/memory_control.go).
  • Mark Path as Dead on Drop: When an event is dropped and replaced by a DropEvent in the event collector (using the EventCollectorMemoryControl algorithm), the corresponding path in the dynamic stream is marked as 'dead' (utils/dynstream/stream.go, utils/dynstream/memory_control.go). Subsequent events for a dead path are ignored (utils/dynstream/parallel_dynamic_stream.go).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configureGemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to handle event dropping due to memory pressure, primarily within the dynstream and eventcollector components. Key changes include:

  • Refactoring the memory control logic in dynstream by introducing a MemoryControlAlgorithm interface and specific implementations (PullerMemoryControl, EventCollectorMemoryControl). This improves modularity.
  • Modifying the Handler.OnDrop method signature to return an interface{}, allowing handlers to provide a replacement event (e.g., a DropEvent) when an event is dropped.
  • Introducing a new pkg/common/event.DropEvent type to represent events that have been dropped.
  • Updating the eventcollector to utilize this new OnDrop mechanism. When memory limits are exceeded for the EventCollectorMemoryControl algorithm, DML, DDL, or Handshake events are replaced with DropEvents, and the corresponding path is marked as dead.
  • Adding String() methods to DDL and DML event types for better debuggability.

Overall, the changes seem well-structured to address the problem of event dropping. However, there are a couple of significant concerns regarding a new log.Panic and the deserialization logic for DropEvent that need to be addressed. Several minor points are also noted.

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/needs-linked-issue labels Jun 17, 2025
@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen
Copy link
Collaborator Author

/test all

@asddongmen asddongmen marked this pull request as ready for review June 20, 2025 03:58
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 20, 2025
@asddongmen asddongmen self-assigned this Jun 20, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jun 20, 2025

@asddongmen: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cdc-pulsar-integration-light 5be94f2 link false /test pull-cdc-pulsar-integration-light

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jun 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongyunyan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the lgtm label Jun 20, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jun 20, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-06-20 04:19:37.730902133 +0000 UTC m=+418230.454081112: ☑️ agreed by hongyunyan.

@ti-chi-bot ti-chi-bot bot added the approved label Jun 20, 2025
@asddongmen
Copy link
Collaborator Author

/test pull-cdc-mysql-integration-heavy

@ti-chi-bot ti-chi-bot bot merged commit d6dfccb into pingcap:master Jun 20, 2025
16 of 17 checks passed
@asddongmen asddongmen mentioned this pull request Jun 17, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants