Skip to content

changefeedccl: bug in poller causes it to emit a resolved timestamp before emitting row updates at that timestamp. #41415

@aayushshah15

Description

@aayushshah15

Changefeeds support watching tables that undergo a schema change with a backfill.

Currently, the poller (see poller.go) keeps track of the most recent table descriptor version for every table it watches. If it detects a schema change, it triggers a full table scan using the timestamp of the last ModifiedTime of the given table descriptor. This means that the row updates emitted as a result of the backfill are emitted at this ModifiedTime timestamp.

The issue with this approach is that this ModificationTime timestamp is often (or always in my testing) the last resolved timestamp that was emitted by the poller. So what ends up happening is that the changeAggregator forwards its local frontier using this resolved timestamp and then the changeFrontier sees all these backfill-related row updates at the same timestamp. This is a significant consumption bug since the client is free to ignore all row updates emitted at or below the latest globally resolved timestamp. This bug affects all sinks.

Metadata

Metadata

Assignees

Labels

A-cdcChange Data Capture

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions