changefeedccl: periodic pts record updates#76605
changefeedccl: periodic pts record updates#76605craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
0f87fba to
519906f
Compare
miretskiy
left a comment
There was a problem hiding this comment.
This is very nice.
Reviewed 1 of 5 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @samiskin)
-- commits, line 14 at r1:
This definitely needs a release note. It's an internal change, and probably nobody cares about this
internal implementation detail; but it's a release note worty.
At the very least mentioning that changefeeds can run reliably on tables w/ low ttl.
Definitely need to mention the setting to control the frequency of such updates because this frequency effectively
overrides gcttl set on a table.
pkg/ccl/changefeedccl/changefeed.go, line 46 at r1 (raw file):
} func shouldProtectTimestamps(codec keys.SQLCodec) bool {
pls add a todo to remove this restriction once tenant based pts is in...
pkg/ccl/changefeedccl/changefeed_processors.go, line 1172 at r1 (raw file):
} if err := cf.flowCtx.Stopper().RunAsyncTask(ctx, "changefeed-pts-manager", func(ctx context.Context) {
not sure if we need this. Why not do this while trying to checkpoint job progress?
b60c090 to
b12ddd9
Compare
samiskin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
This definitely needs a release note. It's an internal change, and probably nobody cares about this
internal implementation detail; but it's a release note worty.
At the very least mentioning that changefeeds can run reliably on tables w/ low ttl.
Definitely need to mention the setting to control the frequency of such updates because this frequency effectively
overrides gcttl set on a table.
Done.
pkg/ccl/changefeedccl/changefeed.go, line 46 at r1 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
pls add a todo to remove this restriction once tenant based pts is in...
Done.
pkg/ccl/changefeedccl/changefeed_processors.go, line 1172 at r1 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
not sure if we need this. Why not do this while trying to checkpoint job progress?
I just had it this way since you mentioned it being a separate loop on a timer initially. I wasn't sure whether it'd be considered more or less complex to either have a piece of functionality be more distinct from other functionality versus the timing of events being more deterministic where recurring events happen in lockstep with resolved events.
I do like that keeping it with checkpoints keeps table changes in one place and having it there lets it share the "min checkpoint frequency" option. I made the changes to move it to resolved 👍.
b12ddd9 to
acd1cc2
Compare
miretskiy
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @miretskiy and @samiskin)
-- commits, line 16 at r3:
i don't think it's a bug fix -- it's an enterprise change.
acd1cc2 to
0b5e2b0
Compare
Previously changefeeds only laid down protected timestamp records to protect against either an ongoing backfill or the changefeed lagging behind. This is insufficient in cases such as if the gcttl is very short, recurring errors retry the changefeed for too long, or in upcoming work to enable serverless to shut down idle changefeeds. This PR removes the manual PTS protection on backfills and begins an async routine on the changeFrontier that updates the protected timestamp record to the current highwater mark. Fixes cockroachdb#76247 Release note (enterprise change): changefeeds running on tables with a low gcttl will function more reliably due to protected timestamps being maintained for the changefeed targets at the resolved timestamp of the changefeed. The frequency at which the protected timestamp is updated to the resolved timestamp can be configured through the `changefeed.protect_timestamp_interval` cluster setting. If the changefeed lags too far behind such that storage of old data becomes an issue, cancelling the changefeed will release the protected timestamps and allow garbage collection to resume. If `protect_data_from_gc_on_pause` is unset, pausing the changefeed will release the existing protected timestamp record.
0b5e2b0 to
54203d6
Compare
|
bors r+ |
|
Build failed (retrying...): |
|
Build succeeded: |
| } | ||
|
|
||
| func shouldProtectTimestamps(codec keys.SQLCodec) bool { | ||
| // TODO(smiskin): Remove this restriction once tenant based pts are enabled |
There was a problem hiding this comment.
Can we TODO this @samiskin @adityamaru, given #73727 (comment) ?
Previously changefeeds only laid down protected timestamp records to
protect against either an ongoing backfill or the changefeed lagging
behind. This is insufficient in cases such as if the gcttl is very
short, recurring errors retry the changefeed for too long, or in
upcoming work to enable serverless to shut down idle changefeeds.
This PR removes the manual PTS protection on backfills and begins an
async routine on the changeFrontier that updates the protected timestamp
record to the current highwater mark.
Fixes #76247
Release note (enterprise change): changefeeds running on tables with a
low gcttl will function more reliably due to protected timestamps being
maintained for the changefeed targets at the resolved timestamp of the
changefeed. The frequency at which the protected timestamp is updated
to the resolved timestamp can be configured through the
changefeed.protect_timestamp_intervalcluster setting. If thechangefeed lags too far behind such that storage of old data becomes an
issue, cancelling the changefeed will release the protected timestamps
and allow garbage collection to resume. If
protect_data_from_gc_on_pauseis unset, pausing the changefeed willrelease the existing protected timestamp record.