storage: hook closed timestamps into rangefeed#28974
storage: hook closed timestamps into rangefeed#28974craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
|
Heads up: I saw this and will ignore it until you ask for a review. |
de71e81 to
04a64fa
Compare
|
@tschottdorf still not ready to review, but I'd like to understand the epoch-based lease restriction with closed timestamps a little further. The more I work with this, the more I'd like to remove 137a1b0 as a hard requirement for rangefeeds because it makes things significantly more difficult to test and to use. Ranges will often swap from expiration-based leases to epoch-based leases and it's surprisingly tricky to ensure that we're never establishing a rangefeed to a range with the first type of lease. |
|
@nvanbenschoten I think it's fair to say that if the CT updates are used correctly, then there is no danger in trying to use them on an expiration-based range. What I mean by that is that "correct" usage of CT information requires taking into account the Epoch of the lease. An expiration-based lease has Epoch zero (or rather, an expiration-base lease should be treated as one with Epoch zero). Closed timestamp updates are never emitted for Epoch zero, and so the result of this is that no read request will ever come to the conclusion that it's OK to be served from a follower, at least not until there's a new, Epoch-based, lease. |
So then it's also the case that the In that case, what do you think about simply rejecting rangefeeds on ranges where |
Yep, that's what should happen :-)
That should work fine, if you want to make sure that "hopeless" RangeFeeds never even get created. I do like the explicit validation that RangeFeeds on expiration-based ranges don't do anything wrong, but even if you let the system create them you'd probably add a trigger that terminates them so quickly that things have very little chance to go wrong in the first place - so, what you're proposing seems better. That error would bubble up all the way to the ChangeFeed, right? |
nvb
left a comment
There was a problem hiding this comment.
That error would bubble up all the way to the ChangeFeed, right?
Yes, absolutely. It wouldn't be a RangeFeedRetryError.
Thanks for the confirmation, I'll make that change then clean up this PR so that it's ready for a review.
Reviewable status:
complete! 0 of 0 LGTMs obtained
04a64fa to
5b948ea
Compare
|
@tschottdorf the last two commits should be ready for review now. The first three are still #28912 because merging anything is just about impossible today. |
5b948ea to
c0a3d95
Compare
Ranges with expiration-based leases will not function properly with closed timestamps, which means that we can't establish a resolved timestamp on ranges that use them. Because of this, we disallow rangefeeds on these ranges entirely. This restriction may be lifted in the future. Release note: None
This change hooks up closed timestamps into rangefeed. It does so by creating a new closed timestamp subscription and notifying all ranges with active rangefeed when new closed timestamp entries arrive. This triggers each rangefeed to check for the maximum closed timestamp on its range and use this to attempt to move its resolved timestamp forward. Eventually this will need some testing tied in with the closedts package. That will have to wait until the closedts package is set up to be more easily used in tests from the storage package (right now `NoopContainer`s are used in tests everywhere). For now, we can wait for integration testing with cdc. Release note: None
c0a3d95 to
7601470
Compare
|
Rebased of master now that #28912 merged. |
|
With this change and a small DistSender fix I'll send out soon I was able to get
|
|
Awesome! 🙌 I looked into the tpcc-1000 roachtest failure yesterday and it looks like we're being really slow about dumping into kafka for some reason, which is backing up writing to the buffer pretty considerably. I tried some retunings of sarama and pprof-ing the roachtest and a local benchmark. So for it's been inconclusive. Anyway, I bet this is contributing to the release valve issues you're seeing. We definitely need to address the buffer TODO but even if we did that, I think the sink throughput is currently bad enough that'd we'd still have some problems. |
|
Also @tschottdorf, I was unsure of whether I need to call |
|
With this change and #29219, all tests in |
|
TFTRs! bors r+ |
28974: storage: hook closed timestamps into rangefeed r=nvanbenschoten a=nvanbenschoten Closes #28912. This change hooks up closed timestamps into rangefeed. It does so by creating a new closed timestamp subscription and notifying all ranges with active rangefeed when new closed timestamp entries arrive. This triggers each rangefeed to check for the maximum closed timestamp on its range and use this to attempt to move its resolved timestamp forward. Eventually this will need some testing tied in with the closedts package. That will have to wait until the closedts package is set up to be more easily used in tests from the storage package (right now `NoopContainer`s are used in tests everywhere). For now, we can wait for integration testing with cdc. With this change and the changes it depends on, rangefeeds are sufficiently feature-complete to be used for changefeeds. For instance, with a few small tweaks, we pass 6 changefeed unit tests in `./pkg/ccl/changefeedccl` with `kv.rangefeed.enabled` set to true. Release note: None Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Build succeeded |
29500: release-2.1: backport 7 rangefeed PRs r=nvanbenschoten a=nvanbenschoten Backport: * 5/5 commits from "kv: give Transport a haircut" (#28855) * 1/1 commits from "engine: use Txn.Timestamp instead of OrigTimestamp for LogLogicalOp" (#28970) * 1/1 commits from "rangefeed: add release valve to logical op consumption" (#29076) * 3/3 commits from "kv: teach DistSender about RangeFeeds, use for changefeeds" (#28912) * 1/1 commits from "rangefeed: small perf-related changes" (#29134) * 2/2 commits from "kv: truncate RangeFeed span to range descriptor " (#29219) * 2/2 commits from "storage: hook closed timestamps into rangefeed" (#28974) Please see individual PRs for details. All cherry-picks were clean. /cc @cockroachdb/release Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Closes #28912.
This change hooks up closed timestamps into rangefeed. It does
so by creating a new closed timestamp subscription and notifying
all ranges with active rangefeed when new closed timestamp entries
arrive. This triggers each rangefeed to check for the maximum closed
timestamp on its range and use this to attempt to move its resolved
timestamp forward.
Eventually this will need some testing tied in with the closedts
package. That will have to wait until the closedts package is set
up to be more easily used in tests from the storage package (right
now
NoopContainers are used in tests everywhere). For now, we canwait for integration testing with cdc.
With this change and the changes it depends on, rangefeeds are sufficiently
feature-complete to be used for changefeeds. For instance, with a few small
tweaks, we pass 6 changefeed unit tests in
./pkg/ccl/changefeedcclwithkv.rangefeed.enabledset to true.Release note: None