kvserver: default kv.rangefeed.catchup_scan_iterator_optimization.enabled to true#73473
Conversation
erikgrinaker
left a comment
There was a problem hiding this comment.
This switches the default value to true. Further, I've lowered the
probability that this boolean is flipped in the metamorphic builds
such that the default case gets test more often.
Since the default case is also implicitly tested in lots of other integration/end-to-end tests, does it really make sense to reduce the probability of non-standard settings in the few tests that actually exercise them?
…bled to true We've been testing this at random for some time and, as far as I know, have not seen any CDC failures caused by this optimization. This switches the default value to true. Further, I've lowered the probability that this boolean is flipped in the metamorphic builds such that the default case gets test more often. Release note (ops change): The default value of the kv.rangefeed.catchup_scan_iterator_optimization.enabled cluster setting is now true.
c325f1d to
e8080a9
Compare
|
bors r=erikgrinaker |
|
Build succeeded: |
|
I like the idea of defaulting this value to true in a v21.2 patch. Reason being upgrades from v21.1 → v21.2 in the presence of changefeeds. The IOPS spike that comes from rolling upgrade + node cycling + catchup scans can be very disruptive across the cluster, and discourage upgrades. If the value is defaulted to true, I expect these disruptions to be reduced. Mechanically, it would just be a backport of this PR, I suppose. Thoughts? @stevendanna @erikgrinaker |
|
And while we’re at it, ditto for |
|
The backport policy doesn't explicitly mention changes to default settings, so it's a bit of a grey area. I think we've gained enough confidence in separated intents that it's safe to enable I'm less confident in |
|
PS: we'd have to make sure these new default settings respect the relevant version gates, if they don't already, to avoid breakage in mixed-version 21.1/21.2 clusters. |
|
I think we should definitely change the default of
We've had it default true on master for some time and rangefeeds have seen increased usage on master over this release cycle. Looking at the git history, I haven't seen any correctness changes to mvcc_incremental_iterator in the meantime either. Both will be shipping by default in 22.1 and at this point we've have these settings turned on in our largest installations. So, while
We respect a cluster setting for the intent scanner. I don't think there are any cluster settings we need to worry about for catchup-scans, but if I'm wrong there let me know. cockroach/pkg/kv/kvserver/replica_rangefeed.go Lines 396 to 397 in 385f742 |
We've been testing this at random for some time and, as far as I know,
have not seen any CDC failures caused by this optimization.
This switches the default value to true. Further, I've lowered the
probability that this boolean is flipped in the metamorphic builds
such that the default case gets test more often.
Release note (ops change): The default value of the
kv.rangefeed.catchup_scan_iterator_optimization.enabled cluster
setting is now true.