kvserver: sync cluster version setting to store#55240
kvserver: sync cluster version setting to store#55240craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
petermattis
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)
pkg/kv/kvserver/store.go, line 2784 at r1 (raw file):
return err } return eng.Flush() // make it durable
This is a big hammer as it causes the memtable to be flushed. I think it is better to change c-deps/libroach/engine.cc:DBImpl::{Put,Merge,Delete,SingleDelete,DeleteRange} to specify options.sync = true.
Or create a batch here and commit it with Commit(true). But my preference is doing the fix in libroach.
I'm separately trying to determine how many places we're using a raw DB for writes.
tbg
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @petermattis)
pkg/kv/kvserver/store.go, line 2784 at r1 (raw file):
Previously, petermattis (Peter Mattis) wrote…
This is a big hammer as it causes the memtable to be flushed. I think it is better to change
c-deps/libroach/engine.cc:DBImpl::{Put,Merge,Delete,SingleDelete,DeleteRange}to specifyoptions.sync = true.Or create a batch here and commit it with
Commit(true). But my preference is doing the fix inlibroach.I'm separately trying to determine how many places we're using a raw DB for writes.
That seems like a bigger semantic change, one which I in principle support, but do we want that to be the thing we backport to all branches? I would rather go with the minimal PR here (i.e. NewBatch) and flip the default separately.
petermattis
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @irfansharif, @petermattis, and @tbg)
pkg/kv/kvserver/store.go, line 2784 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
That seems like a bigger semantic change, one which I in principle support, but do we want that to be the thing we backport to all branches? I would rather go with the minimal PR here (i.e.
NewBatch) and flip the default separately.
It is a bigger change, though note that storage.Pebble is doing sync mutations for these operations. So the semantic change has already been made for 20.2 (and 20.1 clusters running Pebble). I'm good with this change, just want to gently push you to make the more central fix.
|
I just saw the result of your investigation on slack, so SGTM about the central fix. Will ping when updated. |
tbg
left a comment
There was a problem hiding this comment.
Updated.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @tbg)
petermattis
left a comment
There was a problem hiding this comment.
I hope my confidence here doesn't come back to bite me.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)
Writes to a RocksDB `storage.Engine` were not sync'ed by default, meaning that they can get lost due to an ill-timed crash. They are now, matching pebble's behavior. This affects only WriteClusterVersion, updateBootstrapInfoLocked, WriteLastUpTimestamp, and Compactor.Suggest, nonw of which are performance sensitive. Fixes cockroachdb#54906. (The backport will take care of cockroachdb#54908). Release note (bug fix): a rare scenario in which a node would refuse to start after updating the binary was fixed. The log message would indicate: "store [...], last used with cockroach version [...], is too old for running version [...] (which requires data from [...] or later)".
|
bors r=petermattis |
|
Build succeeded: |
Currently, doing direct mutations on a RocksDB instance bypasses custom batching / syncing logic that we've built on top of it. This, or something internal to RocksDB, started leading to some bugs when all direct mutations started passing in WriteOptions.sync = true (see cockroachdb#55240 for when this change went in). In this change, direct mutations still commit the batch with sync=true to guarantee WAL syncing, but they go through the batch commit pipeline too, just like the vast majority of operations already do. Fixes cockroachdb#55362. Release note: None.
Currently, doing direct mutations on a RocksDB instance bypasses custom batching / syncing logic that we've built on top of it. This, or something internal to RocksDB, started leading to some bugs when all direct mutations started passing in WriteOptions.sync = true (see cockroachdb#55240 for when this change went in). In this change, direct mutations still commit the batch with sync=true to guarantee WAL syncing, but they go through the batch commit pipeline too, just like the vast majority of operations already do. Fixes cockroachdb#55362. Release note: None.
Currently, doing direct mutations on a RocksDB instance bypasses custom batching / syncing logic that we've built on top of it. This, or something internal to RocksDB, started leading to some bugs when all direct mutations started passing in WriteOptions.sync = true (see cockroachdb#55240 for when this change went in). In this change, direct mutations still commit the batch with sync=true to guarantee WAL syncing, but they go through the batch commit pipeline too, just like the vast majority of operations already do. Fixes cockroachdb#55362. Release note: None.
Currently, doing direct mutations on a RocksDB instance bypasses custom batching / syncing logic that we've built on top of it. This, or something internal to RocksDB, started leading to some bugs when all direct mutations started passing in WriteOptions.sync = true (see cockroachdb#55240 for when this change went in). In this change, direct mutations still commit the batch with sync=true to guarantee WAL syncing, but they go through the batch commit pipeline too, just like the vast majority of operations already do. Fixes cockroachdb#55362. Release note: None.
55708: storage: Use batches for direct RocksDB mutations r=itsbilal a=itsbilal Currently, doing direct mutations on a RocksDB instance bypasses custom batching / syncing logic that we've built on top of it. This, or something internal to RocksDB, started leading to some bugs when all direct mutations started passing in WriteOptions.sync = true (see #55240 for when this change went in). In this change, direct mutations still commit the batch with sync=true to guarantee WAL syncing, but they go through the batch commit pipeline too, just like the vast majority of operations already do. Fixes #55362. Release note: None. 55734: server: skip unit'ed engines in tombstone storage r=irfansharif a=tbg Empty (i.e. uninitialized engines) could receive tombstones before being initialized. Initialization checks that the engine is empty (save for the cluster version) and thus failed. Simply don't write tombstones to uninitialized engines, which is fine since by the time the callbacks fire, at least one is initialized anyway, and besides, this mechanism is best effort. The alternatives would have been to either allow tombstones to be present on an engine that is being bootstrapped, or to give the storage the option to defer writing to the engine once it's bootstrapped. Neither seemed worth the extra work. Fixes #55576. Release note: None 55739: opt: fix normalization of st_distance when use_spheroid parameter used r=rytaft a=rytaft This commit fixes the normalization rule that converts `st_distance` to `st_dwithin` or `st_dwithinexclusive`, which was broken in the case when the `use_spheroid` parameter was used. Prior to this commit, the rule was assigning the `use_spheroid` parameter as the 3rd parameter to `st_dwithin` or `st_dwithinexclusive` and the `distance` parameter as the 4th, but that order does not match the function signatures. This commit fixes the issue by assigning `distance` as the 3rd parameter and `use_spheroid` as the 4th if it exists. Fixes #55675 Release note (bug fix): Fixed an internal error that could occur during query planning when the use_spheroid parameter was used in the ST_Distance function as part of a filter predicate. For example, `SELECT ... WHERE ST_Distance(geog1, geog2, false) < 10` previously caused an error. This has now been fixed. Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com> Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com> Co-authored-by: Rebecca Taft <becca@cockroachlabs.com>
Currently, doing direct mutations on a RocksDB instance bypasses custom batching / syncing logic that we've built on top of it. This, or something internal to RocksDB, started leading to some bugs when all direct mutations started passing in WriteOptions.sync = true (see cockroachdb#55240 for when this change went in). In this change, direct mutations still commit the batch with sync=true to guarantee WAL syncing, but they go through the batch commit pipeline too, just like the vast majority of operations already do. Fixes cockroachdb#55362. Release note: None.
Currently, doing direct mutations on a RocksDB instance bypasses custom batching / syncing logic that we've built on top of it. This, or something internal to RocksDB, started leading to some bugs when all direct mutations started passing in WriteOptions.sync = true (see cockroachdb#55240 for when this change went in). In this change, direct mutations still commit the batch with sync=true to guarantee WAL syncing, but they go through the batch commit pipeline too, just like the vast majority of operations already do. Fixes cockroachdb#55362. Release note: None.
Writes to a
storage.Engineare not sync'ed by default, meaning thatthey can get lost due to an ill-timed crash.
Fixes #54906.
(The backport will take care of #54908).
Release note (bug fix): a rare scenario in which a node would refuse
to start after updating the binary was fixed. The log message would
indicate: "store [...], last used with cockroach version [...], is too
old for running version [...] (which requires data from [...] or
later)".