engine,storage: replace read-only batches with type safety#26916
engine,storage: replace read-only batches with type safety#26916benesch wants to merge 1 commit intocockroachdb:masterfrom
Conversation
|
Review status: pkg/storage/replica_command.go, line 102 at r1 (raw file):
While I generally like the approach you've taken here, just pointing out this drawback. Creating a Comments from Reviewable |
|
Reviewed 41 of 41 files at r1. pkg/storage/replica.go, line 2602 at r1 (raw file):
A ReadWriter in executeReadOnlyBatch? Comments from Reviewable |
I've become convinced that read-only batches, as created by
Engine.NewReadOnly, are the wrong abstraction. They conflate two
unrelated goals:
1. Ensure read-only commands do not accidentally write data.
2. Improve performance of code paths that can reuse iterators by
caching the iterator object returned by NewIterator.
The fact that Engine.NewReadOnly must return a ReadWriter, instead of a
Reader, to satisfy both of these goals is evidence that these two goals
should be addressed separately.
To better accomplish the first goal, we can leverage the Go type system.
This commit adjusts command registration so that commands declare up
front whether they are read-only or mutating. Read-only commands take an
engine.Reader, while mutating commands take an engine.ReadWriter. This
prevents accidental writes at compile time, whereas the read-only batch
could only check for misuse at runtime.
To accomplish the second goal, this commit replaces read-only batches
with an engine wrapper, rocksDBIterCacher, that does nothing but cache
iterators. This wrapper might also prove useful to cache iterators in
code paths that wish to perform mutations but do not need a full-fledged
batch.
Release note: None
14cca72 to
ed138af
Compare
|
Afraid I'm not going to have time to resolve the remaining problems with this PR. |
This PR revives cockroachdb#26916 (copying over the commit message from there). It's not strictly necessary, but is a quality of life improvement. There are also miscellaneous cleanups around TeeEngine included therein. Read-only batches, as created by engine.NewReadOnly, are the wrong abstraction. They conflate two unrelated goals: - Ensure read-only commands do not accidentally write data. - Improve performance of code paths that can reuse iterators by caching the iterator object returned by NewIterator. The fact that engine.NewReadOnly must return a ReadWriter, instead of a Reader, to satisfy both of these goals is evidence that these two goals should be addressed separately. To better accomplish the first goal, we can leverage the Go type system. This commit adjusts command registration so that commands declare up front whether they are read-only or mutating. Read-only commands take an engine.Reader, while mutating commands take an engine.ReadWriter. This prevents accidental writes at compile time, whereas the read-only batch could only check for misuse at runtime. To accomplish the second goal, this commit replaces read-only batches with an engine wrapper, rocksDBIterCacher, that does nothing but cache iterators. This wrapper might also prove useful to cache iterators in code paths that wish to perform mutations but do not need a full-fledged batch. NB: There's still the unaddressed wart that is replica read code paths operating off an engine.ReadWriter. Changing this out would require a refactor of replica evaluation code paths, which I'm not doing now. Release note: None
This PR revives cockroachdb#26916 (copying over the commit message from there). It's not strictly necessary, but is a quality of life improvement. There are also miscellaneous cleanups around TeeEngine included therein. Read-only batches, as created by engine.NewReadOnly, are the wrong abstraction. They conflate two unrelated goals: - Ensure read-only commands do not accidentally write data. - Improve performance of code paths that can reuse iterators by caching the iterator object returned by NewIterator. The fact that engine.NewReadOnly must return a ReadWriter, instead of a Reader, to satisfy both of these goals is evidence that these two goals should be addressed separately. To better accomplish the first goal, we can leverage the Go type system. This commit adjusts command registration so that commands declare up front whether they are read-only or mutating. Read-only commands take an engine.Reader, while mutating commands take an engine.ReadWriter. This prevents accidental writes at compile time, whereas the read-only batch could only check for misuse at runtime. To accomplish the second goal, this commit replaces read-only batches with an engine wrapper, rocksDBIterCacher, that does nothing but cache iterators. This wrapper might also prove useful to cache iterators in code paths that wish to perform mutations but do not need a full-fledged batch. NB: There's still the unaddressed wart that is replica read code paths operating off an engine.ReadWriter. Changing this out would require a refactor of replica evaluation code paths, which I'm not doing now. Release note: None
This PR reworks parts of cockroachdb#26916. It's not strictly necessary, but is a quality of life improvement. There are also miscellaneous cleanups around TeeEngine, RocksDB benchmarks, etc. Previous to this we relied on engine.NewReadOnly's runtime panics when receiving writes to ensure that read-only commands do not accidentally write data. By changing the command registration process to have commands declare upfront whether they are read-only or read-write, we can plumb in engine.Reader or engine.ReadWriter respectively. It's nice rely on the type system to assert these guarantees. NB: There's still the unaddressed wart that is replica read code paths operating off an engine.ReadWriter. Changing this out would require a refactor of replica evaluation code paths, which I'm not doing now. This PR however paves the way for that. Release note: None
This PR reworks parts of cockroachdb#26916. It's not strictly necessary, but is a quality of life improvement. There are also miscellaneous cleanups around TeeEngine, RocksDB benchmarks, etc. Previous to this we relied on engine.NewReadOnly's runtime panics when receiving writes to ensure that read-only commands do not accidentally write data. By changing the command registration process to have commands declare upfront whether they are read-only or read-write, we can plumb in engine.Reader or engine.ReadWriter respectively. It's nice rely on the type system to assert these guarantees. NB: There's still the unaddressed wart that is replica read code paths operating off an engine.ReadWriter. Changing this out would require a refactor of replica evaluation code paths, which I'm not doing now. This PR however paves the way for that. Release note: None
This PR reworks parts of cockroachdb#26916. It's not strictly necessary, but is a quality of life improvement. There are also miscellaneous cleanups around TeeEngine, RocksDB benchmarks, etc. Previous to this we relied on engine.NewReadOnly's runtime panics when receiving writes to ensure that read-only commands do not accidentally write data. By changing the command registration process to have commands declare upfront whether they are read-only or read-write, we can plumb in engine.Reader or engine.ReadWriter respectively. It's nice rely on the type system to assert these guarantees. NB: There's still the unaddressed wart that is replica read code paths operating off an engine.ReadWriter. Changing this out would require a refactor of replica evaluation code paths, which I'm not doing now. This PR however paves the way for that. Release note: None
43764: storage: change command registration to declare if ro/rw r=irfansharif a=irfansharif This PR reworks parts of #26916. It's not strictly necessary, but is a quality of life improvement. There are also miscellaneous cleanups around TeeEngine, RocksDB benchmarks, etc. Previous to this we relied on engine.NewReadOnly's runtime panics when receiving writes to ensure that read-only commands do not accidentally write data. By changing the command registration process to have commands declare upfront whether they are read-only or read-write, we can plumb in engine.Reader or engine.ReadWriter respectively. It's nice rely on the type system to assert these guarantees. NB: There's still the unaddressed wart that is replica read code paths operating off an engine.ReadWriter. Changing this out would require a refactor of replica evaluation code paths, which I'm not doing now. This PR however paves the way for that. Release note: None 43779: pgwire/hba: fix the HBA parser r=knz a=knz Previously, the HBA rule parser would not properly support the pg split format for separate address and netmask. When the 4th column of a rule in the setting `server.host_based_authentication.configuration` is an IP address, the 5th column is expected to be an IP netmask. Prior to this patch, the HBA parser would not recognize the netmask syntax and instead expect a simple number of bits set (which is not valid pg syntax). This patch fixes this and adds the corresponding tests. Additionally, this patch replaces the HBA rule parser by one inspired from pg's own implementation, so as to ease future maintainance. Release note (bug fix): When the 4th column of a rule in the setting `server.host_based_authentication.configuration` is an IP address without a mask length (e.g. `1.2.3.4` instead of `1.2.0.0/16`), CockroachDB now properly interprets the 5th column as an IP netmask, as per https://www.postgresql.org/docs/current/auth-pg-hba-conf.html. Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Please let me know what you think of this approach! I'd like to get buy-in before I spend any more time on this. I'm expecting a few lint failures but tests in storage and storage/engine passed.
I've become convinced that read-only batches, as created by
Engine.NewReadOnly, are the wrong abstraction. They conflate two
unrelated goals:
caching the iterator object returned by NewIterator.
The fact that Engine.NewReadOnly must return a ReadWriter, instead of a
Reader, to satisfy both of these goals is evidence that these two goals
should be addressed separately.
To better accomplish the first goal, we can leverage the Go type system.
This commit adjusts command registration so that commands declare up
front whether they are read-only or mutating. Read-only commands take an
engine.Reader, while mutating commands take an engine.ReadWriter. This
prevents accidental writes at compile time, whereas the read-only batch
could only check for misuse at runtime.
To accomplish the second goal, this commit replaces read-only batches
with an engine wrapper, rocksDBIterCacher, that does nothing but cache
iterators. This wrapper might also prove useful to cache iterators in
code paths that wish to perform mutations but do not need a full-fledged
batch.
Release note: None