Skip to content

engine,storage: replace read-only batches with type safety#26916

Closed
benesch wants to merge 1 commit intocockroachdb:masterfrom
benesch:no-read-only
Closed

engine,storage: replace read-only batches with type safety#26916
benesch wants to merge 1 commit intocockroachdb:masterfrom
benesch:no-read-only

Conversation

@benesch
Copy link
Copy Markdown
Contributor

@benesch benesch commented Jun 22, 2018

Please let me know what you think of this approach! I'd like to get buy-in before I spend any more time on this. I'm expecting a few lint failures but tests in storage and storage/engine passed.


I've become convinced that read-only batches, as created by
Engine.NewReadOnly, are the wrong abstraction. They conflate two
unrelated goals:

  1. Ensure read-only commands do not accidentally write data.
  2. Improve performance of code paths that can reuse iterators by
    caching the iterator object returned by NewIterator.

The fact that Engine.NewReadOnly must return a ReadWriter, instead of a
Reader, to satisfy both of these goals is evidence that these two goals
should be addressed separately.

To better accomplish the first goal, we can leverage the Go type system.
This commit adjusts command registration so that commands declare up
front whether they are read-only or mutating. Read-only commands take an
engine.Reader, while mutating commands take an engine.ReadWriter. This
prevents accidental writes at compile time, whereas the read-only batch
could only check for misuse at runtime.

To accomplish the second goal, this commit replaces read-only batches
with an engine wrapper, rocksDBIterCacher, that does nothing but cache
iterators. This wrapper might also prove useful to cache iterators in
code paths that wish to perform mutations but do not need a full-fledged
batch.

Release note: None

@benesch benesch requested review from a team, nvb and petermattis June 22, 2018 01:30
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@petermattis
Copy link
Copy Markdown
Collaborator

Review status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/storage/replica_command.go, line 102 at r1 (raw file):

		}
		if cmd.EvalReadOnly != nil {
			pd, err = cmd.EvalReadOnly(ctx, batch, cArgs, reply)

While EvalReadOnly takes an engine.Reader, you're passing in batch which is an engine.ReadWriter. The "type safety" here can be violated via a type assertion of the engine.Reader back to an engine.ReadWriter.

I generally like the approach you've taken here, just pointing out this drawback. Creating a readOnly wrapper for the engine.ReadWriter could avoid this drawback, though I'm not sure if doing so is worth it.


Comments from Reviewable

@bdarnell
Copy link
Copy Markdown
Contributor

:lgtm:


Reviewed 41 of 41 files at r1.
Review status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/replica.go, line 2602 at r1 (raw file):

	var result result.Result
	rec := NewReplicaEvalContext(r, spans)
	var eng engine.ReadWriter = r.store.Engine().NewIterCacher()

A ReadWriter in executeReadOnlyBatch?


Comments from Reviewable

I've become convinced that read-only batches, as created by
Engine.NewReadOnly, are the wrong abstraction. They conflate two
unrelated goals:

  1. Ensure read-only commands do not accidentally write data.
  2. Improve performance of code paths that can reuse iterators by
     caching the iterator object returned by NewIterator.

The fact that Engine.NewReadOnly must return a ReadWriter, instead of a
Reader, to satisfy both of these goals is evidence that these two goals
should be addressed separately.

To better accomplish the first goal, we can leverage the Go type system.
This commit adjusts command registration so that commands declare up
front whether they are read-only or mutating. Read-only commands take an
engine.Reader, while mutating commands take an engine.ReadWriter. This
prevents accidental writes at compile time, whereas the read-only batch
could only check for misuse at runtime.

To accomplish the second goal, this commit replaces read-only batches
with an engine wrapper, rocksDBIterCacher, that does nothing but cache
iterators. This wrapper might also prove useful to cache iterators in
code paths that wish to perform mutations but do not need a full-fledged
batch.

Release note: None
@benesch
Copy link
Copy Markdown
Contributor Author

benesch commented Jan 3, 2019

Afraid I'm not going to have time to resolve the remaining problems with this PR.

@benesch benesch closed this Jan 3, 2019
@benesch benesch deleted the no-read-only branch January 3, 2019 04:53
@benesch benesch restored the no-read-only branch January 3, 2019 04:57
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Jan 7, 2020
This PR revives cockroachdb#26916 (copying over the commit message from there).
It's not strictly necessary, but is a quality of life improvement.
There are also miscellaneous cleanups around TeeEngine included
therein.

Read-only batches, as created by engine.NewReadOnly, are the wrong
abstraction. They conflate two unrelated goals:

  - Ensure read-only commands do not accidentally write data.
  - Improve performance of code paths that can reuse iterators by
  caching the iterator object returned by NewIterator.

The fact that engine.NewReadOnly must return a ReadWriter, instead of a
Reader, to satisfy both of these goals is evidence that these two goals
should be addressed separately.

To better accomplish the first goal, we can leverage the Go type system.
This commit adjusts command registration so that commands declare up
front whether they are read-only or mutating. Read-only commands take an
engine.Reader, while mutating commands take an engine.ReadWriter. This
prevents accidental writes at compile time, whereas the read-only batch
could only check for misuse at runtime.

To accomplish the second goal, this commit replaces read-only batches
with an engine wrapper, rocksDBIterCacher, that does nothing but cache
iterators. This wrapper might also prove useful to cache iterators in
code paths that wish to perform mutations but do not need a full-fledged
batch.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now.

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Jan 7, 2020
This PR revives cockroachdb#26916 (copying over the commit message from there).
It's not strictly necessary, but is a quality of life improvement.
There are also miscellaneous cleanups around TeeEngine included
therein.

Read-only batches, as created by engine.NewReadOnly, are the wrong
abstraction. They conflate two unrelated goals:

  - Ensure read-only commands do not accidentally write data.
  - Improve performance of code paths that can reuse iterators by
  caching the iterator object returned by NewIterator.

The fact that engine.NewReadOnly must return a ReadWriter, instead of a
Reader, to satisfy both of these goals is evidence that these two goals
should be addressed separately.

To better accomplish the first goal, we can leverage the Go type system.
This commit adjusts command registration so that commands declare up
front whether they are read-only or mutating. Read-only commands take an
engine.Reader, while mutating commands take an engine.ReadWriter. This
prevents accidental writes at compile time, whereas the read-only batch
could only check for misuse at runtime.

To accomplish the second goal, this commit replaces read-only batches
with an engine wrapper, rocksDBIterCacher, that does nothing but cache
iterators. This wrapper might also prove useful to cache iterators in
code paths that wish to perform mutations but do not need a full-fledged
batch.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now.

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Jan 7, 2020
This PR reworks parts of cockroachdb#26916. It's not strictly necessary, but is a
quality of life improvement. There are also miscellaneous cleanups
around TeeEngine, RocksDB benchmarks, etc.

Previous to this we relied on engine.NewReadOnly's runtime panics when
receiving writes to ensure that read-only commands do not accidentally
write data. By changing the command registration process to have
commands declare upfront whether they are read-only or read-write, we
can plumb in engine.Reader or engine.ReadWriter respectively. It's nice
rely on the type system to assert these guarantees.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now. This
PR however paves the way for that.

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Jan 7, 2020
This PR reworks parts of cockroachdb#26916. It's not strictly necessary, but is a
quality of life improvement. There are also miscellaneous cleanups
around TeeEngine, RocksDB benchmarks, etc.

Previous to this we relied on engine.NewReadOnly's runtime panics when
receiving writes to ensure that read-only commands do not accidentally
write data. By changing the command registration process to have
commands declare upfront whether they are read-only or read-write, we
can plumb in engine.Reader or engine.ReadWriter respectively. It's nice
rely on the type system to assert these guarantees.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now. This
PR however paves the way for that.

Release note: None
irfansharif added a commit to irfansharif/cockroach that referenced this pull request Jan 7, 2020
This PR reworks parts of cockroachdb#26916. It's not strictly necessary, but is a
quality of life improvement. There are also miscellaneous cleanups
around TeeEngine, RocksDB benchmarks, etc.

Previous to this we relied on engine.NewReadOnly's runtime panics when
receiving writes to ensure that read-only commands do not accidentally
write data. By changing the command registration process to have
commands declare upfront whether they are read-only or read-write, we
can plumb in engine.Reader or engine.ReadWriter respectively. It's nice
rely on the type system to assert these guarantees.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now. This
PR however paves the way for that.

Release note: None
craig bot pushed a commit that referenced this pull request Jan 7, 2020
43764: storage: change command registration to declare if ro/rw r=irfansharif a=irfansharif

This PR reworks parts of #26916. It's not strictly necessary, but is a
quality of life improvement. There are also miscellaneous cleanups
around TeeEngine, RocksDB benchmarks, etc.

Previous to this we relied on engine.NewReadOnly's runtime panics when
receiving writes to ensure that read-only commands do not accidentally
write data. By changing the command registration process to have
commands declare upfront whether they are read-only or read-write, we
can plumb in engine.Reader or engine.ReadWriter respectively. It's nice
rely on the type system to assert these guarantees.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now. This
PR however paves the way for that.

Release note: None


43779: pgwire/hba: fix the HBA parser r=knz a=knz

Previously, the HBA rule parser would not properly support the pg
split format for separate address and netmask.

When the 4th column of a rule in
the setting `server.host_based_authentication.configuration` is
an IP address, the 5th column is expected to be an IP netmask.

Prior to this patch, the HBA parser would not recognize the netmask
syntax and instead expect a simple number of bits set (which is not
valid pg syntax).

This patch fixes this and adds the corresponding tests.

Additionally, this patch replaces the HBA rule parser by one inspired
from pg's own implementation, so as to ease future maintainance.

Release note (bug fix): When the 4th column of a rule in the setting
`server.host_based_authentication.configuration` is an IP address
without a mask length (e.g. `1.2.3.4` instead of `1.2.0.0/16`),
CockroachDB now properly interprets the 5th column as an IP netmask,
as per https://www.postgresql.org/docs/current/auth-pg-hba-conf.html.

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants