Skip to content

storage: change command registration to declare if ro/rw#43764

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
irfansharif:200106.ro-batches
Jan 7, 2020
Merged

storage: change command registration to declare if ro/rw#43764
craig[bot] merged 1 commit intocockroachdb:masterfrom
irfansharif:200106.ro-batches

Conversation

@irfansharif
Copy link
Copy Markdown
Contributor

@irfansharif irfansharif commented Jan 6, 2020

This PR reworks parts of #26916. It's not strictly necessary, but is a
quality of life improvement. There are also miscellaneous cleanups
around TeeEngine, RocksDB benchmarks, etc.

Previous to this we relied on engine.NewReadOnly's runtime panics when
receiving writes to ensure that read-only commands do not accidentally
write data. By changing the command registration process to have
commands declare upfront whether they are read-only or read-write, we
can plumb in engine.Reader or engine.ReadWriter respectively. It's nice
rely on the type system to assert these guarantees.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now. This
PR however paves the way for that.

Release note: None

@irfansharif irfansharif requested review from nvb and petermattis January 6, 2020 22:57
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@irfansharif irfansharif force-pushed the 200106.ro-batches branch 2 times, most recently from b965df0 to d432da4 Compare January 7, 2020 03:36
Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the replica evaluation changes to use the type system.

I'm not terribly fond of the term IterCacher. I suppose it is more specific than ReadOnly, but it is a bit of a mouthful. Was there a benefit to removing the functionality to disable read-only methods in IterCacher. I see your commit message comment that it might prove useful in the future to use IterCacher for certain mutations. Is that a concrete plan, or speculation?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @nvanbenschoten, and @petermattis)


pkg/storage/engine/tee.go, line 294 at r1 (raw file):

// NewIterCacher implements the Engine interface.
func (t *TeeEngine) NewIterCacher() Engine {
	return t

This doesn't seem right. The returned engine won't cache the iters.

Copy link
Copy Markdown
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR.

Is that a concrete plan, or speculation?

It's speculation for now, but might prove useful when we're doing local raft log truncations for example. I'm also happy to revert this decision.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @petermattis)


pkg/storage/engine/tee.go, line 294 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

This doesn't seem right. The returned engine won't cache the iters.

This was intentional. Since TeeEngine is only meant to be used in testing, I didn't bother with caching iterators. Added a comment to this effect.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pulling on this! Any cleanup here is movement in the right direction.

I'm a little confused what we gain from deleting the Writer methods on the iterCacher types though. They each embed Engines, so now that we're returning an Engine from their constructors, users have direct access to the embedded field's Writer methods. Was this intentional?

Reviewed 46 of 50 files at r1, 1 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @nvanbenschoten, and @petermattis)


pkg/storage/batcheval/command.go, line 36 at r1 (raw file):

	// it should also update *CommandArgs.Stats. It should treat the provided
	// request as immutable.
	EvalRW func(context.Context, engine.ReadWriter, CommandArgs, roachpb.Response) (result.Result, error)

Mention that only one of these is ever set at a time.

Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's speculation for now, but might prove useful when we're doing local raft log truncations for example. I'm also happy to revert this decision.

My preference would be to revert this part of the change for now. The replication evaluation changes are unambiguously good, while the s/ReadOnly/IterCacher bits leave me feeling "meh".

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif)


pkg/storage/engine/tee.go, line 294 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

This was intentional. Since TeeEngine is only meant to be used in testing, I didn't bother with caching iterators. Added a comment to this effect.

While a TeeEngine is only used for testing, it is nice to test the same semantics as the underlying engines. There is a minor semantic difference from using a cached iterator vs creating a new one: an iterator takes an implicit snapshot of the database at the time of its creation.

@irfansharif irfansharif changed the title storage,engine: replace engine.NewReadOnly with NewIterCacher storage: change command registration to declare if ro/rw Jan 7, 2020
Copy link
Copy Markdown
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to revert this part of the change for now

Done.

The replication evaluation changes are unambiguously good, while the s/ReadOnly/IterCacher bits leave me feeling "meh".

I reverted the below pkg/storage/engine changes, keeping only the replica evaluation type guarantee niceties. This also addresses @nvanbenschoten's comment.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @petermattis)


pkg/storage/batcheval/command.go, line 36 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Mention that only one of these is ever set at a time.

Done.


pkg/storage/engine/tee.go, line 294 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

While a TeeEngine is only used for testing, it is nice to test the same semantics as the underlying engines. There is a minor semantic difference from using a cached iterator vs creating a new one: an iterator takes an implicit snapshot of the database at the time of its creation.

Done (no longer applicable)

Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @irfansharif and @nvanbenschoten)


pkg/storage/engine/tee.go, line 264 at r3 (raw file):

// GetEncryptionRegistries implements the Engine interface.
func (t *TeeEngine) GetEncryptionRegistries() (*EncryptionRegistries, error) {

What effect was it having to have these methods on the value instead of the pointer? Was that a bug?

Copy link
Copy Markdown
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @nvanbenschoten)


pkg/storage/engine/tee.go, line 264 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

What effect was it having to have these methods on the value instead of the pointer? Was that a bug?

It was buggy when I was making the changes to TeeEngine, but nothing other than that. @itsbilal mentioned that the value receivers were just an artifact of his editor.

Copy link
Copy Markdown
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @nvanbenschoten)


pkg/storage/engine/tee.go, line 264 at r3 (raw file):

Previously, irfansharif (irfan sharif) wrote…

It was buggy when I was making the changes to TeeEngine, but nothing other than that. @itsbilal mentioned that the value receivers were just an artifact of his editor.

Ack. Ok. Using a pointer receiver is fine by me and less error prone.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 11 files at r3, 2 of 2 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @itsbilal)


pkg/storage/engine/bench_rocksdb_test.go, line 170 at r4 (raw file):

// performance of creating an iterator and seeking to a key if a read-only
// ReadWriter that caches the RocksDB iterator is used
func BenchmarkIterOnIterCacher_RocksDB(b *testing.B) {

This didn't get switched back.

This PR reworks parts of cockroachdb#26916. It's not strictly necessary, but is a
quality of life improvement. There are also miscellaneous cleanups
around TeeEngine, RocksDB benchmarks, etc.

Previous to this we relied on engine.NewReadOnly's runtime panics when
receiving writes to ensure that read-only commands do not accidentally
write data. By changing the command registration process to have
commands declare upfront whether they are read-only or read-write, we
can plumb in engine.Reader or engine.ReadWriter respectively. It's nice
rely on the type system to assert these guarantees.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now. This
PR however paves the way for that.

Release note: None
@irfansharif
Copy link
Copy Markdown
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Jan 7, 2020
43764: storage: change command registration to declare if ro/rw r=irfansharif a=irfansharif

This PR reworks parts of #26916. It's not strictly necessary, but is a
quality of life improvement. There are also miscellaneous cleanups
around TeeEngine, RocksDB benchmarks, etc.

Previous to this we relied on engine.NewReadOnly's runtime panics when
receiving writes to ensure that read-only commands do not accidentally
write data. By changing the command registration process to have
commands declare upfront whether they are read-only or read-write, we
can plumb in engine.Reader or engine.ReadWriter respectively. It's nice
rely on the type system to assert these guarantees.

NB: There's still the unaddressed wart that is replica read code paths
operating off an engine.ReadWriter. Changing this out would require a
refactor of replica evaluation code paths, which I'm not doing now. This
PR however paves the way for that.

Release note: None


43779: pgwire/hba: fix the HBA parser r=knz a=knz

Previously, the HBA rule parser would not properly support the pg
split format for separate address and netmask.

When the 4th column of a rule in
the setting `server.host_based_authentication.configuration` is
an IP address, the 5th column is expected to be an IP netmask.

Prior to this patch, the HBA parser would not recognize the netmask
syntax and instead expect a simple number of bits set (which is not
valid pg syntax).

This patch fixes this and adds the corresponding tests.

Additionally, this patch replaces the HBA rule parser by one inspired
from pg's own implementation, so as to ease future maintainance.

Release note (bug fix): When the 4th column of a rule in the setting
`server.host_based_authentication.configuration` is an IP address
without a mask length (e.g. `1.2.3.4` instead of `1.2.0.0/16`),
CockroachDB now properly interprets the 5th column as an IP netmask,
as per https://www.postgresql.org/docs/current/auth-pg-hba-conf.html.

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 7, 2020

Build succeeded

@craig craig bot merged commit 5e6e11c into cockroachdb:master Jan 7, 2020
@irfansharif irfansharif deleted the 200106.ro-batches branch January 7, 2020 22:46
craig bot pushed a commit that referenced this pull request Jan 8, 2020
43792: batcheval: clean up remaning variable names of `{Reader,ReadWriter}` types r=irfansharif a=irfansharif

Ignore first commit, reviewed in #43764. 

We should be using {reader,readWriter} for
{Reader,ReadWriter} respectively instead. I'd skipped everything under
batcheval when working on #43265, back when I thought it'd be a good
idea to plumb in an engine.Batch throughout replica read codepaths. Due
to perf penalty as seen on #43388, I'm no longer going to be doing that.
This is the remaining work.

Release note: None


43811: hba: stricter parsing and better pretty-printing r=knz a=knz

The following defects were found by @mjibson using
`go-fuzz` (#43805) and are fixed by this patch:

- the code would parse then fail to pretty-print an empty connection
  type and auth method.
- the code would parse then pretty-print incorrectly a connection
  type or auth method containing commas or other special characters.
- the code would accept an empty string as address.
- the code would accept then fail to pretty-print columns containing
  special (control) characters.

Note that the code previously supported tabs (\t) and carriage
returns (\r) inside quoted strings, and this is now rejected. For most
fields, this does not matter, as these special characters were not
valid semantically anyway.

However, for the method options in the last column it is theoretically
possible for a method to wish to use such special characters. This is
not currently the case for the CCL `gss` method already implemented,
so this change is not introducing a regression. A comment is left in
the code to indicate what needs to be changed if a future method needs
this support.

Release note (security): CockroachDB now properly rejects control
characters in the value of the cluster setting
`server.host_based_authentication.configuration`. These characters
were previously accepted and would silently yield rules matching
differently from intended. Deployments careful to strip control
characters from their HBA configuration are not affected.

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants