Skip to content

db: standardize behavior of indexed batch mutations during iteration#1640

Merged
jbowens merged 1 commit intocockroachdb:masterfrom
jbowens:batch-iter-fix
May 18, 2022
Merged

db: standardize behavior of indexed batch mutations during iteration#1640
jbowens merged 1 commit intocockroachdb:masterfrom
jbowens:batch-iter-fix

Conversation

@jbowens
Copy link
Copy Markdown
Contributor

@jbowens jbowens commented Apr 11, 2022

Previously the visibility of indexed batch mutations to open Iterators was
undocumented within the API. In practice range deletion and range key mutations
added after Iterator construction were never visible. Point operations were
immediately visible but could induce odd behavior like repeat keys when
switching directions (#943).

This commit defines and adopts new behavior for iteration over mutated indexed
batches. An iterator over an indexed batch sees only the mutations that exist
at Iterator creation.

Calls to SetOptions refresh the Iterator's view of the indexed batch,
regardless of whether the iterator's options were changed. Cloned iterators
adopt the original Iterator's views of the batch and may be independently
refreshed using SetOptions.

This interface allows iterator reuse in CockroachDB, which always calls
SetOptions after mutation before reuse. It removes the unusual repeat-key
behavior.

Fix #943.
Fix #1638.

@jbowens jbowens requested review from a team, erikgrinaker and sumeerbhola April 11, 2022 23:01
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@erikgrinaker
Copy link
Copy Markdown
Contributor

Attempting to relative position an iterator with Next or Prev when the underlying batch has been modified is prohibited and errors.

I don't think this will work, if I'm understanding it correctly. For example, in CRDB we have ClearIterRange which will iterate across a span and clear every key that's encountered during iteration.

I don't think we need to see the mutations we're making while iterating, but we do need to be able to mutate while iterating.

Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right, makes sense. I'll go back to an earlier iteration that hid mutations between absolute positioning calls. I was hoping we could get away with the stricter behavior to avoid accidental stale reads, but it'll be straightforward to hide the mutations rather than error.

Reviewable status: 0 of 5 files reviewed, all discussions resolved (waiting on @erikgrinaker and @sumeerbhola)

@erikgrinaker
Copy link
Copy Markdown
Contributor

Out of curiosity: what's the behavior on a bare engine? Will an iterator see writes that happen during iteration (e.g. by other writers), or does the iterator operate on a consistent snapshot? It would be nice to have parity between all ReadWriter implementations.

Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator operates on a consistent snapshot. Iterators created on top of batches are the only way the data underlying an iterator can change. We could add some explicit facility that 'refreshes' iterators created on top of batches, instead of the implicit refreshes on seeks.

Reviewable status: 0 of 5 files reviewed, all discussions resolved (waiting on @erikgrinaker and @sumeerbhola)

@jbowens jbowens force-pushed the batch-iter-fix branch 3 times, most recently from fb7ffb9 to f39de5b Compare April 12, 2022 14:41
@erikgrinaker
Copy link
Copy Markdown
Contributor

The iterator operates on a consistent snapshot. Iterators created on top of batches are the only way the data underlying an iterator can change. We could add some explicit facility that 'refreshes' iterators created on top of batches, instead of the implicit refreshes on seeks.

This is a bit interesting when I think about it. The primary motivation for cloning in CRDB is precisely for the iterators to have a consistent view of the reader (e.g. in the intent interleaving iterator, where it is crucial for both the intent and point key iterators to view the same state). And yet we're now saying that we need to update those iterators' views of the state to something that's possibly inconsistent.

Let me look this over again tomorrow to figure out exactly what semantics we really need here in various cases. It may be more important to ensure we have consistent semantics across point/range keys and across read/writers. A method to refresh the iterator state seems like it would allow us to ensure that it all cases.

Copy link
Copy Markdown
Contributor

@nicktrav nicktrav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r1, 5 of 5 files at r2, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @erikgrinaker, @jbowens, and @sumeerbhola)


iterator.go, line 1873 at r2 (raw file):

// operation. If the iterator includes iteration over an indexed batch, the
// batch's counts of range deletions and range keys are consulted to determine
// if new range deletions or range keys were written to the batch since the last
// seek.

Is this comment outdated? My understanding is that the condition is more general now in that any mutation on the batch updates the "seqnum" which is being used to determine whether to re-init the iter - i.e. not limited to just range del / range key counts.


testdata/indexed_batch_mutation, line 52 at r2 (raw file):


mutate
del-range f g

Can we drop in two prev's on the iter here to prove that the range del op isn't yet visible?


testdata/indexed_batch_mutation, line 72 at r2 (raw file):

should error

We don't expect an error do we? Just that the two previous ops aren't yet visible.

@erikgrinaker
Copy link
Copy Markdown
Contributor

Sorry for the delay. I wrote up a test to explore and nail down iterator consistency semantics here:

https://github.com/cockroachdb/cockroach/blob/29fee1b6bcd5db967f8f70864724f52a1af9cca4/pkg/storage/engine_test.go#L1685

The CRDB-level invariants I came up with that I think make the most sense (and are mostly consistent with current semantics) are:

  1. All iterators have a consistent view of the reader as of the time of its creation. Subsequent writes are never visible to it.

  2. Iterators on readers with ConsistentIterators=true all have a consistent view of the Engine as of the time of the first iterator creation or PinEngineStateForIterators call: newer Engine writes are never visible to new iterators. The opposite is true for ConsistentIterators=false: new iterators always see the most recent Engine state at the time of their creation.

  3. New iterators on unindexed Batches never see batch writes. They do satisfy ConsistentIterators: they never see new Engine writes after the first iterator was created or a PinEngineStateForIterators call.

  4. New iterators on indexed Batches see all batch writes that happened before the iterator was created. However, they satisfy ConsistentIterators and do not see new Engine writes after the first iterator was created or a PinEngineStateForIterators call. Via 1, they never see later batch writes.

There are a couple of Batch inconsistencies, as we've discussed above:

  • Existing batch iterators see new batch point key writes. They shouldn't (see 1).

  • New batch iterators do not see batch range key writes. They should (see 4).

  • Because we reuse Batch iterators via cloning, we need a way to refresh the cloned iterator's view of the batch (but not of the Engine) after cloning it. Otherwise, we cannot satisfy 4. Refreshing on seek violates 1, so I think we need a new method.

I haven't vetted that all existing CRDB code is compatible with these semantics, and I haven't found a good way to fake them. If you can throw together a prototype then I'll see if the test suite passes (or I might just prototype it myself tomorrow).

CC @sumeerbhola on semantics ^

@jbowens
Copy link
Copy Markdown
Contributor Author

jbowens commented Apr 15, 2022

New batch iterators do not see batch range key writes. They should (see 4).

I'm confused by this. A new indexed batch iterator will see all range key writes up to the moment the iterator is created. Or by "new batch iterators" do you mean reused existing batch iterators?

@erikgrinaker
Copy link
Copy Markdown
Contributor

New batch iterators do not see batch range key writes. They should (see 4).

I'm confused by this. A new indexed batch iterator will see all range key writes up to the moment the iterator is created. Or by "new batch iterators" do you mean reused existing batch iterators?

Ah, yeah -- I originally wrote this from the point of view of CRDB MVCC iterators and forgot to update it properly. It's caused by the cloned iterators, as you say.

@erikgrinaker
Copy link
Copy Markdown
Contributor

Hacked together a prototype in #1655, which mostly worked. Some observations:

  • SetOptions() already refreshes the batch view. This seems beneficial, since we only call SetOptions() when constructing new iterators from cloned ones.

  • We still need RefreshBatchSnapshot(), to refresh the batch view when we omit the SetOptions() call (if the cloned iterator already has appropriate options). However, this must be significantly cheaper than SetOptions(), otherwise there's no point in omitting the call.

  • It seems like calling SetOptions() causes the range key iterator (but not the point key iterator) to refresh its view of the engine state too -- it shouldn't.

I ran a bunch of KV integration tests with these changes (i.e. batch point writes are not visible to existing iterators), and they all passed. I did get some failures in the storage package, but those are likely artifacts of the tests themselves. So this seems viable.

I don't think it should be a huge lift to adapt the current master to the new point key semantics, I can give it a shot once you've wrapped up the PR. Alternatively, we can split the change into two parts: first refresh the range key view on SetOptions() and RefreshBatchSnapshot(), then change the point key semantics.

@jbowens jbowens force-pushed the batch-iter-fix branch 3 times, most recently from 3d5ec99 to 2a3802c Compare April 18, 2022 15:37
@jbowens
Copy link
Copy Markdown
Contributor Author

jbowens commented Apr 18, 2022

However, this must be significantly cheaper than SetOptions(), otherwise there's no point in omitting the call.

This will be tricky. If any of the batch's internal iterators (1 for rangedels, 1 for rangekeys, 1 for points) go from empty to non-empty, we'll need to reconstruct the whole iterator stack.

I think we should unconditionally call SetOptions: It'll avoid reconstructing the iterator stack if there are no new batch mutations. If we pursue an optimization here for refreshing the batch snapshot, I think we can do it within SetOptions.

It seems like calling SetOptions() causes the range key iterator (but not the point key iterator) to refresh its view of the engine state too -- it shouldn't.

Do you have a reproduction handy? This doesn't readily reproduce, but I wouldn't be surprised if there's a bug due to the hack of the in-memory arena.

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimizing away SetOptions is also used in the underlying pebble.Iterator to optimize consecutive Seeks.
This is used in trySeekUsingNext optimization which is selectively ignored by parts of the iterator stack that know they are mutable, and not ignored by sstables. For example, batchIter.SeekGE has the following comment

	// Ignore trySeekUsingNext since the batch may have changed, so using Next
	// would be incorrect.

This is helpful in that most of the data when working with a pebble.Iterator over a batch is not in the mutable part.
We should preserve such optimizations. If we don't want the refresh to be implicit via an absolute positioning operation, we can add a RefreshIfMutable that can be passed down the InternalIterator stack.

Reviewable status: 1 of 8 files reviewed, 3 unresolved discussions (waiting on @erikgrinaker, @jbowens, and @nicktrav)

Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimizing away SetOptions is also used in the underlying pebble.Iterator to optimize consecutive Seeks.

Can you remind me why we can't compare old and new bounds in SetBounds/SetOptions to determine whether they've changed? If we could do that and move the bounds-change noop optimization into Pebble and not require a separate method in the public interface.

we can add a RefreshIfMutable that can be passed down the InternalIterator stack.

I think we would need to remove the optimization that empty point/rangedel/rangekey batch iterators are omitted from the iterator stack altogether, but that seems alright.

Reviewable status: 1 of 8 files reviewed, 3 unresolved discussions (waiting on @erikgrinaker, @jbowens, and @nicktrav)

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We moved the noop optimization from Pebble up to CockroachDB since we can't correctly make the noop choice in Pebble since the slices may start getting mutated from under the internal iterators. The following pair of PRs made that change #1073 cockroachdb/cockroach#61194

Reviewable status: 1 of 8 files reviewed, 3 unresolved discussions (waiting on @erikgrinaker, @jbowens, and @nicktrav)

@jbowens jbowens force-pushed the batch-iter-fix branch 2 times, most recently from 9a7eb88 to 6cf1506 Compare April 20, 2022 18:01
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 7 files at r3, 14 of 15 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @erikgrinaker and @jbowens)


batch.go, line 842 at r5 (raw file):

		return newErrorKeyspanIter(ErrNotIndexed), func() {}
	}
	iter := new(keyspan.Iter)

Can you add a code comment stating that we create an iterator even if rangeDelIndex is nil, since it is allowed to refresh later, so we need the "container" to exist.


batch.go, line 843 at r5 (raw file):

	}
	iter := new(keyspan.Iter)
	refresh = func() { b.initRangeDelIter(iter) }

Can there be a fast-path if b.tombstones has not changed? By tracking the count of elements in the b.rangeDelIndex?
A code comment about a future optimization is fine.


db.go, line 962 at r5 (raw file):

	if dbi.batch != nil && dbi.batch.nextSeqNum() != dbi.batchSeqNum {
		dbi.batchSeqNum = dbi.batch.nextSeqNum()
		if dbi.batchRefreshPointKeys != nil {

when will these funcs be nil despite dbi.batch != nil?


iterator.go, line 1881 at r5 (raw file):

// The iterator will always be invalidated and must be repositioned with a call
// to SeekGE, SeekPrefixGE, SeekLT, First, or Last.
func (i *Iterator) RefreshBatchSnapshot() {

So CockroachDB reusing an Iterator will need to call this since it can not rely on absolute positioning calls alone and because SetOptions can be elided by it, yes?


testdata/indexed_batch_mutation, line 76 at r5 (raw file):

bar: (bar, .)

# Write a range key del and a point key.

range key set?


testdata/indexed_batch_mutation, line 123 at r5 (raw file):

a: (., [a-c) @1=boop)
.

nice!


testdata/rangekeys, line 177 at r5 (raw file):

set-options
seek-ge b
next

what is this trying to test?

Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, this must be significantly cheaper than SetOptions(), otherwise there's no point in omitting the call.

This will be tricky. If any of the batch's internal iterators (1 for rangedels, 1 for rangekeys, 1 for points) go from empty to non-empty, we'll need to reconstruct the whole iterator stack.

I see. We'll need to run some benchmarks to determine the impact on various workloads. It doesn't affect the CRDB read-path in any case -- not only because there are no mutations, but also because it doesn't use a batch at all.

We moved the noop optimization from Pebble up to CockroachDB since we can't correctly make the noop choice in Pebble since the slices may start getting mutated from under the internal iterators.

I don't really understand why we can't do this in Pebble either. The current CRDB optimization copies the given bounds, so we're already paying that cost. If we copy the bounds inside Pebble instead then we don't have to worry about what the caller is doing to them afterwards.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @jbowens)

Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We moved the noop optimization from Pebble up to CockroachDB since we can't correctly make the noop choice in Pebble since the slices may start getting mutated from under the internal iterators.

I don't really understand why we can't do this in Pebble either. The current CRDB optimization copies the given bounds, so we're already paying that cost. If we copy the bounds inside Pebble instead then we don't have to worry about what the caller is doing to them afterwards.

Hm, I guess this might be to avoid allocations: pebble.Iterator.Clone() doesn't deep-copy the bounds, so we'd need to put the dual bound buffers into the pebble.iterAlloc pool or some such to reuse the allocations, similarly to what pebbleIterator does. I do think it might be worthwhile to move this down though, to avoid the coupling and complexity.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @jbowens)

Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the addition of the zero byte would require us to perform two copies, one to synthesize the new bound with a nul terminator, and one within Pebble to copy the bound. Or the interface would need to be complicated to avoid the second copy.

Hm, that's slightly annoying. It isn't clear to me why we can't let Pebble do the options comparisons inside SetOptions() with the current semantics though. I.e. CRDB keeps doing the dual buffer shuffle, but unconditionally calls SetOptions(). As long as Pebble unconditionally switches the internal byte buffers to the ones given by the caller, but otherwise keeps as much internal state as possible when the contents of the old/new buffers don't change, that should give us these benefits without the additional copy, no?

Reviewable status: 12 of 16 files reviewed, 7 unresolved discussions (waiting on @jbowens and @sumeerbhola)

Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as Pebble unconditionally switches the internal byte buffers to the ones given by the caller, but otherwise keeps as much internal state as possible when the contents of the old/new buffers don't change, that should give us these benefits without the additional copy, no?

Yeah, that's a good point. Pebble's internal iterators have their own SetBounds methods that need to record the new buffer pointers and sometimes clear state to invalidate themselves. We could add a flag to this internal SetBounds method indicating whether the logical bounds changed or just the buffer holding the bounds. If the bounds are logically equivalent, we omit the invalidation. I don't see any reason why that wouldn't work.

Reviewable status: 12 of 16 files reviewed, 7 unresolved discussions (waiting on @jbowens and @sumeerbhola)

@jbowens jbowens force-pushed the batch-iter-fix branch 2 times, most recently from a661e51 to b35808f Compare May 6, 2022 17:24
Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look at this again on a branch using this PR and unconditionally calling SetOptions in CRDB (which should refresh the batch view). Branch here:

cockroachdb/cockroach@mvcc-range-tombstones...erikgrinaker:mvcc-range-tombstones-primitives-setoptions

TestEngineConsistentIterators tests what I believe are the correct semantics, but it fails in a couple of ways. This is all from the MVCCIterator point of view.

Hope that gives you something to go by.

Reviewable status: 6 of 16 files reviewed, 7 unresolved discussions (waiting on @jbowens and @sumeerbhola)

Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sumeerbhola — do you mind giving this another review? I made some material changes, including preserving the same view of the batch across a Clone which I think is more inline with the spirit of Clone. Also, I replaced the refresh closures with direct access to the batchIter and keyspan.Iters.

An existing batch iterator sees new batch writes, i.e. iterOld sees a3 here:

Thanks, this is fixed.

A new batch iterator sees new engine range key writes from after the batch (and thus cloned iterator) was created, i.e. iterOwn sees bc2 here:

This one I think is expected as long as SetOptions refreshes the view of the batch. The new iterator is created by calling Clone and thenSetOptions, which refreshes the batch view.

Reviewable status: 0 of 16 files reviewed, 7 unresolved discussions (waiting on @jbowens and @sumeerbhola)

Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I think is expected as long as SetOptions refreshes the view of the batch. The new iterator is created by calling Clone and thenSetOptions, which refreshes the batch view.

It's supposed to refresh the batch view, but not the engine view. This breaks the ConsistentIterators() contract in CRDB.

Reviewable status: 0 of 16 files reviewed, 7 unresolved discussions (waiting on @jbowens and @sumeerbhola)

@jbowens jbowens force-pushed the batch-iter-fix branch 2 times, most recently from 436381f to ebbf018 Compare May 17, 2022 15:44
Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I misunderstood. The second commit contains a fix for the bug that allowed an iterator to see committed range keys at sequence numbers higher than the iterator's sequence number if there was an overlapping range key in the indexed batch.

Reviewable status: 0 of 19 files reviewed, 7 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)

Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I misunderstood. The second commit contains a fix for the bug that allowed an iterator to see committed range keys at sequence numbers higher than the iterator's sequence number if there was an overlapping range key in the indexed batch.

Thanks! Can confirm that this PR has the correct semantics both for point keys and range keys.

Reviewable status: 0 of 19 files reviewed, 7 unresolved discussions (waiting on @jbowens, @nicktrav, and @sumeerbhola)

@jbowens jbowens requested a review from a team May 18, 2022 16:05
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 1 of 4 files at r6, 1 of 8 files at r7, 1 of 4 files at r8, 11 of 13 files at r9, 1 of 2 files at r10, 4 of 4 files at r11.
Reviewable status: all files reviewed (commit messages unreviewed), 11 unresolved discussions (waiting on @jbowens and @sumeerbhola)


iterator.go line 210 at r11 (raw file):

	batchPointIter    batchIter
	batchRangeDelIter keyspan.Iter
	batchRangeKeyIter keyspan.Iter

I like this explicitness instead of the closure.


internal/keyspan/span.go line 146 at r11 (raw file):

	// ordered by sequence number, so ordinarily we can return the trailing
	// subslice containing keys with sequence numbers less than `seqNum`.
	// However, if batch keys are present, they must always be visible. This can

I am assuming batch keys are "always visible", even if the iterator has not refreshed its view of the batch using SetOptions, because our implementation of range keys/deletes for a batch iterator is structurally constructing the fragmented keys and using those, so future mutations are automatically excluded. Is my understanding correct? Which is why we don't use to pass batchIter.snapshot as a parameter too.
Maybe a code comment would be useful.


internal/keyspan/span.go line 170 at r11 (raw file):

	switch {
	case lastNonVisibleIdx == -1:
		// All keys are visible.

Can you add to this comment that v represents visible and h is hidden.


testdata/indexed_batch_mutation line 168 at r11 (raw file):

.

# Clone i1 to create i2.

clone semantics make sense.
and these newly added test cases are nice.

Previously the visibility of indexed batch mutations to open Iterators was
undocumented within the API. In practice range deletion and range key mutations
added after Iterator construction were never visible. Point operations were
immediately visible but could induce odd behavior like repeat keys when
switching directions (cockroachdb#943).

This commit defines and adopts new behavior for iteration over mutated indexed
batches. An iterator over an indexed batch sees only the mutations that exist
at Iterator creation.

Calls to SetOptions refresh the Iterator's view of the indexed batch,
regardless of whether the iterator's options were changed. Cloned iterators
adopt the original Iterator's views of the batch and may be independently
refreshed using SetOptions.

This interface allows iterator reuse in CockroachDB, which always calls
SetOptions after mutation before reuse. It removes the unusual repeat-key
behavior.

Fix cockroachdb#943.
Fix cockroachdb#1638.
Copy link
Copy Markdown
Contributor Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 18 of 19 files reviewed, 11 unresolved discussions (waiting on @jbowens and @sumeerbhola)


internal/keyspan/span.go line 146 at r11 (raw file):

Previously, sumeerbhola wrote…

I am assuming batch keys are "always visible", even if the iterator has not refreshed its view of the batch using SetOptions, because our implementation of range keys/deletes for a batch iterator is structurally constructing the fragmented keys and using those, so future mutations are automatically excluded. Is my understanding correct? Which is why we don't use to pass batchIter.snapshot as a parameter too.
Maybe a code comment would be useful.

Yeah, that's right. Added a comment.


internal/keyspan/span.go line 170 at r11 (raw file):

Previously, sumeerbhola wrote…

Can you add to this comment that v represents visible and h is hidden.

Done.

@jbowens
Copy link
Copy Markdown
Contributor Author

jbowens commented May 18, 2022

I'll follow up with changes to chip away at the Span.Visible calls.

@jbowens jbowens merged commit e32e94d into cockroachdb:master May 18, 2022
@jbowens jbowens deleted the batch-iter-fix branch May 18, 2022 17:58
@erikgrinaker
Copy link
Copy Markdown
Contributor

Awesome, thanks! The new point key semantics will likely need updates to CRDB, I'll submit a PR once cockroachdb/cockroach#81389 lands (so I don't have to deal with the vfs.WithDiskHealthChecks stuff).

craig bot pushed a commit to cockroachdb/cockroach that referenced this pull request May 24, 2022
81662: storage: respect new Pebble iterator visibility r=jbowens a=erikgrinaker

Pebble recently tightened its iterator semantics such that it has a
static view of the underlying reader as of the time of its creation.
Previously, a batch iterator could see batch writes that occurred after
the iterator was created.

This patch prepares CRDB for this change.

Touches cockroachdb/pebble#1640.

Release note: None

81702: roachtest: update rust-postgres blocklist r=e-mbrown a=RichardJCai

Release note: None

Fixes #81633

Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
Co-authored-by: richardjcai <caioftherichard@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

5 participants