Skip to content

storage: remove key comparison in MVCC stats computations#85580

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
erikgrinaker:mvcc-compute-stats-optimize
Aug 11, 2022
Merged

storage: remove key comparison in MVCC stats computations#85580
craig[bot] merged 1 commit intocockroachdb:masterfrom
erikgrinaker:mvcc-compute-stats-optimize

Conversation

@erikgrinaker
Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker commented Aug 3, 2022

This patch restructures MVCC stats computations to use an MVCC iterator
with appropriate bounds. This allows omitting a key comparison in the
hot path, yielding a ~10% performance improvement. The new stats
functions are:

  • ComputeStats: calculates stats for a Reader key span.

  • ComputeStatsWithVisitors: calculates stats for a Reader key span,
    calling the given visitor callbacks for every point/range key.

  • ComputeStatsForIter: calculates stats for a given MVCCIterator, by
    scanning it until exhausted.

It also removes the MVCCIterator.ComputeStats method, which had no
business being part of the interface.

name                                     old time/op   new time/op   delta
MVCCComputeStats_Pebble/valueSize=64-24    130ms ± 1%    119ms ± 1%  -8.77%  (p=0.000 n=10+10)

name                                     old speed     new speed     delta
MVCCComputeStats_Pebble/valueSize=64-24  516MB/s ± 1%  565MB/s ± 1%  +9.61%  (p=0.000 n=10+10)

Resolves #41899.
Touches #84544.

Release note: None

@erikgrinaker erikgrinaker requested a review from jbowens August 3, 2022 21:39
@erikgrinaker erikgrinaker requested review from a team as code owners August 3, 2022 21:39
@erikgrinaker erikgrinaker self-assigned this Aug 3, 2022
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@erikgrinaker erikgrinaker force-pushed the mvcc-compute-stats-optimize branch from f05082c to 573cf07 Compare August 3, 2022 21:40
@erikgrinaker erikgrinaker requested a review from a team August 4, 2022 14:48
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 22 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @jbowens)


pkg/kv/kvserver/replica_consistency.go line 708 at r1 (raw file):

		// TODO(sumeer): When we have replicated locks other than exclusive locks,
		// we will probably not have any interleaved intents so we could stop
		// using MVCCKeyAndIntentsIterKind and consider all locks here.

I can't remember what this TODO is trying to say. Is it merely saying that we can avoid using MVCCKeyAndIntentsIterKind and iterate separately over the lock table and the MVCC keys? If yes, I don't see the connection with "locks other than exclusive locks".


pkg/kv/kvserver/rditer/stats.go line 24 at r1 (raw file):

// TODO(sumeer): When we have replicated locks other than exclusive locks,
// we will probably not have any interleaved intents so we could stop
// using MVCCKeyAndIntentsIterKind and consider all locks here.

this new location for this TODO doesn't seem right -- we are not creating an MVCCIterator here. We do that in ComputeStatsWithVisitors.

@erikgrinaker erikgrinaker force-pushed the mvcc-compute-stats-optimize branch from 573cf07 to bb17474 Compare August 8, 2022 08:54
Copy link
Copy Markdown
Contributor Author

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jbowens and @sumeerbhola)


pkg/kv/kvserver/replica_consistency.go line 708 at r1 (raw file):

Previously, sumeerbhola wrote…

I can't remember what this TODO is trying to say. Is it merely saying that we can avoid using MVCCKeyAndIntentsIterKind and iterate separately over the lock table and the MVCC keys? If yes, I don't see the connection with "locks other than exclusive locks".

Glad I'm not the only one confused by this. :) That's my interpretation too, but I removed the comment -- the vast majority of data we scan across is going to be in the user span, so I don't think we're saving much by dropping the intentInterleavingIter for the few spans where we can avoid it.


pkg/kv/kvserver/rditer/stats.go line 24 at r1 (raw file):

Previously, sumeerbhola wrote…

this new location for this TODO doesn't seem right -- we are not creating an MVCCIterator here. We do that in ComputeStatsWithVisitors.

Yeah, but ComputeStatsWithVisitors can't make that determination, it needs to be plumbed down from the call site so I put it at the relevant call site instead.

Copy link
Copy Markdown
Contributor

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 12 of 22 files at r1, 12 of 12 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)


pkg/storage/engine.go line 271 at r2 (raw file):

	// could be better solved by checking the iterator bounds in NewMVCCIterator
	// and requiring callers to set them appropriately.
	ComputeStats(start, end roachpb.Key, nowNanos int64) (enginepb.MVCCStats, error)

Glad to see this gone

@erikgrinaker erikgrinaker force-pushed the mvcc-compute-stats-optimize branch from bb17474 to d562b22 Compare August 11, 2022 14:54
This patch restructures MVCC stats computations to use an MVCC iterator
with appropriate bounds. This allows omitting a key comparison in the
hot path, yielding a ~10% performance improvement. The new stats
functions are:

* `ComputeStats`: calculates stats for a `Reader` key span.

* `ComputeStatsWithVisitors`: calculates stats for a `Reader` key span,
  calling the given visitor callbacks for every point/range key.

* `ComputeStatsForIter`: calculates stats for a given `MVCCIterator`, by
  scanning it until exhausted.

It also removes the `MVCCIterator.ComputeStats` method, which had no
business being part of the interface.

```
name                                     old time/op   new time/op   delta
MVCCComputeStats_Pebble/valueSize=64-24    130ms ± 1%    119ms ± 1%  -8.77%  (p=0.000 n=10+10)

name                                     old speed     new speed     delta
MVCCComputeStats_Pebble/valueSize=64-24  516MB/s ± 1%  565MB/s ± 1%  +9.61%  (p=0.000 n=10+10)
```

Release note: None
@erikgrinaker erikgrinaker force-pushed the mvcc-compute-stats-optimize branch from d562b22 to 7f61dfd Compare August 11, 2022 15:42
@erikgrinaker
Copy link
Copy Markdown
Contributor Author

Looks like I won the flake jackpot, with all 3 builds flaking with different failures. Assuming this is coincidental and YOLOing it.

bors r=jbowens

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 11, 2022

👎 Rejected by code reviews

@erikgrinaker erikgrinaker dismissed sumeerbhola’s stale review August 11, 2022 18:19

Comments addressed.

@erikgrinaker
Copy link
Copy Markdown
Contributor Author

bors r=jbowens

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 11, 2022

Build succeeded:

@craig craig bot merged commit 9fb8d36 into cockroachdb:master Aug 11, 2022
@erikgrinaker erikgrinaker deleted the mvcc-compute-stats-optimize branch August 12, 2022 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

storage: remove end-key comparison in ComputeStatsForRange

4 participants