Skip to content

clusterversion: benchmark and rm Unmarshal allocs#113043

Merged
craig[bot] merged 4 commits intocockroachdb:masterfrom
pav-kv:bench-cluster-version-check
Oct 31, 2023
Merged

clusterversion: benchmark and rm Unmarshal allocs#113043
craig[bot] merged 4 commits intocockroachdb:masterfrom
pav-kv:bench-cluster-version-check

Conversation

@pav-kv
Copy link
Copy Markdown
Collaborator

@pav-kv pav-kv commented Oct 25, 2023

Currently, each IsActive does a memory allocation:

==================== Test output for //pkg/clusterversion:clusterversion_test:
goos: darwin
goarch: arm64
BenchmarkClusterVersionSettingIsActive
BenchmarkClusterVersionSettingIsActive-10       28778041                42.03 ns/op           16 B/op          1 allocs/op
PASS

Since the cluster version check is in many hot paths, we should eliminate this allocation.

After:

==================== Test output for //pkg/clusterversion:clusterversion_test:
goos: darwin
goarch: arm64
BenchmarkClusterVersionSettingIsActive
BenchmarkClusterVersionSettingIsActive-10       45417914                26.43 ns/op            0 B/op          0 allocs/op
PASS

Touches #111561
Epic: none
Release note: none

@pav-kv pav-kv requested a review from dhartunian October 25, 2023 13:19
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 25, 2023

After some experiments, it seems like the allocation is hidden inside protoutil.Unmarshal.

@pav-kv pav-kv added backport-23.2.x PAST MAINTENANCE SUPPORT: 23.2 patch releases via ER request only backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only labels Oct 25, 2023
@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 25, 2023

There are more Unmarshal calls in this package which probably could be optimized too.
Upd: done.

@pav-kv pav-kv changed the title clusterversion: benchmark IsActive version check clusterversion: benchmark and remove allocs in Unmarshal Oct 25, 2023
@pav-kv pav-kv changed the title clusterversion: benchmark and remove allocs in Unmarshal clusterversion: benchmark and rm allocs in Unmarshal Oct 25, 2023
@pav-kv pav-kv changed the title clusterversion: benchmark and rm allocs in Unmarshal clusterversion: benchmark and rm Unmarshal allocs Oct 25, 2023
@erikgrinaker
Copy link
Copy Markdown
Contributor

After some experiments, it seems like the allocation is hidden inside protoutil.Unmarshal.

Yeah, gogoproto/Protobuf is very inefficient in terms of allocations. We've considered using some other encoding, e.g. Tobi was playing around with FlatBuffers last year, but it hasn't been prioritized. In the short term, it doesn't seem viable to replace it.

@erikgrinaker
Copy link
Copy Markdown
Contributor

I think there was a reason why we use protoutil.Unmarshal() over Message.Unmarshal(), I think there's even a linter for it. I'll have to refresh my memory on this stuff.

@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 25, 2023

I think there was a reason why we use protoutil.Unmarshal() over Message.Unmarshal(), I think there's even a linter for it. I'll have to refresh my memory on this stuff.

Interesting.

When looking at code, I concluded that protoutil.Unmarshal pretty much calls Reset and Unmarshal on the message. We don't need the Reset in all these cases because the message is created fresh, so we can just call Unmarshal on the concrete message.

@erikgrinaker
Copy link
Copy Markdown
Contributor

When looking at code, I concluded that protoutil.Unmarshal pretty much calls Reset and Unmarshal on the message. We don't need the Reset in all these cases because the message is created fresh, so we can just call Unmarshal on the concrete message.

Yeah, looks about right, this is probably fine then.

@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 25, 2023

We do have this lint indeed.

@erikgrinaker
Copy link
Copy Markdown
Contributor

That's what I thought:

    lint_test.go:1295:
        clusterversion/setting.go:140:  if err := curVer.Unmarshal(encoded.([]byte)); err != nil { <- forbidden; use 'protoutil.Unmarshal' instead
    lint_test.go:1295:
        clusterversion/setting.go:157:  if err := clusterVersion.Unmarshal(val); err != nil { <- forbidden; use 'protoutil.Unmarshal' instead
    lint_test.go:1295:
        clusterversion/setting.go:168:  if err := newCV.Unmarshal(newRawProto); err != nil { <- forbidden; use 'protoutil.Unmarshal' instead
    lint_test.go:1295:
        clusterversion/setting.go:177:  if err := oldCV.Unmarshal(curRawProto); err != nil { <- forbidden; use 'protoutil.Unmarshal' instead
    lint_test.go:1295:
        clusterversion/setting.go:213:  if err := ver.Unmarshal(rawProto); err != nil { <- forbidden; use 'protoutil.Unmarshal' instead

It's probably okay anyway, but I'll have a closer look at this when I have time.

@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 25, 2023

The linter was introduced here: #18455. I think it's overly restrictive. It costs unnecessary allocations.

I tried generics:

func unmarshal[Message protoutil.Message](m Message, b []byte) error {
	m.Reset()
	return m.Unmarshal(b)
}

...

if err := unmarshal[*ClusterVersion](&curVer, encoded.([]byte)); err != nil {
	log.Fatalf(ctx, "%v", err)
}

but it doesn't remove the allocation. I don't think the calls are inlined in this case.

So the only thing we can do is either lift or workaround this restriction. Maybe we can silence the linter knowing that this case is safe.

Copy link
Copy Markdown
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only thing we can do is either lift or workaround this restriction. Maybe we can silence the linter knowing that this case is safe.

You can add an exception for clusterversion/setting.go here, and add a comment at the call sites for why we don't use protoutil.Unmarshal():

https://github.com/cockroachdb/cockroach/blob/fb37a8215c68482aa0e2934d195d0d5c66f62fbd/pkg/testutils/lint/lint_test.go#L1274-L1281

As @dt pointed out, it seems unfortunate that we're even unmarshaling here, since isActive() is commonly used in hot paths. Let's write up an issue for this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't strictly need to change all of these, they're not hot paths. I think activeVersionOrEmpty, and possibly Decode, are the ones we care about.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Honestly we could also just move clusterversion.proto into another package that pkg/settings is allowed to depend on and then store a *ClusterVersion in the atomic instead of a []byte so there'd be no unmarshal -- allocating or otherwise -- on the hot path. But that's a bigger change, so if you're eyeing a back port to resolve an observed customer issue, this more targeted removal of just the alloc, and just on the hot path, probably makes sense for now.

Copy link
Copy Markdown
Collaborator Author

@pav-kv pav-kv Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving as is so that this file consistently uses one way. LMK if feel strongly.
I will file an issue to get rid of protos in this package (and added a TODO in the code).
#113385

@erikgrinaker
Copy link
Copy Markdown
Contributor

Btw, I think we could add a generic protoutil.UnmarshalNew() which always creates a new message, guaranteeing that it's safe to omit Reset(). Something like:

func UnmarshalNew[M Message](data []byte) (M, error) {
  var m M
  return m.Unmarshal(data)
}

@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 26, 2023

Btw, I think we could add a generic protoutil.UnmarshalNew() which always creates a new message, guaranteeing that it's safe to omit Reset(). Something like:

func UnmarshalNew[M Message](data []byte) (M, error) {
  var m M
  return m.Unmarshal(data)
}

Tried that:

func UnmarshalNew[M Message](data []byte) (M, error) {
	var m M
	err := m.Unmarshal(data)
	return m, err
}

// and then
curVer, err := protoutil.UnmarshalNew[ClusterVersion](encoded.([]byte))
if err != nil {
	log.Fatalf(ctx, "%v", err)
}

This does not build:

pkg/clusterversion/setting.go:139:40: ClusterVersion does not satisfy protoutil.Message (method MarshalTo has pointer receiver)

Setting the M type to pointer doesn't help either: var m M becomes a nil, and then there is a nil deref panic.

@erikgrinaker
Copy link
Copy Markdown
Contributor

Setting the M type to pointer doesn't help either: var m M becomes a nil, and then there is a nil deref panic.

Can do m := new(M) or something, perhaps.

Either way, as you pointed out on Slack, this may not help with performance anyway due to how Go monomorphizes generic functions.

@pav-kv pav-kv force-pushed the bench-cluster-version-check branch from 21a83e1 to 3969f04 Compare October 31, 2023 09:28
@pav-kv pav-kv removed the request for review from dhartunian October 31, 2023 09:31
@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 31, 2023

bors r=erikgrinaker

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Oct 31, 2023

🕐 Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

Currently, each IsActive does a memory allocation:

```
==================== Test output for //pkg/clusterversion:clusterversion_test:
goos: darwin
goarch: arm64
BenchmarkClusterVersionSettingIsActive
BenchmarkClusterVersionSettingIsActive-10       28778041                42.03 ns/op           16 B/op          1 allocs/op
PASS
```

Since the cluster version check is in many hot paths, we should
eliminate this allocation.

Epic: none
Release note: none
The allocation no longer presents itself in the benchmark.

```
==================== Test output for //pkg/clusterversion:clusterversion_test:
goos: darwin
goarch: arm64
BenchmarkClusterVersionSettingIsActive
BenchmarkClusterVersionSettingIsActive-10       45417914                26.43 ns/op            0 B/op          0 allocs/op
PASS
```

Epic: none
Release note: none
The mandated protoutil.Unmarshal caused unnecessary allocations in
clusterversion/setting.go, is used extensively on critical paths. These
allocations were visible on benchmarks (e.g. 1% of all allocations in
one KV benchmark).

Epic: none
Release note: none
@pav-kv pav-kv force-pushed the bench-cluster-version-check branch from 3969f04 to 6a935ae Compare October 31, 2023 10:16
@pav-kv
Copy link
Copy Markdown
Collaborator Author

pav-kv commented Oct 31, 2023

bors retry

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Oct 31, 2023

Build succeeded:

@craig craig bot merged commit 1aabd30 into cockroachdb:master Oct 31, 2023
@pav-kv pav-kv deleted the bench-cluster-version-check branch October 31, 2023 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-23.1.x PAST MAINTENANCE SUPPORT: 23.1 patch releases via ER request only backport-23.2.x PAST MAINTENANCE SUPPORT: 23.2 patch releases via ER request only

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants