Skip to content

kvserver: rework memory allocation in replicastats#86592

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
kvoli:220822.replstats-mem
Aug 31, 2022
Merged

kvserver: rework memory allocation in replicastats#86592
craig[bot] merged 2 commits intocockroachdb:masterfrom
kvoli:220822.replstats-mem

Conversation

@kvoli
Copy link
Copy Markdown
Contributor

@kvoli kvoli commented Aug 22, 2022

This patch removes some unused fields within the replica stats object.
It also opts to allocate all the memory needed upfront for a replica
stats object for better cache locality and less GC overhead.

This patch also removes locality tracking for the other throughput trackers
to reduce per-replica memory footprint.

resolves #85112

Release justification: low risk, lowers memory footprint to avoid oom.
Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@kvoli kvoli force-pushed the 220822.replstats-mem branch from 914e0bc to 9522702 Compare August 22, 2022 22:37
@kvoli kvoli force-pushed the 220822.replstats-mem branch from 9522702 to afe67ec Compare August 23, 2022 13:47
@kvoli kvoli marked this pull request as ready for review August 23, 2022 13:47
@kvoli kvoli requested a review from a team as a code owner August 23, 2022 13:47
@kvoli kvoli self-assigned this Aug 23, 2022
@kvoli kvoli added the release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. label Aug 24, 2022
Copy link
Copy Markdown
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reducing the size of the struct like we've done is good, but the total count of objects allocated is important which can cause undue lot of pressure on the Go GC. It seems that for every *ReplicaStats object we have 6 underlying *replicaStatsRecord objects.

for i := range rs.Mu.records {
rs.Mu.records[i] = newReplicaStatsRecord()
}

Also for every *Replica on a store, we have 6 *ReplicaStats objects.

return &ReplicaLoad{
batchRequests: replicastats.NewReplicaStats(clock, getNodeLocality),
requests: replicastats.NewReplicaStats(clock, nil),
writeKeys: replicastats.NewReplicaStats(clock, nil),
readKeys: replicastats.NewReplicaStats(clock, nil),
writeBytes: replicastats.NewReplicaStats(clock, nil),
readBytes: replicastats.NewReplicaStats(clock, nil),
}

So that's a multiplier of 6*6=36 objects per replica. Could we get this down to just one object allocated per-replica? I suspect that's the bottleneck more than the size of each object itself.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @lidorcarmel)

Copy link
Copy Markdown
Contributor

@lidorcarmel lidorcarmel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)


pkg/kv/kvserver/replicastats/replica_stats.go line 104 at r1 (raw file):

// mergeReplicaStatsRecords combines two records and returns a new record with
// the merged data. When this is called with nil records, a nil record is

update the comment? at least replace nil with "inactive" or something like that.


pkg/kv/kvserver/replicastats/replica_stats.go line 107 at r1 (raw file):

// returned; otherwise a new record instead.
func (rsr *replicaStatsRecord) mergeReplicaStatsRecords(other *replicaStatsRecord) {
	if !rsr.active && !other.active {

can this ever happen? I think no but maybe I'm missing something.


pkg/kv/kvserver/replicastats/replica_stats.go line 115 at r1 (raw file):

	}

	if !other.active {

I think these 2 if statements are also not needed? because we will always reset when deactivating?

Code quote:

	if !rsr.active {
		rsr.reset()







	}

	if !other.active {
		other.reset()
	}

pkg/kv/kvserver/replicastats/replica_stats.go line 120 at r1 (raw file):

	rsr.sum += other.sum
	rsr.active = true

btw in the old behavior we "activated" other.. right? so this is different. this new behavior makes more sense to me though.


pkg/kv/kvserver/replicastats/replica_stats.go line 210 at r1 (raw file):

		if !rs.Mu.records[i].active {
			other.Mu.records[i].reset()
			other.Mu.records[i].active = false

reset() will already set active to false, right? so maybe drop this line?

can we simplify things by having activate() and deactivate() or something like that? I feel like I need to track reset() together with active which is error prone.

@kvoli kvoli force-pushed the 220822.replstats-mem branch from afe67ec to 41e4313 Compare August 25, 2022 15:45
@blathers-crl blathers-crl bot requested a review from irfansharif August 25, 2022 15:45
Copy link
Copy Markdown
Contributor Author

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should reconsider the structure - however I'm opting to not move to this single object model yet. I don't think it would address the OOM issue that motivated this change and I would rather put in a smaller fix here.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @lidorcarmel)


pkg/kv/kvserver/replicastats/replica_stats.go line 104 at r1 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

update the comment? at least replace nil with "inactive" or something like that.

Updated.


pkg/kv/kvserver/replicastats/replica_stats.go line 107 at r1 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

can this ever happen? I think no but maybe I'm missing something.

This can happen - it's possible that either has only gone through < 6 windows, in which case they would be inactive.

The logic is bloated however, I've updated to make this clearer.


pkg/kv/kvserver/replicastats/replica_stats.go line 115 at r1 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

I think these 2 if statements are also not needed? because we will always reset when deactivating?

Updated - see above.


pkg/kv/kvserver/replicastats/replica_stats.go line 210 at r1 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

reset() will already set active to false, right? so maybe drop this line?

can we simplify things by having activate() and deactivate() or something like that? I feel like I need to track reset() together with active which is error prone.

Good idea. Updated.

Copy link
Copy Markdown
Contributor

@lidorcarmel lidorcarmel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider filing a short issue about this so we don't forget (unless you can send the pr soon).

Reviewed 3 of 3 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @kvoli)


pkg/kv/kvserver/replicastats/replica_stats.go line 107 at r1 (raw file):

Previously, kvoli (Austen) wrote…

This can happen - it's possible that either has only gone through < 6 windows, in which case they would be inactive.

The logic is bloated however, I've updated to make this clearer.

thanks!


pkg/kv/kvserver/replicastats/replica_stats.go line 78 at r3 (raw file):

		syncutil.Mutex
		idx int
		// the window records are initialized once, then each window is reused

s/the/The/ :)
and a double space on the next line (I missed those!)

thanks for the comment btw.

Code quote:

the

pkg/kv/kvserver/replicastats/replica_stats.go line 103 at r3 (raw file):

	// values to avoid churning memory. We could also delete elements that
	// haven't been seen in a while, however it makes no difference as the
	// memory wouldn't be released (https://github.com/golang/go/issues/20135).

nice!

Code quote:

	// haven't been seen in a while, however it makes no difference as the
	// memory wouldn't be released (https://github.com/golang/go/issues/20135).
	// To protect against a memory leak, where many unused localities
	// accumulate, create a new map if the number of unused localities was
	// greater than 1.

pkg/kv/kvserver/replicastats/replica_stats.go line 240 at r3 (raw file):

			continue
		}
		// Otherwise, active the rhs record and split the request count between

s/active/activate/ probably?

Code quote:

active 

@kvoli kvoli force-pushed the 220822.replstats-mem branch 3 times, most recently from 4cebeec to 0bb69c9 Compare August 31, 2022 14:15
kvoli added 2 commits August 31, 2022 15:48
This patch removes some unused fields within the replica stats object.
It also opts to allocate all the memory needed upfront for a replica
stats object for better cache locality and less GC overhead.

resolves cockroachdb#85112

Release justification: low risk, lowers memory footprint to avoid oom.
Release note: None
Previously, every replica throughput tracker in `replica_load` would
track the request locality. Only batch requests used this locality for
follow-the-workload leaseholder transfers. This patch removes locality
tracking for the other throughput trackers to reduce per-replica memory
footprint.

Release justification: Low risk, high benefit change to reduce memory
overhead in multi-region clusters.
Release note: None
@kvoli kvoli force-pushed the 220822.replstats-mem branch from 0bb69c9 to 5e91ca5 Compare August 31, 2022 15:52
Copy link
Copy Markdown
Contributor Author

@kvoli kvoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @kvoli, and @lidorcarmel)


pkg/kv/kvserver/replicastats/replica_stats.go line 107 at r1 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

thanks!

updated


pkg/kv/kvserver/replicastats/replica_stats.go line 120 at r1 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

btw in the old behavior we "activated" other.. right? so this is different. this new behavior makes more sense to me though.

yeah - changed to be more logical.


pkg/kv/kvserver/replicastats/replica_stats.go line 78 at r3 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

s/the/The/ :)
and a double space on the next line (I missed those!)

thanks for the comment btw.

fixed.


pkg/kv/kvserver/replicastats/replica_stats.go line 240 at r3 (raw file):

Previously, lidorcarmel (Lidor Carmel) wrote…

s/active/activate/ probably?

updated.

@kvoli
Copy link
Copy Markdown
Contributor Author

kvoli commented Aug 31, 2022

filed #87187 for ref.

Tested w/ 300k ranges.

Minimal (-500mb) difference. Multi-region would show greater distinction - likely O(localities).

Also pushed ~3.48% more throughput on kv95, however that's within a error imo.

# preceding commit
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 1800.0s        0       34185125        18991.5    215.7     92.3    906.0   1744.8  10737.4

# change
_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 1800.0s        0       35372909        19651.4    208.5     92.3    872.4   1677.7   9663.7

@kvoli
Copy link
Copy Markdown
Contributor Author

kvoli commented Aug 31, 2022

bors r+

@craig craig bot merged commit 0bcbece into cockroachdb:master Aug 31, 2022
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 31, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: kv/splits/nodes=3/quiesce=true failed

4 participants