Skip to content

colserde: perform memory accounting for scratch slices in the converter#94660

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:converter-accounting
Jan 4, 2023
Merged

colserde: perform memory accounting for scratch slices in the converter#94660
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:converter-accounting

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Jan 3, 2023

This commit introduces the memory accounting for values and offsets
scratch slices that are being reused across BatchToArrow calls since
recently. This required a lot of plumbing to propagate the memory
account from all places. The converter is careful to only grow and
shrink the account by its own usage, so the same memory account can be
reused across multiple converters (as long as there is no concurrency
going on, and we only can have concurrency for hash router outputs, so
there we give each output a separate account).

An additional minor change is that now in diskQueue.Enqueue we
properly Close the FileSerializer before nil-ing it out. This
isn't a problem per se since it is the caller's responsibility to close
the account used by the serializer, but it's nice to properly release
the accounted for bytes.

Epic: None

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the converter-accounting branch 7 times, most recently from fba13da to 2815410 Compare January 4, 2023 00:06
@yuzefovich yuzefovich changed the title colserde: perform memory accounting for scratch slices colserde: perform memory accounting for scratch slices in the converter Jan 4, 2023
@yuzefovich yuzefovich marked this pull request as ready for review January 4, 2023 00:06
@yuzefovich yuzefovich requested a review from a team as a code owner January 4, 2023 00:06
@yuzefovich yuzefovich requested review from DrewKimball, michae2 and rytaft and removed request for rytaft January 4, 2023 00:06
Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 57 of 57 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)


pkg/col/colserde/arrowbatchconverter.go line 82 at r1 (raw file):

// acc must be a non-nil unlimited memory account unless ArrowToBatchOnly mode
// is used. The account will be shrunk in Release, however, it is the caller's
// responsibility to close the account.

[nit] Might be worth it to call out that Release can't be called after the caller has closed the account.


pkg/col/colserde/arrowbatchconverter.go line 361 at r1 (raw file):

	for i := range c.scratch.values {
		// Some of these might be nil in which case footprint won't change.
		footprint += int64(len(c.scratch.values[i]))

Should we use cap here instead of len?

This commit introduces the memory accounting for values and offsets
scratch slices that are being reused across `BatchToArrow` calls since
recently. This required a lot of plumbing to propagate the memory
account from all places. The converter is careful to only grow and
shrink the account by its own usage, so the same memory account can be
reused across multiple converters (as long as there is no concurrency
going on, and we only can have concurrency for hash router outputs, so
there we give each output a separate account).

An additional minor change is that now in `diskQueue.Enqueue` we
properly `Close` the `FileSerializer` before `nil`-ing it out. This
isn't a problem per se since it is the caller's responsibility to close
the account used by the serializer, but it's nice to properly release
the accounted for bytes.

Release note: None
@yuzefovich yuzefovich force-pushed the converter-accounting branch from 2815410 to 2922777 Compare January 4, 2023 16:34
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @michae2)


pkg/col/colserde/arrowbatchconverter.go line 82 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Might be worth it to call out that Release can't be called after the caller has closed the account.

Good point, done.


pkg/col/colserde/arrowbatchconverter.go line 361 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Should we use cap here instead of len?

Indeed, nice catch, thanks.

@yuzefovich
Copy link
Copy Markdown
Member Author

TFTR!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 4, 2023

Build succeeded:

@craig craig bot merged commit cb3d6db into cockroachdb:master Jan 4, 2023
@yuzefovich yuzefovich deleted the converter-accounting branch January 4, 2023 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants