Skip to content

copy: optimize date parsing memory usage#92252

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
cucaroach:date-parse-opt
Nov 29, 2022
Merged

copy: optimize date parsing memory usage#92252
craig[bot] merged 2 commits intocockroachdb:masterfrom
cucaroach:date-parse-opt

Conversation

@cucaroach
Copy link
Copy Markdown
Contributor

@cucaroach cucaroach commented Nov 21, 2022

  • pgdate: benchmark ParseDate
  • pgdate: optimize bulk date parsing
  • sql: wire up pgdate optimization to ParseAndRequireString and colexec.

Fixes: #91834

I'll squash the last two commits but thought for review purposes they'd be better separate.

Epic: CRDB-18892

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@cucaroach cucaroach changed the title date parse opt copy: optimize date parsing memory usage Nov 21, 2022
Simple benchmark for ParseDate to highlight forthcoming optimizations.

Release note: None
@cucaroach cucaroach force-pushed the date-parse-opt branch 2 times, most recently from 5e996cd to f12108b Compare November 22, 2022 00:10
@cucaroach cucaroach requested a review from otan November 22, 2022 01:02
@cucaroach cucaroach marked this pull request as ready for review November 22, 2022 01:03
@cucaroach cucaroach requested a review from a team November 22, 2022 01:03
@cucaroach cucaroach requested a review from a team as a code owner November 22, 2022 01:03
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1, 5 of 5 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @otan)


pkg/sql/colexec/execgen/cmd/execgen/cast_gen_util.go line 398 at r3 (raw file):

		_now := %[3]s.GetRelativeParseTime()
		_dateStyle := %[3]s.GetDateStyle()
		_d, _, err := pgdate.ParseDate(_now, _dateStyle, string(%[2]s), &evalCtx.ParseHelper)

nit: replace evalCtx with %[3]s as we do for _now and _dateStyle. The rationale is trying to keep the templating more flexible (although in this case the flexibility is probably not worth it since I don't foresee us generating multiple files based on cast_tmpl, still it's better for consistency sake).


pkg/sql/sem/tree/datum.go line 2008 at r3 (raw file):

	// GetDateStyle returns the date style in the session.
	GetDateStyle() pgdate.DateStyle
	// GetParseHelper returns a helper to optmize date parsing

nit: missing period.


pkg/util/timeutil/pgdate/field_extract.go line 78 at r2 (raw file):

	// This indicates that the value in the year field was only
	// two digits and should be adjusted to make it recent.
	tweakYear bool

nit: I'm assuming that the fields are re-arranged to get the object size under some threshold. Consider adding an assertion in init() method that verifies that the size doesn't exceed that threshold.

Copy link
Copy Markdown
Contributor Author

@cucaroach cucaroach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @otan and @yuzefovich)


pkg/util/timeutil/pgdate/field_extract.go line 78 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I'm assuming that the fields are re-arranged to get the object size under some threshold. Consider adding an assertion in init() method that verifies that the size doesn't exceed that threshold.

Seems like overkill, I just wanted to save 16 bytes by collapsing bools, it probably didn't move the needle, just seemed like a good idea.

@cucaroach
Copy link
Copy Markdown
Contributor Author

TFTRs!
bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 28, 2022

Build failed:

Add a "helper" to cache memory allocations to make repeated/bulk date
parsing more efficient.

```
name          old time/op    new time/op    delta
ParseDate-10    1.76µs ±12%    1.49µs ± 6%   -15.08%  (p=0.000 n=9+10)

name          old alloc/op   new alloc/op   delta
ParseDate-10    1.66kB ± 0%    0.00kB       -100.00%  (p=0.000 n=10+10)

name          old allocs/op  new allocs/op  delta
ParseDate-10      8.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
```

Fixes: cockroachdb#91834

Release note: None
@cucaroach
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 28, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 28, 2022

Build failed:

@cucaroach
Copy link
Copy Markdown
Contributor Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 29, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 29, 2022

Build succeeded:

@craig craig bot merged commit 6588db8 into cockroachdb:master Nov 29, 2022
@cucaroach cucaroach deleted the date-parse-opt branch November 29, 2022 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pgdate: pgdate.ParseDate allocates a lot of temporary heap memory

4 participants