Skip to content

colexec: batch allocate datums for projectTupleOp#74724

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
ajwerner:ajwerner/optimize-tuple-proj-op
Jan 12, 2022
Merged

colexec: batch allocate datums for projectTupleOp#74724
craig[bot] merged 1 commit intocockroachdb:masterfrom
ajwerner:ajwerner/optimize-tuple-proj-op

Conversation

@ajwerner
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner commented Jan 12, 2022

This has a profound impact on the amound of garbage generated by the delivery
query in TPC-C.

name                    old time/op    new time/op    delta
TPCC/mix=delivery=1-16    38.0ms ± 2%    35.8ms ± 1%   -5.76%  (p=0.000 n=9+8)

name                    old alloc/op   new alloc/op   delta
TPCC/mix=delivery=1-16    8.17MB ± 1%    7.97MB ± 1%   -2.36%  (p=0.000 n=9+10)

name                    old allocs/op  new allocs/op  delta
TPCC/mix=delivery=1-16     80.2k ± 0%     20.3k ± 1%  -74.65%  (p=0.000 n=10+9)

Leveraged #74443 to find this.

Release note: None

@ajwerner ajwerner requested a review from yuzefovich January 12, 2022 02:26
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! I think this change is very good, :lgtm:

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/colexec/tuple_proj_op.go, line 73 at r1 (raw file):

	}

	t.allocator.PerformOperation([]coldata.Vec{projVec}, func() {

So actually we will perform the memory accounting for all these tuples - t.allocator.PerformOperation measures how much space projVec is taking up before the anonymous function, how much after the function, and adjusts the memory account accordingly. In this case we assume that old tuples in projVec have been discarded (we're overwriting the references in projVec itself right now, and if the caller wanted to hold on to the old tuples, it must have copied them somewhere and adjusted its memory account).

My comment about DatumAlloc was applicable to the row-by-row engine where we don't have an easy way to track the lifetime of the datums and don't have helpers like PerformOperation to perform the accounting.


pkg/sql/colexec/tuple_proj_op.go, line 76 at r1 (raw file):

		// Preallocate the tuples and their underlying datums in a contiguous
		// slice to reduce allocations.
		tuples := make([]tree.DTuple, n)

I started thinking whether we can reuse any of these allocations between iterations of tupleProjOp.Next, and I think we cannot. We definitely cannot reuse tuples[i] because all datums are assumed to be immutable. datums[::] backs each tuple, so if we were to change it, we'd mutate the already returned tuple - also no bueno. Same goes for tuples.

@ajwerner ajwerner force-pushed the ajwerner/optimize-tuple-proj-op branch from 7ad3e53 to 7a3ca8d Compare January 12, 2022 04:43
Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized we don't need two loops. It helps another on the runtime.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @yuzefovich)


pkg/sql/colexec/tuple_proj_op.go, line 73 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

So actually we will perform the memory accounting for all these tuples - t.allocator.PerformOperation measures how much space projVec is taking up before the anonymous function, how much after the function, and adjusts the memory account accordingly. In this case we assume that old tuples in projVec have been discarded (we're overwriting the references in projVec itself right now, and if the caller wanted to hold on to the old tuples, it must have copied them somewhere and adjusted its memory account).

My comment about DatumAlloc was applicable to the row-by-row engine where we don't have an easy way to track the lifetime of the datums and don't have helpers like PerformOperation to perform the accounting.

Ack.


pkg/sql/colexec/tuple_proj_op.go, line 76 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I started thinking whether we can reuse any of these allocations between iterations of tupleProjOp.Next, and I think we cannot. We definitely cannot reuse tuples[i] because all datums are assumed to be immutable. datums[::] backs each tuple, so if we were to change it, we'd mutate the already returned tuple - also no bueno. Same goes for tuples.

Ack.

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner)


-- commits, line 4 at r2:
nit: s/amound/amount/.

This has a profound impact on the amount of garbage generated by the delivery
query in TPC-C.

```
name                    old time/op    new time/op    delta
TPCC/mix=delivery=1-16    38.0ms ± 2%    35.8ms ± 1%   -5.76%  (p=0.000 n=9+8)

name                    old alloc/op   new alloc/op   delta
TPCC/mix=delivery=1-16    8.17MB ± 1%    7.97MB ± 1%   -2.36%  (p=0.000 n=9+10)

name                    old allocs/op  new allocs/op  delta
TPCC/mix=delivery=1-16     80.2k ± 0%     20.3k ± 1%  -74.65%  (p=0.000 n=10+9)
```

Leveraged cockroachdb#74443 to find this.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/optimize-tuple-proj-op branch from 7a3ca8d to 57732b1 Compare January 12, 2022 04:53
Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r=yuzefovich

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner)


-- commits, line 4 at r2:

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/amound/amount/.

Done.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 12, 2022

Build succeeded:

@craig craig bot merged commit 3c16801 into cockroachdb:master Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants