sql: Make cancel checker more efficient#103866
sql: Make cancel checker more efficient#103866craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
22fec9e to
8e3f34b
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
Nice! I like it. Curious though what prompted you to look into this?
(Also, not sure if you were planning to address those, but there are a few other places where we have a similar logic - e.g. colexecutils/cancel_checker.go.)
Reviewed 7 of 7 files at r1, 11 of 11 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @miretskiy)
pkg/util/cancelchecker/cancel_checker.go line 50 at r2 (raw file):
atomic.StoreUint32(&done, 1) }); err != nil { panic(err)
nit: add a comment here that err could only be non-nil if ctx.Done() is nil which we know is false.
I started working on conditional mutex -- where I wanted to lock the mutex when some condition
I can add it to this PR if you prefer.
Ack. Done. |
Add `ctxutil.WhenDone` function which invokes a callback when
parent context becomes done.
```
parent, cancelParent := context.WithTimeout(....)
ctxutil.WhenDone(parent, func() {....})
```
Normally, in order to invoke a function when context becomes
done, a go routine should be spun up that listens on
`ctx.Done()` channel. Sometimes, this is too heavy weight.
`ctxutil.WhenDone` adds such a mechanism, which under most
circumstances, simply invokes the function when context completes.
When parent context becomes done, the funciton is invoked
with the result (Err()) of the parent context.
If the parent context never completes, `WhenDone` returns an error.
Go versions 1.20 or higher also add ability to augment
context error message with "cause" error.
Epic: None
Release note: None
a125731 to
0c5d534
Compare
added. |
2f51263 to
26d0df0
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 14 of 14 files at r3, 18 of 19 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @miretskiy)
pkg/util/ctxutil/context.go line 35 at r4 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
@yuzefovich -- do you think I should allow this case -- like: you call whenDone when context is never done -- that's on you :)
Alternatively, maybe add a method such asMustInvokeWhenDone-- which has this check and have regular "WhenDone" work with any context?
Discussed offline, the current approach seems good to me.
pkg/util/cancelchecker/cancel_checker_test.go line 44 at r5 (raw file):
} // Old implementation:
nit: I'd extract this comment from the code into the commit message (I don't think we usually put things like this into the code).
done |
Arrange for cancel checker to be notified once context is done. ``` Old implementation: BenchmarkCancelChecker-10 133185742 8.790 ns/op New Implementation: BenchmarkCancelChecker-10 1000000000 0.5804 ns/op ``` Epic: None Release note: None
|
bors r+ |
|
I’d be interested @yuzefovich to know if this would be a visible improvement in SQL performance? Any sort of macro benchmark that may be interesting? |
|
My guess is that it won't be really visible in an end-to-end testing. For an OLTP query we're likely to do only a single cancellation check throughout the execution, so a few nanoseconds won't be visible; for an OLAP query we're likely to do many cancellation checks, so we could shave off microseconds or perhaps even milliseconds, yet the query latency would be in seconds, so it still wouldn't be visible. For the former, one could run |
|
Yeah; the overall change is not great; colexec calls to CheckEvery might have been more expensive... but still, it's hard to see those if you're talking seconds. A solid maybe :) |
|
If you don’t foresee visible macro improvements, doesn’t sound like a good use of your time @yuzefovich! I was just curious. If it were in a hot loop the answer might be different. Thanks. |
|
Build succeeded: |
|
Interestingly, this seems to lead to an increase in allocations which actually increases the latency on the hot path: (This is on 145bb3b, and I manually modified |
|
Probably due to uint32* allocation. And that's because planner copied
canceler. If it's guaranteed that reset is called, this alloc can be removed
…On Wed, May 31, 2023, 4:59 PM Yahor Yuzefovich ***@***.***> wrote:
Interestingly, this seems to lead to an increase in allocations which
actually increases the latency on the hot path:
***@***.***:~/go/src/github.com/cockroachdb/cockroach$ BENCHTIMEOUT=72h PKG=./pkg/sql/flowinfra BENCHES=BenchmarkFlowSetup ./scripts/bench HEAD~1 HEAD~0
Comparing HEAD~0 (new) with HEAD~1 (old)
Writing to /tmp/tmp.Y7b21cdmu3
Switching to HEAD~1
+ ./dev bench ./pkg/sql/flowinfra --timeout=72h --filter=BenchmarkFlowSetup --count=30 --bench-mem -v --stream-output --ignore-cache
Switching to HEAD~0
+ ./dev bench ./pkg/sql/flowinfra --timeout=72h --filter=BenchmarkFlowSetup --count=30 --bench-mem -v --stream-output --ignore-cache
name old time/op new time/op delta
FlowSetup/vectorize=true/distribute=true-24 204µs ± 3% 210µs ± 4% +2.89% (p=0.000 n=28+27)
FlowSetup/vectorize=true/distribute=false-24 205µs ± 8% 209µs ± 8% +2.22% (p=0.008 n=26+26)
FlowSetup/vectorize=false/distribute=true-24 205µs ± 5% 210µs ± 6% +2.57% (p=0.000 n=29+30)
FlowSetup/vectorize=false/distribute=false-24 201µs ± 6% 206µs ± 3% +2.70% (p=0.000 n=28+27)
name old alloc/op new alloc/op delta
FlowSetup/vectorize=true/distribute=true-24 24.8kB ±13% 25.2kB ±13% +1.49% (p=0.005 n=29+29)
FlowSetup/vectorize=true/distribute=false-24 23.6kB ±13% 23.4kB ±11% -0.56% (p=0.001 n=27+25)
FlowSetup/vectorize=false/distribute=true-24 28.7kB ±10% 28.9kB ±11% ~ (p=0.842 n=28+29)
FlowSetup/vectorize=false/distribute=false-24 27.0kB ± 1% 27.0kB ± 1% ~ (p=0.370 n=24+24)
name old allocs/op new allocs/op delta
FlowSetup/vectorize=true/distribute=true-24 265 ±15% 273 ±14% +2.81% (p=0.001 n=30+30)
FlowSetup/vectorize=true/distribute=false-24 253 ±17% 253 ± 1% +0.10% (p=0.001 n=27+23)
FlowSetup/vectorize=false/distribute=true-24 262 ±10% 267 ±16% +1.66% (p=0.001 n=28+30)
FlowSetup/vectorize=false/distribute=false-24 246 ± 1% 247 ± 0% +0.51% (p=0.000 n=24+19)
(This is on 145bb3b
<145bb3b>,
and I manually modified scripts/bench to use 30 iterations instead of 10.)
—
Reply to this email directly, view it on GitHub
<#103866 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANA4FVAVQG2T35Q6X3VXLCDXI6WMDANCNFSM6AAAAAAYN3WY3A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Have we resolved this latency increase? I don't think this PR is bringing us forward if we keep this latency increase on the hot path. Should I send a revert for this PR? |
|
I reverted the change. Cancel checker is about 8x faster. But not 128x (or
1024x) faster
…On Mon, Jun 19, 2023, 6:24 AM knz ***@***.***> wrote:
Interestingly, this seems to lead to an increase in allocations which
actually increases the latency on the hot path:
Have we resolved this latency increase? I don't think this PR is bringing
us forward if we keep this latency increase on the hot path.
Should I send a revert for this PR?
—
Reply to this email directly, view it on GitHub
<#103866 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANA4FVGHC7RO4MU7JYLSFQDXMASE7ANCNFSM6AAAAAAYN3WY3A>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
107079: grpcinterceptor: Use ctxutil to detect context cancellation r=miretskiy a=miretskiy Previous commit #76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional go routine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by #103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs #107053 Release note: None 107267: kvcoord: make rangefeed connection class `TenantReadOnly` r=erikgrinaker a=erikgrinaker Otherwise, the setting won't have any effect for tenants, always falling through to the default. Resolves #106895. Epic: none Release note: None 107273: build: use absolute path to pnpm-lock.yaml r=maryliag a=sjbarag Previously, actions/setup-node@v3 was used with a relative path for `cache-dependency-path`. This had an unexpected interaction with the `.defaults.run.working-directory: "pkg/ui/workspaces/cluster-ui"` declaration, which caused setup-node's cleanup action to look for pkg/ui/workspaces/cluster-ui/pkg/ui/pnpm-lock.yaml. Naturally, this file didn't exist. Use an absolute path (thanks, `github.workspace`!) to pnpm-lock.yaml instead. Release note: None Epic: none Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com> Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com> Co-authored-by: Sean Barag <barag@cockroachlabs.com>
Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
…tion Backport 1/2 commits from cockroachdb#103866 Backport 1/1 commits from cockroachdb#107079 Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None
…tion Backport 1/2 commits from cockroachdb#103866 Backport 1/1 commits from cockroachdb#107079 Previous commit cockroachdb#76306 arranged for interceptors to run for all rpcs (local or otherwise). In doing so, it added an additional goroutine for each RPC whose purpose was to detect context completion, and to arrange for the cleanup function to run when that happened. `ctxutil.WhenDone` (added by cockroachdb#103866/commits/dfe4ef82810aaf81ae21afecce83cbc1fb6783dd) allows for the same functionality without the additional goroutine cost. Informs cockroachdb#107053 Release note: None Release justification: Important performance improvement for large scale rangefeed workloads to remove inadvertently introduced goroutine.
Arrange for cancel checker to be notified once context
is done.
Epic: None
Release note: None