server, sql: Query cancellation for non-distributed queries#17003
server, sql: Query cancellation for non-distributed queries#17003itsbilal merged 4 commits intocockroachdb:masterfrom
Conversation
|
cc @asubiotto who might have some thoughts Review status: 0 of 41 files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
Reviewed 16 of 41 files at r1. pkg/sql/cancel_checker.go, line 63 at r1 (raw file):
If you are using an atomic and there will be no context check, then there is also no need for a check interval. pkg/sql/sqlbase/sort.go, line 24 at r1 (raw file):
I am not OK with placing this code in the Also, you used way too many calls to the cancel check method. pkg/util/uint128/uint128_test.go, line 35 at r1 (raw file):
Test the error cases too. Comments from Reviewable |
3337fcd to
9d54697
Compare
|
@knz Did a split into 3 commits (slightly different from what you suggested, but it made the changes a lot easier to separate). Review status: 14 of 39 files reviewed at latest revision, 3 unresolved discussions. pkg/sql/cancel_checker.go, line 63 at r1 (raw file): Previously, knz (kena) wrote…
Adding a check for context.Done too now. pkg/util/uint128/uint128_test.go, line 35 at r1 (raw file): Previously, knz (kena) wrote…
On it - will also add more tests in the next commit. pkg/sql/sqlbase/sort.go, line 24 at r1 (raw file):
Not sure where specifically I'm checking too often. But I've removed one fairly-redundant check in Comments from Reviewable |
|
Thanks for all the good work.
Reviewed 2 of 41 files at r1, 28 of 30 files at r2, 14 of 14 files at r3, 14 of 14 files at r4. pkg/server/serverpb/status.proto, line 268 at r2 (raw file):
Maybe this should be encoded as the 16 bytes of the uuid instead of the 32 bytes of the hex representation! pkg/sql/cancel_checker.go, line 55 at r4 (raw file):
Don't make the conditional like this. Instead use a do-nothing function as cancel checker for internal planners. (Basic principle: don't pay N times for something you can check ahead of time just once) pkg/sql/crdb_internal.go, line 559 at r2 (raw file):
s/id/query_id/ pkg/sql/delete.go, line 142 at r4 (raw file):
Factor this code with the one in pkg/sql/executor.go, line 1063 at r2 (raw file):
make the timestamp here and in phaseTimes be the same. pkg/sql/executor.go, line 379 at r3 (raw file):
"query ID %s not found" pkg/sql/filter_opt.go, line 323 at r4 (raw file):
wrong commit, it seems pkg/sql/insert.go, line 263 at r4 (raw file):
ditto update/delete pkg/sql/session.go, line 645 at r2 (raw file):
Out of curiosity why add a pkg/sql/truncate.go, line 96 at r4 (raw file):
Explain why. pkg/sql/update.go, line 413 at r4 (raw file):
I do not understand why you are marking the query non-cancellable only here. What is the motivation? The comment must be extended. Comments from Reviewable |
|
I forgot to ask, you also will need to compute before/after perf measurements before this PR can merge. |
533e9a3 to
bd4f9fd
Compare
|
Made that refactor of I've yet to address some of your minor comments about the insert/update/delete refactor and the uncancellable comment (and I'll get to both of those very shortly), but otherwise everything else should be good. Review status: 9 of 65 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed. pkg/server/serverpb/status.proto, line 268 at r2 (raw file): Previously, knz (kena) wrote…
Do you mean just replacing it with pkg/sql/cancel_checker.go, line 55 at r4 (raw file): Previously, knz (kena) wrote…
Done. Brought back the interface that has two implementations - one that does nothing (for internal planners). pkg/sql/crdb_internal.go, line 559 at r2 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/executor.go, line 1063 at r2 (raw file): Previously, knz (kena) wrote…
The closest past pkg/sql/executor.go, line 379 at r3 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/session.go, line 645 at r2 (raw file): Previously, knz (kena) wrote…
I could just pkg/sql/update.go, line 413 at r4 (raw file): Previously, knz (kena) wrote…
The goal is to mark all entries to the autoCommit path as uncancellable, as we talked about in the RFC. In this case, pkg/util/uint128/uint128_test.go, line 35 at r1 (raw file): Previously, itsbilal (Bilal Akhtar) wrote…
Done. Comments from Reviewable |
|
Thank you for the effort. I ❤️ this PR. I also just saw the light:
If you're lucky your PR would perhaps even get a negative line count :) Reviewed 1 of 30 files at r2, 1 of 29 files at r5, 10 of 14 files at r6, 42 of 42 files at r7, 3 of 3 files at r8. pkg/server/serverpb/status.proto, line 268 at r2 (raw file): Previously, itsbilal (Bilal Akhtar) wrote…
hmm maybe not. Let this sleep. pkg/sql/create.go, line 701 at r8 (raw file):
Make pkg/sql/information_schema.go, line 810 at r8 (raw file):
I would make the planner{} object contain the cancelchecker, so as to not have to construct it anew everywhere it's needed. pkg/sql/join.go, line 319 at r8 (raw file):
see my other comment about setting up the cancelchecker in the planner directly. pkg/sql/session.go, line 645 at r2 (raw file): Previously, itsbilal (Bilal Akhtar) wrote…
yeah but a defer has a non-trivial cost. That's really why we don't use defer everywhere already. pkg/sql/update.go, line 413 at r4 (raw file): Previously, itsbilal (Bilal Akhtar) wrote…
This still needs to be documented in the code. Also here and in other places this logic must be captured in a single method reused everywhere where this is relevant. Comments from Reviewable |
|
Addressed almost all of your comments, and started passing nextParams to Start() as well (though it should probably get a better name at this point - planNodeParams? planParams?). Passing the planner instead of cancelChecker is the only remaining TODO (there would be some cases, eg. in distsql_physical_planner where it will be null). Technically, if we wanted to put the cancelChecker inside the planner in the first place, we didn't need any of the planNode refactors (since the relevant planNodes already have a planner reference) - but nextParams seems like a cleaner approach forward anyway. Review status: all files reviewed at latest revision, 10 unresolved discussions, some commit checks failed. pkg/sql/create.go, line 701 at r8 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/delete.go, line 142 at r4 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/information_schema.go, line 810 at r8 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/insert.go, line 263 at r4 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/join.go, line 319 at r8 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/session.go, line 645 at r2 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/truncate.go, line 96 at r4 (raw file): Previously, knz (kena) wrote…
Done. pkg/sql/update.go, line 413 at r4 (raw file): Previously, knz (kena) wrote…
Done. Comments from Reviewable |
bd4f9fd to
11388f2
Compare
|
nextParams -> runParams or execParams? Remember once the planner is in the nextParams/runParams/execParams you can drop the planner reference from all the planNodes! I am looking forward to this code reduction. Reviewed 2 of 57 files at r9, 8 of 14 files at r10, 44 of 44 files at r11, 3 of 3 files at r12. Comments from Reviewable |
2b9c555 to
d332315
Compare
|
Renamed We also talked yesterday about completely removing the uncancellable phase (instead opting to leave the outcome of the Making those changes next. Review status: 6 of 65 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. Comments from Reviewable |
d332315 to
07f920f
Compare
|
Non-cancellable phase is gone, and the RFC has been updated to reflect that. Review status: 5 of 65 files reviewed at latest revision, 1 unresolved discussion. Comments from Reviewable |
|
Please explain what you did to Andrew, he's cooking a change to the table writer's finalize method and he needs to learn about the cancelChecker. Reviewed 1 of 59 files at r13, 12 of 18 files at r14, 44 of 44 files at r15, 3 of 3 files at r16. Comments from Reviewable |
07f920f to
d41728e
Compare
d41728e to
ad67677
Compare
ad67677 to
192658a
Compare
|
I'm super excited to see this happening. Can't wait to see the code. Here's a drive-by feature request that I wanna see if I can get you interested in: when a query processed through the Review status: 4 of 65 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. Comments from Reviewable |
This reverts some of the changes made in PRs cockroachdb#17003 and cockroachdb#17200. In particular: 17003 made new contexts for every statement, but didn't cancel them after statement execution, resulting in a memory leak. 17200 did a quick-fix for that memory leak by not forking that context, at the expense of not being able to leverage the context in cancellation. This PR restores the context cancellation behaviour introduced in cockroachdb#17003 , except at the transaction level - cancelling a query closes its transaction's context.
This reverts some of the changes made in PRs cockroachdb#17003 and cockroachdb#17200. In particular: 17003 made new contexts for every statement, but didn't cancel them after statement execution, resulting in a memory leak. 17200 did a quick-fix for that memory leak by not forking that context, at the expense of not being able to leverage the context in cancellation. This PR restores the context cancellation behaviour introduced in cockroachdb#17003 , except at the transaction level - cancelling a query closes its transaction's context.
This reverts some of the changes made in PRs cockroachdb#17003 and cockroachdb#17200. In particular: 17003 made new contexts for every statement, but didn't cancel them after statement execution, resulting in a memory leak. 17200 did a quick-fix for that memory leak by not forking that context, at the expense of not being able to leverage the context in cancellation. This PR restores the context cancellation behaviour introduced in cockroachdb#17003 , except at the transaction level - cancelling a query closes its transaction's context.
This reverts some of the changes made in PRs cockroachdb#17003 and cockroachdb#17200. In particular: 17003 made new contexts for every statement, but didn't cancel them after statement execution, resulting in a memory leak. 17200 did a quick-fix for that memory leak by not forking that context, at the expense of not being able to leverage the context in cancellation. This PR restores the context cancellation behaviour introduced in cockroachdb#17003 , except at the transaction level - cancelling a query closes its transaction's context.
Implements most of this RFC. Specifically:
uint128generated from the HLC timestamp and encoded into a hex string. Shown inSHOW SESSIONS.queryMetagets cancellation flags which are checked repeatedly by planNodes (for most planNodes, that's every 100k rows processed).COMMIT,ROLLBACK,BEGIN TRANSACTION) are not cancellable at any point. Every other statement is cancellable until its autocommit phase (if it has one).