sql: add prepared_statements_cache_size setting#98917
sql: add prepared_statements_cache_size setting#98917craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
1760a6b to
c61f548
Compare
Thanks for the reminder. Yeah it can be more complicated with pausable portals involved, as we haven't decided on whether we want to close them when dropping the prepared stmt. I don't think we will include this in #96358, but will have a follow-up. I think after the portal PR is merged, we can add some restrictions before we start on the follow-up. Such as:
If you don't mind we can come back to this discussion after #96358 is merged. |
yuzefovich
left a comment
There was a problem hiding this comment.
Nice work!
This probably will need to be rebased on top of #96358
Shouldn't it be the opposite? Since we want to backport this PR but not #96358, then this PR should go in first (and I think it's more likely this PR will be ready sooner too).
I haven't looked at session migration stuff yet, this might be a little broken if the session is migrated.
Yeah, I think currently we will lose LRU information when migrating sessions - in DeserializeSessionState we add prepared statements in the order in which they were serialized. This suggests a couple of serialization approaches:
- we could serialize prepared statements in reverse LRU fashion (tail first, tail.prev then, etc) - this way when we deserialize the state and create prepared statements, we'll get the correct LRU ordering by construction
- or we could explicitly serialize the doubly-linked list separately.
Reviewed 12 of 12 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @michae2, @rafiss, and @ZhouXing19)
pkg/sql/conn_executor.go line 1050 at r2 (raw file):
ex.extraTxnState.prepStmtsNamespace = prepStmtNamespace{ prepStmts: make(map[string]*PreparedStatement), prepStmtsLru: make(map[string]struct{ prev, next string }),
nit: maybe s/Lru/LRU/g?
pkg/sql/conn_executor.go line 1654 at r2 (raw file):
// statement names ordered by most recent access (needed to determine // evictions when prepared_statements_cache_size is set). There is a special // entry for the empty string which is both the head and tail of the
nit: perhaps extract the empty string into named constants for head and tail?
pkg/sql/conn_executor.go line 1687 at r2 (raw file):
next, ok := ns.prepStmtsLru[this.next] if !ok || next.prev != "" { // If the chain has broken for some reason, just leak the broken part and
When can this occur?
pkg/sql/conn_executor.go line 1706 at r2 (raw file):
} // Note: must do this serially in case prev and next are the same entry. if prev, ok := ns.prepStmtsLru[this.prev]; ok && prev.next == name {
Is the second part of the conditional needed? IIUC if ok == true, but prev.next != name, then this should be an assertion failure.
pkg/sql/conn_executor.go line 1760 at r2 (raw file):
sb.WriteString("Prep stmts ordered by most recent access: ") for entry, ok := ns.prepStmtsLru[""]; ok && entry.next != ""; entry, ok = ns.prepStmtsLru[entry.next] { sb.WriteString(entry.next + " ")
nit: this will not print out "empty portal name" - should we do the if as we did above?
pkg/sql/conn_executor_prepare.go line 138 at r2 (raw file):
lru := ex.extraTxnState.prepStmtsNamespace.prepStmtsLru // While we're over the cache size, deallocate the LRU prepared statement. for head := lru[""]; head.prev != "" && head.prev != name; head = lru[""] {
nit: in this context s/head/tail/g might make more sense.
pkg/sql/conn_executor_prepare.go line 145 at r2 (raw file):
// deleting it won't immediately free memory, so we subtract the memory // estimate instead of checking ex.sessionPreparedMon.AllocBytes() again. alloc -= ex.deletePreparedStmt(ctx, head.prev)
Should we add some logging about the number of deallocated prepared statements? Perhaps log.VEventf(ctx, 1, ...).
pkg/sql/conn_executor_prepare.go line 544 at r2 (raw file):
} func (ex *connExecutor) deletePreparedStmt(ctx context.Context, name string) int64 {
nit: mention in the comment what return value is about.
pkg/sql/conn_executor_prepare.go line 602 at r2 (raw file):
switch descCmd.Type { case pgwirebase.PrepareStatement: ps, ok := ex.extraTxnState.prepStmtsNamespace.prepStmts[string(descCmd.Name)]
Should this also result in a touch? If not, a quick comment might be worth it.
pkg/sql/vars.go line 2584 at r2 (raw file):
}, GlobalDefault: func(_ *settings.Values) string { return string(humanizeutil.IBytes(0))
I wonder whether we want to enable this feature by default on 23.1. Perhaps it can be done in a separate commit.
pkg/sql/logictest/testdata/logic_test/prepare line 1486 at r2 (raw file):
1 10 100 # Test that prepared_statements_cache_size functions correctly.
nit: perhaps define a separate subtest for this.
pkg/sql/logictest/testdata/logic_test/prepare line 1616 at r2 (raw file):
pscs17 # Retrying a transaction should rewind the LRU list each retry even if some
nit: probably s/list each retry/list after each retry/g.
pkg/sql/logictest/testdata/logic_test/prepare line 1650 at r2 (raw file):
1 pscs16 1 pscs17 2 pscs14
Why do we have five and not six prepared statements with which==2?
michae2
left a comment
There was a problem hiding this comment.
Thanks for the reviews! I'll keep polishing this and when we get to commit / backport time we can see where #96358 is at.
we could serialize prepared statements in reverse LRU fashion
I like this, too. Done.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @rafiss, @yuzefovich, and @ZhouXing19)
pkg/sql/conn_executor.go line 1050 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: maybe
s/Lru/LRU/g?
Done.
pkg/sql/conn_executor.go line 1654 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: perhaps extract the empty string into named constants for
headandtail?
Done.
pkg/sql/conn_executor.go line 1687 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
When can this occur?
It can't as far as I know (I was being paranoid). I've changed it to an assertion.
pkg/sql/conn_executor.go line 1706 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Is the second part of the conditional needed? IIUC if
ok == true, butprev.next != name, then this should be an assertion failure.
Changed to an assertion.
pkg/sql/conn_executor.go line 1760 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: this will not print out "empty portal name" - should we do the
ifas we did above?
Good point, done.
pkg/sql/conn_executor_prepare.go line 138 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: in this context
s/head/tail/gmight make more sense.
Oh, good point. Done.
pkg/sql/conn_executor_prepare.go line 145 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Should we add some logging about the number of deallocated prepared statements? Perhaps
log.VEventf(ctx, 1, ...).
I've added a log message every time we deallocate. (Hmm, is that going to be too frequent? Quite possibly. I need to think about this.)
pkg/sql/conn_executor_prepare.go line 544 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: mention in the comment what return value is about.
Done.
pkg/sql/conn_executor_prepare.go line 602 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Should this also result in a touch? If not, a quick comment might be worth it.
Added a comment.
pkg/sql/vars.go line 2584 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I wonder whether we want to enable this feature by default on 23.1. Perhaps it can be done in a separate commit.
Good point, I wasn't thinking about that. Maybe with a high default value? (1 GiB?) I'll do that in a separate commit.
pkg/sql/logictest/testdata/logic_test/prepare line 1486 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: perhaps define a separate
subtestfor this.
Done.
pkg/sql/logictest/testdata/logic_test/prepare line 1616 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: probably
s/list each retry/list after each retry/g.
Done.
pkg/sql/logictest/testdata/logic_test/prepare line 1650 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Why do we have five and not six prepared statements with
which==2?
Sharp eye! I believe this is because in
cockroach/pkg/sql/conn_executor.go
Lines 1703 to 1733 in 0a72a49
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewed 5 of 12 files at r1, 2 of 2 files at r2, 8 of 8 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @michae2, @rafiss, @yuzefovich, and @ZhouXing19)
pkg/sql/conn_executor_prepare.go line 139 at r3 (raw file):
lru := ex.extraTxnState.prepStmtsNamespace.prepStmtsLRU // While we're over the cache size, deallocate the LRU prepared statement. for tail := lru[prepStmtsLRUTail]; tail.prev != prepStmtsLRUHead && tail.prev != name; tail = lru[prepStmtsLRUTail] {
Should we keep at least one prepared statement in the cache, even if it exceeds the limit?
pkg/sql/logictest/testdata/logic_test/prepare line 1613 at r3 (raw file):
pscs13 pscs14 pscs15
The first row is the front of the list, right? Why aren't 15, 16 and 17 in front? Or if it's the other way around, why is 11 at the tail? Either way, it seems like 13 and 14 should be at the back.
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 8 of 8 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @michae2, @rafiss, and @ZhouXing19)
pkg/sql/conn_executor.go line 1693 at r3 (raw file):
// Note: must do this serially in case head and next are the same entry. head := ns.prepStmtsLRU[prepStmtsLRUHead] this.next = head.next
Shouldn't we set this.prev = prepStmtsLRUHead somewhere? Oh, it's because the default value of prev string happens to be the empty string which equals prepStmtsLRUHead. Still, it might be more clear to perform an explicit assignment.
pkg/sql/conn_executor.go line 1747 at r3 (raw file):
return } if head, ok := ns.prepStmtsLRU[prepStmtsLRUHead]; ok && head.next == name {
nit: can this nested if be simplified a bit? What if we always add the empty string into prepStmtsLRU when creating the map? If we never remove it from the LRU cache, then it should be sufficient to just check ns.prepStmtsLRU[prepStmtsLRUHead].next == name to see whether name is as the front.
pkg/sql/conn_executor_exec.go line 456 at r3 (raw file):
ps, ok := ex.extraTxnState.prepStmtsNamespace.prepStmts[name] if !ok { err := pgerror.Newf(
nit: should we improve the error message when the cache size limit is set? Perhaps with a notice or a hint? It might be good to audit all callsites where we use pgcode.InvalidSQLStatementName and improve the error message if it's about non-existent prepared statement.
pkg/sql/conn_executor_prepare.go line 145 at r2 (raw file):
Previously, michae2 (Michael Erickson) wrote…
I've added a log message every time we deallocate. (Hmm, is that going to be too frequent? Quite possibly. I need to think about this.)
I guess on average we should see exactly one log line when a new prepared statement is added when we're at the limit, seems ok since we do need to set vmodule=conn_executor_prepare=1 for this.
pkg/sql/conn_executor_prepare.go line 139 at r3 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Should we keep at least one prepared statement in the cache, even if it exceeds the limit?
+1. I think it might be worthwhile to always keep one unnamed prepared statement and at least one named one.
pkg/sql/conn_executor_prepare.go line 184 at r3 (raw file):
prepared := &PreparedStatement{ memAcc: ex.sessionPreparedMon.MakeBoundAccount(),
Something that we might want to consider changing is this bit. Currently, for each prepared statement we derive a separate memory account. This means that the first Grow call will use at least mon.DefaultPoolAllocationSize, so even if a prepared statement uses 1B, we'll still account for 10KiB. We could change it in two ways:
- create a single memory account for all prepared statements
- adjust
sessionPreparedMonso it's created withincrementother than-1. I prefer this option, and perhaps we should just useincrement=1. This will make it so that we grow the memory accounts precisely, without reserving anything extra. The caveat is that it might increase the mutex contention onsessionPreparedMon, but I don't think it'll happen - we shouldn't have any concurrency with creating/deleting prepared statements.
Addressing this gap might also resolve the unexplained overhead of memory accounting of the prepared statements (I think we have an issue for this).
(However, if each prepared statement is using 10KiB or more, then what I said above shouldn't really matter.)
pkg/sql/vars.go line 2584 at r2 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Good point, I wasn't thinking about that. Maybe with a high default value? (1 GiB?) I'll do that in a separate commit.
Hm, since this limit is per session, I'd probably go for smaller default value, perhaps 4-16MiB. Do you know what is the estimated footprint of a single prepared statement? It seems reasonable to start evicting prepared statements once a session goes into thousands of them, so if a single one takes 10KiB, then 16MiB might be good. I agree that it can be done in a separate commit.
pkg/sql/logictest/testdata/logic_test/prepare line 1650 at r2 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Sharp eye! I believe this is because in
we don't shrink memory usage for any prepared statements being rewound. I left a comment about this on one of the issues: #98071 (comment)cockroach/pkg/sql/conn_executor.go
Lines 1703 to 1733 in 0a72a49
Oh, nice find. Consider adding a quick comment to the logic test and linking the github issue too.
michae2
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @rafiss, @yuzefovich, and @ZhouXing19)
pkg/sql/conn_executor.go line 1693 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Shouldn't we set
this.prev = prepStmtsLRUHeadsomewhere? Oh, it's because the default value ofprev stringhappens to be the empty string which equalsprepStmtsLRUHead. Still, it might be more clear to perform an explicit assignment.
Done.
pkg/sql/conn_executor.go line 1747 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: can this nested
ifbe simplified a bit? What if we always add the empty string intoprepStmtsLRUwhen creating the map? If we never remove it from the LRU cache, then it should be sufficient to just checkns.prepStmtsLRU[prepStmtsLRUHead].next == nameto see whethernameis as the front.
I was being paranoid again.
(In fact, I don't think we even need to always add the empty string into prepStmtsLRU, since the index expression will return a zero-value struct{ prev, next string } if the empty string is not a member.)
Done.
pkg/sql/conn_executor_exec.go line 456 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: should we improve the error message when the cache size limit is set? Perhaps with a notice or a hint? It might be good to audit all callsites where we use
pgcode.InvalidSQLStatementNameand improve the error message if it's about non-existent prepared statement.
It's hard to tell if the non-existent prepared statement was evicted or deallocated (or never existed) but I think you've got a point, so I went ahead and added a newPreparedStmtDNEError helper function which adds a hint if prepared_statements_cache_size is set.
pkg/sql/conn_executor_prepare.go line 139 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
+1. I think it might be worthwhile to always keep one unnamed prepared statement and at least one named one.
The check that tail.prev != name in the for does mean we keep at least one prepared statement (the one that was just added).
We always keep the "anonymous" prepared statement (with name "") because it is not in the LRU list.
pkg/sql/conn_executor_prepare.go line 184 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Something that we might want to consider changing is this bit. Currently, for each prepared statement we derive a separate memory account. This means that the first
Growcall will use at leastmon.DefaultPoolAllocationSize, so even if a prepared statement uses 1B, we'll still account for 10KiB. We could change it in two ways:
- create a single memory account for all prepared statements
- adjust
sessionPreparedMonso it's created withincrementother than-1. I prefer this option, and perhaps we should just useincrement=1. This will make it so that we grow the memory accounts precisely, without reserving anything extra. The caveat is that it might increase the mutex contention onsessionPreparedMon, but I don't think it'll happen - we shouldn't have any concurrency with creating/deleting prepared statements.Addressing this gap might also resolve the unexplained overhead of memory accounting of the prepared statements (I think we have an issue for this).
(However, if each prepared statement is using 10KiB or more, then what I said above shouldn't really matter.)
Nice idea. It looks like the smallest prepared statements are around 3 KiB each, so I changed sessionPreparedMon to use 1024 as the increment. I think this will help in a lot of cases even without using prepared_statements_cache_size!
pkg/sql/logictest/testdata/logic_test/prepare line 1650 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Oh, nice find. Consider adding a quick comment to the logic test and linking the github issue too.
So as it turns out, I was wrong: that function is calling decRef and it should work just fine. The real cause is that the first PREPARE in the txn evicts a prepared statement but doesn't actually reduce memory usage, because the txn is still open and so no prepared statement refcounts go to zero. Then the second PREPARE evicts two prepared statements to make up for this.
I'm still thinking about this one.
pkg/sql/logictest/testdata/logic_test/prepare line 1613 at r3 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
The first row is the front of the list, right? Why aren't 15, 16 and 17 in front? Or if it's the other way around, why is 11 at the tail? Either way, it seems like 13 and 14 should be at the back.
These are (confusingly) sorted by name, not position in the LRU list. (I've been using fmt.Println(ns) in various parts of connExecutor to see the order of the LRU list.)
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 12 of 12 files at r4, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball, @michae2, @rafiss, and @ZhouXing19)
pkg/sql/conn_executor.go line 1791 at r4 (raw file):
func (ns *prepStmtNamespace) String() string { var sb strings.Builder sb.WriteString("Prep stmts: ")
nit: since we're always keeping track of LRU prepared statements, should we only keep the LRU ordering here? I.e. currently we print out all prepared statements twice (first, with random map-iteration order, second, with LRU order), should we just keep the latter?
pkg/sql/conn_executor.go line 1792 at r4 (raw file):
var sb strings.Builder sb.WriteString("Prep stmts: ") for name, ps := range ns.prepStmts {
nit: ps doesn't seem to be used.
pkg/sql/logictest/testdata/logic_test/prepare line 1650 at r2 (raw file):
Previously, michae2 (Michael Erickson) wrote…
So as it turns out, I was wrong: that function is calling
decRefand it should work just fine. The real cause is that the firstPREPAREin the txn evicts a prepared statement but doesn't actually reduce memory usage, because the txn is still open and so no prepared statement refcounts go to zero. Then the secondPREPAREevicts two prepared statements to make up for this.I'm still thinking about this one.
Interesting. It doesn't seem like a blocking issue to me, perhaps mention this in a github issue (or create a new one)?
Add a new circular doubly-linked list of prepared statements to `prepStmtNamespace` which tracks the least-recently-used prepared statement. When new setting `prepared_statements_cache_size` is set, use this LRU list to automatically deallocate prepared statements. Fixes: cockroachdb#97866 Epic: None Release note (sql change): Add a new `prepared_statements_cache_size` setting which, when set to a non-zero number of bytes, causes the least- recently-used prepared statements to be automatically deallocated when prepared statement memory usage goes above the cache size. This setting can be used to avoid prepared statement leaks from long-lived connections which never `DEALLOCATE` prepared statements.
michae2
left a comment
There was a problem hiding this comment.
TFYR!
bors r=yuzefovich
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @rafiss, @yuzefovich, and @ZhouXing19)
pkg/sql/conn_executor.go line 1791 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: since we're always keeping track of LRU prepared statements, should we only keep the LRU ordering here? I.e. currently we print out all prepared statements twice (first, with random
map-iteration order, second, with LRU order), should we just keep the latter?
I was being paranoid yet again. Done.
pkg/sql/conn_executor.go line 1792 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
psdoesn't seem to be used.
Removed.
pkg/sql/logictest/testdata/logic_test/prepare line 1650 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Interesting. It doesn't seem like a blocking issue to me, perhaps mention this in a github issue (or create a new one)?
The more I thought about it, the more uncomfortable I got (what if someone had thousands of prepared statements in their transaction?) so I went ahead and fixed it by tracking allocations in prepStmtNamespace.prepStmtsLRUAlloc. Hopefully you don't hate it. 😉
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 6018086 to blathers/backport-release-22.1-98917: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. error creating merge commit from 6018086 to blathers/backport-release-22.2-98917: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. error creating merge commit from 6018086 to blathers/backport-release-23.1-98917: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
|
I'm assuming you'll get an error when trying to use a prepared statement that's been automatically deallocated, correct? We should be sure to note that when we document this feature. What error is returned - is it |
Yes, it is. Here's how it looks: |
|
Also, with |
Add a new circular doubly-linked list of prepared statements to
prepStmtNamespacewhich tracks the least-recently-used prepared statement. When new settingprepared_statements_cache_sizeis set, use this LRU list to automatically deallocate prepared statements.Fixes: #97866
Epic: None
Release note (sql change): Add a new
prepared_statements_cache_sizesetting which, when set to a non-zero number of bytes, causes the least-recently-used prepared statements to be automatically deallocated when prepared statement memory usage goes above the cache size. This setting can be used to avoid prepared statement leaks from long-lived connections which neverDEALLOCATEprepared statements.