sql: statement statistics, step three#14845
Conversation
|
Nice! I recall talking about rewriting database, table and column names to IDs. Is that still on tap? Kind of nice to see the queries with the names, though. Review status: 0 of 3 files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
|
We discussed this with David; the replacement of names (which is more expensive) will be done asynchronously by the network reporter. The DBA will probably appreciate to see the full names of their things in the virtual table. Review status: 0 of 3 files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
| // testing confirms it works and is stable. | ||
| var StmtStatsEnable = envutil.EnvOrDefaultBool( | ||
| "COCKROACH_SQL_STMT_STATS_ENABLE", false, | ||
| "COCKROACH_SQL_STMT_STATS_ENABLE", true, |
There was a problem hiding this comment.
might be worth including before/after benchmarks with the commit that enables this by default.
pkg/sql/parser/format.go
Outdated
| func formatOrScrubNode(buf *bytes.Buffer, f FmtFlags, n NodeFormatter) { | ||
| if f.scrubDatums { | ||
| switch n.(type) { | ||
| case Datum, *NumVal, *StrVal: |
There was a problem hiding this comment.
Can we match Constant here? if not, we should add a comment there to update this if adding another one.
|
Review status: 0 of 3 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. pkg/sql/parser/format.go, line 75 at r2 (raw file):
We'll have two different levels of scrubbing - this one which removes datums, and a second level that also removes SQL identifiers. I think I'd prefer to use the term "scrubbed" for the latter, but I don't have a suggestion for what to call this one ( pkg/sql/parser/format.go, line 128 at r2 (raw file):
We might also want to handle tuples specially (or are they already covered by
Comments from Reviewable |
|
Review status: 0 of 3 files reviewed at latest revision, 4 unresolved discussions. pkg/sql/app_stats.go, line 50 at r2 (raw file): Previously, dt (David Taylor) wrote…
Done. pkg/sql/parser/format.go, line 75 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
FWIW oracle also calls it a fingerprint. I settled on the boring "FmtHideConstants" instead. pkg/sql/parser/format.go, line 128 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I am not comfortable with this change because it would add special cases more overhead to the formatting algorithm. Besides, I can see that every "last insert" of a sequence will use a different number of values, but OTOH different arities may yield different query plans / performance profiles and the mean/stddev stats on the run time will be different because of this. I fear we'd pollute the results by grouping them together. I propose to keep tuples handled this way for now and revisit later if/when we observe it's an issue in practice. pkg/sql/parser/format.go, line 129 at r2 (raw file): Previously, dt (David Taylor) wrote…
Done. Comments from Reviewable |
|
Review status: 0 of 3 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. pkg/sql/parser/format.go, line 128 at r2 (raw file):
But you can't do that if you're using Comments from Reviewable |
|
Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. pkg/sql/parser/format.go, line 128 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Ack, yet I do not see an algorithm that will do this without scanning the tuple two times during formatting. Is that what we want? Comments from Reviewable |
|
Review status: 0 of 3 files reviewed at latest revision, 2 unresolved discussions, some commit checks failed. pkg/sql/parser/format.go, line 128 at r2 (raw file): Previously, knz (kena) wrote…
I'm not really thinking about the implementation, just the desirability of the feature. If it's hard to implement then I'm fine to leave it until later. Scanning the tuple twice doesn't seem too bad to me but I'm not sure what the implications of that would be. Comments from Reviewable |
|
Ok I have implemented the tuple scrubbing in a similar way as pg-query-digest. PTAL |
|
Reviewed 4 of 4 files at r4, 1 of 1 files at r5. pkg/sql/parser/format.go, line 75 at r2 (raw file): Previously, knz (kena) wrote…
LGTM Comments from Reviewable |
|
Review status: all files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. pkg/sql/parser/format.go, line 128 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
LGTM Comments from Reviewable |
|
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. pkg/sql/parser/format.go, line 78 at r5 (raw file):
The pkg/sql/testdata/logic_test/statement_statistics, line 75 at r5 (raw file):
The presence of types in some of these fingerprints is distracting. You added them so clearly you think they are important, but I'm not so sure. Comments from Reviewable |
|
Review status: all files reviewed at latest revision, 3 unresolved discussions, all commit checks successful. Comments from Reviewable |
This patch introduces a new AST formatting option `FmtHideConstants` which causes all datums and literals to be replaced by underscores. This achieves two goals: - it removes potentially sensitive information from collected statistics; - it ensures that statements which are equivalent modulo the constants have the same "statistics key" and thus that their statistics are grouped together. One of the side goals when I initially implemented this was to report the type of the datums and literals in the scrubbed form, so as to provide more information for offline analysis. This goal cannot be achieved at this point without a major refactor however: the typing information is currently not stored back in the AST and thus not available to the pretty-printer. To achieve this would mandate rewriting all the AST->planNode constructors.
Perf impact: negligible. ``` name old time/op new time/op delta Select1/Cockroach-4 138µs ± 3% 140µs ± 3% ~ (p=0.180 n=6+6) Select1/MultinodeCockroach-4 136µs ± 2% 142µs ± 4% +4.18% (p=0.009 n=6+6) Select2/Cockroach-4 850µs ± 1% 885µs ± 5% +4.03% (p=0.015 n=6+6) Select2/MultinodeCockroach-4 834µs ± 1% 860µs ± 1% +3.14% (p=0.004 n=6+5) Select3/Cockroach-4 1.17ms ± 2% 1.20ms ± 3% ~ (p=0.132 n=6+6) Select3/MultinodeCockroach-4 1.17ms ± 2% 1.18ms ± 5% ~ (p=0.699 n=6+6) SQL/Delete1/Cockroach-4 777µs ± 6% 767µs ± 3% ~ (p=0.394 n=6+6) SQL/Delete10/Cockroach-4 947µs ± 7% 913µs ± 2% ~ (p=0.093 n=6+6) SQL/Delete100/Cockroach-4 2.21ms ± 4% 2.17ms ± 2% ~ (p=0.247 n=6+5) SQL/Delete1000/Cockroach-4 17.7ms ± 2% 17.9ms ± 1% ~ (p=0.394 n=6+6) SQL/Insert1/Cockroach-4 545µs ± 2% 560µs ± 1% +2.82% (p=0.002 n=6+6) SQL/Insert10/Cockroach-4 788µs ± 2% 795µs ± 3% ~ (p=0.394 n=6+6) SQL/Insert100/Cockroach-4 2.93ms ± 2% 2.92ms ± 4% ~ (p=0.931 n=5+6) SQL/Insert1000/Cockroach-4 24.2ms ± 3% 23.6ms ± 3% ~ (p=0.180 n=6+6) SQL/InsertDistinct1/Cockroach-4 2.24ms ± 3% 2.23ms ± 3% ~ (p=0.818 n=6+6) SQL/InsertDistinct10/Cockroach-4 1.99ms ± 2% 1.88ms ±13% ~ (p=0.240 n=6+6) SQL/InsertDistinct100/Cockroach-4 6.06ms ± 4% 6.04ms ± 5% ~ (p=1.000 n=6+6) SQL/InsertDistinct1000/Cockroach-4 6.18ms ± 1% 6.26ms ± 6% ~ (p=1.000 n=4+6) SQL/InterleavedSelect1/Cockroach-4 632µs ± 3% 630µs ± 1% ~ (p=0.931 n=6+5) SQL/InterleavedSelect10/Cockroach-4 683µs ± 5% 719µs ± 5% +5.30% (p=0.026 n=6+6) SQL/InterleavedSelect100/Cockroach-4 1.19ms ± 2% 1.18ms ± 4% ~ (p=1.000 n=6+6) SQL/InterleavedSelect1000/Cockroach-4 6.33ms ± 4% 6.02ms ± 2% -4.95% (p=0.004 n=5+6) SQL/TrackChoices1/Cockroach-4 680µs ± 3% 650µs ± 6% -4.43% (p=0.004 n=6+6) SQL/TrackChoices10/Cockroach-4 150µs ± 3% 144µs ± 2% -3.90% (p=0.004 n=6+5) SQL/TrackChoices100/Cockroach-4 94.3µs ± 5% 88.5µs ± 4% -6.22% (p=0.009 n=6+6) SQL/TrackChoices1000/Cockroach-4 87.1µs ± 3% 82.0µs ± 2% -5.85% (p=0.002 n=6+6) SQL/TrackChoices1#01/Cockroach-4 669µs ± 3% 658µs ± 0% ~ (p=0.476 n=6+4) SQL/TrackChoices10#01/Cockroach-4 147µs ± 1% 146µs ± 5% ~ (p=0.429 n=5+6) SQL/TrackChoices100#01/Cockroach-4 93.4µs ± 1% 92.0µs ± 3% ~ (p=0.537 n=5+6) SQL/TrackChoices1000#01/Cockroach-4 86.5µs ± 2% 83.0µs ± 3% -4.09% (p=0.002 n=6+6) SQL/Update1/Cockroach-4 830µs ± 9% 827µs ± 1% ~ (p=0.589 n=6+6) SQL/Update10/Cockroach-4 1.40ms ± 4% 1.38ms ± 3% ~ (p=0.485 n=6+6) SQL/Update100/Cockroach-4 6.23ms ± 5% 6.10ms ± 2% ~ (p=0.310 n=6+6) SQL/Update1000/Cockroach-4 48.4ms ± 1% 47.4ms ± 2% -1.97% (p=0.009 n=6+6) SQL/Upsert1/Cockroach-4 1.06ms ± 2% 1.07ms ± 4% ~ (p=0.394 n=6+6) SQL/Upsert10/Cockroach-4 1.32ms ± 2% 1.31ms ± 4% ~ (p=0.394 n=6+6) SQL/Upsert100/Cockroach-4 4.38ms ± 1% 4.00ms ± 2% -8.59% (p=0.004 n=5+6) SQL/Upsert1000/Cockroach-4 31.6ms ± 4% 29.6ms ± 8% -6.53% (p=0.041 n=6+6) Scan/1Limit0/Cockroach-4 374µs ± 1% 363µs ± 7% ~ (p=0.065 n=6+6) Scan/1Limit1/Cockroach-4 385µs ± 1% 370µs ± 8% ~ (p=0.247 n=5+6) Scan/1Limit10/Cockroach-4 388µs ± 2% 391µs ± 5% ~ (p=0.589 n=6+6) Scan/1Limit100/Cockroach-4 390µs ± 3% 387µs ± 2% ~ (p=0.818 n=6+6) Scan/10Limit0/Cockroach-4 394µs ± 2% 407µs ± 3% +3.25% (p=0.015 n=6+6) Scan/10Limit1/Cockroach-4 371µs ± 1% 403µs ± 2% +8.73% (p=0.004 n=5+6) Scan/10Limit10/Cockroach-4 405µs ± 1% 425µs ± 2% +5.16% (p=0.004 n=5+6) Scan/10Limit100/Cockroach-4 394µs ± 1% 419µs ± 3% +6.22% (p=0.002 n=6+6) Scan/100Limit0/Cockroach-4 674µs ± 2% 646µs ± 1% -4.21% (p=0.002 n=6+6) Scan/100Limit1/Cockroach-4 400µs ± 4% 394µs ± 3% ~ (p=0.132 n=6+6) Scan/100Limit10/Cockroach-4 432µs ± 1% 445µs ± 3% +2.99% (p=0.015 n=6+6) Scan/100Limit100/Cockroach-4 699µs ± 0% 682µs ± 4% ~ (p=0.082 n=5+6) Scan/1000Limit0/Cockroach-4 3.12ms ± 2% 3.00ms ± 6% -3.94% (p=0.030 n=5+6) Scan/1000Limit1/Cockroach-4 385µs ± 4% 391µs ± 3% ~ (p=0.132 n=6+6) Scan/1000Limit10/Cockroach-4 394µs ± 1% 420µs ± 1% +6.44% (p=0.002 n=6+6) Scan/1000Limit100/Cockroach-4 657µs ± 1% 659µs ± 1% ~ (p=0.699 n=6+6) Scan/10000Limit0/Cockroach-4 27.4ms ± 9% 25.3ms ± 1% ~ (p=0.065 n=6+6) Scan/10000Limit1/Cockroach-4 391µs ± 2% 381µs ± 5% ~ (p=0.180 n=6+6) Scan/10000Limit10/Cockroach-4 408µs ± 2% 421µs ± 2% +3.18% (p=0.009 n=6+6) Scan/10000Limit100/Cockroach-4 679µs ± 2% 661µs ± 2% -2.68% (p=0.004 n=6+6) ScanFilter/10000Limit1/Cockroach-4 657µs ± 2% 663µs ± 2% ~ (p=0.310 n=6+6) ScanFilter/10000Limit10/Cockroach-4 704µs ± 2% 705µs ± 4% ~ (p=0.937 n=6+6) ScanFilter/10000Limit50/Cockroach-4 3.43ms ± 2% 3.26ms ± 5% ~ (p=0.052 n=5+6) OrderBy/100000Limit10/Cockroach-4 255ms ± 4% 262ms ± 7% ~ (p=0.180 n=6+6) OrderBy/100000Limit10Distinct/Cockroach-4 254ms ± 1% 272ms ± 6% +7.12% (p=0.004 n=5+6) WideTable/100/Cockroach-4 5.76ms ± 7% 5.46ms ± 2% -5.27% (p=0.009 n=6+6) WideTable/1000/Cockroach-4 6.75ms ± 2% 6.61ms ± 3% ~ (p=0.065 n=6+6) WideTable/10000/Cockroach-4 19.1ms ± 4% 17.5ms ± 2% -8.51% (p=0.002 n=6+6) WideTable/100000/Cockroach-4 113ms ± 7% 111ms ± 8% ~ (p=0.699 n=6+6) WideTable/1000000/Cockroach-4 1.08s ± 5% 1.06s ± 2% ~ (p=0.240 n=6+6) KV/Insert1_Native-4 663µs ± 5% 658µs ± 5% ~ (p=0.394 n=6+6) KV/Insert1_SQL-4 594µs ± 4% 570µs ± 2% -4.01% (p=0.002 n=6+6) KV/Insert10_Native-4 875µs ± 6% 915µs ± 8% ~ (p=0.132 n=6+6) KV/Insert10_SQL-4 868µs ± 2% 863µs ± 3% ~ (p=0.394 n=6+6) KV/Insert100_Native-4 2.75ms ± 3% 2.73ms ± 3% ~ (p=0.818 n=6+6) KV/Insert100_SQL-4 3.35ms ± 2% 3.38ms ± 2% ~ (p=0.485 n=6+6) KV/Insert1000_Native-4 20.7ms ± 2% 20.6ms ± 5% ~ (p=0.818 n=6+6) KV/Insert1000_SQL-4 28.5ms ± 2% 27.7ms ± 2% -2.73% (p=0.002 n=6+6) KV/Insert10000_Native-4 235ms ± 4% 253ms ± 5% +7.56% (p=0.009 n=6+6) KV/Insert10000_SQL-4 333ms ± 7% 329ms ± 4% ~ (p=0.699 n=6+6) KV/Update1_Native-4 1.04ms ± 2% 1.09ms ± 6% ~ (p=0.065 n=6+6) KV/Update1_SQL-4 838µs ± 7% 854µs ± 4% ~ (p=0.132 n=6+6) KV/Update10_Native-4 1.58ms ± 3% 1.57ms ± 5% ~ (p=0.937 n=6+6) KV/Update10_SQL-4 1.34ms ± 2% 1.33ms ± 1% ~ (p=0.937 n=6+6) KV/Update100_Native-4 5.75ms ± 4% 5.64ms ± 3% ~ (p=0.132 n=6+6) KV/Update100_SQL-4 5.90ms ± 3% 5.93ms ± 6% ~ (p=0.310 n=6+6) KV/Update1000_Native-4 48.6ms ± 1% 48.4ms ± 6% ~ (p=0.394 n=6+6) ```
|
TFYRs Review status: 2 of 5 files reviewed at latest revision, 2 unresolved discussions. pkg/sql/parser/format.go, line 78 at r5 (raw file): Previously, petermattis (Peter Mattis) wrote…
We are testing all our formatter flags via logic tests now. pkg/sql/testdata/logic_test/statement_statistics, line 75 at r5 (raw file): Previously, petermattis (Peter Mattis) wrote…
So we have had several discussions recently about problems around mixed-type tuples. I posit they are not common and I intended this to gather data from production systems. However it is not so important and we can collect this data later. I removed them for now. Comments from Reviewable |

Two patches:
sql: scrub queries collected fo per-statement statistics
This patch introduces a new AST formatting option
FmtScrubbedwhichcauses all datums and literals to be replaced by underscores. This
achieves two goals:
have the same "statistics key" and thus that their statistics are grouped
together.
One of the side goals when I initially implemented this was to report
the type of the datums and literals in the scrubbed form, so as to
provide more information for offline analysis. This goal cannot be
achieved at this point without a major refactor however: the typing
information is currently not stored back in the AST and thus not
available to the pretty-printer. To achieve this would mandate
rewriting all the AST->planNode constructors.
sql: enable SQL statement statistics collection by default.
cc @petermattis
Advances #13968.