sql: add summarized query formats for select, insert, and update statements#71012
Conversation
9322896 to
e5eddd6
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/ui/workspaces/cluster-ui/src/statementsTable/statementsTable.tsx, line 244 at r1 (raw file):
original: string, ) { switch (summary.statement) {
I wanted to get something up for early review, just to make sure I'm approaching this correctly! I can add more cases for the other formats we want to implement later (insert, update), but wondering if I should do that in the same PR, or a separate one, or in separate commits?
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 66 at r1 (raw file):
}; const sqlKeywords = new Set(["ON", "WHERE", "ORDER BY"]);
Is there an easier, more-effective way to separate SQL clauses? From a quick glance, I think my current solution is able to catch most cases I see on the Statements page, although definitely don't think I'm covering every edge case right now.
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/ui/workspaces/cluster-ui/src/statementsTable/statementsTable.tsx, line 243 at r1 (raw file):
summary: StatementSummary, original: string, ) {
nit: let's annotated this with return types :D
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 66 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Is there an easier, more-effective way to separate SQL clauses? From a quick glance, I think my current solution is able to catch most cases I see on the Statements page, although definitely don't think I'm covering every edge case right now.
Hmm the only way to 100% correctly parse our SQL statements is to reuse the SQL parser used in the DB itself. This makes me think that, unless we want to reimplement the parsing logic in TS, maybe the best way for us to surface this information in the backend (where we actually have the parsed AST of a statement), and bundle that in the statement metadata.
Our statement syntax also is constantly evolving, e.g. AS OF SYSTEM TIME clause is not really standard in Postgres and it is specific to cockroach db.
cc @maryliag what do you think?
e5eddd6 to
b83d118
Compare
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/ui/workspaces/cluster-ui/src/statementsTable/statementsTable.tsx, line 244 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
I wanted to get something up for early review, just to make sure I'm approaching this correctly! I can add more cases for the other formats we want to implement later (insert, update), but wondering if I should do that in the same PR, or a separate one, or in separate commits?
you can do on the same PR and on the same commit
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 60 at r1 (raw file):
// detailedSummarize takes a string SQL statement and produces a // structured summary of the query, with columns and join/nested table info
nit: all comment should end with a period
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 66 at r1 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Hmm the only way to 100% correctly parse our SQL statements is to reuse the SQL parser used in the DB itself. This makes me think that, unless we want to reimplement the parsing logic in TS, maybe the best way for us to surface this information in the backend (where we actually have the parsed AST of a statement), and bundle that in the statement metadata.
Our statement syntax also is constantly evolving, e.g.
AS OF SYSTEM TIMEclause is not really standard in Postgres and it is specific to cockroach db.cc @maryliag what do you think?
We don't want to recreate all the logic on the frontend, so I agree we can try use something from the backend and send a few new parameters, meaning keep the statement value as is, so it can be used on places we already use, and add a new metadata field that have the statement separated and that you can here later on
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 87 at r1 (raw file):
statementMatch.groups.table ) { // get columns from regex
nit: period
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 98 at r1 (raw file):
} // get tables from regex
nit: period
b83d118 to
3a7ba21
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 66 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
We don't want to recreate all the logic on the frontend, so I agree we can try use something from the backend and send a few new parameters, meaning keep the statement value as is, so it can be used on places we already use, and add a new metadata field that have the statement separated and that you can here later on
got it, I can look into that! btw, do you have any code pointers on hand for any files that might be relevant? f not, I can try to do some digging myself first
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
pkg/ui/workspaces/cluster-ui/src/statementsTable/statementsTable.tsx, line 244 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
you can do on the same PR and on the same commit
gotcha 👍
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
pkg/ui/workspaces/cluster-ui/src/util/sql/summarize.ts, line 66 at r1 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
got it, I can look into that! btw, do you have any code pointers on hand for any files that might be relevant? f not, I can try to do some digging myself first
I think one place we can start doing some digging is the sql.Statement struct, which is what we use to represent statements within SQL package. sql.Statement embeds parser.Statement which in turn embeds tree.Statement interface.
tree.Statement interface is what we use to represent the statement AST. It contains NodeFormatter interface which walks through the AST tree and format statement into string based on formatting flags.
Thomas did something like this for statement reduction back then in this PR, i think if it's me I'd start digging there :P
3a7ba21 to
c646a03
Compare
c646a03 to
92f153c
Compare
b79ff09 to
f0b2be1
Compare
f0b2be1 to
7fab019
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, and @maryliag)
a discussion (no related file):
Trying to break my code down into smaller commits, so I'll be including metadata changes and frontend changes in a later commit. Also haven't implemented the JOIN simplifying logic yet, since it seemed a little complicated. Currently clarifying some details with Anne and Kevin, but I can add that in a future commit as well.
pkg/sql/sem/tree/format.go, line 544 at r5 (raw file):
// FormatNodeSummary recurses into a node for pretty-printing a summarized version. func (ctx *FmtCtx) FormatNodeSummary(n NodeFormatter) {
Hmm, I'm not sure if it would've been cleaner to pass down my FmtSummary flag and implement the summarized versions in each select/insert/update Format function instead of in my own functions? Although I didn't want to have too many code snippets everywhere, and there's also the issue with outer and inner SELECT clauses being formatted differently.
pkg/sql/sem/tree/format_test.go, line 447 at r5 (raw file):
{`INSERT INTO vehicles (vehicles, rides) VALUES (1), (2), (3), (4), (5)`, `INSERT INTO vehicles (vehicles, rides)`}, {`INSERT INTO vehicles (vehicles, rides) SELECT vehicle_ids, rides FROM riders`,
I wasn't able to test with periods in my strings, since I was getting an error message about the format not being implemented. Is there a way around this?
pkg/sql/sem/tree/select.go, line 93 at r5 (raw file):
// Format implements the NodeFormatter interface. func (node *SelectClause) Format(ctx *FmtCtx) { f := ctx.flags
Note: This is a special case for nested Select statements, where we don't wanna show columns at all.
874b57d to
1601a96
Compare
matthewtodd
left a comment
There was a problem hiding this comment.
Reviewed 3 of 5 files at r5, 2 of 2 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, @maryliag, and @matthewtodd)
a discussion (no related file):
Previously, lindseyjin (Lindsey Jin) wrote…
Trying to break my code down into smaller commits, so I'll be including metadata changes and frontend changes in a later commit. Also haven't implemented the JOIN simplifying logic yet, since it seemed a little complicated. Currently clarifying some details with Anne and Kevin, but I can add that in a future commit as well.
Yay for small, focused changes! 🎉
pkg/sql/sem/tree/format.go, line 579 at r7 (raw file):
// AsShortStringWithFlags pretty prints a node to a string given specific flags; only // flags that don't require Annotations can be used. func AsShortStringWithFlags(n NodeFormatter, fl FmtFlags, opts ...FmtCtxOption) string {
Since FmtSummary is a new flag we're introducing here, I'm wondering if it would be clearer to fold this behavior into the existing AsStringWithFlags, so that callers don't have to both set the flag and call a different method. WDYT?
1601a96 to
7cb22b4
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, @maryliag, and @matthewtodd)
pkg/sql/sem/tree/format.go, line 579 at r7 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
Since
FmtSummaryis a new flag we're introducing here, I'm wondering if it would be clearer to fold this behavior into the existingAsStringWithFlags, so that callers don't have to both set the flag and call a different method. WDYT?
Done! Thanks for the suggestion
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @Azhng, @lindseyjin, @maryliag, and @matthewtodd)
pkg/sql/sem/tree/format_test.go, line 443 at r9 (raw file):
// calls encodeSQLString with the right formatter. // See TestFormatExprs below for the test on DStrings. {`SELECT (SELECT count(*) FROM system.jobs) AS j, num_running, s.* FROM system.scheduled_jobs AS s WHERE next_run < current_timestamp() ORDER BY random() LIMIT 10 FOR UPDATE`,
It will really improve the readability a lot if you explicitly list out the field names, e.g.
{
stmt: `SELECT ...`,
expected: `SELECT ....`,
}pkg/sql/sem/tree/format_test.go, line 444 at r9 (raw file):
// See TestFormatExprs below for the test on DStrings. {`SELECT (SELECT count(*) FROM system.jobs) AS j, num_running, s.* FROM system.scheduled_jobs AS s WHERE next_run < current_timestamp() ORDER BY random() LIMIT 10 FOR UPDATE`, `SELECT (SELECT FROM sy... FROM system.scheduled_jobs AS s`},
Hmm, I feel we might want to close the parenthesis for the subquery here, thoughts?
3f4ef20 to
a38c90f
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @lindseyjin, @maryliag, and @matthewtodd)
pkg/sql/sem/tree/format_test.go, line 443 at r9 (raw file):
Previously, Azhng (Archer Zhang) wrote…
It will really improve the readability a lot if you explicitly list out the field names, e.g.
{ stmt: `SELECT ...`, expected: `SELECT ....`, }
Done.
pkg/sql/sem/tree/format_test.go, line 444 at r9 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Hmm, I feel we might want to close the parenthesis for the subquery here, thoughts?
Would require some refactoring, but I can add that in if you think it's necessary!
I'm also concerned about cases with a subquery, but also additional truncated text outside that query? eg) For the above example, it might be misleading to add a closing parenthesis
SELECT (SELECT FROM sy...) FROM ...
Because there's additional columns being selected after the subquery. Unless we also decide to add extra ellipses after the closing parenthesis, but that might be unnecessarily complicated.
SELECT (SELECT FROM sy...)... FROM ...
Thoughts?
a38c90f to
c6f1b91
Compare
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @kevin-v-ngo, @knz, @lindseyjin, and @matthewtodd)
pkg/sql/sem/tree/format_test.go, line 444 at r9 (raw file):
Previously, lindseyjin (Lindsey Jin) wrote…
Would require some refactoring, but I can add that in if you think it's necessary!
I'm also concerned about cases with a subquery, but also additional truncated text outside that query? eg) For the above example, it might be misleading to add a closing parenthesis
SELECT (SELECT FROM sy...) FROM ...Because there's additional columns being selected after the subquery. Unless we also decide to add extra ellipses after the closing parenthesis, but that might be unnecessarily complicated.
SELECT (SELECT FROM sy...)... FROM ...Thoughts?
Hmm, personally, I found the summary string of the shape SELECT (SELECT FROM sy...)... FROM ... significantly more intuitive to read compared to the case of unbalanced parenthesis (e.g. SELECT (SELECT FROM sy... FROM system.scheduled_jobs AS s). Defer to @kevin-v-ngo on whether if this really affect UX
Hmm also, how come this is not being treated as the special case where we simply reduce it to (SELECT) ?
|
(sorry I'm on PTO and unable to review this) |
maryliag
left a comment
There was a problem hiding this comment.
Reviewed 3 of 5 files at r5, 1 of 3 files at r8, 1 of 1 files at r9, 1 of 1 files at r11, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @kevin-v-ngo, @lindseyjin, @maryliag, and @matthewtodd)
pkg/sql/sem/tree/format_test.go, line 433 at r11 (raw file):
} func TestFormatNodeSummary(t *testing.T) {
@kevin-v-ngo @Annebirzin can you take a look into this test file just to validate the new formats or if there is any other case you have in mind that you would like to see the result.
@lindseyjin just confirming, you haven't dealt with the JOIN case yet, correct?
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Annebirzin, @Azhng, @kevin-v-ngo, @maryliag, and @matthewtodd)
pkg/sql/sem/tree/format_test.go, line 433 at r11 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
@kevin-v-ngo @Annebirzin can you take a look into this test file just to validate the new formats or if there is any other case you have in mind that you would like to see the result.
@lindseyjin just confirming, you haven't dealt with the JOIN case yet, correct?
Oh, quick update for that: I talked to Kevin and Anne and they decided it was fine if we just go up to 30 characters starting from the FROM clause for tables, instead of implementing specific JOIN logic. That would be easier, and we’d likely capture the first table, the JOIN clause, ON clause, and maybe another table, before the text is truncated.
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @Annebirzin, @Azhng, @kevin-v-ngo, @maryliag, and @matthewtodd)
|
Nice! @maryliag @lindseyjin one thing I noticed in the test file, a few of the truncated statements look very long and I'm wondering how that will display in the statements page considering the limited space. A couple of thoughts:
@kevin-v-ngo any thoughts? |
c6f1b91 to
31f8ebf
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Hmm, I'm not too sure how it would render in the frontend, although I'd imagine that overly long statements would be cropped anyways instead of extending the table cell size. Can include screenshots in my next commit after I finish implementing that!
For now, I've updated the table character limit to be 15 instead of 30 characters. Is that better?
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @Annebirzin, @Azhng, @kevin-v-ngo, @maryliag, and @matthewtodd)
|
Sound good! We could also consider extending the width of the statements column a bit, but will wait to see how it renders in the UI. |
Annebirzin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @Annebirzin, @Azhng, @kevin-v-ngo, @maryliag, and @matthewtodd)
3d24541 to
74088e3
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @Annebirzin, @Azhng, @kevin-v-ngo, @maryliag, and @matthewtodd)
pkg/sql/sem/tree/format_test.go, line 444 at r9 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Hmm, personally, I found the summary string of the shape
SELECT (SELECT FROM sy...)... FROM ...significantly more intuitive to read compared to the case of unbalanced parenthesis (e.g.SELECT (SELECT FROM sy... FROM system.scheduled_jobs AS s). Defer to @kevin-v-ngo on whether if this really affect UXHmm also, how come this is not being treated as the special case where we simply reduce it to
(SELECT)?
Update to provide context: We're going with the first approach of closing the parentheses, and adding extra ellipses if necessary to signify additional text.
SELECT (SELECT FROM sy...)... FROM ...
pkg/sql/sem/tree/format_test.go, line 494 at r14 (raw file):
{ stmt: `SELECT (SELECT job_id FROM (SELECT * FROM system.jobs) AS c) AS j, name, app_name FROM system.apps`, expected: `SELECT (SELECT FROM (S...)...)... FROM system.apps`,
Any thoughts on this format? @Annebirzin @kevin-v-ngo
Do we want to close all open parentheses and add ellipses if there's text afterwards?
…ements Resolves cockroachdb#27021 Previously, statements on the statements page hid too much information. There were complaints that it was difficult to disambiguate between statements without having to view the full query on the tooltips. To address this, this commit adds in new formatting functions for creating summarized queries. Currently, this only applies to SELECT, INSERT, and UPDATE statement types. We have also added a new FmtFlag, FmtSummary, a new field, StmtSummary, in the Statement struct, and a new field, QuerySummary, in app stats. A later commit will implement these changes in the frontend. Release note (sql change): Add new formatting functions to create summarized queries for SELECT, INSERT, and UPDATE statements. Also add new metadata fields, which will later be used to pass this information to the front-end Statements page.
74088e3 to
ba94984
Compare
lindseyjin
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @Annebirzin, @Azhng, @kevin-v-ngo, @maryliag, and @matthewtodd)
pkg/sql/sem/tree/format_test.go, line 462 at r15 (raw file):
{ stmt: `INSERT INTO vehicles VALUES ($1, $2, __more6__)`, expected: `INSERT INTO vehicles`,
I realized that insert statements without provided columns simplify down to just INSERT INTO table. Do we want to consider showing the values in this case, or is this fine as it is? @Annebirzin
|
@lindseyjin I think it's probably fine as is |
lindseyjin
left a comment
There was a problem hiding this comment.
Sounds good, thanks!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @Annebirzin, @Azhng, @kevin-v-ngo, @maryliag, and @matthewtodd)
|
bors r+ |
|
Build succeeded: |
Resolves #27021
Previously, statements on the statements page hid too much information.
There were complaints that it was difficult to disambiguate between
statements without having to view the full query on the tooltips.
To address this, this commit adds in new formatting functions for
creating summarized queries. Currently, this only applies to SELECT,
INSERT, and UPDATE statement types. We have also added a new FmtFlag,
FmtSummary, a new field, StmtSummary, in the Statement struct, and a
new field, QuerySummary, in app stats. A later commit will implement
these changes in the frontend.
Release note (sql change): Add new formatting functions to create
summarized queries for SELECT, INSERT, and UPDATE statements. Also
add new metadata fields, which will later be used to pass this
information to the front-end Statements page.