sql: surface sampled query plan, database name, and execution nodes to internal node statement statistics table#65782
Conversation
maryliag
left a comment
There was a problem hiding this comment.
Great work!
Just a few comments:
- the PR should have just one commit, to you can squash the 3 into 1
- the PR name should have the the package name and all lower case, so
sql: surface sampled query.... - On the description add info about what the PR is about, a link to the issue with
Resolves #numberand a Release note
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @THardy98)
pkg/sql/crdb_internal.go, line 948 at r1 (raw file):
} // createSamplePlan is a go closure function that builds a JSON object
you don't need to get in such details about what type of go function is this, you can go with
// createSamplePlan builds a formatted JSON object from the explain tree nodes.
bb54349 to
5494ba1
Compare
THardy98
left a comment
There was a problem hiding this comment.
done
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @THardy98)
5494ba1 to
c68d446
Compare
THardy98
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @maryliag)
pkg/sql/crdb_internal.go, line 948 at r1 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
you don't need to get in such details about what type of go function is this, you can go with
// createSamplePlan builds a formatted JSON object from the explain tree nodes.
done
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @THardy98)
pkg/sql/crdb_internal.go, line 970 at r2 (raw file):
// createSamplePlan builds a formatted JSON object from the explain tree nodes. var createSamplePlan func(node *roachpb.ExplainTreePlanNode) *json.ObjectBuilder createSamplePlan = func(node *roachpb.ExplainTreePlanNode) *json.ObjectBuilder {
I think is better to remove the creation of this function from here, we can even add to a utils so other places can use it too
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @THardy98)
pkg/sql/crdb_internal.go, line 970 at r2 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
I think is better to remove the creation of this function from here, we can even add to a utils so other places can use it too
you can also just moved to the top along with the other functions
Azhng
left a comment
There was a problem hiding this comment.
Great work! 👏
Also remember to update the release note section in your commit message.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @THardy98)
pkg/sql/crdb_internal.go, line 968 at r2 (raw file):
} // createSamplePlan builds a formatted JSON object from the explain tree nodes.
nit: extra space here between "formatted" and "JSON"
pkg/sql/crdb_internal.go, line 969 at r2 (raw file):
// createSamplePlan builds a formatted JSON object from the explain tree nodes. var createSamplePlan func(node *roachpb.ExplainTreePlanNode) *json.ObjectBuilder
Also, I think you can just return the json.JSON object instead of the builder.
pkg/sql/crdb_internal.go, line 970 at r2 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
you can also just moved to the top along with the other functions
Also it would be nice if you can add some tests for this function once you extracted it.
pkg/sql/crdb_internal.go, line 972 at r2 (raw file):
createSamplePlan = func(node *roachpb.ExplainTreePlanNode) *json.ObjectBuilder { nodePlan := json.NewObjectBuilder(3)
Two things here:
- usually if we are passing a constant literal to a function, we usually add an inline comment with the parameter name behind that constant. So in this case, it would be:
json.NewObjectBuilder(3 /* numAddsHint */) - Why 3 in this case? Would be nice to add a comment explaining it.
pkg/sql/crdb_internal.go, line 979 at r2 (raw file):
for _, attr := range node.Attrs { attrBuilder := json.NewObjectBuilder(1)
ditto
pkg/sql/crdb_internal.go, line 996 at r2 (raw file):
samplePlan := createSamplePlan(&s.mu.data.SensitiveInfo.MostRecentPlanDescription) execNodeIds := tree.NewDArray(types.Int)
nit: s/execNodeIds/execNodeIDs/
pkg/sql/crdb_internal.go, line 997 at r2 (raw file):
execNodeIds := tree.NewDArray(types.Int) for _, nodeId := range s.mu.data.Nodes {
nit: s/nodeId/nodeID/
c68d446 to
74cf507
Compare
THardy98
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @maryliag, and @THardy98)
pkg/sql/crdb_internal.go, line 968 at r2 (raw file):
Previously, Azhng (Archer Zhang) wrote…
nit: extra space here between "formatted" and "JSON"
done
pkg/sql/crdb_internal.go, line 972 at r2 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Two things here:
- usually if we are passing a constant literal to a function, we usually add an inline comment with the parameter name behind that constant. So in this case, it would be:
json.NewObjectBuilder(3 /* numAddsHint */)- Why 3 in this case? Would be nice to add a comment explaining it.
done
pkg/sql/crdb_internal.go, line 979 at r2 (raw file):
Previously, Azhng (Archer Zhang) wrote…
ditto
done
pkg/sql/crdb_internal.go, line 996 at r2 (raw file):
execNodeIDs
done
pkg/sql/crdb_internal.go, line 997 at r2 (raw file):
Previously, Azhng (Archer Zhang) wrote…
nit: s/nodeId/nodeID/
done
74cf507 to
6acfed5
Compare
THardy98
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
pkg/sql/crdb_internal.go, line 969 at r2 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Also, I think you can just return the
json.JSONobject instead of the builder.
done
pkg/sql/crdb_internal.go, line 970 at r2 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Also it would be nice if you can add some tests for this function once you extracted it.
Extracted function, put it just above the table with other util functions used for the node_statement_statistics (e.g. getSQLStats).
Exported the function so it could be used in the unit testing file.
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @maryliag, and @THardy98)
pkg/sql/crdb_internal_test.go, line 514 at r3 (raw file):
// correctly builds a JSON object from an ExplainTreePlanNode. func TestExplainTreePlanNodeToJSON(t *testing.T) { // Test using a node with multiple inner children.
You can organize this by creating one list of objects with your input and expected result, so e.g.
testData := []struct {
explainTree roachpb.ExplainTreePlanNode
expected string
}{
{ roachpb.ExplainTreePlanNode{. <-- here is multiInner
Name: "root",.....
,
`{"attrs": [{"rootKey": "rootValue"}].....`},
{ roachpb.ExplainTreePlanNode{. <-- here is multiAttr
Name: "root",.....
,
`{"attrs": [{"rootKey": "rootValue"}].....`},
{ roachpb.ExplainTreePlanNode{. <--- here is multiChild
Name: "root",.....
,
`{"attrs": [{"rootKey": "rootValue"}].....`},
}
for _, test := range testData {
resultJSON := sql.ExplainTreePlanNodeToJSON(&test.explainTree)
require.Equal(t, test.expected, resultJSON.String())
}
maryliag
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @maryliag, and @THardy98)
pkg/sql/crdb_internal.go, line 849 at r3 (raw file):
} // explainTreePlanNodeToJSON builds a formatted JSON object from the explain tree nodes.
nit: capitalize first letter ExplainTreePlanNodeToJSON
6acfed5 to
4d8d7e7
Compare
THardy98
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
pkg/sql/crdb_internal.go, line 849 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
nit: capitalize first letter
ExplainTreePlanNodeToJSON
done
pkg/sql/crdb_internal_test.go, line 514 at r3 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
You can organize this by creating one list of objects with your input and expected result, so e.g.
testData := []struct { explainTree roachpb.ExplainTreePlanNode expected string }{ { roachpb.ExplainTreePlanNode{. <-- here is multiInner Name: "root",..... , `{"attrs": [{"rootKey": "rootValue"}].....`}, { roachpb.ExplainTreePlanNode{. <-- here is multiAttr Name: "root",..... , `{"attrs": [{"rootKey": "rootValue"}].....`}, { roachpb.ExplainTreePlanNode{. <--- here is multiChild Name: "root",..... , `{"attrs": [{"rootKey": "rootValue"}].....`}, } for _, test := range testData { resultJSON := sql.ExplainTreePlanNodeToJSON(&test.explainTree) require.Equal(t, test.expected, resultJSON.String()) }
done
maryliag
left a comment
There was a problem hiding this comment.
Changes look good, a few things missing:
- Changing your commit message as Archer mentioned (your commit must have all the info, not just the PR)
- There is some tests failing that you need to update, some about the table
node_statement_statisticsthat now has more columns
The tests are on TeamCity: https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_UnitTests/3023812?
e.g.pkg/sql/logictest/testdata/logic_test/create_statements
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @maryliag, and @THardy98)
pkg/sql/crdb_internal_test.go, line 631 at r4 (raw file):
explainTreeJSON := sql.ExplainTreePlanNodeToJSON(&testData.explainTree) expected := testData.expected require.Equal(t, expected, explainTreeJSON.String())
there is not need to create the expected variable, you can use testData.expected directly
4d8d7e7 to
d0ba20d
Compare
THardy98
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
pkg/sql/crdb_internal_test.go, line 631 at r4 (raw file):
Previously, maryliag (Marylia Gutierrez) wrote…
there is not need to create the
expectedvariable, you can usetestData.expecteddirectly
Done.
d0ba20d to
16172d9
Compare
Azhng
left a comment
There was a problem hiding this comment.
Great work! Almost there. Just one more nit from me and seems like you might need to fix more tests in the CI 😛
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @THardy98)
pkg/sql/crdb_internal_test.go, line 501 at r5 (raw file):
var dbName string _, err := sqlDB.Exec(`SET database = defaultdb`)
nit: might worth to add additional require.Equal before calling SET database = defaultdb to ensure the database name is what you expected prior to this statement. Also, since it seems like this test is checking for SQL outputs of the crdb_internal.node_statement_statistics view and it's not really touching the internals of the code, you can simply just use logic test for it. It would be a lot cleaner since you can avoid a lot of boilerplate code.
16172d9 to
ab13e6e
Compare
THardy98
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
pkg/sql/crdb_internal_test.go, line 501 at r5 (raw file):
Previously, Azhng (Archer Zhang) wrote…
nit: might worth to add additional
require.Equalbefore callingSET database = defaultdbto ensure the database name is what you expected prior to this statement. Also, since it seems like this test is checking for SQL outputs of thecrdb_internal.node_statement_statisticsview and it's not really touching the internals of the code, you can simply just use logic test for it. It would be a lot cleaner since you can avoid a lot of boilerplate code.
Removed the unit test, created a logic test as per your suggestion.
ab13e6e to
3c08045
Compare
maryliag
left a comment
There was a problem hiding this comment.
Nice!
from the changes I requested
Make sure you got the approval from Archer from his requests too
@kevin-v-ngo if you don't mind taking a look if you're happy with the format of the plan. Thanks!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
Azhng
left a comment
There was a problem hiding this comment.
Just make sure to wait for kevin to take a final look at plan format.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
3c08045 to
3a0f76e
Compare
3a0f76e to
51e6de3
Compare
Azhng
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @Azhng, @maryliag, and @THardy98)
pkg/sql/crdb_internal.go, line 854 at r6 (raw file):
// Create a new json.ObjectBuilder with 3 key-value pairs for the node's name, // a single node attribute, and the node's children. nodePlan := json.NewObjectBuilder(3 /* numAddsHint */)
Should it still be 3 here in this case now that we are just flattening the attribute trees?
…o internal node statement statistics table This change adds the database name, sampled query plan, and execution nodes of a query statement to the crdb_internal.node_statement_statistics table. Through this change, 3rd party tools and partners are now able to consume this information for their use. Resolves cockroachdb#65013 Resolves cockroachdb#65285 Release note (sql change): Added sample_plan, database_name, and exec_node_ids columns to crdb_internal.node_statement_statistics table. Allows for 3rd party and partner consumption of this data.
No, should be: number of attributes + name + children -> num attributes + 2. |
51e6de3 to
cb5a4b5
Compare
Azhng
left a comment
There was a problem hiding this comment.
Awesome work!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @maryliag and @THardy98)
pkg/sql/crdb_internal.go, line 854 at r6 (raw file):
Previously, THardy98 (Thomas Hardy) wrote…
No, should be: number of attributes + name + children -> num attributes + 2.
Fixed.
NICE 🎉
maryliag
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @maryliag and @THardy98)
|
Looks good Thomas! Thanks for updating the format! |
|
bors r+ |
|
Build failed (retrying...): |
|
Build succeeded: |
Previously, we store statement plan as a byte blob column in system.statement_statistics. As of cockroachdb#65782 we now support JSON encoding for the plan. This commit change the column type for plan from BYTES to JSONB in system.statement_statistics. Release note (sql change): using JSONB to store statement plan instead of BYTES in system.statement_statistics.
67331: sql: store plan as jsonb in system.statement_statistics r=maryliag a=Azhng Previously, we store statement plan as a byte blob column in system.statement_statistics. As of #65782 we now support JSON encoding for the plan. This commit change the column type for plan from BYTES to JSONB in system.statement_statistics. Release note (sql change): using JSONB to store statement plan instead of BYTES in system.statement_statistics. 67340: opt: fix FoldEqZeroSTDistance with use_spheroid argument r=otan a=mgartner The `FoldEqZeroSTDistance` normalization rule normalizes expressions in the form `st_distance(a, b) = 0` to `st_intersects(a, b)`. This normalization rule is important because it allows an inverted index to be used for more queries. Previously, `FoldEqZeroSTDistance` did not handle the third argument of `st_distance` for the geography overload, `use_spheroid`. The optimizer panicked when a user supplied this third argument because it attempted to find an `st_intersects` overload with the three arguments and no such overload exists. In addition to the panic, this rule was incorrectly firing in cases where it shouldn't have. From the [`ST_Distance` PostGIS docs](https://postgis.net/docs/ST_Distance.html): > For geography types defaults to return the minimum geodesic distance > between two geographies in meters, compute on the spheroid determined > by the SRID. If use_spheroid is false, a faster spherical calculation > is used. From the [`ST_Intersects` PostGIS docs](https://postgis.net/docs/ST_Intersects.html): > For geography, this function has a distance tolerance of about 0.00001 > meters and uses the sphere rather than spheroid calculation. In summary, `st_distance` calculates on a spheroid by default, and only on a sphere if `use_spheroid` is explicitly false. `st_intersects` calculates on the sphere always. Therefore, this rule can only apply for geography types if `use_spheroid=false` is explicitly passed. This commit ensures that `FoldEqZeroSTDistance` only matches `st_distance` functions with geography arguments if `use_spheroid=false` is included in the function call. The panic is fixed by stripping the `use_spheroid` argument from the list of arguments used to lookup the `st_intersects` overload. Fixes #67235 Release note (bug fix): Two bugs have been fixed which affected geospatial queries with the `st_distance` function. The first caused errors for filters of the form `st_distance(g1, g2, use_spheroid) = 0`. The second could cause incorrect results in some cases. It incorrectly transformed filters in the form `st_distance(g1, g2) = 0` when `g1` and `g2` are geographies to `st_instersects(g1, g2)`. This is not a valid transformation because `st_distance` makes spheroid-based calculations by default while `st_intersects` only makes sphere-based calculations. 67371: dev: move `dev` back to `cockroach` r=rail a=rickystewart We don't seem to be getting any benefit out of having it in its own repo, and having it in the same repo gives us more flexibility. Closes #67339. Release note: None 67390: roachtest: fix ruby-pg roachtest r=rafiss a=RichardJCai Helper file was moved but path was not updated. Release note: None 67393: roachtest: fix gorm by installing test deps r=RichardJCai a=rafiss fixes #66825 I think this started failing with the move to go1.16. I added these steps to match what gorm does in its own test suite. Release note: None Co-authored-by: Azhng <archer.xn@gmail.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com> Co-authored-by: richardjcai <caioftherichard@gmail.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
This change adds the database name, sampled query plan, and execution nodes of a query statement to the crdb_internal.node_statement_statistics table.
Example of result:
Example sample plan JSON formatted:
Through this change, 3rd party tools and partners are now able to consume this information for their use.
Resolves #65013
Resolves #65285
Release note (sql change): Added sample_plan, database_name, and exec_node_ids columns to crdb_internal.node_statement_statistics table. Allows for 3rd party and partner consumption of this data.