sql: enhance the handling of internal errors, add sentry reporting#35820
sql: enhance the handling of internal errors, add sentry reporting#35820craig[bot] merged 7 commits intocockroachdb:masterfrom
Conversation
Before: ``` testdata/logic_test/assertion_error:4: expected success, but found crdb_internal.force_assertion_error(): internal error: woo ``` After (for example, after cockroachdb#35820 is separately applied): ``` testdata/logic_test/assertion_error:4: expected success, but found (XX000) crdb_internal.force_assertion_error(): internal error: woo sql/sem/builtins/builtins.go:2815: in func211() DETAIL: stack trace: github.com/cockroachdb/cockroach/pkg/sql/sem/builtins/builtins.go:2815: in func211() github.com/cockroachdb/cockroach/pkg/sql/sem/tree/eval.go:3651: in Eval() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/expr.go:196: in eval() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/processors.go:375: in ProcessRow() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/processors.go:778: in ProcessRowHelper() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/values.go:126: in Next() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/base.go:174: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/processors.go:801: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/flow.go:627: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:252: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:839: in PlanAndRun() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1112: in execWithDistSQLEngine() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:948: in dispatchToExecutionEngine() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:456: in execStmtInOpenState() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:102: in execStmt() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1178: in run() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:429: in ServeConn() github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:337: in func4() NOTE: internal errors may have more details in log. Use -show-logs. ``` Release note: None
35827: sql/logictest: reveal error details if present r=knz a=knz Before: ``` testdata/logic_test/assertion_error:4: expected success, but found crdb_internal.force_assertion_error(): internal error: woo ``` After (for example, after #35820 is separately applied): ``` testdata/logic_test/assertion_error:4: expected success, but found (XX000) crdb_internal.force_assertion_error(): internal error: woo sql/sem/builtins/builtins.go:2815: in func211() DETAIL: stack trace: github.com/cockroachdb/cockroach/pkg/sql/sem/builtins/builtins.go:2815: in func211() github.com/cockroachdb/cockroach/pkg/sql/sem/tree/eval.go:3651: in Eval() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/expr.go:196: in eval() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/processors.go:375: in ProcessRow() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/processors.go:778: in ProcessRowHelper() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/values.go:126: in Next() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/base.go:174: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/processors.go:801: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsqlrun/flow.go:627: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:252: in Run() github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:839: in PlanAndRun() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1112: in execWithDistSQLEngine() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:948: in dispatchToExecutionEngine() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:456: in execStmtInOpenState() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:102: in execStmt() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1178: in run() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:429: in ServeConn() github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:337: in func4() NOTE: internal errors may have more details in log. Use -show-logs. ``` Release note: None Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
... like it had always been intended. Release note: None
RaduBerinde
left a comment
There was a problem hiding this comment.
Very nice work, thank you! This gets us all the benefits we had with panics but without actually crashing!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @knz, and @RaduBerinde)
pkg/sql/pgwire/pgerror/errors.go, line 295 at r5 (raw file):
} // Format implements the fmt.Formatter interface.
[nit] Add to this comment and explain the three variations v, +v, s. It's important because it's not common for %sand%v` to be different.
pkg/sql/pgwire/pgerror/errors.go, line 299 at r5 (raw file):
switch verb { case 'v': switch {
[nit] switch with a single case should be if. Or just put them all in a big switch with
case verb == 'v' && s.Flag('+'):
case verb == 'v':
case verb == 's':pkg/sql/pgwire/pgerror/errors.go, line 318 at r5 (raw file):
case 's': fmt.Fprintf(s, "%s", pg.Message) }
What is this supposed to return if the verb is bad? Unfortunately this interface is poorly documented in fmt.
pkg/sql/pgwire/pgerror/errors.proto, line 50 at r5 (raw file):
} // complement to the detail field that can be reported
[nit] explicitly mention that all info is scrubbed
Release note: None
The `depth` argument wasn't incremented properly, causing one level to be lost. Release note: None
We wish to embed stack traces in Sentry reports for internal errors. However, since internal errors flow through RPCs as protobufs and raven.Stacktrace objects (used for Sentry reports) do not have a protobuf encoding, we want another serialization format. This patch provides this by means of an encapsulation of `raven.Stacktrace` as `log.StackTrace`, with working `Encode`/`Decode` methods. Release note: None
This patch splits `log.SendCrashReport` into the following pieces: - `log.ShouldSendReport` tests whether a report should be assembled+sent; - `log.SendReport` sends a report from a message and zero or more reportalbe objects. - `log.ReportableObject` aliases `raven.Interface` and is instantiated by: - `log.NewReportableMessage()` for simple strings. - `log.NewException()` for string + stack trace. Release note: None
e5c55f1 to
7810506
Compare
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis and @RaduBerinde)
pkg/sql/pgwire/pgerror/errors.go, line 295 at r5 (raw file):
Previously, RaduBerinde wrote…
[nit] Add to this comment and explain the three variations
v,+v, s. It's important because it's not common for%sand%v` to be different.
I changed it back - it was breaking otherwise.
pkg/sql/pgwire/pgerror/errors.go, line 299 at r5 (raw file):
Previously, RaduBerinde wrote…
[nit] switch with a single case should be
if. Or just put them all in a big switch withcase verb == 'v' && s.Flag('+'): case verb == 'v': case verb == 's':
Done.
pkg/sql/pgwire/pgerror/errors.go, line 318 at r5 (raw file):
Previously, RaduBerinde wrote…
What is this supposed to return if the verb is bad? Unfortunately this interface is poorly documented in
fmt.
Then nothing is printed. That's the way the errors package does it, so I'm not asking further questions.
pkg/sql/pgwire/pgerror/errors.proto, line 50 at r5 (raw file):
Previously, RaduBerinde wrote…
[nit] explicitly mention that all info is scrubbed
Done.
|
pkg/sql/pgwire/pgerror/errors.go, line 295 at r5 (raw file): Previously, knz (kena) wrote…
Ok, great. I actually hit this in a test where I had to change Still worth a comment mentioning that |
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis and @RaduBerinde)
pkg/sql/pgwire/pgerror/errors.go, line 295 at r5 (raw file):
Previously, RaduBerinde wrote…
Ok, great. I actually hit this in a test where I had to change
%sto%vwhich felt strange.Still worth a comment mentioning that
%s / %vshow the error message and%+vshows details as well (+vis not particularly well documented infmt, it just says "when printing structs, the plus flag (%+v) adds field names").
Added the comment.
|
This PR is more or less complete, but I want to get a green build on #35821 to verify that the API changes here are good. |
e87789c to
2779d6f
Compare
|
It was a good idea to iterate on the other PR before merging this one. I discovered that the
Adjusted the PR accordingly, with an explanatory comment. |
|
Another "oops" caught by the other PR: the result message for should be Fixed (also added a test). |
|
Yet another "oops" caught in tests:
However if I modified the logic accordingly to preserve retry errors + ambiguity errors. |
This patch achieves the following: - it extends the pgerror.Error object with a new field `SafeDetail`, aimed to contain non-PII details and thuis suitable for inclusion in reporting. - it populates the scrubbed format + safe arguments of error messages in `SafeDetail` using the same logic as `log.ReportablesToSafeError`. - it captures the stack trace in `SafeDetail` but also `Detail` so that a human user seeing the error can report the stack trace with a manually created issue. - it extends `pgerror.Wrapf()` so as to combine details in an existing error with the details of the point that calls `pgerror.Wrapf()`. - it provides a custom `fmt.Formatter` for `pgerror.Error`, so that: - `%#v` will now print the pg error code in addition to the message; - `%+v` will now print the source, code, message and stored `SafeDetail`. Release note: None
This patch ensures that a sentry report is created when a pg error with CodeInternalError flows out of the execution layers. Release note: None
35821: sql,*: error handling improvements r=knz a=knz Subsumes #35825. Subsumes #35820. This PR replaces uses of `errors.New`, `errors.Errorf` and `errors.Wrap` by their `pgerror` equivalents. It also embeds the subsumed PRs; see those PR's discussion for details. Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
|
pkg/sql/pgwire/pgerror/wrap.go, line 82 at r12 (raw file):
If the error already was CodeInternalError and had a stack trace, do we need to embed another stack trace every time we wrap it? |
|
As I was debugging a couple of things I discovered that error objects do not simply flow on return paths, they also move horizontally in code (i.e. passed "down" to function calls) and also across the network (distsql). capturing the wrap stack traces gives a better idea of the paths taken through the code by an error. |
Subsumes #35826.
Subsumes #35823.
This PR achieves the following:
printf("%+v", err)