sql: revert the flattening into errors via pgerror.Wrapf#36023
sql: revert the flattening into errors via pgerror.Wrapf#36023knz wants to merge 2 commits intocockroachdb:masterfrom
Conversation
This an oversight in cockroachdb#35821 - all the errors in `inbound.go` use `errors.Wrap`, this one must too. (This and the rest of the file should really use `pgerror.Wrap`, which in combination with cockroachdb#36023 would be innocuous. But that's an aside.) In either case, all the functionality in this file requires wrapping, not flattening the error into a string. Release note: None
9e9fdfb to
d92da7d
Compare
43caef1 to
17cfbe7
Compare
andreimatei
left a comment
There was a problem hiding this comment.
I really like the ultimate flexibility that I think this patch will give us, but I need more time to look over it.
Here's a question though - why is it that we have both pgerror.Error and sqlbase.Error?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @knz)
pkg/sql/distsqlpb/data.proto, line 39 at r2 (raw file):
// detail is deprecated as of 19.1. // Will be populated for 2.1 peers until the next version.
nit: s/peers/nodes
pkg/sql/distsqlpb/data.proto, line 48 at r2 (raw file):
} // full_error contains a structured errors with possibly multiple wrapping layers
say that this is the same as detail.
pkg/sql/sqlerror/data.go, line 29 at r2 (raw file):
) // withMessage adds a message prefix.
why have all these with* types instead of having a single sqlerror.Error with cause, msg, code, hint, details, etc fields?
Similarly, ErrorWrapperPayload could have a bunch of fields instead of being a oneof.
Wouldn't that be a lot simpler - and also more efficient?
And then we could still have function like sqlerror.WithCode(code, err) which look at the input and if it's already a sqlerror.Error, just populate the respective field and if it isn't create a sqlerror.Error with the code and cause populated.
pkg/sql/sqlerror/data.go, line 80 at r2 (raw file):
} // withInternalError is a general purpose internal error marker and
how's this different from withCode? Why does it need to exist?
pkg/sql/sqlerror/data.go, line 90 at r2 (raw file):
} // stackTrace is a helper interface to extra stacktraces from
I think there's a missing word
pkg/sql/sqlerror/data.go, line 117 at r2 (raw file):
default: return &withInternalError{ cause: errors.Errorf("unknown error type received: %T", e.Detail),
don't we need to populate the code? If not, pls document that in withInternalError.
pkg/sql/sqlerror/data.proto, line 27 at r2 (raw file):
message EncodedErrorWrapper { AnyErrorContainer cause = 1 [(gogoproto.nullable) = false]; ErrorWrapperPayload payload = 2 [(gogoproto.nullable) = false];
shouldn't ErrorWrapperPayload be nullable?
Like, if I'm encoding a roachpb.UnhandledRetryableError, I don't have any payload, do I?
I'm imagining this structure:
ErrWithCause{
ErrWithCause cause
oneof {
pgerror.Error // this is the thing with Code, Hint, etc.
roachpb.UnhandledRetryableError
}
}
To encode a chain of errors, you take the head, put it in the oneof, and then fill in the cause recursively.
pkg/sql/sqlerror/data.proto, line 30 at r2 (raw file):
} // AnyErrorContainer is a container for a protobuf-encodable error.
how about just AnyError?
pkg/sql/sqlerror/data.proto, line 44 at r2 (raw file):
// UnencodableError is used to encode error objects that do not have a // variant in AnyErrorContainer. message UnencodableError {
how about UntypedErr?
pkg/sql/sqlerror/data.proto, line 53 at r2 (raw file):
// The following payload encode additional/fallback values for the // fields of a pgerror.Error directly. TextErrorPayload with_code = 1;
how about simply string code, string message, ...?
I don't see the need for TextErrorPayload, and also the with_ prefixes seem weird to me.
pkg/sql/sqlerror/data.proto, line 60 at r2 (raw file):
pgerror.Error.Source with_source = 6; SafeDetailPayload with_safe_detail = 7; TextErrorPayload with_unknown_error_payload = 8;
tabs
pkg/sql/sqlerror/data.proto, line 61 at r2 (raw file):
SafeDetailPayload with_safe_detail = 7; TextErrorPayload with_unknown_error_payload = 8; TextErrorPayload with_internal_error = 9;
like the withInternalError type itself, I don't see the need for this
pkg/sql/sqlerror/errors.go, line 24 at r2 (raw file):
) type causerI interface {
nit: why not just causer ?
pkg/sql/sqlerror/internal_errors.go, line 35 at r2 (raw file):
// NewAssertionErrorWithWrappedErrf is the "advanced" version of the function in package pgerror. func NewAssertionErrorWithWrappedErrf(err error, format string, args ...interface{}) error {
please remind me why this function exists?
pkg/sql/sqlerror/wrap.go, line 26 at r2 (raw file):
// WrapWithDepthf is an advanced replacement for pgerror.WrapWithDepthf. // It preserves the errors.Causer() interface while being able to
I'd strike the reference to pgerror.WrapWithDepthf.
pkg/sql/sqlerror/wrap.go, line 26 at r2 (raw file):
// WrapWithDepthf is an advanced replacement for pgerror.WrapWithDepthf. // It preserves the errors.Causer() interface while being able to
errors.Causer is a actually an unexported interface... Not sure how I'd phrase this though.
pkg/sql/sqlerror/wrap.go, line 28 at r2 (raw file):
// It preserves the errors.Causer() interface while being able to // produce fully decorated pgerror.Error objects via the Flatten() function. func WrapWithDepthf(depth int, err error, code, format string, args ...interface{}) error {
depth needs a comment
pkg/sql/sqlerror/wrap.go, line 31 at r2 (raw file):
if err == nil { // Shortcut: although the functions below do the same, this takes less time. return err
return nil
pkg/sql/sqlerror/wrap.go, line 58 at r2 (raw file):
func init() { // Override the base function with the advanced code. pgerror.WrapWithDepthf = WrapWithDepthf
waaaa what's going on here? :)
Can't pgerror.Wrap.. simply call this one?
pkg/sql/sqlerror/wrap.go, line 72 at r2 (raw file):
// WithMessagef adds a message. func WithMessagef(err error, format string, args ...interface{}) error { if err == nil || (format == "" && len(args) == 0) {
what does format == "" && len(args) != 0 do?
RaduBerinde
left a comment
There was a problem hiding this comment.
I'm also finding it hard to wrap my head around the details. pgerror.Wrap returning a sqlerror is definitely surprising. Isn't it possible to fold sqlerror everything into pgerror? I didn't see sqlerror importing anything that pgerror couldn't.
In any case, I'm getting pretty uncomfortable with these changes so close to releasing 19.1.. Do you think there might be a smaller fix we could consider for 19.1? (even if it means reverting some of the recent changes and giving up some of the benefits around internal errors)
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz)
|
Hello friends, thanks for the early review. Before we talk about the important things in this PR, let me clear the air on three things:
I'll respond to the other points inline below. |
|
Regarding 3, I'm not worried about the dependency injection (I used that before too), I'm worried about how understandable things are, especially for someone who isn't deep-diving into these details but is just working with higher level code. If I have to create an error, when should I be using I think an improvement we could make is hide |
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @knz)
pkg/sql/pgwire/pgerror/wrap.go, line 45 at r2 (raw file):
func init() { WrapWithDepthf = defaultWrapWithDepthf
In what cases would the default implementation be used? I assume for tests that link pgwire but not sqlerror, are there a lot of those? Because it would be more clear if this default implementation was in the tests. Then the API comment above doesn't need to talk about the default implementation, it can just describe the real semantics.
Yes I see this wisdom. I think I know how to achieve that, I'm going to try something for this. re the default implementation in pgerror:
|
17cfbe7 to
bfcdb91
Compare
knz
left a comment
There was a problem hiding this comment.
I have made a first round of adjustments based on your first round of comments. See my answers below.
The next thing is to do what Radu suggested, move most of the functionality back to the pgerror package and hide sqlerror. I'll do that next.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @RaduBerinde)
pkg/sql/distsqlpb/data.proto, line 39 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: s/peers/nodes
Done.
pkg/sql/distsqlpb/data.proto, line 48 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
say that this is the same as
detail.
Done.
pkg/sql/pgwire/pgerror/wrap.go, line 45 at r2 (raw file):
Previously, RaduBerinde wrote…
In what cases would the default implementation be used? I assume for tests that link pgwire but not sqlerror, are there a lot of those? Because it would be more clear if this default implementation was in the tests. Then the API comment above doesn't need to talk about the default implementation, it can just describe the real semantics.
I think that by moving the functionality to the pgerror package like you suggested, we can erase the distinction and clarify. I'll try that first.
Otherwise I will make a run of tests without any implementation (with nil function references) and see if anything goes wrong. If it works, I can remove the defaults and clarify too.
pkg/sql/sqlerror/data.go, line 29 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
why have all these
with*types instead of having a singlesqlerror.Errorwithcause, msg, code, hint, details, etcfields?
Similarly,ErrorWrapperPayloadcould have a bunch of fields instead of being aoneof.
Wouldn't that be a lot simpler - and also more efficient?And then we could still have function like
sqlerror.WithCode(code, err)which look at the input and if it's already asqlerror.Error, just populate the respective field and if it isn't create asqlerror.Errorwith thecodeandcausepopulated.
For now, I'll just say that my current design is 10000x better than what you propose for very objective reasons, and I'll just ignore your comment (and I'll take time later to tell you why privately).
pkg/sql/sqlerror/data.go, line 80 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
how's this different from
withCode? Why does it need to exist?
Good question - clarified in the comments.
Also renamed the other to withDefaultCode.
pkg/sql/sqlerror/data.go, line 90 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I think there's a missing word
Indeed. Thanks for pointing it out. Fixed.
pkg/sql/sqlerror/data.go, line 117 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
don't we need to populate the code? If not, pls document that in
withInternalError.
Documented. Thanks for noticing.
pkg/sql/sqlerror/data.proto, line 27 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
shouldn't
ErrorWrapperPayloadbe nullable?
Like, if I'm encoding aroachpb.UnhandledRetryableError, I don't have any payload, do I?I'm imagining this structure:
ErrWithCause{ ErrWithCause cause oneof { pgerror.Error // this is the thing with Code, Hint, etc. roachpb.UnhandledRetryableError } }To encode a chain of errors, you take the head, put it in the
oneof, and then fill in the cause recursively.
It's like your comment above about the type -- I think you're confused with how this overall works. Which is strange, because the way it actually works is so much simpler than how you seem to think about it.
That you misunderstood this, is on me; I may not have sufficient explained what's going on. I will add comments + PR description to achieve that, then we can discuss further.
In the meantime, you can look at the source code of the github.com/pkg/errors package. I am doing the same as they do, and FYI there's enormous precedence in other fields of software engineering for that approach.
pkg/sql/sqlerror/data.proto, line 30 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
how about just
AnyError?
Because it's not implementing the error interface (and neither can it).
pkg/sql/sqlerror/data.proto, line 44 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
how about
UntypedErr?
Because that's not semantically correct. The error is typed.
pkg/sql/sqlerror/data.proto, line 53 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
how about simply
string code,string message, ...?
I don't see the need forTextErrorPayload, and also thewith_prefixes seem weird to me.
Good idea. Thanks for suggesting. Done.
pkg/sql/sqlerror/data.proto, line 60 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
tabs
Done.
pkg/sql/sqlerror/data.proto, line 61 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
like the
withInternalErrortype itself, I don't see the need for this
Ack. I hope the comments clarify.
pkg/sql/sqlerror/errors.go, line 24 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: why not just
causer?
TestVet says I can't use if causer, ok := err.(causer) because the first shadows the second.
pkg/sql/sqlerror/internal_errors.go, line 35 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
please remind me why this function exists?
To decorate unexpected errors with something that will turn them into CodeInternalError if they are flattened at the gateway, but otherwise leave access to the original cause for other parts of the code.
pkg/sql/sqlerror/wrap.go, line 26 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
errors.Causeris a actually an unexported interface... Not sure how I'd phrase this though.
I'll keep it.
pkg/sql/sqlerror/wrap.go, line 26 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I'd strike the reference to
pgerror.WrapWithDepthf.
Agreed. Will rework this when I rewrite like Radu suggested.
pkg/sql/sqlerror/wrap.go, line 28 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
depthneeds a comment
Done.
pkg/sql/sqlerror/wrap.go, line 31 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
return nil
Done.
pkg/sql/sqlerror/wrap.go, line 58 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
waaaa what's going on here? :)
Can'tpgerror.Wrap..simply call this one?
I explained before - that's the dependency injection.
However I will do what Radu suggests and simplify this further.
pkg/sql/sqlerror/wrap.go, line 72 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
what does
format == "" && len(args) != 0do?
There's a comment just one line below! wasn't that sufficiently explanatory? I'll add an example to make it even clearer.
The conversion of various errors types into a `pgerror.Error` was found (by tests) to be too disruptive; we actually really want the errors.Cause() interface to continue to work even for wrapped errors. This patch reverts the previous choice and instead ensures that all existing calls to `pgerror.Wrap` preserve the original error object, reachable via the errors.Cause() interface. Additionally, this patch ensures that error decorations (safe details, message prefix, pg error codes) etc are properly preserved across the distsql boundary, in a backward-compatible manner. Release note: None
bfcdb91 to
8b7c008
Compare
|
I have successfully hidden the The code is actually simpler as a result. RFAL. |
knz
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/sql/sqlerror/data.proto, line 44 at r2 (raw file):
Previously, knz (kena) wrote…
Because that's not semantically correct. The error is typed.
Changed to string in the end.
andreimatei
left a comment
There was a problem hiding this comment.
I have successfully hidden the sqlerror package. The only remaining reference to it is sqlerror.EncodeError() in distsqlpb/data.go.
That's definitely a big improvement! Do we need a particular package for it then, or could we just keep all the code that's now in sqlerror in distsqlrun and the protos in distsqlpb?
I think what this PR needs is a big explanation about all the different types of errors that different parts of the system use and the various use cases (what needs to go on the wire in KV, what needs to go on the wire in DistSQL, what telemetry cares about, the pg codes, how retriable errors come from deep down but need to be recognized by layers that are very high up) and how they all work after this patch (including who's supposed to create what error).
There's a lot here and I find myself going in circles.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @knz)
pkg/sql/distsqlpb/data.go, line 140 at r3 (raw file):
// NewError creates an Error from an error, to be sent on the wire. It will // recognize certain errors and marshall them accordingly, and everything // unrecognized is turned into a PGError with code "internal".
I think this comment is stale now
pkg/sql/pgwire/pgerror/data.go, line 68 at r3 (raw file):
type withUnknownErrorPayload struct { cause error payloadType string
can you comment this payloadType pls
pkg/sql/pgwire/pgerror/internal_errors.go, line 61 at r3 (raw file):
const InternalErrorPrefix = "internal error: " // NewAssertionErrorWithWrappedErrf wraps an error (which may be a pg
So the difference between NewAssertion..(err) and errors.Wrap() is that the former gets a "safe detail" which is the stack whereas the latter still has a stack but no safe details?
I don't really get it... I really wouldn't know when to use this function... I'm really hoping that we can do without it, cause it makes my head hurt :)
pkg/sql/sqlerror/data.proto, line 27 at r2 (raw file):
Previously, knz (kena) wrote…
It's like your comment above about the type -- I think you're confused with how this overall works. Which is strange, because the way it actually works is so much simpler than how you seem to think about it.
That you misunderstood this, is on me; I may not have sufficient explained what's going on. I will add comments + PR description to achieve that, then we can discuss further.
In the meantime, you can look at the source code of the
github.com/pkg/errorspackage. I am doing the same as they do, and FYI there's enormous precedence in other fields of software engineering for that approach.
It seems to me that when we put an error on the wire, there's a good opportunity to flatten it out somewhat. So if the error has both a pg code and a hint and a telemetry key, we don't need 3 levels of nesting to represent that. We could put all of these things into a single ErrorWrapperPayload, and then just have one cause. So I think the structure here doesn't need to be recursive at all.
The recursivity allows you to retain the structure of the input error, but is that really necessary? Cause it sure would be simpler if we didn't. At least I find it very difficult to read these proto.
Separately, even outside of protos, I wonder if we need such a deeply recursive structure for the errors. I understand you're copying the style of pkg/errors, which is very functional, but it sure leads to tons and tons of code. We seem to have gotten structures with like an optional code that applied on top of a thing which may or may not have another code, and stuff like that. It's just very complicated in my opinion. It seems to me that we should be able to flatten things much more aggressively - like never have two different codes in a error chain. At least with all the error fields pertaining to pgwire - code, detail, hint,source - I'd just always flatten those out. I think we'd end up with simpler code. No?
RaduBerinde
left a comment
There was a problem hiding this comment.
It will take me a minute to go through the change, but at first glance the latest iteration seems much cleaner!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @knz)
pkg/sql/pgwire/pgerror/sqlerror_api.go, line 30 at r3 (raw file):
// is initialized. var ( ConvertOtherError func(err error) EncodableError
[nit] These should live in an interface, and sqlerror would have a singleton object that implements them. The object doesn't even need injection, it could be passed as an arg to EncodeErrorInternal.
pkg/sql/pgwire/pgerror/sqlerror_api.go, line 44 at r3 (raw file):
// EncodeErrorInternal walks an error chain and uses the functions // above to encode the error. Do not use this directly; use // sqlerror.EncodeError instead.
We're going through a lot of hoops here just because we aren't exporting the withXX types. Arguably we aren't hiding them anyway because sqlerror still needs to know about each one to implement specific functions (with args that map 1-1 to the members of the structs). Have you considered exporting them and moving this code in sqlerror entirely? They would need to be renamed with more obfuscated names so they won't confuse users of the package (maybe ErrObjWithMessage etc).
|
. At least with all the error fields pertaining to pgwire - code, detail, hint,source - I'd just always flatten those out. I think we'd end up with simpler code. No?
No. I tried. The code is more complex and more error prone.
…--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
Replace the dependency injections with an interface. Release note: None
|
I pushed a commit that replaces the dependency injection with an interface. |
|
Superseded by #37765. |
Alleviates/informs #35854.
The conversion of various errors types into a
pgerror.Errorwasfound (by tests) to be too disruptive; we actually really want the
errors.Cause() interface to continue to work even for wrapped errors.
This patch reverts the previous choice and instead ensures that all
existing calls to
pgerror.Wrappreserve the original error object,reachable via the errors.Cause() interface.
Additionally, this patch ensures that error decorations (safe details,
message prefix, pg error codes) etc are properly preserved across the
distsql boundary, in a backward-compatible manner.
Release note: None