Currently there are a few suspicious places where the code is running switch err { ... or switch err.(type) { to determine whether an error encountered is one of a predefined set of "interesting" errors for which special behavior must be triggered.
This general patterns is highly problematic because:
- the error may have been wrapped (
errors.Wrapf) and decorated from the point it was generated to the point it is seen. At the very least a Cause() should be put in there
- the error may have been flattened / transformed into a
pgerror.Error with a different error code.
The proper pattern is thus not if err == errMyInterestingError but instead if IsErrorDerived(err, errMyInterestingError) with some suitably implemented IsErrorDerived predicate (which is not yet implemented).
here is the list of problematic codes:
conn_executor.go
switch t := err.(type) {
case *roachpb.TransactionRetryWithProtoRefreshError:
errTxn := t.Transaction
if errTxn.ID == curTxnID && errTxn.Epoch == curTxnEpoch {
// A retryable error for the current transaction
// incarnation is given the highest priority.
return 1
}
return 2
case *roachpb.TxnAlreadyEncounteredErrorError:
// Another parallel stmt got an error that caused this one.
return 5
default:
// Any other error. We sort these behind retryable errors
// and errors we know to be their symptoms because it is
// impossible to conclusively determine in all cases whether
// one of these errors is a symptom of a concurrent retry or
// not. If the error is a symptom then we want to ignore it.
// If it is not, we expect to see the same error during a
// transaction retry.
return 4
}
- maybe the error was wrapped? Note: wrapped pg errors do not preserve the kv error details.
- is it possible the error got flattened into pg error with code 40001 already by that point?
lease.go
switch err {
case nil, errDidntUpdateDescriptor:
return sqlbase.NewImmutableTableDescriptor(tableDesc.TableDescriptor), nil
case errLeaseVersionChanged:
// will loop around to retry
default:
return nil, err
}
- this may have been wrapped
- this may have been flattened into a pg error
lease.go again
switch err {
case errRenewLease:
// Renew lease and retry. This will block until the lease is acquired.
if _, errLease := acquireNodeLease(ctx, m, tableID); errLease != nil {
return nil, hlc.Timestamp{}, errLease
}
if m.testingKnobs.LeaseStoreTestingKnobs.LeaseAcquireResultBlockEvent != nil {
m.testingKnobs.LeaseStoreTestingKnobs.LeaseAcquireResultBlockEvent(LeaseAcquireBlock)
}
case errReadOlderTableVersion:
// Read old table versions from the store. This can block while reading
// old table versions from the store.
versions, errRead := m.readOlderVersionForTimestamp(ctx, tableID, timestamp)
if errRead != nil {
return nil, hlc.Timestamp{}, errRead
}
m.insertTableVersions(tableID, versions)
default:
return nil, hlc.Timestamp{}, err
}
- this may have been wrapped
- this may have been flattened into a pg error
schema_changer_go
switch err {
case
context.Canceled,
context.DeadlineExceeded,
errExistingSchemaChangeLease,
errExpiredSchemaChangeLease,
errNotHitGCTTLDeadline,
errSchemaChangeDuringDrain,
errSchemaChangeNotFirstInLine:
return false
}
switch err := err.(type) {
case errTableVersionMismatch:
return false
case *pgerror.Error:
switch err.Code {
case pgerror.CodeSerializationFailureError, pgerror.CodeConnectionFailureError:
return false
case pgerror.CodeInternalError:
if err.Message == context.DeadlineExceeded.Error() {
return false
}
}
}
sequence.go
switch err.(type) {
case *roachpb.IntegerOverflowError:
return 0, boundsExceededError(descriptor)
default:
return 0, err
}
- the error may have been wrapped
Epic CRDB-7779
Jira issue: CRDB-4549
Currently there are a few suspicious places where the code is running
switch err { ...orswitch err.(type) {to determine whether an error encountered is one of a predefined set of "interesting" errors for which special behavior must be triggered.This general patterns is highly problematic because:
errors.Wrapf) and decorated from the point it was generated to the point it is seen. At the very least aCause()should be put in therepgerror.Errorwith a different error code.The proper pattern is thus not
if err == errMyInterestingErrorbut insteadif IsErrorDerived(err, errMyInterestingError)with some suitably implementedIsErrorDerivedpredicate (which is not yet implemented).here is the list of problematic codes:
conn_executor.golease.golease.goagainschema_changer_gosequence.goEpic CRDB-7779
Jira issue: CRDB-4549