Skip to content

sql: erroneous switches to test error identity #35854

@knz

Description

@knz

Currently there are a few suspicious places where the code is running switch err { ... or switch err.(type) { to determine whether an error encountered is one of a predefined set of "interesting" errors for which special behavior must be triggered.

This general patterns is highly problematic because:

  • the error may have been wrapped (errors.Wrapf) and decorated from the point it was generated to the point it is seen. At the very least a Cause() should be put in there
  • the error may have been flattened / transformed into a pgerror.Error with a different error code.

The proper pattern is thus not if err == errMyInterestingError but instead if IsErrorDerived(err, errMyInterestingError) with some suitably implemented IsErrorDerived predicate (which is not yet implemented).

here is the list of problematic codes:

conn_executor.go

        switch t := err.(type) {
        case *roachpb.TransactionRetryWithProtoRefreshError:
          errTxn := t.Transaction
          if errTxn.ID == curTxnID && errTxn.Epoch == curTxnEpoch {
            // A retryable error for the current transaction
            // incarnation is given the highest priority.
            return 1
          }
          return 2
        case *roachpb.TxnAlreadyEncounteredErrorError:
          // Another parallel stmt got an error that caused this one.
          return 5
        default:
          // Any other error. We sort these behind retryable errors
          // and errors we know to be their symptoms because it is
          // impossible to conclusively determine in all cases whether
          // one of these errors is a symptom of a concurrent retry or
          // not. If the error is a symptom then we want to ignore it.
          // If it is not, we expect to see the same error during a
          // transaction retry.
          return 4
        }
  • maybe the error was wrapped? Note: wrapped pg errors do not preserve the kv error details.
  • is it possible the error got flattened into pg error with code 40001 already by that point?

lease.go

    switch err {
    case nil, errDidntUpdateDescriptor:
      return sqlbase.NewImmutableTableDescriptor(tableDesc.TableDescriptor), nil
    case errLeaseVersionChanged:
      // will loop around to retry
    default:
      return nil, err
    }
  • this may have been wrapped
  • this may have been flattened into a pg error

lease.go again

    switch err {
    case errRenewLease:
      // Renew lease and retry. This will block until the lease is acquired.
      if _, errLease := acquireNodeLease(ctx, m, tableID); errLease != nil {
        return nil, hlc.Timestamp{}, errLease
      }
      if m.testingKnobs.LeaseStoreTestingKnobs.LeaseAcquireResultBlockEvent != nil {
        m.testingKnobs.LeaseStoreTestingKnobs.LeaseAcquireResultBlockEvent(LeaseAcquireBlock)
      }

    case errReadOlderTableVersion:
      // Read old table versions from the store. This can block while reading
      // old table versions from the store.
      versions, errRead := m.readOlderVersionForTimestamp(ctx, tableID, timestamp)
      if errRead != nil {
        return nil, hlc.Timestamp{}, errRead
      }
      m.insertTableVersions(tableID, versions)

    default:
      return nil, hlc.Timestamp{}, err
    }
  • this may have been wrapped
  • this may have been flattened into a pg error

schema_changer_go

  switch err {
  case
    context.Canceled,
    context.DeadlineExceeded,
    errExistingSchemaChangeLease,
    errExpiredSchemaChangeLease,
    errNotHitGCTTLDeadline,
    errSchemaChangeDuringDrain,
    errSchemaChangeNotFirstInLine:
    return false
  }
  switch err := err.(type) {
  case errTableVersionMismatch:
    return false
  case *pgerror.Error:
    switch err.Code {
    case pgerror.CodeSerializationFailureError, pgerror.CodeConnectionFailureError:
      return false

    case pgerror.CodeInternalError:
      if err.Message == context.DeadlineExceeded.Error() {
        return false
      }

    }
  }

sequence.go

      switch err.(type) {
      case *roachpb.IntegerOverflowError:
        return 0, boundsExceededError(descriptor)
      default:
        return 0, err
      }
  • the error may have been wrapped

Epic CRDB-7779

Jira issue: CRDB-4549

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions