Skip to content

Fix 9435 - Allow speeding up of underpriced transactions#9687

Merged
darkwing merged 9 commits intoMetaMask:developfrom
darkwing:9435-error-state
Oct 28, 2020
Merged

Fix 9435 - Allow speeding up of underpriced transactions#9687
darkwing merged 9 commits intoMetaMask:developfrom
darkwing:9435-error-state

Conversation

@darkwing
Copy link
Copy Markdown
Contributor

Fixes #9435

This patch shows the "Pending" status transaction instead of a FAILED transaction that prevents any further transactions.

One remaining problem is possibly our speed up mechanism:

First, if the price is set very low, we only allow bumping up by an increment of 1, and that's not going to help very much:

SpeedUp

Second, pressing "Yes, let's try" doesn't do anything

@darkwing darkwing requested a review from a team as a code owner October 22, 2020 21:53
@darkwing darkwing requested a review from rekmarks October 22, 2020 21:53
@darkwing darkwing marked this pull request as draft October 22, 2020 21:53
@github-actions
Copy link
Copy Markdown
Contributor

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

David pointed out something on a call we had where he went over some of his findings. the status 'failed' is not in the PRIORITY_STATUS_HASH.

The only way for a failed transaction to be the primary transaction is simply it being the latest transaction for a nonce. The transactionSelector that is a predicate of this selector, returns a list of transactions sorted by time. Therefore cancellations, retries, and transactions submitted with custom nonce equal to an existing transaction nonce will always be the primaryTransaction for a group.

initialTransaction gets set appropriately due to the comparison of time

I don't know if only network failures have this error object, but if that's true then that seems like a pretty good way to draw the distinction.

@Gudahtt
Copy link
Copy Markdown
Member

Gudahtt commented Oct 22, 2020

Side-note about this comment:

One remaining problem is possibly our speed up mechanism:

First, if the price is set very low, we only allow bumping up by an increment of 1, and that's not going to help very much:

For cancellations, we always increase the gas price by 10%. This is not nearly enough in many cases, such as for very low initial gas prices, or during times of high network volatility.

The fix we had in mind was to allow the user to customize the price. This is tracked in #8011

Though now that you mention it, our default here is also not great 🤔 Independent of allowing custom gas prices on cancellations, we should also consider current gas estimates. e.g. something like Math.max(updatedGasPriceEstimate, oldPrice * 1.1). Maybe that could be a separate ticket.

Edit: I have created #9689 to track improving our default gas price for cancellations.

@darkwing darkwing marked this pull request as ready for review October 23, 2020 16:13
brad-decker
brad-decker previously approved these changes Oct 23, 2020
Copy link
Copy Markdown
Member

@Gudahtt Gudahtt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this fail for a "group of 1", if the transaction had failed? We do still want to assign a failed transaction as primary (either a network or on-chain failure, any failure really) if it's the sole transaction in the group.

@brad-decker
Copy link
Copy Markdown
Contributor

brad-decker commented Oct 23, 2020

@darkwing I hate it when @Gudahtt is right.

Instead, while we are iterating the transactions, what we could do is check if the assigned primaryTransaction is in a failed state. If that's true, OR the transaction has a higher (later) time, we reassign the primaryTransaction. All of the statuses in the PRIORITY_STATUS_HASH will be types we'd want to override failed txs with.

I think that would resolve the problem still without introducing the issue with groups of 1 failed transactions.

@darkwing darkwing force-pushed the 9435-error-state branch 2 times, most recently from 33ee7af to 3243c62 Compare October 27, 2020 16:45

// If the primary transaction is failed or this transaction newer, assign as primary
if (
(nonceProps.primaryTransaction.status === FAILED_STATUS && nonceProps.primaryTransaction?.txReceipt?.status === '0x0' && transaction.status !== FAILED_STATUS) ||
Copy link
Copy Markdown
Contributor

@brad-decker brad-decker Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two failure states we'd want to catch here:

primaryTransaction.status === FAILED and primaryTransaction.txReceipt.status === '0x0' these are two separate failure states. The latter could be primaryTransaction.status === CONFIRMED

additionally you don't want to drop the CONFIRMED check, here..so a stab at the logic might be:

const previousPrimaryIsFailed = nonceProps.primaryTransaction.status === FAILED_STATUS || nonceProps.primaryTransaction?.txReceipt?.status === '0x0';

if (status === CONFIRMED_STATUS || previousPrimaryIsFailed || txTime > primaryTxTime) {
  // do stuff
}

Copy link
Copy Markdown
Member

@Gudahtt Gudahtt Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. That doesn't seem quite right either. We want to treat the two types of failure (on-chain and network) in opposite ways, not in the same way.

e.g. something like this:

const previousPrimaryIsNetworkFailure = nonceProps.primaryTransaction.status === FAILED_STATUS && nonceProps.primaryTransaction?.txReceipt?.status !== '0x0';
const currentTransactionIsOnChainFailure = transaction.txReceipt?.status === '0x0'

if (status === CONFIRMED_STATUS || currentTransactionIsOnChainFailure || previousPrimaryIsNetworkFailure || txTime > primaryTxTime) {
  // do stuff
}

So there are three priorities here. Anything should be prioritized over a network failure, but a status of confirmed or an on-chain failure should be prioritized above pending, because they both represent a "final" state for the transaction that can't change from there on.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, we may be able to assume that on-chain failures have a status of confirmed. I was unable to confirm that it was possible for an on-chain failure to get assigned a failed status.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're off on something because, per my PR's console statements, the primary transaction is being replaced every time.

Copy link
Copy Markdown
Member

@Gudahtt Gudahtt Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Sorry I was imagining my snippet as being inside the if (status in PRIORITY_STATUS_HASH) block. Though that wouldn't have made sense either because FAILED isn't in that object.

So maybe something like this:

if (status === CONFIRMED_STATUS || currentTransactionIsOnChainFailure || previousPrimaryIsNetworkFailure || (txTime > primaryTxTime && status in PRIORITY_STATUS_HASH)) {

The intent here is to use the most recent transaction among the "priority" statuses - e.g. the ones we typically prefer to display. The on-chain failure is one strange exception to this, as is the "previous transaction network failure" case.

@darkwing darkwing marked this pull request as draft October 27, 2020 21:46
if (status === CONFIRMED_STATUS || txTime > primaryTxTime) {
nonceProps.primaryTransaction = transaction
}
const previousPrimaryIsNetworkFailure = nonceProps.primaryTransaction.status === FAILED_STATUS && nonceProps.primaryTransaction?.txReceipt?.status !== '0x0';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I have tested our usual workflow for triggering the bug reported in #9435, and this fix resolves that issue, for sure;

I have also tested what happens if a user submits a custom nonce transaction with the same nonce as a group of transactions in the state described in #9435 👍

  1. submit tx on Rinkeby with 0.6 gasPrice.
  2. Speed up tx with gasPrice 1 and extremely high gasLimit (above network tolerance)
  3. --on develop this tx would now show as failed, on this PR it shows as pending (correct)--
  4. Submit a new transaction with the same nonce as the prior two, with a different value to be able to distinguish in UI
  5. TX in UI updates to reflect new value in the pending state
  6. TX is confirmed and continues to reflect the appropriate custom nonce tx.

The only thing I haven't yet been able to test is on-chain failures because I don't know how to replicate them -- however, the logic looks good, and it would follow the same trajectory as network failures, so I think I'm 👍 on this. I have a couple of nits/thoughts.

const previousPrimaryIsNetworkFailure = nonceProps.primaryTransaction.status === FAILED_STATUS && nonceProps.primaryTransaction?.txReceipt?.status !== '0x0';

I'm not sure that the latter half of the check is necessary; we'd want to handle on-chain failures as the primary transaction in the same way as network failures. Any TX that is failed should be superseded with a non-failed tx of the same nonce, regardless of failure reason.

const currentTransactionIsOnChainFailure = transaction?.txReceipt?.status === '0x0'

So, previously if an on-chain failure were in the 'confirmed' status, it would have been appropriately marked as the primaryTransaction. However, transactions that were in a 'failed' status would have been skipped and potentially not set as the primaryTransaction. So I think this logic in the if block makes sense to make sure that on-chain failures are properly identified in the UI. However, if the previous primaryTransaction is still pending, we wouldn't want the on-chain failure to supersede it, which would result in a similar scenario like the one mentioned in #9435. It should be noted that I'm running through a mental model here and haven't tested with real or synthetic on-chain failures.

Copy link
Copy Markdown
Contributor

@brad-decker brad-decker Oct 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example Scenario:

3 transactions, oldest to newest
1 - Pending transaction
2 - On-chain failure
3 - Network failure

  1. on evaluating the first transaction 1 is set as both primary and initial.
  2. on evaluating the second transaction, transaction 2 would be set at the primary transaction due to the || currentTransactionIsOnChainFailure check
  3. on evaluating the third transaction it is skipped due to it not being confirmed, the current transaction is not an on-chain failure, the previous primaryTransaction is not a network failure and its status is not in the PRIORITY_STATUS_HASH.

The logic does seem to work the same regardless of the order of transactions:

  1. On evaluating the third transaction (now first) it is set as both primary and initial
  2. On evaluating the second transaction it is set as the primary transaction due to being an on-chain failure.
  3. On evaluating the first transaction (now third) it is skipped because it's an older transaction.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if the previous primaryTransaction is still pending, we wouldn't want the on-chain failure to supersede it, which would result in a similar scenario like the one mentioned in #9435.

If a transaction has been confirmed or failed on-chain, it should not be possible for it to still be "pending" as well. That's... kinda what it means for it to be confirmed. Or that was my understanding anyway. That's why the confirmed status was treated as a higher priority than the others.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least to my understanding, you're right that an on-chain failure would prevent any transaction with the same nonce from being accepted -- an on-chain failure most likely means we should display a failed transaction to the user. I think that might make my other feedback counterproductive as well, we'd want to override a failed transaction as primary only if it was not an on-chain failure. Protecting us from accidentally overriding a transaction that was, in reality, confirmed but then failed on-chain.

I think I might be taking back my feedback -- meaning this would be good as is?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have seen #9687 (comment) and the subsequent conversation. My apologies for leading this astray and countering your advice, @Gudahtt

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem! I agree that this seems correct as-is.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion @darkwing!

@darkwing darkwing marked this pull request as ready for review October 28, 2020 20:24
Copy link
Copy Markdown
Member

@Gudahtt Gudahtt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@darkwing darkwing merged commit 220a82b into MetaMask:develop Oct 28, 2020
@github-actions github-actions bot locked and limited conversation to collaborators Oct 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Underpriced speed-up can cause confusing error state

3 participants