Skip to content

Conversation

@ryanofsky
Copy link
Contributor

@ryanofsky ryanofsky commented May 15, 2017

An off-by-one-block bug in importmulti rescan logic could cause it to return
success in an edge case even when a rescan was not successful. The case where
this would happen is if there were multiple blocks in a row with the same
GetBlockTimeMax() value, and the last block was scanned successfully, but one
or more of the earlier blocks was not readable.

This is fixed by changing the CWallet::ScanForWalletTransactions return value to point to the last failed block that could not be rescanned, and then updating the ScanForWalletTransactions call in importmulti to use the new return value.

An off-by-one-block bug in importmulti rescan logic could cause it to return
success in an edge case even when a rescan was not successful. The case where
this would happen is if there were multiple blocks in a row with the same
GetBlockTimeMax() value, and the last block was scanned successfully, but one
or more of the earlier blocks was not readable.
UniValue result = UniValue(UniValue::VOBJ);
result.pushKV("success", UniValue(false));
result.pushKV("error", JSONRPCError(RPC_MISC_ERROR, strprintf("Failed to rescan before time %d, transactions may be missing.", scannedRange->GetBlockTimeMax())));
result.pushKV("error", JSONRPCError(RPC_MISC_ERROR, strprintf("Failed to rescan before time %d, transactions may be missing.", scanFailed->GetBlockTimeMax() + 1)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it scans forward (increasing date / block numbers), and scanFailed returns the first block that could not be scanned, shouldn't this be "after time"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it scans forward (increasing date / block numbers), and scanFailed returns the first block that could not be scanned, shouldn't this be "after time"?

scanFailed returns the last block that could not be scanned, not the first block that couldn't be scanned. To take a concrete example, say there are two blocks in a row with GetBlockTimeMax() == 1000, and the first block fails to scan, and the second block succeeds, and later blocks also succeed. If there are any keys created at time 1000 or earlier, there could be transactions missing that pertain to those keys, so those keys should return errors. If there are any keys created at time 1001 or higher, they should technically be ok (even though they will still return errors for birthdays up to 1000 + TIMESTAMP_WINDOW).

The error returned for the problem keys will be "Failed to rescan before time 1001, transactions may be missing." which I think makes sense if "before" implies "less than" rather than "less than or equal".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time thinking about edge cases here, and I've basically convinced myself that this is ok, because:

  • the TIMESTAMP_WINDOW means that we have a good safety window between the address's birthday and the block timestamp (ie we're safe even if an address appears in a block with a timestamp before its birthday).
  • the monotonically increasing GetBlockTimeMax() means that this error message gives the most conservative value (ie if a block B's time has moved back from its parent block A and we couldn't scan block B then the error will say that we couldn't scan before Block A's timestamp. That's fine)

I think it might be useful for the user if there was a bit more information in the error message. Up to you how verbose you want to go, but a couple of things that might be good to include:

  • how does the user fix this problem (reindex, which will cause a full chain download if this is a pruning node)
  • what if the Failed to rescan before time <time> is for a time before the user specified as the address birthday? eg what if the user set the birthday to 10am, and we failed to scan a block at 9:55am. We'd return an error saying Failed to rescan before time 9:55am and the user might thing "that's ok, my key's birthday is 10am". In fact it's not ok, because the 9:55am block may include a transaction for that address, since block time is not canonical or monotonic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, expanded error message in 2e041dd

// the result stand unmodified. Otherwise replace the result
// with an error message.
if (GetImportTimestamp(request, now) - TIMESTAMP_WINDOW >= scannedRange->GetBlockTimeMax() || results.at(i).exists("error")) {
if (GetImportTimestamp(request, now) - TIMESTAMP_WINDOW > scanFailed->GetBlockTimeMax() || results.at(i).exists("error")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this checks that the birthday of the imported key is after most recent block that could not be scanned, in which case the current result will be returned?
This seems to disagree with the comment "key creation date is within the successfully scanned range".
What am I misunderstanding here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this checks that the birthday of the imported key is after most recent block that could not be scanned, in which case the current result will be returned?

Yes, if the key birthday is after the GetBlockTimeMax of the most recent block that could not be scanned + TIMESTAMP_WINDOW, the rescan for that key's transactions is considered successful, and the result is not replaced.

This seems to disagree with the comment "key creation date is within the successfully scanned range".
What am I misunderstanding here?

I may be misunderstanding something, but the comment seems right to me. If the most recent block that could not be scanned has GetBlockTimeMax == 1000, then any key created at timestamp 1001 or higher is in the "successfully scanned range", because all the blocks that were bad or missing are older than the key, and therefore not relevant.

Copy link
Member

@laanwj laanwj May 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because all the blocks that were bad or missing are older than the key, and therefore not relevant.

That's the part I'm not sure about: aren't the blocks at and after that point missing, instead of before that point? After all it's the point where it failed in a linear forward scan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the part I'm not sure about: aren't the blocks at and after that point missing, instead of before that point?

Because it returns a pointer to the last block that couldn't be scanned, it means that any blocks after the pointer were scanned successfully.

After all it's the point where it failed in a linear forward scan?

The scan does keep going even if individual blocks couldn't be read. There might be a smarter way to handle this, but the behavior has been around a while. Only the error reporting is fairly new.

Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UniValue result = UniValue(UniValue::VOBJ);
result.pushKV("success", UniValue(false));
result.pushKV("error", JSONRPCError(RPC_MISC_ERROR, strprintf("Failed to rescan before time %d, transactions may be missing.", scannedRange->GetBlockTimeMax())));
result.pushKV("error", JSONRPCError(RPC_MISC_ERROR, strprintf("Failed to rescan before time %d, transactions may be missing.", scanFailed->GetBlockTimeMax() + 1)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, expanded error message in 2e041dd

@jnewbery
Copy link
Contributor

Looks good. Tested ACK 2e041dd. It'd be nice to have tests covering the edge cases (block time the same for two consecutive blocks and block time decreasing), but that can be left for a future PR.

@jonasschnelli
Copy link
Contributor

utACK 4d2d604
Very verbose error description texts (but great explanation).

@laanwj
Copy link
Member

laanwj commented Jun 5, 2017

utACK 4d2d604

@laanwj laanwj merged commit 4d2d604 into bitcoin:master Jun 5, 2017
laanwj added a commit that referenced this pull request Jun 5, 2017
4d2d604 Fix importmulti failure to return rescan errors (Russell Yanofsky)

Tree-SHA512: e5e6d6c5a2bb7230e1bcac1903a4b766cd57bf781fade50c6c9cd5713cd3e768db0987cfda9699b57a53d3a0a60951b96dce5283b3d3ec1f954162c439bc932b
@laanwj laanwj added this to the 0.14.3 milestone Sep 5, 2017
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jun 26, 2019
4d2d604 Fix importmulti failure to return rescan errors (Russell Yanofsky)

Tree-SHA512: e5e6d6c5a2bb7230e1bcac1903a4b766cd57bf781fade50c6c9cd5713cd3e768db0987cfda9699b57a53d3a0a60951b96dce5283b3d3ec1f954162c439bc932b
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants