Skip to content

Use block hash comparison for consistency check when loading block index#2993

Merged
zkbot merged 1 commit intozcash:masterfrom
str4d:2977-performance-regression
Feb 27, 2018
Merged

Use block hash comparison for consistency check when loading block index#2993
zkbot merged 1 commit intozcash:masterfrom
str4d:2977-performance-regression

Conversation

@str4d
Copy link
Copy Markdown
Contributor

@str4d str4d commented Feb 26, 2018

The Equihash check caused block index loading to take around 38x longer.
However, we don't need to check it directly, as the only paths to writing a
block header to disk already go through a proof-of-work check (e.g. receiving a
block over the network). By forcing the block header inside the CBlockIndex to
be re-serialized, we retain the benefits of the consistency check without the
overhead at startup.

@str4d str4d added I-performance Problems and improvements with respect to performance I-regression This is a regression, i.e. something that previously worked but now does not due to a change we made I-error-handling Problems and improvements related to error handling labels Feb 26, 2018
@str4d str4d added this to the v1.0.15 milestone Feb 26, 2018
@str4d str4d requested review from bitcartel and daira February 26, 2018 20:59
@bitcartel
Copy link
Copy Markdown
Contributor

bitcartel commented Feb 26, 2018

Hmm, seems like the following code will always pass no matter what.

                auto header = pindexNew->GetBlockHeader();
                if (header.GetHash() != pindexNew->GetBlockHash())

because earlier in the code:

                CBlockIndex* pindexNew = InsertBlockIndex(diskindex.GetBlockHash());
                ...
                // copy diskindex fields into pindexNew
                ..

where

CBlockIndex * InsertBlockIndex(uint256 hash) {
   ...
    mi = mapBlockIndex.insert(make_pair(hash, pindexNew)).first;
    pindexNew->phashBlock = &((*mi).first);
   ...
}

and

    uint256 GetBlockHash() const
    {
        return *phashBlock;
    }

That is, diskindex.GetBlockHash() is being checked against diskindex.GetBlockHash().

@str4d
Copy link
Copy Markdown
Contributor Author

str4d commented Feb 26, 2018

Not quite 😃 pindexNew->GetBlockHeader() copies the header properties from CBlockIndex to CBlockHeader, and then header.GetHash() reserializes. I'm intentionally comparing it to pindexNew->GetBlockHash() precisely because it is calculated from the original CDiskBlockIndex, and there's no point in recalculating that side.

Another way to achieve this would be:

CDiskBlockIndex reserialized(*pindexNew);
reserialized.GetBlockHash()

But I think this PR is more intuitive.

@bitcartel
Copy link
Copy Markdown
Contributor

You mean pindexNew.GetBlockHeader() not pindexNew.GetHeader()?

Does this sound right?

  1. There is a CDiskBlockIndex object called diskIndex.
  2. diskindex.GetBlockHash() is invoked which calls GetHash() and involves serialization
  3. CBlockIndex* pindexNew copies the member variables from diskIndex
  4. pindexNew->GetBlockHeader() is invoked which returns a CBlockHeader object called header, and then header.GetHash() involves serialization

So when InsertBlockIndex is invoked, pindexNew->phashBlock is set to the value of diskindex.GetBlockHash() (call it V1) which is what pindexNew->GetBlockHash() will return.

Once the values from diskIndex are copied into pindexNew, reserializing seems redundant since it will be over the same values which were already hashed to generate V1, meaning the consistency check always passes

Am I missing something?

@str4d
Copy link
Copy Markdown
Contributor Author

str4d commented Feb 26, 2018

Once the values from diskIndex are copied into pindexNew, reserializing seems redundant since it will be over the same values which were already hashed to generate V1, meaning the consistency check always passes

"Once the values are copied" is the sticking point: this consistency check is intended to defend against bugs like #2931.

@bitcartel
Copy link
Copy Markdown
Contributor

bitcartel commented Feb 26, 2018

Imagine there is disk corruption. When zcashd wrote a block to disk, its hashReserved was all 0x00, as expected. However there has been some bit rot, and now whenver reading back from disk, it is non-zero junk data. (header.GetHash() != pindexNew->GetBlockHash()) won't catch this error, but CheckProofOfWork should. (That's pretty much how Zencash realized they had an issue on #2931).

Checking the block hash can help catch developer error of forgetting to copy a field (which is used as part of computing the block hash), but is there a way we can catch this type of error at build time?

This PR as it stands is fine for purpose of fixing the performance regression.

Non-blocking: Perhaps the log message "CheckEquihashSolution failed" should be changed to something like "GetBlockHash inconsistency detected, expected {} but got {}"

@str4d
Copy link
Copy Markdown
Contributor Author

str4d commented Feb 27, 2018

What I'm asking is what is the consistency check actually against?

It is protecting against us making similar mistakes as #2931 when making future changes to the block header format.

Checking the block hash can help catch developer error of forgetting to copy a field (which is used as part of computing the block hash), but is there a way we can catch this type of error at build time?

Not with the current architecture inherited from Bitcoin Core (this function is unchanged upstream). We'd need to largely rewrite this function at a minimum, which is well outside the scope of this PR.

This PR as it stands is fine for purpose of fixing the performance regression. Perhaps the log message "CheckEquihashSolution failed" should be changed to something like "GetBlockHash inconsistency detected, expected {} but got {}"

Oops, I change this message on one computer but forgot to do so on the one I actually made the commit on. Will fix.

Comment thread src/txdb.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the error message reflect what is checked.

The Equihash check caused block index loading to take around 38x longer.
However, we don't need to check it directly, as the only paths to writing a
block header to disk already go through a proof-of-work check (e.g. receiving a
block over the network). By forcing the block header inside the CBlockIndex to
be re-serialized, we retain the benefits of the consistency check without the
overhead at startup.

Co-authored-by: Brad Miller <brad@z.cash>
@str4d str4d force-pushed the 2977-performance-regression branch from b067503 to 704b763 Compare February 27, 2018 00:48
@daira
Copy link
Copy Markdown
Contributor

daira commented Feb 27, 2018

The modified check is useful, but not equivalent to checking the Equihash solution. That said, it seems a pragmatic choice given the performance regression. The error message definitely needs to be changed as well though.

@str4d
Copy link
Copy Markdown
Contributor Author

str4d commented Feb 27, 2018

Force-pushed to update error message.

Copy link
Copy Markdown
Contributor

@daira daira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@str4d
Copy link
Copy Markdown
Contributor Author

str4d commented Feb 27, 2018

@zkbot r+

@zkbot
Copy link
Copy Markdown
Contributor

zkbot commented Feb 27, 2018

📌 Commit 704b763 has been approved by str4d

@zkbot
Copy link
Copy Markdown
Contributor

zkbot commented Feb 27, 2018

⌛ Testing commit 704b763 with merge f9dbd1e...

zkbot added a commit that referenced this pull request Feb 27, 2018
Use block hash comparison for consistency check when loading block index

The Equihash check caused block index loading to take around 38x longer.
However, we don't need to check it directly, as the only paths to writing a
block header to disk already go through a proof-of-work check (e.g. receiving a
block over the network). By forcing the block header inside the CBlockIndex to
be re-serialized, we retain the benefits of the consistency check without the
overhead at startup.
@zkbot
Copy link
Copy Markdown
Contributor

zkbot commented Feb 27, 2018

☀️ Test successful - pr-merge
Approved by: str4d
Pushing f9dbd1e to master...

@zkbot zkbot merged commit 704b763 into zcash:master Feb 27, 2018
@str4d str4d deleted the 2977-performance-regression branch February 27, 2018 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

I-error-handling Problems and improvements related to error handling I-performance Problems and improvements with respect to performance I-regression This is a regression, i.e. something that previously worked but now does not due to a change we made

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants