Conversation
| // If the tree is of the latest version and fast node is not in the tree | ||
| // then the regular node is not in the tree either because fast node | ||
| // represents live state. | ||
| if t.version == t.ndb.latestVersion { |
There was a problem hiding this comment.
this is fixing a separate race condition that was exercised during the test
| pendingFastNodeAdditions []*fastnode.Node // Fast nodes to add to cache after batch commit. | ||
| pendingFastNodeRemovals [][]byte // Fast node keys to remove from cache after batch commit. |
There was a problem hiding this comment.
new, this is the actual fix for the main race causing the app hash
| // return -1, nil | ||
| } | ||
|
|
||
| func (ndb *nodeDB) getCachedLatestVersion() int64 { |
There was a problem hiding this comment.
already an accessor for getLatestVersion, but it has the potential to do a heavy db scan on latest version being 0. The previous logic did not do this, so this lighter weight accessor was added. Again this is for the separate race condition being fixed to make the test not race, not the app hash.
|
@Mergifyio backport release/v1.2.x |
✅ Backports have been createdDetails
Cherry-pick of 146f723 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
|
@Mergifyio backport release/v1.3.x |
✅ Backports have been createdDetails
Cherry-pick of 146f723 has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
Fixes a race condition where concurrent
GetFastNodecalls (from RPC queries) can repopulate the fast node cache with stale data that is about to be overridden from the tree duringSaveVersion. AfterCommitwithinSaveVersionthe cache is not repopulated with the correct data from the tree, causing the cache to serve that stale data to future readers. At the application level, this leads to an app hash mismatches.To fix this, we introduce a
pendingFastNodeAdditionsandpendingFastNodeRemovalsthat store changes to the fast node cache when adding nodes viasaveFastNodeUnlockedorDeleteFastNode. We then defer the addition or removal of nodes from the fast node cache until after the tree changes commit, meaning there is no period of time where we could have removed a node from the cache, then brought back up an incorrect value from the tree.This also fixes a separate race between
SaveVersionand accessinggetLatestVersionthat was exercised by the regression tests.