Skip to content

[R4R]add sharedStorage for prefetching to L1#792

Merged
unclezoro merged 15 commits intobnb-chain:developfrom
flywukong:sharedStorage
Mar 28, 2022
Merged

[R4R]add sharedStorage for prefetching to L1#792
unclezoro merged 15 commits intobnb-chain:developfrom
flywukong:sharedStorage

Conversation

@flywukong
Copy link
Copy Markdown
Contributor

@flywukong flywukong commented Mar 11, 2022

Description

In the previous prefetch behavior of importing blocks, the prefetch and the main process use completely different state dbs, the prefetch process can only put the retrieved data into the L3 layer, if the main process wants to access, it must go through the L1-L2-L3 layers, the new design let the main process and the prefetch process share a shared_pool with the stateDB. This pool stores the originStorage data (each stateobjects has the corresponding
entry ). By this way , the prefetch data that can be directly fished to the L1 layer
This shared_pool can be represented as the following secondary map structure, which is a private member variable in stateDB , each stateObjects has pointer to access the Corresponding entry of shared_pool

image

This PR also remove the overhead of TryPreload which cost 2%-3% in main process.

Rationale

we can make a lot of data in the main process that originally need to be accessed in L3 directly accessed in L1 layer, which reduces a lot of L2 and L3 layer overhead in the number of layers

Performance

The two test machine is on AWS , the two nodes has worked in fullsync mode and has caught up to the latest block

This is the delay of process block , we can see the machine which running master branch data has nearly same performance data
image

Then update a machine with this PR code, it can be clearly seen that the delay data is reduced (green performance data), and the average process delay is reduced by about 7%

image

The proportion of I/O overhead in the total block cost of the main process dropped from 33% to 30.5%
image

The overhead of total block cost has reduce from to 3.67% to 1.1% with PR code
image

Example

add an example CLI or API response...

Changes

Notable changes:

  • add each change in a bullet point here
  • ...

@flywukong flywukong changed the title add sharedStorage for prefetching to L1 [WIP]add sharedStorage for prefetching to L1 Mar 11, 2022
Copy link
Copy Markdown

@RenRick RenRick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// sharedStorage is used to store maps of originStorage of stateObjects
type SharedStorage struct {
poolLock *sync.RWMutex
shared_map map[common.Address]*sync.Map
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"shared_map" needs to keep the same naming style with poolLock

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

go func(start time.Time, followup *types.Block, throwaway *state.StateDB, interrupt *uint32) {
bc.prefetcher.Prefetch(followup, throwaway, bc.vmConfig, &followupInterrupt)
}(time.Now(), block, throwaway, &followupInterrupt)
}(time.Now(), block, statedb, &followupInterrupt)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not suggest to share StateDB between Prefetcher and main StateDB.
Use prefetch DB instead, main StateDB can access prefetch DB to get the originStorage

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefetch DB could be used as: L1.5 Cache

Copy link
Copy Markdown
Contributor Author

@flywukong flywukong Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stateDB of main process an prefetcher are different, they only share the sharedpool which store the originStroage maps , I had adjust the sharedPool to be L1.5 Cache by the new commit

go func(start time.Time, followup *types.Block, throwaway *state.StateDB, interrupt *uint32) {
bc.prefetcher.Prefetch(followup, throwaway, bc.vmConfig, &followupInterrupt)
}(time.Now(), block, throwaway, &followupInterrupt)
}(time.Now(), block, statedb, &followupInterrupt)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not suggest to share StateDB between Prefetcher and main StateDB.
Use prefetch DB instead, main StateDB can access prefetch DB to get the originStorage

storage.RUnlock()
if !ok {
m := new(sync.Map)
storage.Lock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concurrent race condition?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, newObjects called by main preoces or prefetcher will both call this function to check or update the sharedpool

Copy link
Copy Markdown
Contributor

@unclezoro unclezoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix lint.

storageMap, ok := s.sharedMap[address]
s.RUnlock()
if !ok {
m := new(sync.Map)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has race condition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

}
}

// Check whether the storage exist in pool,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be better to follow the commenting style used in go-ethereum, like, // Copy creates a deep, independent copy of the state. // Snapshots of the copied state cannot be applied to the copy. func (s *StateDB) Copy() *StateDB {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@unclezoro
Copy link
Copy Markdown
Contributor

LGTM

Copy link
Copy Markdown

@richardrich975 richardrich975 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@unclezoro unclezoro changed the title [WIP]add sharedStorage for prefetching to L1 [R4R]add sharedStorage for prefetching to L1 Mar 28, 2022
@unclezoro unclezoro merged commit 77cb056 into bnb-chain:develop Mar 28, 2022
}
if value, cached := s.originStorage[key]; cached {

if value, cached := s.getOriginStorage(key); cached {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can leave it unchanged; the API: getOriginStorage can be removed;
// Try to get it from origin storage prefetch pool
if s.sharedOriginStorage != nil {
val, ok := s.sharedOriginStorage.Load(key)
if ok {
s.originStorage[key] = val.(common.Hash)
return val.(common.Hash), true
}
}

return common.Hash{}, false
}

func (s *StateObject) setOriginStorage(key common.Hash, value common.Hash) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to add a new APIsetOriginStorage either

@@ -0,0 +1,37 @@
package state
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about rename the file with: shared_storage_pool.go, shared_pool.go is a bit confusing, shared what?

return accounts
}

func (s *StateDB) GetStorage(address common.Address) *sync.Map {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to addGetStorage;
GetStorage is paired with SetStorage, it is for fakeStorage.
Do not occupy the name.

code Code // contract bytecode, which gets set when code is loaded

originStorage Storage // Storage cache of original entries to dedup rewrites, reset for every transaction
sharedOriginStorage *sync.Map // Storage cache of original entries to dedup rewrites, reset for every transaction
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Storage cache of original entries to dedup rewrites, reset for every transaction
the comment is for originStorage, pls add a new comment for sharedOriginStorage

if data.Root == (common.Hash{}) {
data.Root = emptyRoot
}
var storageMap *sync.Map
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename storageMap to sharedStorage, storage is a Map by itself.

stateObjectsPending map[common.Address]struct{} // State objects finalized but not yet written to the trie
stateObjectsDirty map[common.Address]struct{} // State objects modified in the current execution

storagePool *StoragePool // sharedPool to store L1 originStorage of stateObjects
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name: storagePool -> originStoragePool to make it clear.

stateObjectsDirty map[common.Address]struct{} // State objects modified in the current execution

storagePool *StoragePool // sharedPool to store L1 originStorage of stateObjects
writeOnSharedStorage bool // Write to the shared origin storage of a stateObject while reading from the underlying storage layer.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename writeOnSharedStorage to updateSharedOriginStorage

unclezoro pushed a commit that referenced this pull request Mar 29, 2022
* add sharedStorage for prefetching to L1

* remote originStorage in stateObjects

* fix core

* fix bug of sync map

* remove read lock when get & set keys

* statedb copy use CopyWithSharedStorage

* reduce lock access

* fix comment

* avoid sharedPool effects on other modules

* remove  tryPreload

* fix comment

* fix var name

* fix lint

* fix L1 miss data && data condition

* fix comment
setunapo pushed a commit to setunapo/bsc that referenced this pull request Apr 1, 2022
* add sharedStorage for prefetching to L1

* remote originStorage in stateObjects

* fix core

* fix bug of sync map

* remove read lock when get & set keys

* statedb copy use CopyWithSharedStorage

* reduce lock access

* fix comment

* avoid sharedPool effects on other modules

* remove  tryPreload

* fix comment

* fix var name

* fix lint

* fix L1 miss data && data condition

* fix comment
owen-reorg pushed a commit to bnb-chain/op-geth that referenced this pull request May 31, 2023
Description: This PR refers to the code optimization done by `BSC`, which mainly includes the following parts:
1. Do BlockBody verification concurrently.
2. Do the calculation of intermediate root concurrently.
3. Commit the MPTs concurrently.
4. Preload accounts before processing blocks.
5. Make the snapshot layers configurable.
6. Reuse some objects to reduce GC.
7. Add shared_pool for `stateDB` to improve cache usage.

References:
- bnb-chain/bsc#257
- bnb-chain/bsc#792

---------

Co-authored-by: j75689 <j75689@gmail.com>
Co-authored-by: realowen <131384228+realowen@users.noreply.github.com>
owen-reorg pushed a commit to bnb-chain/op-geth that referenced this pull request May 31, 2023
Description

This PR refers to the code optimization done by `BSC`, which mainly includes the following parts:
1. Do BlockBody verification concurrently.
2. Do the calculation of intermediate root concurrently.
3. Commit the MPTs concurrently.
4. Preload accounts before processing blocks.
5. Make the snapshot layers configurable.
6. Reuse some objects to reduce GC.
7. Add shared_pool for `stateDB` to improve cache usage.

References
- bnb-chain/bsc#257
- bnb-chain/bsc#792

---------

Co-authored-by: j75689 <j75689@gmail.com>
Co-authored-by: realowen <131384228+realowen@users.noreply.github.com>
owen-reorg pushed a commit to bnb-chain/op-geth that referenced this pull request May 31, 2023
Description

This PR refers to the code optimization done by `BSC`, which mainly includes the following parts:
1. Do BlockBody verification concurrently.
2. Do the calculation of intermediate root concurrently.
3. Commit the MPTs concurrently.
4. Preload accounts before processing blocks.
5. Make the snapshot layers configurable.
6. Reuse some objects to reduce GC.
7. Add shared_pool for `stateDB` to improve cache usage.

References
- bnb-chain/bsc#257
- bnb-chain/bsc#792

---------

Co-authored-by: j75689 <j75689@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants