[R4R]implement diff sync by unclezoro · Pull Request #376 · bnb-chain/bsc

unclezoro · 2021-08-22T10:31:50Z

Description

Diff Sync of BSC

Abstract

The increasing adoption of BSC leads to a more active network. Blocks on BSC start hitting the gasceil daily and we are planning to increase the capacity of BSC further. On the other hand, the node maintainer had a hard time keeping their node catching up with the chain. We need a light sync mode to lower the hardware requirement for running a bsc fullnode.

Currently bsc has three kinds of sync mode: 1.Snap sync. 2. Fast sync. 3. Full sync.

1 and 2 are used for the initial synchronization, once the client has the entire state and all historical block data, it will switch to full sync automatically.

It takes several steps to process a block when doing full sync:

Block fetcher or downloader get blocks from other peers.
Verify header and block body.
Execute transactions within EVM, including read data from cache/disk.
Calculate the root hash of MPT.
Commit MPT to memory db, persist snapshot to disk.

According to our profile, step 3 occupied 70+% of the block processing time.

This design wants to propose a diff sync protocol without executing transactions, in exchange, the security of a fullnode will degrade to light client.

Spec

WorkFlow

Cache Difflayer and Persist Difflayer

The size of one difflayer is around 100K bytes, it is totally fine to cache 10K diff layers, which means a client can still diff sync if it is down for 8 hours.

We have to persist difflayer to a new database so that a node can still get difflayer even after a long downtime.

Fallback to Full Sync && Switch to Diff Sync

When fallback to full sync:

Can not receive difflayer in time.
No peers have that diff layer.
Randomly switch to full sync(every 21 blocks) to ensure security.

When switch diff sync:

Periodically detect the network, whether any peer has the latest difflayer.

Delay of Diff Sync

The verification of difflayer takes time, we adopt an optimistic strategy to broadcast the difflayer. A node will cache two kinds of difflayers: 1. Trusted difflayer which have been verified; 2. Untrusted difflayer received from other peers. A node always tries to respond with trusted difflayer first, and only responds with untrusted difflayer if it is missed in cache and database. The client can disconnect the peer once receive an invalid diff layer. In this way, we dismiss the delay of diff sync.

Security

The diff sync guarantee that:

Light client security
State consistency.

It sustains:
validator collusion
Fork the chain with an invalid state

So do not enable diff sync when it requires high security.

For a validator, it can only use diff sync when it is the inturn validator for the next block. It guarantee that at most one validator is doing diff sync at the same time, the whole network is still secure.

Preflight checks

build passed (make build)
tests passed (make test)
manual transaction test passed

Already reviewed by

...

Related issues

... reference related issue #'s here ...

* ligth sync: download difflayer Signed-off-by: kyrie-yl <lei.y@binance.com> * download diff layer: fix according to the comments Signed-off-by: kyrie-yl <lei.y@binance.com> * download diff layer: update Signed-off-by: kyrie-yl <lei.y@binance.com> * download diff layer: fix accroding comments Signed-off-by: kyrie-yl <lei.y@binance.com> Co-authored-by: kyrie-yl <lei.y@binance.com>

j75689 · 2021-09-24T07:49:31Z

cmd/faucet/faucet.go

 	}

-	bep2eInfos := make(map[string]bep2eInfo, 0)
+	bep2eInfos := make(map[string]bep2eInfo)


why not make a map with capacity?
make(map[string]bep2eInfo,len(symbols))

j75689 · 2021-09-24T08:01:51Z

consensus/parlia/parlia.go

 	}
 	delay := p.delayForRamanujanFork(snap, header)
+	// The blocking time should be no more than half of period
+	if delay > time.Duration(p.config.Period)*time.Second/2 {


if halfPeriod := time.Duration(p.config.Period) * time.Second / 2; delay > halfPeriod { delay = halfPeriod }

This way don't need to calculate twice

j75689 · 2021-09-24T08:08:42Z

consensus/parlia/parlia.go

+	idx := snap.indexOfVal(p.val)
+	// validator is not allowed to diff sync
+	return idx < 0
+


j75689 · 2021-09-24T08:15:00Z

core/blockchain.go

 	SnapshotWait:   true,
 }

+type BlockChainOption func(*BlockChain) *BlockChain


Actually, there is no need to return *BlockChain

It is true there is no need to return in this case. But in some other cases, we may wrapper the Object which is an interface, and return the wrapper. It impact little to performance, just personal code style. If you insist, I can prune the return objects.

j75689 · 2021-09-24T08:18:06Z

core/blockchain.go

 	}
+	// do options before start any routine
+	for _, option := range options {
+		bc = option(bc)


only need option(bc)

check above

j75689 · 2021-09-24T11:42:09Z

eth/downloader/downloader.go

 	Snapshots() *snapshot.Tree
 }

+type DownloadOption func(downloader *Downloader) *Downloader


returning *Downloader is unnecessary

check above

j75689 · 2021-09-24T11:46:12Z

eth/handler_diff.go

+
+// PeerInfo retrieves all known `diff` information about a peer.
+func (h *diffHandler) PeerInfo(id enode.ID) interface{} {
+	if p := h.peers.peer(id.String()); p != nil {


can be combined into one line

if p := h.peers.peer(id.String()); p != nil && p.diffExt != nil { return p.diffExt.info() }

j75689 · 2021-09-24T11:49:42Z

eth/peerset.go

+	// Otherwise wait for `diff` to connect concurrently
+	wait := make(chan *diff.Peer)
+	ps.diffWait[id] = wait
+	ps.lock.Unlock()


move this line to :184 defer ps.lock.Unlock

I just follow what waitSnapExtension did, it has been proven quite safe. So I think we don't have motivation to modify this.

j75689 · 2021-09-24T11:52:25Z

eth/protocols/diff/handler.go

+		}
+		if fulfilled := requestTracker.Fulfil(peer.id, peer.version, FullDiffLayerMsg, res.RequestId); fulfilled {
+			return backend.Handle(peer, res)
+		} else {


this else is unnecessary

node/node.go

make diff block configable wait code write fix testcase resolve comments resolve comment

j75689 · 2021-09-27T07:47:39Z

core/state_processor.go

+		threads = runtime.NumCPU()
+	} else if threads == 0 {
+		threads = 1
+	}


I saw that this logic had been written in two places, and the only parameter is the difference.
If it is just a strategy for determining the number of threads, we can make a unique function.
It will be easier to maintain.

unclezoro force-pushed the light_sync branch from 9dd3b43 to 41b8931 Compare August 23, 2021 03:41

kyrie-yl approved these changes Aug 24, 2021

View reviewed changes

unclezoro force-pushed the light_sync branch 9 times, most recently from 14e48ff to 0f56e8c Compare August 26, 2021 12:22

unclezoro force-pushed the light_sync branch 2 times, most recently from b05e738 to b8d6373 Compare September 1, 2021 08:56

unclezoro changed the title ~~[WIP]implement block process part of light sync~~ [R4R]implement block process part of light sync Sep 2, 2021

unclezoro changed the title ~~[R4R]implement block process part of light sync~~ [R4R]implement diff sync Sep 2, 2021

unclezoro force-pushed the light_sync branch 5 times, most recently from f8ed94b to a8891d9 Compare September 6, 2021 11:55

unclezoro and others added 10 commits September 7, 2021 08:35

implement block process part of light sync

bfce9eb

add difflayer protocol

3d8a997

handle difflayer and refine light processor

b782a7b

add testcase for diff protocol

172b26e

make it faster

c598aae

allow validator to light sync

d160091

change into diff sync

6f39765

update light sync to diff sync

66ee50d

raise the max diff limit

ca156fc

unclezoro force-pushed the light_sync branch 8 times, most recently from 80be4ed to f89fe52 Compare September 10, 2021 08:15

add test code

443f9a4

unclezoro force-pushed the light_sync branch 2 times, most recently from 09515cf to f1f4615 Compare September 10, 2021 15:43

remove extra message

1fbecfa

unclezoro force-pushed the light_sync branch from f1f4615 to 1fbecfa Compare September 10, 2021 15:54

yutianwu approved these changes Sep 13, 2021

View reviewed changes

unclezoro force-pushed the light_sync branch from 8cc1c3d to 35415e0 Compare September 24, 2021 08:05

j75689 reviewed Sep 24, 2021

View reviewed changes

unclezoro force-pushed the light_sync branch from 1fac05b to e5953e7 Compare September 26, 2021 04:12

fix testcase and lint

92e21cc

make diff block configable wait code write fix testcase resolve comments resolve comment

unclezoro force-pushed the light_sync branch from e5953e7 to 92e21cc Compare September 26, 2021 04:13

unclezoro changed the base branch from master to develop September 27, 2021 02:06

unclezoro added 2 commits September 27, 2021 11:21

resolve comments

85e0fd4

resolve comments

0bbbcc5

j75689 reviewed Sep 27, 2021

View reviewed changes

unclezoro added 2 commits September 27, 2021 16:55

resolve comment

0011461

fix mistake

844f2ab

j75689 approved these changes Sep 28, 2021

View reviewed changes

unclezoro merged commit 1ded097 into bnb-chain:develop Sep 28, 2021

This was referenced Oct 19, 2021

[R4R]prepare for release v1.1.3 #465

Merged

[R4R] Release v1.1.3 #460

Merged

ghost mentioned this pull request Feb 7, 2022

Add an option to skip transaction execution while on sync klaytn/klaytn#1147

Closed

Conversation

unclezoro commented Aug 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Diff Sync of BSC

Abstract

Spec

WorkFlow

Cache Difflayer and Persist Difflayer

Fallback to Full Sync && Switch to Diff Sync

Delay of Diff Sync

Security

Preflight checks

Already reviewed by

Related issues

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

unclezoro commented Aug 22, 2021 •

edited

Loading