Skip to content

add state sync#4645

Closed
erikgrinaker wants to merge 49 commits intomasterfrom
erik/state-sync
Closed

add state sync#4645
erikgrinaker wants to merge 49 commits intomasterfrom
erik/state-sync

Conversation

@erikgrinaker
Copy link
Contributor

@erikgrinaker erikgrinaker commented Apr 5, 2020

Fixes #828. Adds state sync, as outlined in ADR-053. See related PRs in Cosmos SDK (cosmos/cosmos-sdk#5803) and Gaia (cosmos/gaia#327).

  • Adds a new ABCI interface and connection for fetching and applying state snapshots. Bumps ABCIVersion to 0.17.0.

  • Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested. Does not bump P2PVersion since ignoring these messages is fine.

  • Adds a new configuration section [statesync] that enables state sync and configures the light client. Also enables statesync:info logging by default.

  • Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.

What's left?

  • Write reactor unit tests, and possibly integration tests.
  • Investigate occasional IAVL panics when restoring snapshots, generally at height 1.
  • Integrate with v1 blockchain reactor and support running without fast sync.
  • Integrate with v2 blockchain reactor. (later PR)
  • Fetch verified consensus parameters via RPC (needs lite2: verify ConsensusHash in rpc client #4693).
  • Check whether current light client verification is sufficient.
  • Bump the ABCI version to 0.17.0.
  • Add metrics.
  • Write documentation. (later PR)
  • Run test nets, especially with large states (>100 MB).

How to try it out

There's a draft Gaia PR (cosmos/gaia#327) that includes modifications to run a local state sync test net.

First, fetch the necessary branches:

$ mkdir statesync && cd statesync
$ git clone --branch erik/snapshot git@github.com:cosmos/cosmos-sdk
$ git clone --branch erik/snapshot git@github.com:cosmos/gaia
$ git clone --branch erik/statesync git@github.com:tendermint/tendermint

Next, build and run a Gaia testnet. This starts 3 nodes in a 4-node cluster, taking state snapshots at every single height.

$ cd gaia
$ make build-linux && make build-docker-gaiadnode
$ rm -rf build/node*; make localnet-start

Let the network create a few blocks and snapshots, then (in a separate terminal) run the following script which configures node3 with state sync and starts the node:

$ brew install gnu-sed
$ ./tools/sync.sh

The node will first spend 10 seconds discovering snapshots, then restore the most recent snapshot it found before joining normal consensus operation.


For contributor use:

  • Wrote tests
  • Updated CHANGELOG_PENDING.md
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Updated relevant documentation (docs/) and code comments
  • Re-reviewed Files changed in the Github PR explorer

Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikgrinaker erikgrinaker marked this pull request as draft April 14, 2020 11:50
@codecov-io
Copy link

codecov-io commented Apr 16, 2020

Codecov Report

Merging #4645 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #4645   +/-   ##
=======================================
  Coverage   65.77%   65.77%           
=======================================
  Files         255      255           
  Lines       23669    23669           
=======================================
  Hits        15569    15569           
  Misses       6841     6841           
  Partials     1259     1259           

@erikgrinaker
Copy link
Contributor Author

Getting some hairy merge conflicts with master. I'll split this up a bit and submit fresh PRs.

@erikgrinaker erikgrinaker deleted the erik/state-sync branch April 20, 2020 10:57
erikgrinaker added a commit that referenced this pull request Apr 29, 2020
Adds the ABCI interface for [state sync](#828) as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md), and bumps ABCIVersion to `0.17.0`.

The interface adds a new ABCI connection which Tendermint can use to query and load snapshots from the app (for serving snapshots to other nodes), and to offer and apply snapshots to the app (for state syncing a local node from peers).

Split out from the original PR in #4645, state sync reactor will be submitted as a separate PR. The interface is implemented by the Cosmos SDK in cosmos/cosmos-sdk#5803.
erikgrinaker added a commit that referenced this pull request Apr 29, 2020
Fixes #828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (cosmos/cosmos-sdk#5803) and Gaia (cosmos/gaia#327).

This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704. 

* Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested.

* Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default.

* Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.
tac0turtle pushed a commit that referenced this pull request Apr 29, 2020
Adds the ABCI interface for [state sync](#828) as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md), and bumps ABCIVersion to `0.17.0`.

The interface adds a new ABCI connection which Tendermint can use to query and load snapshots from the app (for serving snapshots to other nodes), and to offer and apply snapshots to the app (for state syncing a local node from peers).

Split out from the original PR in #4645, state sync reactor will be submitted as a separate PR. The interface is implemented by the Cosmos SDK in cosmos/cosmos-sdk#5803.
tac0turtle pushed a commit that referenced this pull request Apr 29, 2020
Fixes #828. Adds state sync, as outlined in [ADR-053](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-053-state-sync-prototype.md). See related PRs in Cosmos SDK (cosmos/cosmos-sdk#5803) and Gaia (cosmos/gaia#327).

This is split out of the previous PR #4645, and branched off of the ABCI interface in #4704.

* Adds a new P2P reactor which exchanges snapshots with peers, and bootstraps an empty local node from remote snapshots when requested.

* Adds a new configuration section `[statesync]` that enables state sync and configures the light client. Also enables `statesync:info` logging by default.

* Integrates state sync into node startup. Does not support the v2 blockchain reactor, since it needs some reorganization to defer startup.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sync: Sync current state without full replay for Applications

3 participants