-
Notifications
You must be signed in to change notification settings - Fork 780
Description
Feature Request
Summary
Currently the BlockSync logic is triple verifying all validator signatures, which has real costs to block syncing throughput. Looking at these IAVL v2 benchmarks, which synced through 19891 mainnet Osmosis blocks (approximately 33 hours of mainnet blocks), 12 minutes were spent on verifying block signatures. (In IAVL v0, this was out of 5.8 CPU hours, in IAVL v2 its out of 2 CPU hours, 1.2 user hours). Note that in those benchmarks, the other 2 hours are getting optimized down significantly, so getting the Tendermint side improved becomes more and more pressing for sync rate.
The pool routine is verifying this three times:
- First here, using batch verification: https://github.com/cometbft/cometbft/blob/c88d5512350f6671c92051cf5ebbbe9841f644cb/internal/blocksync/reactor.go#L463C49-L464 . This takes 1.8 minutes, due to using the batch verification optimization which has been deemed safe for blocksync usage.
- Secondly here, within ValidateBlock: This takes 5.1 minutes due to verifying every signature independently. Theres no reason this should be verifying signatures given the context above.
cometbft/internal/blocksync/reactor.go
Line 468 in c88d551
err = bcR.blockExec.ValidateBlock(state, first) - Thirdly here, within ApplyBlock, calling ValidateBlock. ApplyBlock Line, ValidateBlock Line. This again takes 5.1 minutes, as its the same exact ValidateBlock function
Problem Definition
After new IAVL improvements, Tendermint's triple verification of signatures is costing 10% of the time it takes to sync Osmosis blocks, and this percentage is only going up by the day as other parts of the system are being optimized. We should take the very simple win at hand here to speed this up.
(16% of the time in the blocking threads)
Proposal
The first ValidateBlock inside PoolRoutine can very quickly be sped up, by making a new function that does block validation sans signature checks. That easily shaves off 5.1 minutes here.
The second ValidateBlock inside ApplyBlock probably requires some API rethinking to avoid (as its correctness is already implied by the batch verification), but is worth doing as a second thing.