Skip to content

Proposer-based timestamp experimental values #7202

@williambanfield

Description

@williambanfield

Proposer-based timestamps, as described in ADR-71, calls for the addition of three new consensus parameters: ACCURACY, PRECISION, and MSGDELAY. We currently need a way to measure these experimentally so that we can provide conservative estimates to operators of Tendermint projects.
Because Tendermint is run in such heterogenous network environment and the protocol allows a multi-hop message delivery scheme, these values cannot be determined exactly. If, for example, all validators were directly connected to each other, MSGDELAY could more easily be calculated using the maximum round-trip time between some pair of validators.

Additionally, we would like to make it easy for networks to determine meaningful and conservative value for these parameters. It should not take advanced tooling for a network to derive the values.

Measuring PRECISION

How to measure

PRECISION can be experimentally determined by calculating the maximum difference of Precommit message timestamps in a 2/3+ quorum of Precommits that were issued in the same height/round.

Why?

Precommits were chosen for this calculation over Prevotes because Prevotes are issued after a validator receives a block and validates it. This means that Prevote timestamps will be very influenced by network speeds since blocks are larger and are spread across multiple p2p messages.

Validators do not issue Precommits until they receive a 2/3+ quorum of Prevotes from other validators. We can therefore assume that validators will receive the quorum of Prevotes within a less variable time. This assumption can be made because all validators must wait for the slowest validator which completes the quorum of Prevotes to download the block and issue its Prevote. This is, of course, somewhat imperfect because validators do not receive the quorum of Prevotes at the same time due to network timing differences; However, it will provide a conservative approximation of the value.

Collecting the data

This can easily be added as a prometheus metric to the consensus reactor. We can calculate the value when a block is produced. Collecting the max-difference across a few hundred blocks should be sufficient to calculate a steady-state for PRECISION.

Measuring MSGDELAY

How to measure

Message delay can be measured as the maximum difference between the Proposal message timestamp and the timestamp of a Prevote message in a round where a quorum of Prevotes is seen and the network had not already gossiped the block.

Why?

Validators issue a Prevote message as soon as they receive and validate a complete block. Therefore, the difference in Proposal message timestamp and Prevote timestamp corresponds to the delay in delivering the block. Therefore, the greatest difference between the Proposal and some Prevote's timestamp is the message delay of the entire network. This includes some amount of PRECISION difference as well but will give a conservative value for MSGDELAY.

This should be measured in a round where a quorum is seen so that the calculated MSGDELAY value is large enough to allow for +2/3 of the network to receive the block and mark it as timely.

Collecting the data

This can be added as a prometheus metric to the consensus reactor. We can calculate the value when the consensus reactor sees +2/3 Prevotes a round. Since many large networks produce blocks within a single round, as a simplification we can perform this calculation on blocks that were produced in round 0 to avoid having to keep track of which blocks have ever been seen on the network before.

Measuring ACCURACY

How to measure

ACCURACY is a somewhat theoretical parameter and cannot quite be measured concretely. Our implementation assumes validators are coordinated by NTP and so to check accuracy, we will measure the clock drift as reported by the ntp command on several popular cloud providers.

For my local machine, this can be performed as follows:

$ sntp  pool.ntp.org
+0.279157 +/- 0.001801 pool.ntp.org 213.192.54.227

Where the value following the + indicates the delay behind 'real' time from the perspective on NTP.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions