Introduction
One of the core Neo consensus characteristics is its target inter-block time distance (SecondsPerBlock/MillisecondsPerBlock). In general we want every block to be offset from the previous one by exactly the time span configured. But even if we're not considering view changes, usually it's not the case for a real network. There are various sources for additional delays, but we want to concentrate on one particular aspect here --- block processing.
How it works now
DBFT's InitializeConsensus() sets the timer using configured value adjusted by block_received_time. block_received_time value is set in OnPersistCompleted, so it happens after the block processing and not a lot of things happen between block_receive_time setting and ChangeTimer() invocation. I would even suggest that removing this adjustment won't affect the node in any noticeable way (backups are not adjusted, just to mention).
Related work
#626 tried to use in-block timestamps as a reference, but as there are no guarantees over nodes clocks synchronization they're not really reliable, we can only use local clocks for any decisions.
#819 approached this problem very aggressively by moving block_receive_time right after InitializeConsensus(), but this created even more problems (any view change makes this timestamp waaaay off immediately, for example).
Block processing and consensus
One thing we've noticed during our testing is that under load block processing time increases up to the level where it can exceed the target time between blocks. It's not very surprising, but what's more interesting is that current consensus logic doesn't respond to that in any way. If we're using 1 second as an interval and the block takes 3 seconds to persist, the timer is set for 1 second anyway, even though it would make more sense to start creating the next block immediately. Even under moderate load block processing time can be big enough to affect the time interval between blocks.
The proposal
We can move block_received_time setting to the earliest point in time where we definitely know that the new block exists, that is in the CheckCommits(). Subsequent consensus initialization via OnPersistCompleted() will use that time as a reference the same way it currently does making an adjustment to configured timeout value.
One additional check needs to be added though and that is the height check, because OnPersistCompleted() can also be called without the node ever going through CheckCommits(). So to avoid using wrong timestamp we also save the height of the block created with CheckCommits().
I think this change is safe, backwards compatible and applicable to both Neo versions. It doesn't give us perfect block time, but it makes us a little closer to that by accounting for one particular source of delays. The more loaded the network, the more visible will be the effect of this patch.
It has some relation to TPS performance characteristics. For the case when block processing time doesn't exceed the target interval it's not noticeable, but it makes the node produce a little more of smaller blocks. For the case when block processing time exceeds the target interval it improves TPS a little by making the node work more instead of waiting for the timer.
Potential future improvements
Theoretically we can move block_receive_time assignment to PrepareRequest handler to try eliminating most of network delays associated with PrepareResponse and Commit collection, but that's more invasive and dangerous, thus requerying more thorough testing under different conditions (neo-resilience might probably help here).
Neo Version
Where in the software does this update applies to?
Introduction
One of the core Neo consensus characteristics is its target inter-block time distance (SecondsPerBlock/MillisecondsPerBlock). In general we want every block to be offset from the previous one by exactly the time span configured. But even if we're not considering view changes, usually it's not the case for a real network. There are various sources for additional delays, but we want to concentrate on one particular aspect here --- block processing.
How it works now
DBFT's
InitializeConsensus()sets the timer using configured value adjusted byblock_received_time.block_received_timevalue is set inOnPersistCompleted, so it happens after the block processing and not a lot of things happen betweenblock_receive_timesetting andChangeTimer()invocation. I would even suggest that removing this adjustment won't affect the node in any noticeable way (backups are not adjusted, just to mention).Related work
#626 tried to use in-block timestamps as a reference, but as there are no guarantees over nodes clocks synchronization they're not really reliable, we can only use local clocks for any decisions.
#819 approached this problem very aggressively by moving
block_receive_timeright afterInitializeConsensus(), but this created even more problems (any view change makes this timestamp waaaay off immediately, for example).Block processing and consensus
One thing we've noticed during our testing is that under load block processing time increases up to the level where it can exceed the target time between blocks. It's not very surprising, but what's more interesting is that current consensus logic doesn't respond to that in any way. If we're using 1 second as an interval and the block takes 3 seconds to persist, the timer is set for 1 second anyway, even though it would make more sense to start creating the next block immediately. Even under moderate load block processing time can be big enough to affect the time interval between blocks.
The proposal
We can move
block_received_timesetting to the earliest point in time where we definitely know that the new block exists, that is in theCheckCommits(). Subsequent consensus initialization viaOnPersistCompleted()will use that time as a reference the same way it currently does making an adjustment to configured timeout value.One additional check needs to be added though and that is the height check, because
OnPersistCompleted()can also be called without the node ever going throughCheckCommits(). So to avoid using wrong timestamp we also save the height of the block created withCheckCommits().I think this change is safe, backwards compatible and applicable to both Neo versions. It doesn't give us perfect block time, but it makes us a little closer to that by accounting for one particular source of delays. The more loaded the network, the more visible will be the effect of this patch.
It has some relation to TPS performance characteristics. For the case when block processing time doesn't exceed the target interval it's not noticeable, but it makes the node produce a little more of smaller blocks. For the case when block processing time exceeds the target interval it improves TPS a little by making the node work more instead of waiting for the timer.
Potential future improvements
Theoretically we can move
block_receive_timeassignment to PrepareRequest handler to try eliminating most of network delays associated with PrepareResponse and Commit collection, but that's more invasive and dangerous, thus requerying more thorough testing under different conditions (neo-resilience might probably help here).Neo Version
Where in the software does this update applies to?