Description
In a post-merge beacon chain, a CL (consensus layer/eth2) node will need to call two functions in order to prepare a block:
The ultimate goal of these two calls is to return an ExecutionPayload, which is effectively an execution (eth1) block to be included in a consensus (eth2) block.
The reason there are separate preparePayload and getPayload calls is to allow the CL nodes to be able to give the EL (execution layer/eth1) nodes some time to prepare the payload (i.e., find the best set of transactions it can). To this end, in the ideal case we should call preparePayload some time before we call getPayload.
The purpose of this issue is to establish when the CL nodes should call preparePayload and to consider the engineering requirements for CL implementations (e.g., Lighthouse).
When to call preparePayload
Lets start with three basic constraints about when and how to call preparePayload:
- preparePayload only needs to be called if we expect to call getPayload during some slot
s.
- I.e., only call preparePayload if a beacon node (BN) expects to propose a block in slot
s.
- Since preparePayload accepts a
parentHash, we can only call it after we know the parent of the block at slot s.
- I.e., preparePayload needs to be called sometime during slot
s - 1.
- preparePayload parameters are determined by what we expect to be the canonical head block at the start of slot
s.
Given these constraints, we could say that preparePayload should be called whenever the canonical head changes during slot s - 1.
But alas, there is an edge-case. What if the node never receives a block at slot s - 1 (i.e., s - 1 is a "skip slot")? The head could remain unchanged (e.g. the block at slot s - 2) and therefore we'd never call preparePayload.
In light of skip slots, it seems we may need to decide at some point during slot s - 1 that we're probably not going to get a block and that we should call preparePayload with the current head (e.g. s - 2). This point would be the threshold at which we assume there is a skip slot, so lets call it assumed_skip_slot_threshold.
We can now form a general definition of when to call preparePayload:
General definition
If a CL node expects to propose a block at slot s, then it should call preparePayload with values computed from the canonical head whenever the following events occur during slot s - 1:
- The canonical head changes.
- The
assumed_skip_slot_threshold is reached, and the first condition (1) has not already been triggered.
The nitty gritty of implementation
Proposer shuffling
Our previous definition makes the assumption that we always know the proposers for slot s at slot s - 1. This is not strictly true. The proposer shuffling for epoch e can only be known after the final block in epoch e - 1 is processed.
This means that if we're in the last slot of the epoch (i.e., (s + 1) % SLOTS_PER_EPOCH == 0), we won't know what the proposer shuffling is until we either (a) receive a block at slot s - 1 or (b) hit assumed_skip_slot_threshold and assume that there is no block at s - 1.
With this in mind, we can create a more implementation-specific definition that is aware of proposer-shuffling constraints:
Proposer-shuffling aware definition
If the CL node is performing duties for any active validators, then it should run the maybe_prepare_payload routine whenever:
- The canonical head changes.
- The
assumed_skip_threshold is reached, and the first condition (1) has not already been triggered.
Where maybe_prepare_payload involves:
- Taking the canonical head block and running
process_slots to advance it to slot s.
- Determining if the CL node is performing duties for the block proposer at slot
s. If so, continue, else exit.
- Computing the values for preparePayload and issuing the request to the EL node.
Note: maybe_prepare_payload can be optimized in the non-epoch-boundary scenario to avoid calling process_slots, but this definition aims to be simple and general.
Is the VC or BN driving?
You may notice that I've used "CL node" instead of referring to the duties of a beacon node (BN) or validator client (VC). That's because it's not immediately clear whether the BN or VC should be the one driving this series of events.
VC driving
In the "VC driving" scenario, the BN has no idea about which validators may produce blocks at slot s. It is up to the VC to ensure that the BN issues a relevant preparePayload request at the correct time(s). The "VC driving" process looks like this:
If the VC is performing duties for any active validators, then it should run the maybe_prepare_payload routine whenever:
- The canonical head changes (i.e., it receives a
head SSE event).
- The
assumed_skip_threshold is reached, and the first condition (1) has not already been triggered.
Where maybe_prepare_payload involves:
- Determining the proposer duties for slot
s
- It may have these cached, or it may need to use the BNs
duties/proposer endpoint.
- Determining if the VC is performing duties for the block proposer at slot
s. If so, continue, else exit.
- Issuing a request to the BN API which, in turn, makes it issue a preparePayload request to the EL node.
- Such a BN API does not yet exist, but let's call it
validator/prepare_payload for the time being.
The definition of validator/prepare_payload requires some thought too. I propose it should take (slot, head_block_root) as parameters and return nothing. It will be the duty of the BN to hold the payloadId and provide it during a getPayload request. For the input parameters, slot is the slot in which the VC expects to propose a slot (i.e., s) and head_block_root will be head block at the time of the call (i.e., the expected parent of the beacon block it expects to propose at s).
BN driving
In the "BN driving" scenario, the VC knows nothing of the preparePayload request. Instead, just tells the BN which validators it is managing and the BN transparently calls preparePayload when it sees fit.
The "BN driving" process looks like this:
- The VC sends a message to the BN with the list of validator indices it controls
- The
validator/beacon_committee_subscriptions endpoint could theoretically be repurposed to also do this.
- Alternatively we could create a new
validator/potential_beacon_proposers endpoint (naming can be improved).
- It would probably make sense for this "subscription" to potential beacon proposers to expire after some time, since it does incur effort for the EL node and a once-and-forever subscription could end up wasteful.
- The BN follows exactly the steps described in the Proposer shuffling aware definition.
What does @paulhauner think about VC or BN driving?
At this stage, I think I prefer BN driving because it strives for simplicity in the VC (the scary secret-key-holding thing) and it also allows for more optimization inside the BN. Some clients (Lighthouse, Teku, at least) are already doing optimizations to compute the proposer duties for epoch e at the end of e - 1, these could be leveraged to make preparePayload more efficient.
Open Questions
I'm not sure what to define assumed_skip_slot_threshold as. One way to do it would be to set it at roughly the last time in which we usually expect a beacon block. In my experience this would be somewhere between 4-8s since slot start. However, it would be good to know if there's a point of diminishing returns regarding the delay between preparePayload and getPayload. For example, if it never takes the EL more than 3s to build the ideal ExecutionPayload, then lets just set it to 9s (12s - 3s) after slot start.
Description
In a post-merge beacon chain, a CL (consensus layer/eth2) node will need to call two functions in order to prepare a block:
engine_preparePayload: returns apayloadId.engine_getPayload: accepts apayloadId.The ultimate goal of these two calls is to return an
ExecutionPayload, which is effectively an execution (eth1) block to be included in a consensus (eth2) block.The reason there are separate preparePayload and getPayload calls is to allow the CL nodes to be able to give the EL (execution layer/eth1) nodes some time to prepare the payload (i.e., find the best set of transactions it can). To this end, in the ideal case we should call preparePayload some time before we call getPayload.
The purpose of this issue is to establish when the CL nodes should call preparePayload and to consider the engineering requirements for CL implementations (e.g., Lighthouse).
When to call preparePayload
Lets start with three basic constraints about when and how to call preparePayload:
s.s.parentHash, we can only call it after we know the parent of the block at slots.s - 1.s.Given these constraints, we could say that preparePayload should be called whenever the canonical head changes during slot
s - 1.But alas, there is an edge-case. What if the node never receives a block at slot
s - 1(i.e.,s - 1is a "skip slot")? The head could remain unchanged (e.g. the block at slots - 2) and therefore we'd never call preparePayload.In light of skip slots, it seems we may need to decide at some point during slot
s - 1that we're probably not going to get a block and that we should call preparePayload with the current head (e.g.s - 2). This point would be the threshold at which we assume there is a skip slot, so lets call itassumed_skip_slot_threshold.We can now form a general definition of when to call preparePayload:
General definition
If a CL node expects to propose a block at slot
s, then it should call preparePayload with values computed from the canonical head whenever the following events occur during slots - 1:assumed_skip_slot_thresholdis reached, and the first condition (1) has not already been triggered.The nitty gritty of implementation
Proposer shuffling
Our previous definition makes the assumption that we always know the proposers for slot
sat slots - 1. This is not strictly true. The proposer shuffling for epochecan only be known after the final block in epoche - 1is processed.This means that if we're in the last slot of the epoch (i.e.,
(s + 1) % SLOTS_PER_EPOCH == 0), we won't know what the proposer shuffling is until we either (a) receive a block at slots - 1or (b) hitassumed_skip_slot_thresholdand assume that there is no block ats - 1.With this in mind, we can create a more implementation-specific definition that is aware of proposer-shuffling constraints:
Proposer-shuffling aware definition
If the CL node is performing duties for any active validators, then it should run the
maybe_prepare_payloadroutine whenever:assumed_skip_thresholdis reached, and the first condition (1) has not already been triggered.Where
maybe_prepare_payloadinvolves:process_slotsto advance it to slots.s. If so, continue, else exit.Note:
maybe_prepare_payloadcan be optimized in the non-epoch-boundary scenario to avoid callingprocess_slots, but this definition aims to be simple and general.Is the VC or BN driving?
You may notice that I've used "CL node" instead of referring to the duties of a beacon node (BN) or validator client (VC). That's because it's not immediately clear whether the BN or VC should be the one driving this series of events.
VC driving
In the "VC driving" scenario, the BN has no idea about which validators may produce blocks at slot
s. It is up to the VC to ensure that the BN issues a relevant preparePayload request at the correct time(s). The "VC driving" process looks like this:If the VC is performing duties for any active validators, then it should run the
maybe_prepare_payloadroutine whenever:headSSE event).assumed_skip_thresholdis reached, and the first condition (1) has not already been triggered.Where
maybe_prepare_payloadinvolves:sduties/proposerendpoint.s. If so, continue, else exit.validator/prepare_payloadfor the time being.The definition of
validator/prepare_payloadrequires some thought too. I propose it should take(slot, head_block_root)as parameters and return nothing. It will be the duty of the BN to hold thepayloadIdand provide it during a getPayload request. For the input parameters,slotis the slot in which the VC expects to propose a slot (i.e.,s) andhead_block_rootwill be head block at the time of the call (i.e., the expected parent of the beacon block it expects to propose ats).BN driving
In the "BN driving" scenario, the VC knows nothing of the preparePayload request. Instead, just tells the BN which validators it is managing and the BN transparently calls preparePayload when it sees fit.
The "BN driving" process looks like this:
validator/beacon_committee_subscriptionsendpoint could theoretically be repurposed to also do this.validator/potential_beacon_proposersendpoint (naming can be improved).What does @paulhauner think about VC or BN driving?
At this stage, I think I prefer BN driving because it strives for simplicity in the VC (the scary secret-key-holding thing) and it also allows for more optimization inside the BN. Some clients (Lighthouse, Teku, at least) are already doing optimizations to compute the proposer duties for epoch
eat the end ofe - 1, these could be leveraged to make preparePayload more efficient.Open Questions
I'm not sure what to define
assumed_skip_slot_thresholdas. One way to do it would be to set it at roughly the last time in which we usually expect a beacon block. In my experience this would be somewhere between 4-8s since slot start. However, it would be good to know if there's a point of diminishing returns regarding the delay between preparePayload and getPayload. For example, if it never takes the EL more than 3s to build the idealExecutionPayload, then lets just set it to 9s (12s - 3s) after slot start.