op-node/rollup/sequencing: Fix temporary engine error handling#12258
op-node/rollup/sequencing: Fix temporary engine error handling#12258sebastianst merged 1 commit intodevelopfrom
Conversation
|
@sebastianst just my two cents for a potential minimally invasive fix. Add a This assumes setting |
Yep that would also have been my 2nd try. But removing the early return might just be the better fix. |
|
@sebastianst to add some data to the testing story: we've been running this change on some internal L3 testnets and it's working well |
There was a problem hiding this comment.
I think the change itself is valid, to get the sequencer unstuck, if it ever does encounter a combination of nextActionOK == false and temporary engine errors.
But I am a little puzzled why nextActionOK is false to begin with. It should only be false if the sequencer isn't able to sequence (during sync from L1, during reset of L1 sync, during extraordinary safe-head lag, or when meant to be inactive). So that may be worth investigating more, and I think we can drop nextActionOK state altogether, if we capture the above no-sequence cases individually.
Without the nextActionOK reason, I don't see how we can put together a coherent test. So I am in favor of merging this, based on what we have seen, and then refactoring it to not have a nextActionOK at all anymore.
Do schedule a next sequencer action even if the building state is empty in `onEngineTemporaryError`. This can happen if the previous successful payload cleared the building state and then when `startBuildingBlock` is entered, it hits a temp error but never modified the cleared building state, so the temp error handler would never schedule the next action. Towards #12240 #12041
c647801 to
82c60df
Compare
…eum-optimism#12258) Do schedule a next sequencer action even if the building state is empty in `onEngineTemporaryError`. This can happen if the previous successful payload cleared the building state and then when `startBuildingBlock` is entered, it hits a temp error but never modified the cleared building state, so the temp error handler would never schedule the next action. Towards ethereum-optimism#12240 ethereum-optimism#12041
Description
Do schedule a next sequencer action even if the building state is empty in
onEngineTemporaryError.This can happen if the previous successful payload cleared the building state and then when
startBuildingBlockis entered, it hits a temp error but never modified the cleared building state, so the temp error handler would never schedule the next action.Tests
Additional context
Metadata
Towards #12240 #12041 (but not necessarily fixing yet, if the current solution is wrong)