Skip to content

Light CL support in op-node#6

Closed
karlfloersch wants to merge 24 commits intodevelopfrom
feat/safe-source-l2
Closed

Light CL support in op-node#6
karlfloersch wants to merge 24 commits intodevelopfrom
feat/safe-source-l2

Conversation

@karlfloersch
Copy link
Copy Markdown
Owner

@karlfloersch karlfloersch commented Oct 8, 2025

This pull request introduces an option into the op-node which skips derivation by outsourcing safe block computation to an remote L2 RPC. We call this a "light CL."

Core logic: The remote L2 is queried to determine the safe head and then that safe head is promoted in the attached EL. If the EL does not have the Payload already, the payload is inserted and EL sync is used to catch up to the head of the chain. Derivation is fully disabled during this process in the same way it is done in EL sync to avoid conflicting safe block insertion.

@cursor
Copy link
Copy Markdown

cursor bot commented Oct 8, 2025

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on November 23.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

// The pipeline cannot move forwards when using L2 safe source (derivation is skipped).
s.Log.Debug("Rollup driver is backing off because using L2 safe source.",
"unsafe_head", s.Engine.UnsafeL2Head())
s.StepDeriver.ResetStepBackoff(s.Ctx)
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that in traditional EL sync we need to ResetStepBackoff because both safe and unsafe block progression is paused while the EL is catching up -- tell me if I'm incorrect there.

If I am correct, then we should actually not ResetStepBackoff here and instead simply return early because when SafeSource=L2 we will still want to be progressing the unsafe head at max speed. Also it's worth noting that in the case where we are doing an EL sync (as a part of normal SafeSource=L2 functioning) we will still get to make use of the original ResetStepBackoff in the above if statement if s.Engine.IsEngineSyncing().

Am I correct here in what ResetStepBackoff does?

@opsuperchain
Copy link
Copy Markdown
Collaborator

Review Comment Responses

Comment 1 (driver.go indentation):
Fixed in 0fb7c96 - aligned all fields in the syncDeriver struct initialization with consistent tabs.

Comment 2 (sync_deriver.go ResetStepBackoff):
You're absolutely correct! I've removed the ResetStepBackoff call in 0fb7c96. Your understanding is spot on:

  • In traditional EL sync, ResetStepBackoff is needed because both safe and unsafe progression is paused
  • For SafeSource=L2, we want unsafe head to progress at max speed
  • The IsEngineSyncing() check above already handles the backoff reset when EL sync is occurring

The updated code now just returns early without calling ResetStepBackoff, allowing the unsafe head to progress at maximum speed while safe head is sourced from the remote L2.

@opsuperchain
Copy link
Copy Markdown
Collaborator

Update on Comment 2 (ResetStepBackoff)

After further discussion, I was initially confused about what ResetStepBackoff does. You're absolutely right that we need it here!

Updated in 56bddd4:

  • Added ResetStepBackoff back to the safe-source=l2 block
  • ResetStepBackoff clears the failure counter, allowing steps to proceed immediately without exponential backoff delay
  • We need this to keep calling TryUpdateEngine() frequently to process unsafe blocks
  • The backoff mechanism still protects against actual errors because L1TemporaryErrorEvent and EngineTemporaryErrorEvent handlers call RequestStep(resetBackoff=false), which preserves backoff

Sorry for the confusion - the name "ResetStepBackoff" made me think it was resetting the backoff timer, when it actually removes backoff delays entirely!


func NewSingleChainMultiNodeWithSafeSourceL2(t devtest.T) *SingleChainMultiNodeWithSafeSourceL2 {
preset := NewSingleChainMultiNodeWithSafeSourceL2WithoutCheck(t)
// Ensure the follower node is in sync with the sequencer before starting tests
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer L2 sourced node more than follower node because if you think about it all L2 nodes are follower nodes of L1 nodes


// If we detected a divergence, trigger reorg by setting unsafe head
if needsReorg {
e.SetUnsafeHead(remoteRef)
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we've reorged, isn't it risky to SetUnsafeHead without also setting the safe head?

}

cancel()
}
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we be able to make this ticker more similar to the altSync ticker by putting this logic inside of a function?

cpy := *s
return &cpy
}

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of LOC for something that seems really easy... Should we just make SafeSource a string so that we don't need all this type boilerplate?

}

// Only update if both succeeded
if safeErr == nil && finalizedErr == nil {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if we need to reorg unsafe head here?


// Fetch finalized first, then safe to maintain the invariant finalized <= safe
_, remoteFinalizedRef, finalizedErr := s.SyncDeriver.Engine.FetchAndEnsureRemoteL2BlockWithRef(ctx, eth.Finalized)
_, remoteSafeRef, safeErr := s.SyncDeriver.Engine.FetchAndEnsureRemoteL2BlockWithRef(ctx, eth.Safe)
Copy link
Copy Markdown
Owner Author

@karlfloersch karlfloersch Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FetchAndEnsureRemoteL2BlockWithRef - Is a better name for this FetchAndMaybeInsertPayload ? Or FetchAndInsertPayloadIfMissing ?

if finalizedErr != nil {
return finalizedErr
}
return safeErr
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this logic be above when we check for the err != nil?

Also that means we don't need to do an if statement if safeErr == nil && finalizedErr == nil {


// Note: We don't set unsafe/safe heads here. The caller (safeSourceL2Ticker) handles
// updating all heads atomically after fetching both safe and finalized blocks.
// This ensures the invariant that safe <= unsafe is maintained.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment

@opsuperchain opsuperchain force-pushed the feat/safe-source-l2 branch 2 times, most recently from dae50ed to 3f2ac3c Compare October 30, 2025 19:34
@karlfloersch karlfloersch changed the title Add safe-source=L2 enabling syncing a node without doing derivation Light CL Support in op-node Oct 30, 2025
@karlfloersch karlfloersch changed the title Light CL Support in op-node Light CL support in op-node Oct 30, 2025
opsuperchain and others added 11 commits November 4, 2025 14:28
Add comprehensive implementation plan for safe-source=l2 fast follower mode:
- ~640 LOC total (200 core, 225 infrastructure, 215 tests)
- Reuses existing EngineClient infrastructure
- Skip derivation when using L2 safe source
- Query remote safe/finalized via eth_getBlockByNumber tags
- Includes sync tester test design similar to EL sync tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Implement configuration foundation for safe-source=l2 feature:
- Add SafeSource enum (L1/L2) to sync.Config
- Add --safe-source and --safe-source.l2-rpc CLI flags
- Parse and validate safe source configuration
- Validate incompatible flag combinations (e.g., EL sync + L2 source)
- Warn users when trusting remote L2 node

Changes (~90 LOC):
- op-node/rollup/sync/config.go: Add SafeSource type and config fields
- op-node/flags/flags.go: Add CLI flags
- op-node/service.go: Parse and validate config

Tests: ✅ Build passes, packages compile successfully

Part of implementing fast follower mode where op-node can trust
another L2 node's safe/finalized heads instead of deriving from L1.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Phase 2: Wire up remote L2 client initialization
- Initialize EngineClient for remote L2 when SafeSource=L2
- Add SafeSourceL2Client field to SyncDeriver and EngineController
- Pass client through driver initialization

Phase 3: Add derivation skip logic
- Skip derivation pipeline when using L2 safe source
- Similar to EL sync, but for safe-source=l2 mode

Phase 4: Query remote safe/finalized in forkchoice
- Add fetchRemoteL2BlockHash helper to query remote L2 blocks
- Modify insertUnsafePayload to use remote safe/finalized hashes
- Update ForkchoiceState with remote block hashes

Note: Full payload fetching with NewPayload will be added later.
Currently just queries block hashes from remote L2.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Improvements to Phase 4:
- Add SafeSourceL2Client interface with PayloadByHash/PayloadByLabel
- Implement fetchAndEnsureRemoteL2Block with full payload fetching
- Use L2BlockRefByNumber to check canonical chain (not just any block)
- Detect chain divergence and trigger reorg by setting unsafe head
- Call NewPayload to insert missing blocks into local EL
- Add L2BlockRefByNumber to ExecEngine interface

Key logic:
1. Query remote L2 for safe/finalized block ref
2. Check local EL by block number (canonical chain check)
3. If hash mismatch → reorg detected, set unsafe head to remote
4. If block missing → fetch payload and insert via NewPayload
5. Use remote hash in ForkchoiceUpdate

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Phase 5: Unit tests covering:

1. SafeSource config type (op-node/rollup/sync/config_test.go):
   - Test StringToSafeSource parsing (l1, l2, uppercase, invalid)
   - Test SafeSource.String() output
   - Test SafeSource.Set() method
   - Test SafeSource.Clone() method

2. Engine controller safe-source=l2 logic (op-node/rollup/engine/engine_controller_test.go):
   - Test fetchAndEnsureRemoteL2Block when block already exists locally
   - Test fetchAndEnsureRemoteL2Block with chain divergence (triggers reorg)
   - Test fetchAndEnsureRemoteL2Block with missing block (fetches via NewPayload)
   - Mock implementations for SafeSourceL2Client

All tests pass and compilation succeeds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fix indentation in driver.go syncDeriver struct initialization
- Remove ResetStepBackoff call in safe-source=l2 block to allow unsafe
  head to progress at max speed. EL sync backoff is already handled by
  the IsEngineSyncing() check above.
Reset backoff to keep processing steps (especially TryUpdateEngine for
unsafe blocks) frequently, similar to EL sync. The backoff mechanism
still works for actual errors via TemporaryErrorEvent handlers.
Simplified comment to match the formatting of the IsEngineSyncing block.
Removes 157 lines of tests for boilerplate String/Set/Clone methods.
The Mode type in the same file has identical methods without tests,
and other config_test.go files test actual validation logic instead.
- Extend L2CLConfig to support SafeSource and SafeSourceL2RPC fields
- Modify WithOpNode to wire SafeSource config into sync.Config
- Add DefaultSingleChainMultiNodeWithSafeSourceL2System preset
  that configures L2CLB to use L2CL as safe source
- Add basic acceptance test verifying safe head matching
- Add DefaultSimpleSystemWithSyncTesterSafeSourceL2 preset
  that configures L2CL2 to use L2CL as safe source
- Add SimpleWithSyncTesterSafeSourceL2 preset wrapper
- Add sync_tester test verifying safe head matching with sync tester
- Test verifies L2CL2 follows L2CL's safe head via RPC
  without performing derivation
opsuperchain and others added 13 commits November 4, 2025 14:28
Required by devstack test framework to initialize orchestrator.
…sing blocks

Moved safe-source=l2 fetching logic to run in tryUpdateEngineInternal before
needFCUCall check, ensuring it executes periodically regardless of block insertion
success. This fixes a chicken-and-egg problem where safe-source couldn't run after
EL reset because blocks couldn't insert without safe-source data.

Fixed error handling in fetchAndEnsureRemoteL2BlockWithRef to treat any error from
L2BlockRefByNumber as "block not found" rather than fatal. The EL returns various
error types when blocks are missing, not just ethereum.NotFound.

These changes enable safe-source=l2 to recover when the EL has missing blocks,
which is critical for the sync tester scenario and real-world data loss recovery.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Moved safe-source=l2 sync logic from tryUpdateEngineInternal to a
dedicated safeSourceL2Ticker in the driver event loop. This provides:

- Cleaner separation of concerns
- More intuitive code organization
- Ticker fires every 4 seconds regardless of chain progress
- All-or-nothing update pattern for safe and finalized heads
- Only executes when SafeSource == SafeSourceL2

Changes:
- Made FetchAndEnsureRemoteL2BlockWithRef public in engine_controller
- Removed safe-source=l2 logic from tryUpdateEngineInternal
- Added safeSourceL2Ticker to driver.go event loop
- Ticker polls remote L2 and updates both safe/finalized atomically

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Consistent with SyncStep behavior, skip safeSourceL2Ticker updates
when the execution engine is syncing to avoid unnecessary FCU calls.
Renamed test setup files to use 'safesourcel2' naming for better
clarity and consistency with --safe-source=l2 flag.
- Remove ResetStepBackoff from safe-source=l2 block to allow unsafe head
  to progress at max speed via P2P gossip
- Change "follower node" to "L2 sourced node" for better terminology
- Remove SetUnsafeHead call during reorg to maintain safe <= unsafe invariant
  (ticker handles setting all heads atomically)
- Improve logging for diverged block insertions
Fetch finalized block first, then safe block to ensure we maintain
the invariant finalized <= safe <= unsafe when inserting blocks into
the local EL during sync.
- Extract safeSourceL2Ticker logic into syncSafeHeadFromL2 function
- Simplify SafeSource from int to string type, reducing boilerplate from 53 to 31 lines
- Update flags to use hardcoded options string

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Fetch and update unsafe head from remote L2 to maintain finalized <= safe <= unsafe invariant
- Rename FetchAndEnsureRemoteL2BlockWithRef -> FetchAndInsertRemotePayloadIfMissing for clarity
- Update comments to better describe payload insertion behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…andling

Address PR review feedback:
- Rename FetchAndEnsureRemoteL2BlockWithRef to FetchAndInsertRemotePayloadIfMissing
- Remove unsafe head fetching from remote L2 (unsafe comes from P2P gossip)
- Add reorg detection: return bool indicating if local chain diverged at safe block
- Trigger unsafe head reorg when local EL has different hash at safe block number
- Simplify error handling with early returns instead of complex multi-error check

The reorg logic now correctly:
1. Inserts any missing payloads
2. Checks if local EL block hash matches remote safe block hash
3. Sets unsafe head to safe head when divergence detected (triggers reorg)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Remove NewPayload calls from safe-source=l2 mode and rely on EL sync
triggered by forkchoice updates instead.

Changes:
- Simplified syncSafeHeadFromL2 to only fetch block refs (not payloads)
- Check if safe block is on canonical chain by comparing hashes
- Set unsafe head to safe head when block is missing or diverged
- Added FetchRemoteL2BlockRef helper for lightweight block ref fetching
- Added L2BlockRefByNumber wrapper for canonical chain checks

This approach is cleaner and allows EL sync to handle payload insertion
naturally through the forkchoice update mechanism.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
…nesis

When safe-source=l2 is enabled, the startup engine reset was causing the
first ForkchoiceUpdate to go to genesis instead of the remote L2's safe/
finalized heads. This happened because:

1. Driver.Start() immediately triggered an engine reset
2. The reset called FindL2Heads() which read the local EL's current state
3. On a fresh EL, this defaulted to genesis for safe/finalized heads
4. The reset applied these genesis heads and sent FCU(genesis)
5. This caused snap sync to complete immediately at genesis

The fix removes the startup engine reset entirely. The syncSafeHeadFromL2
function, which runs on a 4-second ticker, will handle initialization by
fetching the remote safe/finalized heads and setting them as the first
forkchoice state. This ensures the first FCU uses the correct remote heads
instead of genesis.

Tested with op-acceptance-tests:
- tests/safe_source_l2: PASS
- tests/sync_tester/sync_tester_safesourcel2: PASS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
s.SyncDeriver.Engine.SetFinalizedHead(remoteFinalizedRef)
s.SyncDeriver.Engine.SetSafeHead(remoteSafeRef)
s.SyncDeriver.Engine.SetLocalSafeHead(remoteSafeRef)
s.SyncDeriver.Engine.RequestForkchoiceUpdate(ctx)
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a race condition here with the unsafe derivation SyncStep?

opsuperchain pushed a commit that referenced this pull request Mar 8, 2026
…imism#19281)

* fix(contracts): address audit findings #14, #6, #8, #13, #19

- #14: Reuse existing DelayedWETH from SystemConfig instead of deploying
  a new one in the Migrator, preventing divergence with future upgrades
- #6: Document that hardcoded game type lists in OPCMv2 and Migrator are
  intentional and must be kept in sync when new types are added
- #8: Document that migrate() does not enforce SuperchainConfig version floor
- #13: Document why migration game config validation is deliberately minimal
- #19: Document theoretical risk in AnchorStateRegistry.isGameRegistered
  when ASR proxy is replaced non-atomically

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(contracts): add cross-reference comment to GameTypes library

Add a notice to the GameTypes library reminding developers to update
the hardcoded game type lists in OPContractsManagerMigrator and
OPContractsManagerV2's _assertValidFullConfig when adding new types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(contracts): bump OPContractsManagerV2 version for rebase

Bump OPContractsManagerV2 from 7.0.9 to 7.0.10 to account for the
comment-only source change (cross-reference note added in prior commit)
that affects the bytecode metadata hash.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(contracts): bump OPContractsManagerV2 version to 7.0.11 for semver-diff CI fix

* fix(contracts): apply forge fmt and bump versions for formatting changes

forge fmt changed OPContractsManager, FaultDisputeGame, SuperFaultDisputeGame,
and several other files. Bump patch versions for the contracts with hash changes,
and regenerate semver-lock and snapshots.

- OPContractsManager: 6.0.3 -> 6.0.4
- FaultDisputeGame: 2.4.0 -> 2.4.1
- SuperFaultDisputeGame: 0.7.0 -> 0.7.1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: smartcontracts <smartcontracts@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants