build: add `Dockerfile`. by mpaulucci · Pull Request #8 · lambdaclass/ethrex

mpaulucci · 2024-06-05T15:34:44Z

Description

Added Dockerfile
Added tracing
Cleaned up dependencies
Added job to check docker build

Closes #7

…-limb values Store each U256 limb individually instead of using a struct assignment or copy_nonoverlapping. This prevents LLVM from grouping limbs[1..3] into a [24 x i8] stack alloca that then requires a memset + memcpy round-trip to reach the EVM stack slot. After the fix, LLVM sees all 4 limbs as independent i64 scalars and can prove that the upper limbs are zero constants for PUSH1-PUSH31, reducing the EVM stack write from 5 memory ops (3 spills + 2 reloads) to 2 stp instructions using the hardware zero register. Before (PUSH1 fast path): str x9, [x11] ; limb[0] ldur q0, [sp, #8] ; reload zeros from frame stur q0, [x11, #8] ; limbs[1..2] ldr x9, [sp, #24] ; reload zeros from frame str x9, [x11, #24] ; limb[3] After (PUSH1 fast path): stp x9, xzr, [x11] ; limb[0] + limb[1] stp xzr, xzr, [x11, #16] ; limb[2] + limb[3]

## Summary This PR applies two complementary assembly-level optimizations to the PUSH1–PUSH32 opcode handlers, identified by analyzing the generated aarch64 and x86_64 assembly output. ### Change 1: Use const-generic big-endian conversion (`push.rs`) Replaces `U256::from_big_endian(&data[..N])` (runtime slice) with `u256_from_big_endian_const::<N>(buf)` (const-generic fixed-size array). Because `N` is a compile-time constant at each monomorphized `OpPushHandler<N>`, the compiler can: - Compute the padding offset `32 - N` at compile time - Operate on a fixed-size `[u8; N]` buffer instead of a runtime slice, enabling better autovectorization of the byte copy ### Change 2: Eliminate stack-frame spill in `Stack::push` (`call_frame.rs`) Replaces the previous `copy_nonoverlapping` (and later direct struct assignment) with individual stores for each of the 4 U256 limbs. **Root cause of the spill:** LLVM's SROA pass decomposes a `U256` value into `limb[0]` (scalar i64) and `limbs[1..3]` (a `[24 x i8]` stack alloca). A struct assignment or `copy_nonoverlapping` still uses `memcpy` for the alloca portion, causing a `memset(alloca, 0) + memcpy(alloca → slot)` round-trip. By storing each limb individually, LLVM treats all 4 as independent i64 scalars, proves the upper limbs are zero constants for PUSH1–PUSH31, and eliminates the alloca entirely. **Before** (PUSH1 fast path, 5 memory ops): ```asm str x9, [x11] ; limb[0] ldur q0, [sp, #8] ; reload zeros from frame ← wasted stur q0, [x11, #8] ; limbs[1..2] ldr x9, [sp, #24] ; reload zeros from frame ← wasted str x9, [x11, #24] ; limb[3] ``` **After** (PUSH1 fast path, 2 memory ops, no stack frame): ```asm stp x9, xzr, [x11] ; limb[0] + limb[1] stp xzr, xzr, [x11, #16] ; limb[2] + limb[3] ``` The same spill/reload pattern was confirmed on x86_64 (using `xorps`/`movaps`/`movq` through the stack frame). ## Test plan - [x] `cargo check -p ethrex-levm` - [x] `cargo test -p ethrex-levm --release` - [x] `cargo test -p ethrex --release`

Run #8 still failed at switch_block+2 even with the cache-aware backend landed in f10fb7f: each post-switch block reading an account modified at a prior post-switch block (but not the latest) returned the MPT-base nonce, off-by-1. Root cause: `BinaryMerkleizer::new` started from `BinaryTrieState::new()` (empty), ignoring `_parent_root` and the `provider` (marked dead_code). Symmetric `MptMerkleizer::new` opens 16 shard workers each rooted at `parent_state_root` and lazy-loads via the provider, so the merkleizer's trie at the new root is the FULL post-parent state. The binary side was producing diffs from empty — so the in-memory trie at root R(N) contained a path only to accounts modified at block N. Read-path gates that walk `state.trie_get` (notably `stem_has_basic_data`) returned false for any account not in the latest block, and `TransitionBackend::account` fell through to the MPT base. Make BinaryMerkleizer mirror MptMerkleizer: - `BinaryTrieProvider::open_state()`: new trait method returning a `BinaryTrieState`. Default impl returns `BinaryTrieState::new()` (empty, for tests + `EmptyBinaryTrieProvider` + genesis bootstrap). - `StoreBinaryTrieProvider::open_state` overrides: opens against `CacheAwareTrieBackend`, so the trie is rooted at the live binary head (cache + disk), matching MPT's `MptTrieWrapper(state_root, trie_cache, db, last_written)` pattern. - `BinaryMerkleizer::{new,new_bal}` open via `provider.open_state()` instead of `BinaryTrieState::new()`. The `provider` field loses its `#[allow(dead_code)]` — load-bearing now. - `EmptyTrieBackend` added to `binary-trie::db` for symmetry with `EmptyBinaryTrieProvider` (used by future test paths that want the default no-op `open_state`). After this fix, each block's merkleizer trie contains parent_state + this block's writes. `state.trie_get` works for cross-block reads. Layer-cache + on-disk fallback flows identically to the MPT pipeline. ethrex-binary-trie 143/143, ethrex-storage 49/49, ethrex-blockchain 7/7 all pass. fmt + clippy clean on touched code.

Add Dockerfile.

848b405

mpaulucci requested a review from a team as a code owner June 5, 2024 15:34

juanbono approved these changes Jun 5, 2024

View reviewed changes

juanbono merged commit d19a445 into main Jun 5, 2024

juanbono deleted the add-docker branch June 5, 2024 15:54

fedacking mentioned this pull request Feb 5, 2026

docs(l1): snapsync roadmap #6112

Open

1 task

This was referenced Feb 6, 2026

test(l1): add restart stall reproduction test using eth-docker #6151

Open

perf(l1): adaptive request sizing, storage bisection, and parallel trie in snap sync #6181

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: add `Dockerfile`.#8

build: add `Dockerfile`.#8
juanbono merged 1 commit into
mainfrom
add-docker

mpaulucci commented Jun 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mpaulucci commented Jun 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants