Skip to content

fix(e2e): guard alice's back-to-back gov txs against sequence-mismatc…#2268

Merged
nimrod-teich merged 1 commit into
mainfrom
fix/e2e-init-account-sequence-race
Apr 15, 2026
Merged

fix(e2e): guard alice's back-to-back gov txs against sequence-mismatc…#2268
nimrod-teich merged 1 commit into
mainfrom
fix/e2e-init-account-sequence-race

Conversation

@NadavLevi

Copy link
Copy Markdown
Contributor

…h flake

The Lava Protocol E2E job (CI run 24386640804) failed at init_e2e.sh line 20 with "account sequence mismatch, expected 2, got 1" on lavad tx gov vote 1 yes --from alice. The fail repeats across three identical patterns in init_e2e.sh where alice submits a legacy proposal and immediately votes on it.

Root cause: lavad tx ... --broadcast-mode sync (the default) returns after CheckTx accepts the tx into the mempool — not after the tx lands in a block. A subsequent wait_next_block only waits for block height to tick; it does not guarantee the prior tx was included in that block. The follow-up vote then queries alice's on-chain sequence, gets the pre-submit value, signs the vote with that sequence, and broadcasts. If the submit lands between query and broadcast, the antehandler (cosmos-sdk@v0.47.13/x/auth/ante/sigverify.go:269) rejects the vote.

Add lavad_tx_and_wait in scripts/useful_commands.sh — a thin wrapper around lavad <subcommand> that captures the txhash from the CLI output and polls lavad q tx $hash via the existing wait_for_tx helper until the tx is actually included. Wrap the three same-signer submit-then-vote patterns in init_e2e.sh (proposals 1, 2, and 3) so alice's sequence has definitely advanced on chain before the vote queries it. The redundant wait_next_block between each submit and vote is dropped: inclusion is now explicit.

Other same-account back-to-back patterns in the script (user1, user3) are separated by wait_next_block + sleep_until_next_epoch or sleep 6, which already provide enough wall-clock gap for the CheckTx → block-inclusion race to close. Leaving those unchanged for minimum churn; they can adopt the helper if a similar flake surfaces.

Description

Closes: #XXXX


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • read the contribution guide
  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the main branch
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic, API design and naming, documentation is accurate, tests and test coverage

@qodo-code-review

Copy link
Copy Markdown
ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan

Review Summary by Qodo

Fix account sequence mismatch in E2E back-to-back governance transactions

🐞 Bug fix

Grey Divider

Walkthroughs

Description
• Add lavad_tx_and_wait helper to ensure tx block inclusion before next same-signer tx
• Wrap alice's back-to-back proposal submit-vote patterns with new helper
• Eliminate race condition where vote queries pre-submit sequence from mempool
• Fix "account sequence mismatch" flake in E2E initialization script
Diagram
flowchart LR
  A["lavad tx --broadcast-mode sync"] -->|returns after CheckTx| B["mempool"]
  B -->|not guaranteed included| C["wait_next_block"]
  C -->|queries sequence| D["pre-submit value"]
  D -->|signs vote| E["antehandler rejects"]
  F["lavad_tx_and_wait wrapper"] -->|polls tx inclusion| G["block confirmed"]
  G -->|sequence advanced| H["next tx safe"]
Loading

Grey Divider

File Changes

1. scripts/useful_commands.sh ✨ Enhancement +32/-3

Add lavad_tx_and_wait helper for tx inclusion polling

• Add lavad_tx_and_wait function that wraps lavad tx commands and polls for block inclusion
• Function extracts txhash from CLI output and waits via existing wait_for_tx helper
• Ensures same-signer back-to-back transactions see consistent on-chain sequence
• Minor whitespace cleanup in wait_next_block_and_tx function

scripts/useful_commands.sh


2. scripts/test/init_e2e.sh 🐞 Bug fix +12/-6

Guard alice's governance transactions with tx inclusion waits

• Replace three lavad tx gov submit-legacy-proposal calls with lavad_tx_and_wait wrapper
• Remove redundant wait_next_block calls between submit and vote for alice's transactions
• Add detailed comment explaining the sequence mismatch race condition and solution
• Affected proposals: specs (proposal 1), plans (proposal 2), and plan removal (proposal 3)

scripts/test/init_e2e.sh


Grey Divider

Qodo Logo

@qodo-code-review

qodo-code-review Bot commented Apr 14, 2026

Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (1)   📘 Rule violations (0)   📎 Requirement gaps (0)
🐞\ ☼ Reliability (1)

Grey Divider


Action required

1. Votes still race sequence 🐞
Description
init_e2e.sh wraps alice’s proposal submits with lavad_tx_and_wait, but the follow-up `lavad tx
gov vote ... --from alice` calls are still not waited for inclusion, so a later alice tx can still
hit the same sequence-mismatch race if the vote gets included between sequence query and broadcast.
This can reintroduce CI flakes on the next alice transaction (e.g., vote→next submit) even though
submit→vote is now guarded.
Code

scripts/test/init_e2e.sh[R28-36]

lavad tx gov vote 1 yes -y --from alice --gas-adjustment "1.5" --gas "auto" --gas-prices $GASPRICE
wait_next_block
sleep 6 # need to sleep because plan policies need the specs when setting chain policies verifications

# Plans proposal
echo ---- Plans proposal ----
wait_next_block
-lavad tx gov submit-legacy-proposal plans-add ./cookbook/plans/test_plans/default.json,./cookbook/plans/test_plans/emergency-mode.json,./cookbook/plans/test_plans/temporary-add.json -y --from alice --gas-adjustment "1.5" --gas "auto" --gas-prices $GASPRICE
-wait_next_block
+lavad_tx_and_wait tx gov submit-legacy-proposal plans-add ./cookbook/plans/test_plans/default.json,./cookbook/plans/test_plans/emergency-mode.json,./cookbook/plans/test_plans/temporary-add.json -y --from alice --gas-adjustment "1.5" --gas "auto" --gas-prices $GASPRICE
lavad tx gov vote 2 yes -y --from alice --gas-adjustment "1.5" --gas "auto" --gas-prices $GASPRICE
Evidence
The script still broadcasts votes in sync mode and only calls wait_next_block, which the repo
already documents as insufficient to guarantee tx inclusion; the next alice transaction can still be
signed with a stale on-chain sequence if the vote lands between query and broadcast. The newly-added
helper and comments in useful_commands.sh describe this exact failure mode, but it’s only applied
to the submit side, not to votes.

scripts/test/init_e2e.sh[26-46]
scripts/useful_commands.sh[122-149]
scripts/useful_commands.sh[24-61]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`init_e2e.sh` still sends `gov vote` txs with plain `lavad tx ...` (default `--broadcast-mode sync`) and only waits for block height changes. This does not guarantee the vote is included before the next alice tx, leaving a remaining sequence-mismatch race.

### Issue Context
You already introduced `lavad_tx_and_wait` specifically to guarantee inclusion before the next same-signer tx. Apply the same guarantee to the `gov vote` transactions (or alternatively capture the vote txhash output and call `wait_for_tx` on it).

### Fix Focus Areas
- scripts/test/init_e2e.sh[26-46]
- scripts/useful_commands.sh[122-149]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment thread scripts/test/init_e2e.sh Outdated
@codecov

codecov Bot commented Apr 14, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
consensus 8.98% <ø> (ø)
protocol 35.20% <ø> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…h flake

The Lava Protocol E2E job (CI run 24386640804) failed at init_e2e.sh line
20 with "account sequence mismatch, expected 2, got 1" on
`lavad tx gov vote 1 yes --from alice`. Three identical patterns in
init_e2e.sh have alice submit a legacy proposal, then vote on it; and
every vote is followed by another alice submit (or, for vote 3, the end
of alice's sequence). Both the submit→vote pair AND the vote→next-submit
pair can race.

Root cause: `lavad tx ... --broadcast-mode sync` (the default) returns
after CheckTx accepts the tx into the mempool — not after the tx lands
in a block. A subsequent `wait_next_block` only waits for block height
to tick; it does not guarantee the prior tx was included in that block.
The follow-up alice tx then queries alice's on-chain sequence, gets the
pre-prior-tx value, signs with that sequence, and broadcasts. If the
prior tx lands between query and broadcast, the antehandler
(cosmos-sdk@v0.47.13/x/auth/ante/sigverify.go:269) rejects the new tx.

The `sleep 6` after each vote is a timing assumption, not a guarantee —
under CI load a single slow block makes vote 1 → submit 2 just as racy
as submit 1 → vote 1.

Add `lavad_tx_and_wait` in scripts/useful_commands.sh — a thin wrapper
around `lavad <subcommand>` that captures the txhash from the CLI
output and polls `lavad q tx $hash` via the existing `wait_for_tx`
helper until the tx is actually included. Wrap every alice gov tx in
init_e2e.sh (three submit-legacy-proposal and three vote calls) so
every same-signer transition is guaranteed sequence-consistent. The
intermediate `wait_next_block` calls are now redundant (inclusion is
explicit) and are dropped; the `sleep 6` calls remain for their
original purpose of letting plan policies become active after voting.

Other same-account back-to-back patterns in the script (user1, user3)
are separated by `sleep_until_next_epoch` or `sleep 6`, which already
close the race in practice; leaving those unchanged for minimum churn.
They can adopt the helper if a similar flake surfaces.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@NadavLevi NadavLevi force-pushed the fix/e2e-init-account-sequence-race branch from e57d558 to 43ad824 Compare April 14, 2026 09:08
@NadavLevi NadavLevi requested a review from avitenzer April 14, 2026 09:17
@github-actions

Copy link
Copy Markdown

Test Results

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
7 files   ±0   0 ❌ ±0 

Results for commit 43ad824. ± Comparison against base commit 5741773.

@nimrod-teich nimrod-teich merged commit 2d92617 into main Apr 15, 2026
30 checks passed
@nimrod-teich nimrod-teich deleted the fix/e2e-init-account-sequence-race branch April 15, 2026 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants