Skip to content

fix(pse): repair mainnet v7 migration TotalScore invariant + Add Error Context#129

Merged
metalarm10 merged 8 commits into
masterfrom
john/v7-migration-fix
Apr 27, 2026
Merged

fix(pse): repair mainnet v7 migration TotalScore invariant + Add Error Context#129
metalarm10 merged 8 commits into
masterfrom
john/v7-migration-fix

Conversation

@metalarm10

@metalarm10 metalarm10 commented Apr 22, 2026

Copy link
Copy Markdown
Contributor

Description

Full report: https://app.clickup.com/t/868jb9jdc
Update: The fix implemented here is applied to testnet as well and chain patched with v7patch1. Testnet PSE bug fixed and stuck distributions are retroactively resolved: #130

Summary

Fixes the v7 PSE migration bug that stalled testnet's first multi-block community distribution on 2026-04-21 (~528M utestcore overpayment, intermediary drained, circuit breaker tripped).

migrateAccountScoreSnapshots copied pre-v7 AccountScoreSnapshot entries, bypassing addToMainScore and leaving TotalScore[firstMultiBlockDistributionID] unset. Phase 2 then divided by an understated denominator, inflating every payout.

Changes

  • Migration fix: migrateAccountScoreSnapshots now sums non-excluded migrated scores into TotalScore[distID] and routes excluded entries to ExcludedAddressScore (matches steady-state v7 routing).
  • Default batch size 1000: production-tuned value, matches what testnet has been running.
  • Error-context wrappers across the distribution path: failures now include distribution_id, delegator, validator, amount, total_score, block height, and stack trace — diagnosis becomes one log line instead of hours of state forensics.

Tests

3 new/strengthened migration tests:

  • TestMigratePSEStore — happy-path now asserts TotalScore invariant
  • TestMigratePSEStore_TotalScoreInvariant — reproduces the testnet failure with real score values; fails against the unfixed migration
  • TestMigratePSEStore_RoutesExcludedAddresses — excluded entries → ExcludedAddressScore, not counted in TotalScore

Plus TestProcessOngoingTokenDistribution_ErrorContext to lock in the error wrappers.

Mainnet impact

Mainnet hasn't run v7 yet — with this PR merged, the first multi-block distribution on mainnet will process initial TotalScore calculations correctly. Testnet was already repaired separately via the v7patch1 upgrade (proposal #84); this PR doesn't touch testnet but uses identical routing semantics so the two networks stay aligned.

Reviewers checklist:

  • Try to write more meaningful comments with clear actions to be taken.
  • Nit-picking should be unblocking. Focus on core issues.

Authors checklist

  • Provide a concise and meaningful description
  • Review the code yourself first, before making the PR.
  • Annotate your PR in places that require explanation.
  • Think and try to split the PR to smaller PR if it is big.

This change is Reviewable

@metalarm10 metalarm10 changed the title [DO-NOT-MERGE]-v7-pse-migration fix fix(PSE): Repair v7 migration to populate TotalScore + route excluded entries Apr 24, 2026
@metalarm10 metalarm10 marked this pull request as ready for review April 24, 2026 07:02
@metalarm10 metalarm10 requested a review from a team as a code owner April 24, 2026 07:02
@metalarm10 metalarm10 requested review from TxCorpi0x and ysv and removed request for a team April 24, 2026 07:02
@metalarm10 metalarm10 changed the title fix(PSE): Repair v7 migration to populate TotalScore + route excluded entries fix(pse): v7 migration TotalScore invariant + Add Error Context Apr 24, 2026
@metalarm10 metalarm10 changed the title fix(pse): v7 migration TotalScore invariant + Add Error Context fix(pse): repair mainnet v7 migration TotalScore invariant + Add Error Context Apr 24, 2026

@TxCorpi0x TxCorpi0x left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TxCorpi0x reviewed 9 files and all commit messages, and made 1 comment.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on metalarm10 and ysv).


app/upgrade/v7/migrate_pse.go line 309 at r1 (raw file):

	ctx context.Context, storeService sdkstore.KVStoreService, distributionID uint64, totalScore sdkmath.Int,
) error {
	if !totalScore.IsZero() {

The migration flow may leave total score missing when the sum is zero, so phase2 does not change and halt community distribution.

distribute.go:168 totalScore, err := k.TotalScore.Get(ctx, ongoingID) treats missing Total score as zero.
distribute.go:177 if totalPSEAmount.IsPositive() && !totalScore.IsPositive() { errirs when amount is positive and total score inot positive, if migration return zero eligible score first pahse 2 run can fail instad of clean finalization of community pool.

Note: It is worth it to cover no-data migration sucess for zero/missing total score. (in migration tes)

@metalarm10 metalarm10 left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@metalarm10 made 1 comment.
Reviewable status: 6 of 11 files reviewed, 1 unresolved discussion (waiting on TxCorpi0x and ysv).


app/upgrade/v7/migrate_pse.go line 309 at r1 (raw file):

Previously, TxCorpi0x wrote…

The migration flow may leave total score missing when the sum is zero, so phase2 does not change and halt community distribution.

distribute.go:168 totalScore, err := k.TotalScore.Get(ctx, ongoingID) treats missing Total score as zero.
distribute.go:177 if totalPSEAmount.IsPositive() && !totalScore.IsPositive() { errirs when amount is positive and total score inot positive, if migration return zero eligible score first pahse 2 run can fail instad of clean finalization of community pool.

Note: It is worth it to cover no-data migration sucess for zero/missing total score. (in migration tes)

Done.

Good catch, thanks. I applied a reorder in distribute.go so that the empty-snapshot-batch finalize path now runs before the totalScore == 0 invariant check, so a distribution with no eligible recipients refunds cleanly to the community pool - without any disable of PSE. The invariant check only fires when there are snapshot entries but TotalScore is zero (which is a real bug).

Added three migration tests:

  • TestMigratePSEStore_TotalScoreFromMultipleEntries : sum of pre-v7 entries is correctly written to TotalScore (related to testnet incident mostly).
  • TestMigratePSEStore_EmptyStoreLeavesTotalScoreUnset : empty pre-v7 store, TotalScore unset, PSE stays enabled.
  • TestMigratePSEStore_AllExcludedLeavesTotalScoreUnset : every pre-v7 entry excluded, TotalScore unset, PSE stays enabled.

@metalarm10 metalarm10 requested a review from TxCorpi0x April 27, 2026 11:14

@TxCorpi0x TxCorpi0x left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TxCorpi0x reviewed 5 files and all commit messages, and resolved 1 discussion.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on ysv).

@ysv ysv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ysv reviewed 11 files and all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on metalarm10).

@metalarm10 metalarm10 merged commit e55e0c3 into master Apr 27, 2026
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants