Skip to content

Ensure the interop types are serializable before commit#3802

Merged
shargon merged 28 commits intomasterfrom
fix-manifest
Mar 20, 2025
Merged

Ensure the interop types are serializable before commit#3802
shargon merged 28 commits intomasterfrom
fix-manifest

Conversation

@shargon
Copy link
Member

@shargon shargon commented Mar 5, 2025

Description

Make sure the manifest is serializable before committing it.
Before this commit, a possible error will be thrown during the commit.

Type of change

  • Optimization (the change is only an optimization)
  • Style (the change is only a code style for better maintenance or standard purpose)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Test A
  • Test B

Test Configuration:

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Copy link
Member

@vncoelho vncoelho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shargon shargon added Bug Used to tag confirmed bugs Critical Issues (bugs) that need to be fixed ASAP labels Mar 5, 2025
@shargon shargon marked this pull request as ready for review March 5, 2025 09:06
vncoelho
vncoelho previously approved these changes Mar 5, 2025
@dusmart
Copy link

dusmart commented Mar 5, 2025

Thank you for your PR that resolved the testnet halt issue caused by the attack. We definitely support merging it.

I’d like to propose a consideration that might lead to a new PR or issue: could we allow all possible outcomes after a successful simulation and block packaging, especially when the transaction simulation results in VMState.HALT? Instead of throwing exceptions, this could cover scenarios like those we encountered during Storage.Commit, or even within OnPersist and PostPersist, helping to mitigate similar attacks in the future.

While the likelihood of an issue like today’s happening again is quite small, logging a high-level error could serve as an early warning. Take it into today’s case: the blockchain won't be halted by the attack, the affected contract might become unreadable due to deserialization failure — which, arguably, is a slightly better outcome.

What do you think?

@dusmart
Copy link

dusmart commented Mar 5, 2025

Additionally, I’m wondering if there might be other potential risks beyond the contract checks — could other IInteroperable instances also cause similar issues during Storage.Commit? @Jim8y, could you and your teammates help take a closer look?

Shargon has also found the Oracle's problem. Great job!

@shargon
Copy link
Member Author

shargon commented Mar 5, 2025

Thank you for your PR that resolved the testnet halt issue caused by the attack. We definitely support merging it.

I’d like to propose a consideration that might lead to a new PR or issue: could we allow all possible outcomes after a successful simulation and block packaging, especially when the transaction simulation results in VMState.HALT? Instead of throwing exceptions, this could cover scenarios like those we encountered during Storage.Commit, or even within OnPersist and PostPersist, helping to mitigate similar attacks in the future.

While the likelihood of an issue like today’s happening again is quite small, logging a high-level error could serve as an early warning. Take it into today’s case: the blockchain won't be halted by the attack, the affected contract might become unreadable due to deserialization failure — which, arguably, is a slightly better outcome.

What do you think?

Yes, we will deep into it

@roman-khimov
Copy link
Contributor

the testnet halt issue caused by the attack

It's certainly not an attack, just an accident. People write contracts, people try them on testnet.

those we encountered during Storage.Commit

Each state change is bound to some execution context by design. The core of the problem is that this assumption is broken, transaction does something and fully succeeds, but then it turns out that the state change is incorrect and should've never happened.

within OnPersist and PostPersist

I don't think you can do this. It's pretty similar to the problem of insufficient GAS balance during OnPersist processing. If this happens we're toast, there is no correct behavior when we have a correctly signed block that does something wrong. This can be recovered only via a fork, throwing away a block and creating a new one.

@dusmart
Copy link

dusmart commented Mar 5, 2025

those we encountered during Storage.Commit

Each state change is bound to some execution context by design. The core of the problem is that this assumption is broken, transaction does something and fully succeeds, but then it turns out that the state change is incorrect and should've never happened.

Perhaps, at the very least, we should have a reminder for others developers — ensuring that every new IInteroperable includes a check to confirm it can be properly serialized if it will be committed to the storage during transaction simulation. Otherwise, wouldn’t this kind of issue be something only experienced developers like you and Shargon would think to catch?

Or perhaps we should consider simulating the commit process during transaction simulation as well for all kinds of txs. If any errors occur during commit, the simulation shouldn't return a HALT state.

@shargon
Copy link
Member Author

shargon commented Mar 5, 2025

Or perhaps we should consider simulating the commit process during transaction simulation as well for all kinds of txs. If any errors occur during commit, the simulation shouldn't return a HALT state.

This can be done in consensus, although the transaction may give a different result (not care because it's a commit test).

@superboyiii
Copy link
Member

We need HF flag here. @shargon

@roman-khimov
Copy link
Contributor

But why? You can't do

if old_node
    do_something()
else
    do_something_else()

here. The buggy node just stops processing and can't persist this block, nothing happened. Proper ones are perfectly aligned with this PR in behavior, they fail the transaction.

@shargon
Copy link
Member Author

shargon commented Mar 6, 2025

Agree with @roman-khimov , old node doesn't work, so no hf needed

@roman-khimov
Copy link
Contributor

It just occured to me that we can destroy contract as well in the same transaction. Can this be a problem?

Frankly I'd rather roll out the fix quickly for mainnet and be done with that.

@shargon
Copy link
Member Author

shargon commented Mar 6, 2025

@neo-project/core please review

@superboyiii
Copy link
Member

superboyiii commented Mar 6, 2025

But why? You can't do

if old_node
    do_something()
else
    do_something_else()

here. The buggy node just stops processing and can't persist this block, nothing happened. Proper ones are perfectly aligned with this PR in behavior, they fail the transaction.

It's a different behaviour and different native contract, you don't know what may happen before next release. Even stateroot is also different when resync. No doubt, it's a hardfork already.

@roman-khimov
Copy link
Contributor

IIUC you don't have any stateroot for testnet block 5557998 with the old node because to get a state root you need a state first and you don't have a state because there is an unserializable thing in the cache. Then if we're to tie this to Echidna the question is how to process testnet block 5557998, it's certainly pre-Echidna (unless you're to set it earlier and require reprocessing), but you can't have the old behavior because it just stops the node completely. Then also mainnet, I won't talk about it much here, but you understand the current state of it perfectly with the unpatched node.

@superboyiii
Copy link
Member

IIUC you don't have any stateroot for testnet block 5557998 with the old node because to get a state root you need a state first and you don't have a state because there is an unserializable thing in the cache. Then if we're to tie this to Echidna the question is how to process testnet block 5557998, it's certainly pre-Echidna (unless you're to set it earlier and require reprocessing), but you can't have the old behavior because it just stops the node completely. Then also mainnet, I won't talk about it much here, but you understand the current state of it perfectly with the unpatched node.

So nothing breaks on mainnet, but a different storage from block 0 in v3.8.0. You think it works as expected?

@shargon shargon mentioned this pull request Mar 13, 2025
16 tasks
Copy link
Member

@AnnaShaleva AnnaShaleva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. May I ask @superboyiii to test this implementation one more time on the "problem" block of Testnet and ensure that it still works properly, because the implementation has been slightly changed comparing to the initial fix.

@AnnaShaleva AnnaShaleva added this to the v3.8.0 milestone Mar 17, 2025
@shargon
Copy link
Member Author

shargon commented Mar 20, 2025

@Jim8y Merge?

@shargon shargon merged commit 4ecb1df into master Mar 20, 2025
7 checks passed
@shargon shargon deleted the fix-manifest branch March 20, 2025 10:23
Jim8y added a commit that referenced this pull request Mar 24, 2025
* master: (52 commits)
  Add SHA512 (#3845)
  Clean crypto (#3844)
  Review of #3782 (#3843)
  [`Fix`] References in `StorageKey` (#3782)
  Ensure `IInteroperable.Clone` (#3829)
  Ensure the interop types are serializable before commit (#3802)
  fix: ModInverse extension in BigIntegerExtensions (#3840)
  Style: avoid allocating emtpy array (#3841)
  Style: format json in tests/ (#3839)
  Reduce mem arguments (#3838)
  Avoid returning the entire mempool when only `maxTransactionsPerBlock` is required (#3823)
  style: use proper Assert methods (#3834)
  Nullable and fix `lock` in `SQLiteWallet` (#3816)
  [`Add`]: `Witness.Empty` for simplifing somethings (#3836)
  Style: format too-long literal strings (#3835)
  Add: NotNullWhen(true) for some out parameters (#3833)
  Update SQLiteWallet.cs (#3830)
  Revert "Optimize DeleteAccount"
  Revert "add lock to Version"
  Revert "Optimize LoadAccounts"
  ...

# Conflicts:
#	neo.sln
cschuchardt88 pushed a commit to cschuchardt88/neo that referenced this pull request Jun 8, 2025
…3802)

* Ensure the manifest is serializable before commit

* Rename to AssertIsSerializable

* Fix oracle request

* Cache when seal

* sealInterop

* clean changes

* Ensure block is serializable

* Remove Set interop

* Add new one

* Add oracle entry

* fix typo

* Revert ConsensusService.OnMessage.cs

* clean

---------

Co-authored-by: Jimmy <jinghui@wayne.edu>
@roman-khimov roman-khimov mentioned this pull request Sep 18, 2025
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Used to tag confirmed bugs Critical Issues (bugs) that need to be fixed ASAP Ready to Merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants