Skip to content

node: verify genesis doc hash against the file contents rather than remarshalled JSON#1293

Merged
mzabaluev merged 5 commits intomainfrom
mikhail/gen-doc-hash-from-file
Sep 6, 2023
Merged

node: verify genesis doc hash against the file contents rather than remarshalled JSON#1293
mzabaluev merged 5 commits intomainfrom
mikhail/gen-doc-hash-from-file

Conversation

@mzabaluev
Copy link
Contributor

@mzabaluev mzabaluev commented Sep 1, 2023

Closes #1287


PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments

@mzabaluev mzabaluev requested a review from a team as a code owner September 1, 2023 06:34
@mzabaluev mzabaluev requested a review from a team September 1, 2023 06:34
@mzabaluev mzabaluev marked this pull request as draft September 1, 2023 06:35
@mzabaluev
Copy link
Contributor Author

I don't think this needs a changelog entry, as this is a follow-up to #1017 and we haven't released the functional changes yet.

@mzabaluev mzabaluev marked this pull request as ready for review September 1, 2023 12:14
@sergio-mena
Copy link
Collaborator

@mzabaluev PR #1295 is likely to affect this PR. Sorry for the inconveniences...

@mzabaluev
Copy link
Contributor Author

@mzabaluev PR #1295 is likely to affect this PR. Sorry for the inconveniences...

No worries, I will wait (and review) and rebase once that one is merged.

@jmalicevic
Copy link
Collaborator

Merging the PR now so you are free to rebase! Thanks Mikhail !

Marshalling the doc data back into JSON to checksum is wasteful.
In the future, the hash of the genesis file may be passed as a CLI or
config parameter to help control coordinated upgrades. We want that
hash to be the same as the hash stored in the database.
Don't put the requirement to verify the JSON content checksum
onto every genesis doc provider, do it in one place in the main code.
@mzabaluev mzabaluev force-pushed the mikhail/gen-doc-hash-from-file branch from ae9d6ea to 0f13216 Compare September 4, 2023 07:16
// DefaultGenesisDocProviderFunc returns a GenesisDocProvider that loads
// the GenesisDoc from the config.GenesisFile() on the filesystem.
func DefaultGenesisDocProviderFunc(config *cfg.Config) GenesisDocProvider {
return func() (*types.GenesisDoc, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether instead of changing the return value of this function (which is used in many other places), we can not move the checksum logic into the LoadStateFromDBOrGenesisProvider to be executed only once if there is no hash in the DB or if we have a flag provided via cli? Basically move it to line 576 and put the logic of reading the file and computing the sum there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether instead of changing the return value of this function (which is used in many other places), we can not move the checksum logic into the LoadStateFromDBOrGenesisProvider to be executed only once if there is no hash in the DB or if we have a flag provided via cli? Basically move it to line 576 and put the logic of reading the file and computing the sum there?

I have not come across any other places where the GenesisDocProvider signature was used. My concern is about eventually reworking this for streamability: the provider should not need to store the whole file content in RAM to parse the JSON or calculate the checksum. This signature allows implementing it with an incremental parser (e.g. json.Decoder from the standard library) and an incremental SHA-256 hasher. I have left a FIXME comment about this in the default provider function. Maybe I should turn it into a backlog issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed away as #1302

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DefaultGenesisProvider is being used a lot although mostly to create a new node so technically it should be fine. But I am not sure whether it is used by users as well .

So instead of changing a public interface for a future optimization, could we not do one of two things:

  • Leave this until we do the actual optimization and then we can even create a DefaultStreamingDocProvider or
  • Keep the genesisProvider returning just the doc and have a special createChecksum function that could afterwards be altered to not read the file but rather stream through it based on a given path. This function is then called to get the hash (the fact that the genesisProvider has it in memory does not change anything as we anyways still keep it in memory).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we cannot really do this without changing the signature and contract for GenesisDocProvider in some way: in main, it only returns the parsed doc, and both retrieval or JSON content from the source and parsing is supposed to happen inside the provider function.

Copy link
Contributor Author

@mzabaluev mzabaluev Sep 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a backward-compatible alternative: add a checksum field to the GenesisDoc struct and mandate every genesis doc provider set it with the computed checksum. But: 1) it feels conceptually wrong; 2) this special field has to be fenced away from normal handling of the GenesisDoc fields by the JSON marshal and unmarshal implementations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right it might be wrong.. What abut the solution of changing the signature of LoadStateFromDBOrGenesisDocProvider by adding it a path to the genesis file (as we will anyways change this function signature to pass the hash from command line)? Both are breaking, just not sure which is more (as in breaking the way people are used to use these things).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK uses GenesisDocProvider in three places (if I didn't miss any): here, here, and here.

If we include a changelog explaining the signature change, I doesn't seem highly disruptive. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we include a changelog explaining the signature change, I doesn't seem highly disruptive. Thoughts?

I support this. But I do need to add a changelog, after all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with the change. Even my proposed solution about breaking the API of the other function is not going to work. As Mikhail pointed out on Slack, the whole point of the provider is to have an abstract provider so passing a path to a function is def not that.

@mzabaluev mzabaluev added the breaking A breaking change label Sep 4, 2023
Copy link
Collaborator

@sergio-mena sergio-mena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

@mzabaluev
Copy link
Contributor Author

The tests seem flaky: I've got failures in two different test buckets on subsequent pushes that did not add anything to the changes.

@mzabaluev mzabaluev added this pull request to the merge queue Sep 6, 2023
Merged via the queue into main with commit 147a47d Sep 6, 2023
@mzabaluev mzabaluev deleted the mikhail/gen-doc-hash-from-file branch September 6, 2023 12:38
melekes pushed a commit that referenced this pull request Jan 9, 2024
…emarshalled JSON (#1293)

* node: verify genesis doc hash against the file

Marshalling the doc data back into JSON to checksum is wasteful.
In the future, the hash of the genesis file may be passed as a CLI or
config parameter to help control coordinated upgrades. We want that
hash to be the same as the hash stored in the database.

* node: centralize genesis doc hash verification

Don't put the requirement to verify the JSON content checksum
onto every genesis doc provider, do it in one place in the main code.

* node: link the issue on the recent FIXME comment

* Changelog entry for #1293
melekes pushed a commit that referenced this pull request Jan 9, 2024
…emarshalled JSON (#1293)

* node: verify genesis doc hash against the file

Marshalling the doc data back into JSON to checksum is wasteful.
In the future, the hash of the genesis file may be passed as a CLI or
config parameter to help control coordinated upgrades. We want that
hash to be the same as the hash stored in the database.

* node: centralize genesis doc hash verification

Don't put the requirement to verify the JSON content checksum
onto every genesis doc provider, do it in one place in the main code.

* node: link the issue on the recent FIXME comment

* Changelog entry for #1293
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 6, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Feb 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jul 7, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

backports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian.ap@gmail.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jul 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jul 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jul 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jul 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jul 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jul 14, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
troian added a commit to akash-network/cometbft that referenced this pull request Oct 20, 2025
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
troian added a commit to akash-network/cometbft that referenced this pull request Jan 23, 2026
This avoids saving the genesisDoc to database.
When using goleveldb and ~4GiB+ genesis files, it causes a panic
during snappy encoding (panic: snappy: decoded block is too large).

refs akash-network/support#280

ports:
- cometbft#1017
- cometbft#1293

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking A breaking change

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

Change the hash on the doc, to a hash on the genesis file itself.

3 participants