Skip to content

chore(Node): Do not keep the entire genesis file in memory during node operation#4247

Merged
alesforz merged 17 commits intomainfrom
alesforz/genesis-in-memory
Oct 25, 2024
Merged

chore(Node): Do not keep the entire genesis file in memory during node operation#4247
alesforz merged 17 commits intomainfrom
alesforz/genesis-in-memory

Conversation

@alesforz
Copy link
Collaborator

@alesforz alesforz commented Oct 4, 2024

Closes #1290.

This PR tracks the work we are doing for #1290.
For details, please refer to #1290 discussion.

ToDo


PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments

@alesforz alesforz self-assigned this Oct 4, 2024
@alesforz alesforz linked an issue Oct 4, 2024 that may be closed by this pull request
3 tasks
@alesforz alesforz added backlog A prioritized task in the team's backlog P:tech-debt Priority: Technical debt that needs to be paid off to enable us to move faster, reliably P:storage-optimization Priority: Give operators greater control over storage and storage optimization labels Oct 4, 2024
@alesforz alesforz added this to the 2024-Q4 milestone Oct 14, 2024
Alessandro Sforzin added 5 commits October 16, 2024 16:57
…is (#4235)

Closes #4234.

### Changes

The [`Environment`](https://github.com/cometbft/cometbft/blob/8273634cdefb2fb22765a14f9acb9fca608e9982/rpc/core/env.go#L72) type
will:
- Store a pointer to a `GenesisDoc` if the genesis file size is below a
threshold currently set to 16 MB.
- If the genesis file exceeds this threshold, it will split it into
16 MB chunks and store the hashes of each chunk in memory in a slice. In
this case, the pointer to the `GenesisDoc` will be `nil`.

Because the `InitGenesisChunks` method of `Environment` is responsible
for creating the chunks, most of the changes of this PR focus on
refactoring this method and adding unit tests for it.

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
…alization (#4250)

Closes #4248.

### Changes
The `Node` type will:
- create a singleton instance of `Environment` when setting up the RPC
APIs.
- set the pointer to the `GenesisDoc` to `nil` once startup is complete
so that the GC can collect the in-memory `GenesisDoc` object.
 - store a `GenesisTime` field.

---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
It was missing. Interestingly, tests did not pick up on this.
It seems that the `base64.StdEncoding.EncodeToString` function does not make a
full copy of its argument. Therefore creating each `chunk` as a direct sub-slice
of `data` was preventing the GC from deallocating the `data` slice after the
function returned (you could still see it as allocated in pprof graphs).

Creating each `chunk` as a copy of the sub-slice of `data` solves the issue.
@alesforz alesforz force-pushed the alesforz/genesis-in-memory branch from d28d4d6 to 6bfe1e1 Compare October 16, 2024 14:57
@alesforz alesforz marked this pull request as ready for review October 16, 2024 14:58
@alesforz alesforz requested a review from a team as a code owner October 16, 2024 14:58
@alesforz alesforz requested review from a team, andynog and jmalicevic October 16, 2024 14:58
@alesforz
Copy link
Collaborator Author

See here for a visualization of the changes in memory usage after applying these changes.

@alesforz
Copy link
Collaborator Author

Note: despite our efforts, these changes break the Node public API. Because Node doesn't store the genesis in memory after startup, its exported method GenesisDoc() had to change:

  • it now fetches the genesis from disk and unmarshals it into a *types.GenesisDoc.
  • its signature changed to also return an error in case fetching the genesis from disk fails (this is the breaking change)

@alesforz alesforz requested a review from jmalicevic October 17, 2024 14:07
Copy link
Collaborator

@andynog andynog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alesforz, I think the implementation looks good, I've added comments and suggestions. I've also tested the API using a sample file and seems to work.

I didn't not validate the content returned. Just realized that we should provide maybe a CLI command (or sample app) in the future that could download the chunks and re-assemble into a genesis doc. Expecting users and operators to download encoded chunks might not be the best UX

- Added a log message in `ConfigureRPC()` to announce the start of the creation of
  the genesis file chunks
- Improved error message returned by `InitGenesisChunks()`
- `TestInitGenesisChunks() now checks that the original genesis doc and the genesis
  doc reassembled from its chuks match.
@alesforz alesforz requested a review from andynog October 18, 2024 09:53
Copy link
Collaborator

@andynog andynog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for addressing the comments and suggestions. Looks good to me.

@jmalicevic
Copy link
Collaborator

Are the genesis json files still needed for testing ?As you added strings, can they be removed from the PR ?

@alesforz
Copy link
Collaborator Author

alesforz commented Oct 23, 2024

Are the genesis json files still needed for testing ?As you added strings, can they be removed from the PR ?

@jmalicevic The files are still there to run the unit tests. They are fictitious though: genesis_ci.json is the default genesis we generate to run the e2e test ci.toml; genesis_big.json is a copy of genesis_ci.json to which I added meaningless strings to the app_state field in order to test with a big genesis. Neither have real addresses.

I can substitute the small genesis file with a string in the code, but the big genesis file needs to be there to unit test the chunking mechanism. I can reduce its size (it's about 42MB) if that's your concern.

@ValarDragon
Copy link
Contributor

Have not reviewed the PR, but am looking forward to this feature!

@alesforz
Copy link
Collaborator Author

alesforz commented Oct 24, 2024

Are the genesis json files still needed for testing ?As you added strings, can they be removed from the PR ?

@jmalicevic The files are still there to run the unit tests. They are fictitious though: genesis_ci.json is the default genesis we generate to run the e2e test ci.toml; genesis_big.json is a copy of genesis_ci.json to which I added meaningless strings to the app_state field in order to test with a big genesis. Neither have real addresses.

I can substitute the small genesis file with a string in the code, but the big genesis file needs to be there to unit test the chunking mechanism. I can reduce its size (it's about 42MB) if that's your concern.

@jmalicevic Updated the testing code to generate the big genesis docs at runtime rather than having a file in the repo 😄 See last couple of commits

Copy link
Collaborator

@jmalicevic jmalicevic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @alesforz !

Alessandro Sforzin and others added 2 commits October 25, 2024 14:44
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
@alesforz alesforz added this pull request to the merge queue Oct 25, 2024
Merged via the queue into main with commit d6a661a Oct 25, 2024
@alesforz alesforz deleted the alesforz/genesis-in-memory branch October 25, 2024 13:30
Eric-Warehime pushed a commit to Eric-Warehime/cometbft that referenced this pull request Oct 28, 2024
…e operation (cometbft#4247)

Closes cometbft#1290.

This PR tracks the work we are doing for cometbft#1290. 
For details, please refer to cometbft#1290 discussion.

### ToDo
- [x] Environment stores either a pointer to a GenesisDoc or the
genesis' chunks, but not both. Merged to this branch with cometbft#4235.
- [x]
[Node](https://github.com/cometbft/cometbft/blob/87a8c8948bf9926aff37f344b4120163f12cb4ad/node/node.go#L50)
does not keep a pointer to a GenesisDoc after initialization. Merged to
this branch with cometbft#4250.

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backlog A prioritized task in the team's backlog P:storage-optimization Priority: Give operators greater control over storage and storage optimization P:tech-debt Priority: Technical debt that needs to be paid off to enable us to move faster, reliably

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

Do not keep the entire genesis file in memory during node operation

5 participants