Skip to content

chore(RPC): Environment stores only one in-memory copy of the Genesis#4235

Merged
alesforz merged 14 commits intoalesforz/genesis-in-memoryfrom
alesforz/environment-one-genesis-copy
Oct 4, 2024
Merged

chore(RPC): Environment stores only one in-memory copy of the Genesis#4235
alesforz merged 14 commits intoalesforz/genesis-in-memoryfrom
alesforz/environment-one-genesis-copy

Conversation

@alesforz
Copy link
Collaborator

@alesforz alesforz commented Oct 2, 2024

Closes #4234.

Changes

The Environment type will:

  • Store a pointer to a GenesisDoc if the genesis file size is below a threshold currently set to 16 MB.
  • If the genesis file exceeds this threshold, it will split it into 16 MB chunks and store the hashes of each chunk in memory in a slice. In this case, the pointer to the GenesisDoc will be nil.

Because the InitGenesisChunks method of Environment is responsible for creating the chunks, most of the changes of this PR focus on refactoring this method and adding unit tests for it.


PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments

Alessandro added 8 commits October 2, 2024 16:37
- return an error if ptr to GenesisDoc is nil
- return early without chunking if GenesisDoc is <= 16MB
- initialize the chunks slice with a size hint for better performance
- set the ptr to GenesisDoc to nil if GenesisDoc > 16MB, that is, we compute and
  cache the genesis's chunks.
…e chunks slice

rather than checking if it's nil.
@alesforz alesforz self-assigned this Oct 2, 2024
@alesforz alesforz added backlog A prioritized task in the team's backlog P:tech-debt Priority: Technical debt that needs to be paid off to enable us to move faster, reliably P:storage-optimization Priority: Give operators greater control over storage and storage optimization labels Oct 2, 2024
@alesforz
Copy link
Collaborator Author

alesforz commented Oct 2, 2024

Additional Context

The changes we want to introduce in this PR would break the behavior of the /genesis_chunked endpoint.

First a reminder of how our RPC APIs that return the genesis work:

  • we have set a size threshold of 2 MB to determine whether the genesis file should be chunked.
  • /genesis: if the genesis size is smaller than the threshold, it returns the entire genesis encoded in JSON. If it is larger, it returns an error suggesting the user to query the /genesis_chunked endpoint instead.
  • /genesis_chunked: returns the genesis file one chunk at a time. The caller is responsible for making multiple calls to /genesis_chunked to rebuild the entire genesis from the chunks.

In the current implementation, the InitGenesisChunks method (IMHO incorrectly) creates a slice of chunks even if the genesis file is less than the threshold. This slice contains only one element: the entire genesis encoded in base64.

Theoretically, when the genesis file is smaller than the threshold, it should be retrieved using the /genesis endpoint since there are no chunks at all. However, because the current implementation of InitGenesisChunks creates a slice of chunks regardless of the genesis size, it is possible to retrieve a small genesis using the /genesis_chunked endpoint by passing 0 as the chunk ID. The caller will receive the entire genesis encoded in base64.

In contrast, our PR's changes prevent the creation of a slice of chunks if the genesis file is smaller than the threshold (see #4234 for reasons why). This would break the /genesis_chunked endpoint for small genesis files. Because we don't want to break public APIs that users may depend upon, we have introduced code to maintain the existing functionality of /genesis_chunked for small genesis files while implementing our changes.

The changes are here. To maintain compatibility, if the chunks are absent but the given chunk ID is 0, we encode the genesis file to base64 and send it to the user. However, if the chunks are absent and the given chunk ID is greater than 0, we return the standard error as usual.

Alessandro added 4 commits October 2, 2024 18:33
@alesforz alesforz requested a review from jmalicevic October 2, 2024 18:10
@alesforz alesforz marked this pull request as ready for review October 2, 2024 18:23
@alesforz alesforz requested a review from a team as a code owner October 2, 2024 18:23
@alesforz alesforz requested review from a team and andynog October 2, 2024 18:23
@alesforz alesforz requested a review from melekes October 4, 2024 07:40
@alesforz alesforz merged commit 5f3323a into alesforz/genesis-in-memory Oct 4, 2024
@alesforz alesforz deleted the alesforz/environment-one-genesis-copy branch October 4, 2024 10:03
alesforz pushed a commit that referenced this pull request Oct 16, 2024
…is (#4235)

Closes #4234.

### Changes

The [`Environment`](https://github.com/cometbft/cometbft/blob/8273634cdefb2fb22765a14f9acb9fca608e9982/rpc/core/env.go#L72) type
will:
- Store a pointer to a `GenesisDoc` if the genesis file size is below a
threshold currently set to 16 MB.
- If the genesis file exceeds this threshold, it will split it into
16 MB chunks and store the hashes of each chunk in memory in a slice. In
this case, the pointer to the `GenesisDoc` will be `nil`.

Because the `InitGenesisChunks` method of `Environment` is responsible
for creating the chunks, most of the changes of this PR focus on
refactoring this method and adding unit tests for it.

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
github-merge-queue bot pushed a commit that referenced this pull request Oct 25, 2024
…e operation (#4247)

Closes #1290.

This PR tracks the work we are doing for #1290. 
For details, please refer to #1290 discussion.

### ToDo
- [x] Environment stores either a pointer to a GenesisDoc or the
genesis' chunks, but not both. Merged to this branch with #4235.
- [x]
[Node](https://github.com/cometbft/cometbft/blob/87a8c8948bf9926aff37f344b4120163f12cb4ad/node/node.go#L50)
does not keep a pointer to a GenesisDoc after initialization. Merged to
this branch with #4250.

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Eric-Warehime pushed a commit to Eric-Warehime/cometbft that referenced this pull request Oct 28, 2024
…e operation (cometbft#4247)

Closes cometbft#1290.

This PR tracks the work we are doing for cometbft#1290. 
For details, please refer to cometbft#1290 discussion.

### ToDo
- [x] Environment stores either a pointer to a GenesisDoc or the
genesis' chunks, but not both. Merged to this branch with cometbft#4235.
- [x]
[Node](https://github.com/cometbft/cometbft/blob/87a8c8948bf9926aff37f344b4120163f12cb4ad/node/node.go#L50)
does not keep a pointer to a GenesisDoc after initialization. Merged to
this branch with cometbft#4250.

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backlog A prioritized task in the team's backlog P:storage-optimization Priority: Give operators greater control over storage and storage optimization P:tech-debt Priority: Technical debt that needs to be paid off to enable us to move faster, reliably

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants