perf: avoid redundant slice allocation in leafHash assignment by tzdybal · Pull Request #301 · celestiaorg/nmt

tzdybal · 2025-08-28T10:23:31Z

Overview

To reduce number of allocations, we can directly return precomputed leafs (without copying them).

Benchmark results

$ go-perftuner benchstat before.txt after.txt
name                       old time/op    new time/op    delta
ComputeRoot/64-leaves-16      115µs ± 1%     112µs ± 0%   -2.80%  (p=0.008 n=5+5)
ComputeRoot/128-leaves-16     230µs ± 5%     223µs ± 1%        ~  (p=0.016 n=5+5)
ComputeRoot/256-leaves-16     456µs ± 1%     446µs ± 1%        ~  (p=0.008 n=5+5)
ComputeRoot/20k-leaves-16    49.7ms ± 1%    48.5ms ± 3%        ~  (p=0.095 n=5+5)

name                       old alloc/op   new alloc/op   delta
ComputeRoot/64-leaves-16     30.4kB ± 0%    27.4kB ± 0%  -10.09%  (p=0.008 n=5+5)
ComputeRoot/128-leaves-16    54.3kB ± 0%    48.1kB ± 0%  -11.32%  (p=0.008 n=5+5)
ComputeRoot/256-leaves-16     114kB ± 0%     102kB ± 0%  -10.73%  (p=0.008 n=5+5)
ComputeRoot/20k-leaves-16    12.2MB ± 0%    11.2MB ± 0%   -7.89%  (p=0.008 n=5+5)

name                       old allocs/op  new allocs/op  delta
ComputeRoot/64-leaves-16        402 ± 0%       338 ± 0%  -15.92%  (p=0.008 n=5+5)
ComputeRoot/128-leaves-16       788 ± 0%       660 ± 0%  -16.24%  (p=0.008 n=5+5)
ComputeRoot/256-leaves-16     1.56k ± 0%     1.30k ± 0%  -16.41%  (p=0.008 n=5+5)
ComputeRoot/20k-leaves-16      120k ± 0%      100k ± 0%  -16.64%  (p=0.008 n=5+5)

To reduce number of allocations, we can directly return precomputed leafs (without copying them). Benchmark results: ``` $ go-perftuner benchstat before.txt after.txt name old time/op new time/op delta ComputeRoot/64-leaves-16 115µs ± 1% 112µs ± 0% -2.80% (p=0.008 n=5+5) ComputeRoot/128-leaves-16 230µs ± 5% 223µs ± 1% ~ (p=0.016 n=5+5) ComputeRoot/256-leaves-16 456µs ± 1% 446µs ± 1% ~ (p=0.008 n=5+5) ComputeRoot/20k-leaves-16 49.7ms ± 1% 48.5ms ± 3% ~ (p=0.095 n=5+5) name old alloc/op new alloc/op delta ComputeRoot/64-leaves-16 30.4kB ± 0% 27.4kB ± 0% -10.09% (p=0.008 n=5+5) ComputeRoot/128-leaves-16 54.3kB ± 0% 48.1kB ± 0% -11.32% (p=0.008 n=5+5) ComputeRoot/256-leaves-16 114kB ± 0% 102kB ± 0% -10.73% (p=0.008 n=5+5) ComputeRoot/20k-leaves-16 12.2MB ± 0% 11.2MB ± 0% -7.89% (p=0.008 n=5+5) name old allocs/op new allocs/op delta ComputeRoot/64-leaves-16 402 ± 0% 338 ± 0% -15.92% (p=0.008 n=5+5) ComputeRoot/128-leaves-16 788 ± 0% 660 ± 0% -16.24% (p=0.008 n=5+5) ComputeRoot/256-leaves-16 1.56k ± 0% 1.30k ± 0% -16.41% (p=0.008 n=5+5) ComputeRoot/20k-leaves-16 120k ± 0% 100k ± 0% -16.64% (p=0.008 n=5+5)

gemini-code-assist

Summary of Changes

Hello @tzdybal, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a performance optimization aimed at reducing memory allocations within the Namespaced Merkle Tree computation. By eliminating an unnecessary slice copy operation, the change improves efficiency and reduces the memory footprint, as evidenced by the provided benchmark results showing significant reductions in allocations and allocs/op.

Highlights

Performance Optimization: The computeRoot function in nmt.go was optimized to directly assign the leafHash instead of performing a redundant make([]byte, ...) and copy() operation. This change directly reduces memory allocations.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request aims to improve performance by avoiding redundant slice allocations when assigning leaf hashes. The change is effective, as shown by the benchmark results, reducing allocations by up to 16%. However, this performance gain comes at the cost of breaking encapsulation. By returning a direct slice of the tree's internal state, there's a risk of data corruption if the caller modifies the returned slice. My review includes a comment detailing this risk with a concrete example and suggests reverting to the safer implementation that returns a copy. While the performance improvement is notable, ensuring the library's robustness and preventing subtle bugs should be a higher priority.

gemini-code-assist · 2025-08-28T10:24:50Z

nmt.go

 	case 1:
-		leafHash := make([]byte, len(n.leafHashes[start]))
-		copy(leafHash, n.leafHashes[start])
+		leafHash := n.leafHashes[start]


This change avoids a slice allocation by returning a slice of the internal n.leafHashes. While this improves performance, it introduces a risk of data corruption. A caller of an exported method like Root() could modify the returned slice, which would alter the internal state of the tree.

For example:

A tree with one leaf is created.

Root() is called. It returns a slice pointing directly to n.leafHashes[0].

The caller modifies the returned slice.

Another leaf is pushed to the tree.

Root() is called again. The new root will be computed using the modified leaf hash from step 3, resulting in an incorrect root.

This breaks the encapsulation of the NamespacedMerkleTree and can lead to subtle bugs. To prevent this, it's safer to return a copy of the leaf hash. If the performance gain is critical, this behavior should be explicitly documented in all affected public methods (e.g., Root, Get, MinNamespace) to warn users against modifying the returned slices.

Suggested change

leafHash := n.leafHashes[start]

leafHash := make([]byte, len(n.leafHashes[start]))

copy(leafHash, n.leafHashes[start])

I didn't find the issue or PR, but iirc we had some issues before similar to the above comment 🤔

in theory likely not

I looked into git history, and copying was introduced alongside pre-calculating leaf hashes on Push.
"Encapsulation" also doesn't hold for Root - we're returning cached value (internal slice).

I totally agree with:

this behavior should be explicitly documented in all affected public methods (e.g., Root, Get, MinNamespace) to warn users against modifying the returned slices.

rach-id

LGTM given the root computation tests are passing and also the proofs are generated correctly

tzdybal requested review from Wondertan, evan-forbes, rach-id and walldiss August 28, 2025 10:23

tzdybal self-assigned this Aug 28, 2025

gemini-code-assist bot reviewed Aug 28, 2025

View reviewed changes

rach-id approved these changes Aug 28, 2025

View reviewed changes

evan-forbes approved these changes Aug 28, 2025

View reviewed changes

tzdybal merged commit 6f32fad into main Aug 29, 2025
8 checks passed

tzdybal deleted the tzdybal/rm_leaf_copy branch August 29, 2025 07:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: avoid redundant slice allocation in leafHash assignment#301

perf: avoid redundant slice allocation in leafHash assignment#301
tzdybal merged 1 commit intomainfrom
tzdybal/rm_leaf_copy

tzdybal commented Aug 28, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 28, 2025

Uh oh!

evan-forbes Aug 28, 2025

Uh oh!

tzdybal Aug 28, 2025

Uh oh!

rach-id left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	leafHash := n.leafHashes[start]
	leafHash := make([]byte, len(n.leafHashes[start]))
	copy(leafHash, n.leafHashes[start])

Conversation

tzdybal commented Aug 28, 2025

Overview

Benchmark results

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

evan-forbes Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

tzdybal Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

rach-id left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants