perf: use nmt buffering for root calcuation by mcrakhman · Pull Request #381 · celestiaorg/rsmt2d

mcrakhman · 2025-09-19T11:15:55Z

Overview

Use buffering approach during root calculation:

each tree has buffer for leaves (in wrapper) and nmt hashes (in tree)
we limit parallel operations during root computation, so each tree is reused in specific goroutine, thus we don't allocate these buffers
the trees are managed by tree pool
we can reuse the buffers for next square
the trees can be reused for square sizes which are smaller, for larger sizes they will be reallocated on the fly

This gives 95% lower allocations for subsequent reuses and 95% less memory usage and 2x speed.

Note: the pr relies on improved NMT version which supports buffer reuse for hashes: celestiaorg/nmt#305

Benchmarks

Without buffering

cpu: Apple M1 Max
BenchmarkEDSRootsWithErasuredNMT
BenchmarkEDSRootsWithErasuredNMT/32x32x512_ODS=0MB,_EDS=0MB
BenchmarkEDSRootsWithErasuredNMT/32x32x512_ODS=0MB,_EDS=0MB-10         	     381	   2894619 ns/op	 8566885 B/op	   44171 allocs/op
BenchmarkEDSRootsWithErasuredNMT/64x64x512_ODS=2MB,_EDS=8MB
BenchmarkEDSRootsWithErasuredNMT/64x64x512_ODS=2MB,_EDS=8MB-10         	     151	   7806400 ns/op	33918235 B/op	  170768 allocs/op
BenchmarkEDSRootsWithErasuredNMT/128x128x512_ODS=8MB,_EDS=32MB
BenchmarkEDSRootsWithErasuredNMT/128x128x512_ODS=8MB,_EDS=32MB-10      	      42	  26461898 ns/op	135224424 B/op	  670235 allocs/op
BenchmarkEDSRootsWithErasuredNMT/256x256x512_ODS=32MB,_EDS=128MB
BenchmarkEDSRootsWithErasuredNMT/256x256x512_ODS=32MB,_EDS=128MB-10    	      10	 105217712 ns/op	544944785 B/op	 2653211 allocs/op
BenchmarkEDSRootsWithErasuredNMT/512x512x512_ODS=128MB,_EDS=512MB
BenchmarkEDSRootsWithErasuredNMT/512x512x512_ODS=128MB,_EDS=512MB-10   	       3	 397317458 ns/op	2186366608 B/op	10559819 allocs/op
BenchmarkEDSRootsWithErasuredNMT/1024x1024x512_ODS=512MB,_EDS=2048MB
BenchmarkEDSRootsWithErasuredNMT/1024x1024x512_ODS=512MB,_EDS=2048MB-10         	       1	2713678708 ns/op	8714358288 B/op	42129390 allocs/op

With buffering

cpu: Apple M1 Max
BenchmarkEDSRootsWithBufferedErasuredNMT
BenchmarkEDSRootsWithBufferedErasuredNMT/32x32x512_ODS=0MB,_EDS=0MB
BenchmarkEDSRootsWithBufferedErasuredNMT/32x32x512_ODS=0MB,_EDS=0MB-10         	    1626	    839702 ns/op	  328908 B/op	    1348 allocs/op
BenchmarkEDSRootsWithBufferedErasuredNMT/64x64x512_ODS=2MB,_EDS=8MB
BenchmarkEDSRootsWithBufferedErasuredNMT/64x64x512_ODS=2MB,_EDS=8MB-10         	     384	   2951978 ns/op	 1283513 B/op	    2948 allocs/op
BenchmarkEDSRootsWithBufferedErasuredNMT/128x128x512_ODS=8MB,_EDS=32MB
BenchmarkEDSRootsWithBufferedErasuredNMT/128x128x512_ODS=8MB,_EDS=32MB-10      	     115	  10319029 ns/op	 5000122 B/op	    6404 allocs/op
BenchmarkEDSRootsWithBufferedErasuredNMT/256x256x512_ODS=32MB,_EDS=128MB
BenchmarkEDSRootsWithBufferedErasuredNMT/256x256x512_ODS=32MB,_EDS=128MB-10    	      22	  52192040 ns/op	19454644 B/op	   13828 allocs/op
BenchmarkEDSRootsWithBufferedErasuredNMT/512x512x512_ODS=128MB,_EDS=512MB
BenchmarkEDSRootsWithBufferedErasuredNMT/512x512x512_ODS=128MB,_EDS=512MB-10   	       6	 171740514 ns/op	80885168 B/op	   29700 allocs/op
BenchmarkEDSRootsWithBufferedErasuredNMT/1024x1024x512_ODS=512MB,_EDS=2048MB
BenchmarkEDSRootsWithBufferedErasuredNMT/1024x1024x512_ODS=512MB,_EDS=2048MB-10         	       2	 699427729 ns/op	329695408 B/op	   69636 allocs/op

gemini-code-assist · 2025-09-19T11:16:15Z

Summary of Changes

Hello @mcrakhman, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the performance of NMT root calculations within the data square by introducing a buffered NMT pooling mechanism. This change aims to minimize memory allocations and optimize concurrent processing, leading to more efficient extended data square computations. The new implementation includes a configurable parallelism limit and is thoroughly tested for correctness and performance benefits.

Highlights

Performance Optimization: Introduced a buffering approach for Namespace Merkle Tree (NMT) root calculation to reduce memory allocations and improve performance, especially in high-throughput scenarios.
NMT Pooling: Implemented a treePool to manage and reuse resizeableBufferTree instances, which are NMTs with pre-allocated buffers for share data.
Controlled Parallelism: Added parallelOps to the dataSquare structure and computeRoots method, allowing for a configurable limit on concurrent operations during root computation using errgroup.Group.SetLimit.
New API for Buffered Computation: Provided a new function ComputeExtendedDataSquareWithBuffer that leverages the buffered NMTs and controlled parallelism for extended data square computation.
Comprehensive Testing and Benchmarking: Added new benchmarks (BenchmarkEDSRootsWithBufferedErasuredNMT) and correctness tests (TestNmtVsBufferedNmtSameRoot, TestComputeExtendedDataSquareVsWithBuffer) to validate the new buffered approach and compare its performance and results against the standard method.
Dependency Update: Updated the github.com/celestiaorg/nmt dependency to a newer version to support the new buffering features.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a buffering and pooling mechanism for NMT root calculation to improve performance by reducing memory allocations. This is achieved through a treePool of resizeableBufferTrees, with concurrency managed by a new parallelOps setting. The changes are logical and supported by new benchmarks and correctness tests that validate the buffered approach against the existing implementation.

I've found one critical issue and one of medium severity:

The new file nmtbuffered_tree_test.go contains production code but is named as a test file, which will cause build issues.
There's a redundant check in the Push method of resizeableBufferTree.

Details are in the review comments below. Once these are addressed, this looks like a solid performance improvement.

nmtbuffered_tree_test.go

gemini-code-assist · 2025-09-19T11:18:32Z

nmtbuffered_tree_test.go

+	if nidDataLen > t.bufferEntrySize {
+		return fmt.Errorf("data is too large to be used with allocated buffer")
+	}


This check is redundant and can be removed. nidDataLen is calculated as t.namespaceSize + len(data), and t.bufferEntrySize is t.namespaceSize + shareSize. The newDataSquare function already ensures that len(data) is equal to shareSize for all non-nil shares. Therefore, nidDataLen will always be equal to t.bufferEntrySize, and this condition will never be met.

datasquare_test.go

go.mod

datasquare_test.go

perf: use nmt buffering for root calcuation

5677c0c

mcrakhman marked this pull request as draft September 19, 2025 11:16

gemini-code-assist bot reviewed Sep 19, 2025

View reviewed changes

evan-forbes previously approved these changes Sep 19, 2025

View reviewed changes

rach-id previously approved these changes Sep 19, 2025

View reviewed changes

mcrakhman marked this pull request as ready for review September 22, 2025 22:08

rootulp assigned mcrakhman Sep 23, 2025

rootulp previously approved these changes Sep 23, 2025

View reviewed changes

datasquare_test.go Outdated Show resolved Hide resolved

datasquare_test.go Outdated Show resolved Hide resolved

datasquare_test.go Outdated Show resolved Hide resolved

go.mod Outdated Show resolved Hide resolved

datasquare_test.go Outdated Show resolved Hide resolved

refactor: use 29-byte namespace, update nmt to latest

f9a870d

mcrakhman dismissed stale reviews from rootulp, rach-id, and evan-forbes via f9a870d September 24, 2025 16:21

rach-id approved these changes Sep 24, 2025

View reviewed changes

evan-forbes approved these changes Sep 24, 2025

View reviewed changes

rootulp approved these changes Sep 24, 2025

View reviewed changes

mcrakhman merged commit 97ea8c7 into main Sep 24, 2025
6 checks passed

mcrakhman deleted the mcrakhman/buffered-nmt branch September 24, 2025 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: use nmt buffering for root calcuation#381

perf: use nmt buffering for root calcuation#381
mcrakhman merged 2 commits intomainfrom
mcrakhman/buffered-nmt

mcrakhman commented Sep 19, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Sep 19, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

mcrakhman commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Benchmarks

Without buffering

With buffering

Uh oh!

gemini-code-assist bot commented Sep 19, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mcrakhman commented Sep 19, 2025 •

edited

Loading