Skip to content

perf(txindex): Lower allocation overhead of txIndex matchRange#2839

Merged
melekes merged 3 commits intocometbft:mainfrom
osmosis-labs:dev/lower_allocation_of_match_range
Apr 24, 2024
Merged

perf(txindex): Lower allocation overhead of txIndex matchRange#2839
melekes merged 3 commits intocometbft:mainfrom
osmosis-labs:dev/lower_allocation_of_match_range

Conversation

@ValarDragon
Copy link
Contributor

@ValarDragon ValarDragon commented Apr 18, 2024

In Osmosis we see massive amounts of heap pressure/allocations coming from txIndex matchRange. (Screenshot below from ~1 hour of heap profiling)

image

This PR is expected to fully compatibly drop this down by a factor of 3. It:

  • Does not get Key() twice (160GB allocation saved)
  • Uses no heap allocations for isTagKey (120GB saved)
  • Does not string cast or do strings.Split in parsing the value (~400GB expected saved)
  • Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change. The remaining RAM overhead from extracting the value can be saved with an unsafe call for casting the output to string with no heap allocation, but we can do that in a separate PR.


PR checklist

  • Tests written/updated - All existing tests still apply
  • Changelog entry added in .changelog (we use unclog to manage our changelog)
  • Updated relevant documentation (docs/ or spec/) and code comments
  • Title follows the Conventional Commits spec

@ValarDragon ValarDragon requested a review from a team as a code owner April 18, 2024 05:42
@ValarDragon ValarDragon requested a review from a team April 18, 2024 05:42
Copy link
Collaborator

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ValarDragon ❤️

LOOP:
for ; it.Valid(); it.Next() {
if !isTagKey(it.Key()) {
// TODO: We need to make a function for getting it.Key() as a byte slice with no copies.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind opening an issue for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Do you want me to delete the comment as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cometbft-db issue opened: cometbft/cometbft-db#156

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Do you want me to delete the comment as well?

no

@melekes melekes added backport-to-v0.37.x backport-to-v0.38.x Tell Mergify to backport the PR to v0.38.x labels Apr 18, 2024
@adizere adizere added this to the 2024-Q2 milestone Apr 18, 2024
@cason cason added storage P:storage-optimization Priority: Give operators greater control over storage and storage optimization labels Apr 18, 2024
@ValarDragon
Copy link
Contributor Author

Sorry for adding another commit after review 😅 . I noticed in the heap profile another source of memory overhead. Its much much lower in magnitude, but the fix was easy so figured I'd still add it before doing a second round of post-fix re-benchmarking.
image

This commit removes the Key call within setTmpHashes and most allocations from ExtractHeightFromKey, yielding another ~5GB of allocation reduction

Copy link

@cason cason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this.

I couldn't find (no test performed, only by reading) operational differences, while the changes probably reduce the memory footprint.

return 0, errors.New("second last separator not found")
}

return strconv.ParseInt(parts[len(parts)-2], 10, 64)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did this work until now?

@melekes melekes added this pull request to the merge queue Apr 24, 2024
Merged via the queue into cometbft:main with commit e75267f Apr 24, 2024
mergify bot pushed a commit that referenced this pull request Apr 24, 2024
In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

#### PR checklist

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit e75267f)
mergify bot pushed a commit that referenced this pull request Apr 24, 2024
In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

#### PR checklist

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit e75267f)

# Conflicts:
#	state/txindex/kv/kv.go
mergify bot pushed a commit that referenced this pull request Apr 24, 2024
In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

#### PR checklist

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit e75267f)

# Conflicts:
#	.changelog/v0.37.5/improvements/2839-tx_index-lower-heap-allocation.md
#	state/txindex/kv/kv.go
melekes pushed a commit that referenced this pull request Apr 24, 2024
…ort #2839) (#2882)

In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)


![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

#### PR checklist

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request #2839 done by
[Mergify](https://mergify.com).

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
melekes added a commit that referenced this pull request Apr 24, 2024
…ort #2839) (#2883)

In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)


![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

#### PR checklist

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request #2839 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
melekes added a commit that referenced this pull request Apr 24, 2024
…ort #2839) (#2884)

In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)


![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

#### PR checklist

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request #2839 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
czarcas7ic pushed a commit to osmosis-labs/cometbft that referenced this pull request Apr 29, 2024
…ort cometbft#2839) (cometbft#2884)

In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request cometbft#2839 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
czarcas7ic added a commit to osmosis-labs/cometbft that referenced this pull request Apr 30, 2024
#27)

* perf(txindex): Lower allocation overhead of txIndex matchRange (backport cometbft#2839) (cometbft#2884)

In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request cometbft#2839 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>

* add changelog

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
mergify bot added a commit to osmosis-labs/cometbft that referenced this pull request Apr 30, 2024
#27)

* perf(txindex): Lower allocation overhead of txIndex matchRange (backport cometbft#2839) (cometbft#2884)

In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request cometbft#2839 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>

* add changelog

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
(cherry picked from commit efd1ea2)
czarcas7ic added a commit to osmosis-labs/cometbft that referenced this pull request Apr 30, 2024
#27) (#31)

* perf(txindex): Lower allocation overhead of txIndex matchRange (backport cometbft#2839) (cometbft#2884)

In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request cometbft#2839 done by
[Mergify](https://mergify.com).

---------

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>

* add changelog

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
(cherry picked from commit efd1ea2)

Co-authored-by: Adam Tucker <adam@osmosis.team>
mattac21 pushed a commit that referenced this pull request Sep 5, 2025
In Osmosis we see massive amounts of heap pressure/allocations coming
from txIndex matchRange. (Screenshot below from ~1 hour of heap
profiling)

![image](https://github.com/cometbft/cometbft/assets/6440154/bf2dfe89-56f0-4824-815b-c5822d20568b)

This PR is expected to fully compatibly drop this down by a factor of 3.
It:
- Does not get Key() twice (160GB allocation saved)
- Uses no heap allocations for isTagKey (120GB saved)
- Does not string cast or do strings.Split in parsing the value (~400GB
expected saved)
- Reuses the big.Int (24GB saved)

The remaining RAM overhead from .Key() needs a cometbft-db API change.
The remaining RAM overhead from extracting the value can be saved with
an unsafe call for casting the output to string with no heap allocation,
but we can do that in a separate PR.

---

- [x] Tests written/updated - All existing tests still apply
- [x] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-to-v0.38.x Tell Mergify to backport the PR to v0.38.x P:storage-optimization Priority: Give operators greater control over storage and storage optimization storage

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants