Skip to content

perf(p2p): Delete expensive debug log already slated for deletion#3412

Merged
melekes merged 1 commit intomainfrom
dev/delete_debug_log
Jul 8, 2024
Merged

perf(p2p): Delete expensive debug log already slated for deletion#3412
melekes merged 1 commit intomainfrom
dev/delete_debug_log

Conversation

@ValarDragon
Copy link
Contributor

@ValarDragon ValarDragon commented Jul 3, 2024

Delete an expensive debug log that has already been slated for deletion. On the osmosis fork (with many other debug logs removed), this debug log ended up taking 1.5% of all CPU time when we had many large blocks. (All from outbound gossip curiously enough) This caused 12GB of total memory allocation over an hour in that benchmark (out of 180GB total. With other optimizations in PR's / merged since that benchmark, would have been 12GB / ~120GB total)

I am just making a PR to delete it, since it already had a code comment indicating we should delete it later. (Rather than waiting for #2989

Note that under normal benchmarks (e.g. the last one I did), this is fairly low impact. (68MB over an hour, out of 69 GB, so .1%)


PR checklist

  • Tests written/updated
  • Changelog entry added in .changelog (we use unclog to manage our changelog) - didn't add one, since its just deleting a debug log
  • Updated relevant documentation (docs/ or spec/) and code comments
  • Title follows the Conventional Commits spec

@ValarDragon ValarDragon requested a review from a team as a code owner July 3, 2024 17:15
@ValarDragon ValarDragon requested a review from a team July 3, 2024 17:15
@ValarDragon ValarDragon added the backport-to-v0.38.x Tell Mergify to backport the PR to v0.38.x label Jul 3, 2024
Copy link
Collaborator

@hvanz hvanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@cason
Copy link

cason commented Jul 4, 2024

How often this code is hit? Because in most-cases we assume that the channel is present in both sender or receivers.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented Jul 4, 2024

This is used in every single packet send! Would be great if we can avoid even needing this check in the future, this map access is actually still showing in CPU profiles after the debug log is removed, since there are so many packets.

@cason
Copy link

cason commented Jul 4, 2024

Actually, this code is only run if we don't return true, which means that the channel does not exist.

@ValarDragon
Copy link
Contributor Author

Ah your right. Here is a profile where we are seeing it taking a large amount of total osmosis load:
image

This is the same profile where we saw gossipDataRoutine taking shockingly long amounts of time (and I suspected busy waiting) #3414 , maybe this helps us tell us the problem?

Copy link
Collaborator

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ValarDragon ❤️

@melekes
Copy link
Collaborator

melekes commented Jul 5, 2024

Actually, this code is only run if we don't return true, which means that the channel does not exist.

I don't understand it either. If all nodes have the same set of channels, then this debug line shouldn't ever be reached.

@hvanz
Copy link
Collaborator

hvanz commented Jul 5, 2024

I would also change channels []byte to channels map[byte]struct{} so we don't have to traverse the array every time we send.

@cason
Copy link

cason commented Jul 5, 2024

I would also change channels []byte to channels map[byte]struct{} so we don't have to traverse the array every time we send.

I am pretty sure that traversing a byte array is faster than using a map. Almost 100% sure. We could use the bytes package though.

@melekes melekes removed backport-to-v0.38.x Tell Mergify to backport the PR to v0.38.x backport-to-v1.x labels Jul 8, 2024
@melekes
Copy link
Collaborator

melekes commented Jul 8, 2024

I'm going to merge this, but we still need to understand where inefficiency is coming from #3412 (comment)

@melekes melekes added this pull request to the merge queue Jul 8, 2024
Merged via the queue into main with commit 1964898 Jul 8, 2024
@melekes melekes deleted the dev/delete_debug_log branch July 8, 2024 08:54
@melekes
Copy link
Collaborator

melekes commented Jul 10, 2024

@mergify backport v1.x

@mergify
Copy link
Contributor

mergify bot commented Jul 10, 2024

backport v1.x

✅ Backports have been created

Details

mergify bot pushed a commit that referenced this pull request Jul 10, 2024
)

Delete an expensive debug log that has already been slated for deletion.
On the osmosis fork (with many other debug logs removed), this debug log
ended up taking 1.5% of all CPU time when we had many large blocks. (All
from outbound gossip curiously enough) This caused 12GB of total memory
allocation over an hour in that benchmark (out of 180GB total. With
other optimizations in PR's / merged since that benchmark, would have
been 12GB / ~120GB total)

I am just making a PR to delete it, since it already had a code comment
indicating we should delete it later. (Rather than waiting for #2989

Note that under normal benchmarks (e.g. the last one I did), this is
fairly low impact. (68MB over an hour, out of 69 GB, so .1%)

---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog) - didn't add one, since its just deleting a debug log
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec

(cherry picked from commit 1964898)
melekes added a commit that referenced this pull request Jul 10, 2024
…ckport #3412) (#3488)

Delete an expensive debug log that has already been slated for deletion.
On the osmosis fork (with many other debug logs removed), this debug log
ended up taking 1.5% of all CPU time when we had many large blocks. (All
from outbound gossip curiously enough) This caused 12GB of total memory
allocation over an hour in that benchmark (out of 180GB total. With
other optimizations in PR's / merged since that benchmark, would have
been 12GB / ~120GB total)

I am just making a PR to delete it, since it already had a code comment
indicating we should delete it later. (Rather than waiting for #2989

Note that under normal benchmarks (e.g. the last one I did), this is
fairly low impact. (68MB over an hour, out of 69 GB, so .1%)

---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog) - didn't add one, since its just deleting a debug log
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
<hr>This is an automatic backport of pull request #3412 done by
[Mergify](https://mergify.com).

Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com>
Co-authored-by: Anton Kaliaev <anton.kalyaev@gmail.com>
mattac21 pushed a commit that referenced this pull request Sep 9, 2025
)

Delete an expensive debug log that has already been slated for deletion.
On the osmosis fork (with many other debug logs removed), this debug log
ended up taking 1.5% of all CPU time when we had many large blocks. (All
from outbound gossip curiously enough) This caused 12GB of total memory
allocation over an hour in that benchmark (out of 180GB total. With
other optimizations in PR's / merged since that benchmark, would have
been 12GB / ~120GB total)

I am just making a PR to delete it, since it already had a code comment
indicating we should delete it later. (Rather than waiting for #2989

Note that under normal benchmarks (e.g. the last one I did), this is
fairly low impact. (68MB over an hour, out of 69 GB, so .1%)

---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog) - didn't add one, since its just deleting a debug log
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
mattac21 pushed a commit that referenced this pull request Sep 10, 2025
)

Delete an expensive debug log that has already been slated for deletion.
On the osmosis fork (with many other debug logs removed), this debug log
ended up taking 1.5% of all CPU time when we had many large blocks. (All
from outbound gossip curiously enough) This caused 12GB of total memory
allocation over an hour in that benchmark (out of 180GB total. With
other optimizations in PR's / merged since that benchmark, would have
been 12GB / ~120GB total)

I am just making a PR to delete it, since it already had a code comment
indicating we should delete it later. (Rather than waiting for #2989

Note that under normal benchmarks (e.g. the last one I did), this is
fairly low impact. (68MB over an hour, out of 69 GB, so .1%)

---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog) - didn't add one, since its just deleting a debug log
- [x] Updated relevant documentation (`docs/` or `spec/`) and code
comments
- [x] Title follows the [Conventional
Commits](https://www.conventionalcommits.org/en/v1.0.0/) spec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants