Prioritize packet events over local events#2522
Conversation
Codecov ReportBase: 67.85% // Head: 67.21% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #2522 +/- ##
==========================================
- Coverage 67.85% 67.21% -0.65%
==========================================
Files 203 193 -10
Lines 30507 28648 -1859
Branches 5963 5628 -335
==========================================
- Hits 20701 19255 -1446
+ Misses 5113 4949 -164
+ Partials 4693 4444 -249
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
aa143e7 to
873ed9c
Compare
|
The network and simulation performance is significantly different with this change: https://github.com/shadow/benchmark-results/tree/master/tor/2022-11-07-T14-56-21 @robgjansen Any issues with this before I merge it? |
cbbc0f2 to
30da0d2
Compare
|
We compared a benchmark run of this PR (2022-11-14-T17-46-27) and the previous weekly benchmark run (2022-11-12-T03-50-23) using the host heartbeat messages and the scripts in We have found that the number of allocations and syscalls were significantly less in the PR benchmark run. PR: Weekly of main branch: |
|
With the new TGen benchmark, no change in simulated network performance: But we do still get the improvement in run time. |
|
@stevenengler please go ahead and merge at your convenience :) |
00f7b90 to
e91cb8c
Compare
|
We lowered the TGen speed threshold in #2633, so the tgen tests should now be more resilient to minor changes in packet scheduling. (We'll want to increase the speed threshold again when we think our net stack is stable enough to warrant it.) |
2cc78f3 to
5ab5d55
Compare
|
@robgjansen Re-marking you as reviewer. You've already reviewed the code once, but it's been a while so letting you have another look if you want. |
robgjansen
left a comment
There was a problem hiding this comment.
Here is my understanding of why we see the runtime performance improvement with this PR.
Prior to this PR, sometimes (depending on the host id) local events would be prioritized ahead of packet events that happen at the same instant, meaning that we could move a single packet event from the event queue to the router codel queue, which would cause a new local event to start the relay forward loop, which would get placed in front of other waiting packets events in the event queue. Effectively we would create a new relay forward event for every packet, and run the event to forward only one packet, and then repeat when the next packet gets added to the router codel queue again.
With this PR, all packets are consistently getting moved from the event queue to the router codel queue in one batch, and then after that we run the relay forward loop which can then forward as many packets as possible. This significantly reduces the number of event objects (and related push and pop operations) required to receive a set of packets that arrive at the same instant.
Previously, when the host would get the next event and there were multiple events at the same time, it would prioritize events with a lower source host ID. This means that sometimes packets would be prioritized over local events, and sometimes local events would be prioritized depending on the source host's ID relative to the current host's ID. This PR removes this inconsistency, and now packet events will always be prioritized over local events. This PR also splits the event into separate event types: packet events (that contain a packet object) and local events (that contain a task). This means that we can take a packet object directly from a packet event, which is more flexible than running an arbitrary task closure containing the packet object.
5ab5d55 to
5b72aab
Compare








Previously, when the host would get the next event and there were multiple events at the same time, it would prioritize events with a lower source host ID. This means that sometimes packets would be prioritized over local events, and sometimes local events would be prioritized depending on the source host's ID relative to the current host's ID. This PR removes this inconsistency, and now packet events will always be prioritized over local events.
This PR also splits the event into separate event types: packet events (that contain a packet object) and local events (that contain a task). This means that we can take a packet object directly from a packet event, which is more flexible than running an arbitrary task closure containing the packet object.