datapath/neighbor: Add ratelimit to desired neighbor calculation#43928
Conversation
|
/test |
008c660 to
7babedf
Compare
|
/test |
7babedf to
565ca3d
Compare
|
/test |
|
@cilium/sig-datapath any chance of getting a review 👋 ? |
smagnani96
left a comment
There was a problem hiding this comment.
Code changes LGTM.
Left a comment on the interval: if you're seeking a feedback on the specific amount chosen (15sec), I'd ask you to pull in other reviewers.
|
@ldelossa do you have bandwidth to help with @cilium/sig-datapath reviews? If not, you might consider either marking your profile as "busy" on GitHub or (temporarily) stepping out of the team until you have the bandwidth. |
joestringer
left a comment
There was a problem hiding this comment.
Some nuances and nits discussed above, but I'll leave it up to you if any of the ideas discussed in the threads are useful to integrate (either in this PR or a subsequent one).
|
(Ready but deferring to @dylandreimerink to resolve comments and apply the PR) |
a2839e2 to
076a22d
Compare
|
/test |
076a22d to
97d1773
Compare
|
/test |
During investigation of a memory leak in v1.18, one of the pprof profiles showed a high amount of memory usage in `netlink/nl.(*NetlinkSocket).Receive`. cilium#41623 (comment) This is most likely due to a lack of rate limiting in the desired neighbor calculation which does a lot of netlink requests to get next hops. So this commit limits desired neighbor calculation to once every 15 seconds. In the worst case scenario where the default gateway changes, XDP might not be able to forward traffic for up to 15 seconds. Such a scenario should only happen when configuration changes are made or when the network topology changes, and thus this seems an acceptable tradeoff. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
97d1773 to
b32a781
Compare
|
/test |
During investigation of a memory leak in v1.18, one of the pprof profiles showed a high amount of memory usage in
netlink/nl.(*NetlinkSocket).Receive.#41623 (comment)
This is most likely due to a lack of rate limiting in the desired neighbor calculation which does a lot of netlink requests to get next hops.
So this PR limits desired neighbor calculation to once every 15 seconds. In the worst case scenario where the default gateway changes, XDP might not be able to forward traffic for up to 15 seconds. Such a scenario should only happen when configuration changes are made or when the network topology changes, and thus this seems an acceptable tradeoff.