policy: introduce lazy ordered/sorted selections#43368
policy: introduce lazy ordered/sorted selections#43368odinuge wants to merge 4 commits intocilium:mainfrom
Conversation
9e9b35c to
3196140
Compare
|
/test |
1637272 to
35f4c45
Compare
|
/test |
pkg/policy/selectorcache_selector.go
Outdated
| id types.SelectorId | ||
| users map[CachedSelectionUser]struct{} | ||
| cachedSelections map[identity.NumericIdentity]struct{} | ||
| cachedSelections set.Set[identity.NumericIdentity] |
There was a problem hiding this comment.
Looking into doing part.Set[identity.NumericIdentity] here as well. That could be useful to reduce the number of allocations - but it will most likely do more allocations that the current version in main, but potentially be faster and allocate less memory.
|
Benchmarks using #43407; So it seems like this is much faster at identity updates, but it does have a significantly higher allocation count; but even more significant reduction in cpu time and allocation size. |
This introduces a separate sorted structure that can be used by the components that need it. Previously all selections were sorted. This very costly in the hotpath of the selection updates, as sorting 10k+ slices is pretty expensive. This still creates a copy of the selections, but it does not sort it. This can in theory result in higher memory usage since we now keep a set (in practice a map), and the sorted slice if asked for. In practice not many things need the sorted version, notably the dnsproxy as well as some l7/xds configs. Its also not totally sure that those always need it either - and in the cases they need it, they also end up keeping it in memory - so the diff is pretty small in practice. Most of the datapath is also using updates, so its not relying on the sorted version all the time. Its a bit hard to benchmark this directly without cherry-picking results, but a simple test-case where a single selector selects say 20k identities (eg. the 'cluster' entity), that are updated say 5 times/sec - results in a lot of cpu cycles spent on sorting that slice. Signed-off-by: Odin Ugedal <odin@ugedal.com> Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <odin@ugedal.com> Signed-off-by: Odin Ugedal <ougedal@palantir.com>
0ed7a43 to
d39b47d
Compare
|
/test |
This swaps the set.Set to a part.Set, so that we can avoid cloning the whole datastructure on each write. This does result in more allocations in general, but overall it looks like a win since we reduce the time it takes to clone it. The same applies to the total amount of allocated bytes, that seems to go down drastically. Signed-off-by: Odin Ugedal <odin@ugedal.com> Signed-off-by: Odin Ugedal <ougedal@palantir.com>
d39b47d to
895b305
Compare
|
/test |
|
Seems like we are having conflicts. I'll wait for #42580 to be worked through and merge before rebasing this. 😄 |
|
This pull request has been automatically marked as stale because it |
|
This pull request has not seen any activity since it was marked stale. |
|
I can't seem to reopen, so I'll open a new one when I get time. I've been busy mitigating other critical bugs like #44328 the past few weeks, and to get v1.17 working so this hasn't been prioritized. So I'm still planning to do this work! |

This introduces a separate sorted structure that can be used by the components that need it. Previously all selections were sorted. This very cosly in the hotpath of the selection updates, as sorting 10k+ slices is pretty expensive. This still creates a copy of the selections, but it does not sort it.
This can in theory result in higher memory usage since we now keep a set (thats in practice a map), and the sorted slice if asked for. In practice not many things need the sorted version, noteably the dnsproxy as well as some l7/xds configs. Its also not totally sure that those always need it either - and in the cases they need it, they also end up keeping it in memory - so the diff is pretty small in practice. Most of the datapath is also using updates, so its not relying on the sorted version all the time.
Its a bit hard to benchmark this directly without cherry-picking results, but a simple test-case where a single selector selects say 20k identities (eg. the 'cluster' entity), that are updated say 5 times/sec - results in a lot of cpu cycles spent on sorting that slice.
Fixes: #issue-number