Skip to content

loadbalancer: Flatten the backend representation#44511

Merged
joamaki merged 2 commits intocilium:mainfrom
joamaki:pr/joamaki/lb-backend-instance-refactor
Mar 6, 2026
Merged

loadbalancer: Flatten the backend representation#44511
joamaki merged 2 commits intocilium:mainfrom
joamaki:pr/joamaki/lb-backend-instance-refactor

Conversation

@joamaki
Copy link
Copy Markdown
Contributor

@joamaki joamaki commented Feb 24, 2026

This flattens the loadbalancer.Backend by storing "backend instance" per row in the backend table instead of having a map of BackendParams inside the Backend object.

Before:

pkg/loadbalancer/writer/ > go test . -bench . -benchmem
goos: linux
goarch: arm64
pkg: github.com/cilium/cilium/pkg/loadbalancer/writer
Benchmark_UpsertServiceAndFrontends_100-6                   3687            311606 ns/op            320918 objects/sec    326431 B/op       4639 allocs/op
Benchmark_UpsertServiceAndFrontends_100_Unchanged-6        14514             81891 ns/op           1221133 objects/sec     50012 B/op       1151 allocs/op
Benchmark_UpsertServiceAndFrontends_1-6                   141879              8088 ns/op            123640 objects/sec     13087 B/op         89 allocs/op
BenchmarkInsertBackend-6                                    3007            387712 ns/op            257923 objects/sec    511640 B/op       6785 allocs/op
BenchmarkReplaceBackend-6                                3216202               377.9 ns/op         2646089 objects/sec       448 B/op          9 allocs/op
BenchmarkReplaceService-6                                2284094               524.4 ns/op         1906815 objects/sec       464 B/op         10 allocs/op
Benchmark_UpsertBackends_SharedBackendManyServices-6           1        3182529066 ns/op        962720856 B/op  16146352 allocs/op
PASS
ok      github.com/cilium/cilium/pkg/loadbalancer/writer        10.315s

After:

pkg/loadbalancer/writer/ > stg push
> loadbalancer-flatten-backend
pkg/loadbalancer/writer/ > go test . -bench . -benchmem
goos: linux
goarch: arm64
pkg: github.com/cilium/cilium/pkg/loadbalancer/writer
Benchmark_UpsertServiceAndFrontends_100-6                   3796            314436 ns/op            318030 objects/sec    328109 B/op       4840 allocs/op
Benchmark_UpsertServiceAndFrontends_100_Unchanged-6        14769             80686 ns/op           1239372 objects/sec     50012 B/op       1151 allocs/op
Benchmark_UpsertServiceAndFrontends_1-6                   144716              7964 ns/op            125565 objects/sec     13103 B/op         91 allocs/op
BenchmarkInsertBackend-6                                    3433            338560 ns/op            295369 objects/sec    439490 B/op       5285 allocs/op
BenchmarkReplaceBackend-6                                4402164               272.7 ns/op         3667044 objects/sec       226 B/op          6 allocs/op
BenchmarkReplaceService-6                                2189962               535.3 ns/op         1868165 objects/sec       464 B/op         10 allocs/op
Benchmark_UpsertBackends_SharedBackendManyServices-6        1480            756108 ns/op          552204 B/op      14062 allocs/op
PASS
ok      github.com/cilium/cilium/pkg/loadbalancer/writer        8.681s

The "Benchmark_UpsertBackends_SharedBackendManyServices" inserts a backend that is shared by 2000 services.
By storing the backends flat in the table this is now 4000x faster and consumes ~500kB instead of 1GB
of memory.

The downside of this is that now it is harder to get the set of backend IP addresses and we need
to dedup on the consuming side. The backend table by default is also now sorted first by
service name.

Fixes: #44310

The internal representation of load-balancing backends has been refactored to efficiently support thousands of services referencing a shared backend.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Feb 24, 2026
@github-actions github-actions bot added the sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. label Feb 24, 2026
@joamaki joamaki changed the title loadbalancer: Proof of concept for flatting the backend representation loadbalancer: Proof of concept for flatter backend representation Feb 24, 2026
@joamaki
Copy link
Copy Markdown
Contributor Author

joamaki commented Feb 25, 2026

/test

@joamaki joamaki force-pushed the pr/joamaki/lb-backend-instance-refactor branch from 57c450a to 0a3876f Compare March 2, 2026 14:55
@joamaki joamaki changed the title loadbalancer: Proof of concept for flatter backend representation loadbalancer: Flatten the backend representation Mar 2, 2026
@joamaki joamaki force-pushed the pr/joamaki/lb-backend-instance-refactor branch from 0a3876f to 3037ecf Compare March 2, 2026 14:56
@joamaki joamaki added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label Mar 2, 2026
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 2, 2026
@joamaki
Copy link
Copy Markdown
Contributor Author

joamaki commented Mar 2, 2026

/test

@joamaki joamaki marked this pull request as ready for review March 2, 2026 16:30
@joamaki joamaki requested review from a team as code owners March 2, 2026 16:30
@joamaki joamaki enabled auto-merge March 3, 2026 09:33
Copy link
Copy Markdown
Contributor

@rastislavs rastislavs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for my codeowners

joamaki added 2 commits March 5, 2026 17:10
This benchmarks adding a backend that is shared by 2000 services.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Flatten the Backend by storing a backend instance per row in the backend table
rather than having them inside the backend object.

This makes handling backends that are shared by many services much more efficient:

Before:
jussi@macbook:~/go/src/github.com/cilium/cilium/pkg/loadbalancer/writer/ > go test . -bench . -benchmem
goos: linux
goarch: arm64
pkg: github.com/cilium/cilium/pkg/loadbalancer/writer
Benchmark_UpsertServiceAndFrontends_100-6                   3687            311606 ns/op            320918 objects/sec    326431 B/op       4639 allocs/op
Benchmark_UpsertServiceAndFrontends_100_Unchanged-6        14514             81891 ns/op           1221133 objects/sec     50012 B/op       1151 allocs/op
Benchmark_UpsertServiceAndFrontends_1-6                   141879              8088 ns/op            123640 objects/sec     13087 B/op         89 allocs/op
BenchmarkInsertBackend-6                                    3007            387712 ns/op            257923 objects/sec    511640 B/op       6785 allocs/op
BenchmarkReplaceBackend-6                                3216202               377.9 ns/op         2646089 objects/sec       448 B/op          9 allocs/op
BenchmarkReplaceService-6                                2284094               524.4 ns/op         1906815 objects/sec       464 B/op         10 allocs/op
Benchmark_UpsertBackends_SharedBackendManyServices-6           1        3182529066 ns/op        962720856 B/op  16146352 allocs/op
PASS
ok      github.com/cilium/cilium/pkg/loadbalancer/writer        10.315s

After:
jussi@macbook:~/go/src/github.com/cilium/cilium/pkg/loadbalancer/writer/ > stg push
> loadbalancer-flatten-backend
jussi@macbook:~/go/src/github.com/cilium/cilium/pkg/loadbalancer/writer/ > go test . -bench . -benchmem
goos: linux
goarch: arm64
pkg: github.com/cilium/cilium/pkg/loadbalancer/writer
Benchmark_UpsertServiceAndFrontends_100-6                   3796            314436 ns/op            318030 objects/sec    328109 B/op       4840 allocs/op
Benchmark_UpsertServiceAndFrontends_100_Unchanged-6        14769             80686 ns/op           1239372 objects/sec     50012 B/op       1151 allocs/op
Benchmark_UpsertServiceAndFrontends_1-6                   144716              7964 ns/op            125565 objects/sec     13103 B/op         91 allocs/op
BenchmarkInsertBackend-6                                    3433            338560 ns/op            295369 objects/sec    439490 B/op       5285 allocs/op
BenchmarkReplaceBackend-6                                4402164               272.7 ns/op         3667044 objects/sec       226 B/op          6 allocs/op
BenchmarkReplaceService-6                                2189962               535.3 ns/op         1868165 objects/sec       464 B/op         10 allocs/op
Benchmark_UpsertBackends_SharedBackendManyServices-6        1480            756108 ns/op          552204 B/op      14062 allocs/op
PASS
ok      github.com/cilium/cilium/pkg/loadbalancer/writer        8.681s

The "Benchmark_UpsertBackends_SharedBackendManyServices" inserts a backend that is shared by 2000 services.
By storing the backends flat in the table this is now 4000x faster and consumes ~500kB instead of 1GB
of memory.

The downside of this is that now it is harder to get the set of backend IP addresses and we need
to dedup on the consuming side. The backend table by default is also now sorted first by
service name.

Fixes: cilium#44310
Signed-off-by: Jussi Maki <jussi@isovalent.com>
@joamaki joamaki force-pushed the pr/joamaki/lb-backend-instance-refactor branch from 3037ecf to 3f911a2 Compare March 5, 2026 16:16
@joamaki
Copy link
Copy Markdown
Contributor Author

joamaki commented Mar 5, 2026

/test

@joamaki
Copy link
Copy Markdown
Contributor Author

joamaki commented Mar 6, 2026

/test

@nebril
Copy link
Copy Markdown
Member

nebril commented Mar 6, 2026

/test

@joamaki joamaki added this pull request to the merge queue Mar 6, 2026
Merged via the queue into cilium:main with commit c0b0586 Mar 6, 2026
88 of 93 checks passed
@joamaki joamaki deleted the pr/joamaki/lb-backend-instance-refactor branch March 6, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cilium v1.19.0 excessive memory usage

6 participants