Skip to content

[FRR] Add support for 514 BGP sessions#22390

Merged
lguohan merged 5 commits intosonic-net:masterfrom
vivekrnv:frr_514
May 28, 2025
Merged

[FRR] Add support for 514 BGP sessions#22390
lguohan merged 5 commits intosonic-net:masterfrom
vivekrnv:frr_514

Conversation

@vivekrnv
Copy link
Copy Markdown
Contributor

@vivekrnv vivekrnv commented Apr 21, 2025

Why I did it

Add support for 512 BGP sessions

How I did it

Patch Upstream Commit
0035-lib-Add-support-for-stream-buffer-to-expand.patch 65b3ee4e
0036-zebra-zebra-crash-for-zapi-stream.patch c122afdb
0037-bgpd-Replace-per-peer-connection-error-with-per-bgp.patch 10c127bc
0038-bgpd-remove-apis-from-bgp_route.h.patch 1d5a8a20
0039-bgpd-batch-peer-connection-error-clearing.patch 4baa9f2d
0040-zebra-move-peer-conn-error-list-to-connection-struct.patch 411abd6b
0041-bgpd-Allow-batch-clear-to-do-partial-work-and-contin.patch b68be906
0042-zebra-send-v6-fast-RA-at-faster-interval.patch #18451
0043-bgpd-Paths-received-from-shutdown-peer-not-deleted.patch 2cbfc7ec
0044-bgpd-Modify-bgp-to-handle-packet-events-in-a-FIFO.patch 12bf042
0045-zebra-Limit-reading-packets-when-MetaQ-is-full.patch 937a9fb
0046-bgpd-Delay-processing-MetaQ-in-some-events.patch 83a92c9
0047-bgpd-Fix-holdtime-not-working-properly-when-busy.patch 9a26a56
0048-bgpd-ensure-that-bgp_generate_updgrp_packets-shares-.patch 681caee
0049-zebra-show-command-to-display-metaq-info.patch 751ae76
0050-bgpd-add-total-path-count-for-bgp-net-in-json-output.patch be3c6d3
0051-lib-Add-nexthop_same_no_ifindex-comparison-function.patch 66f552c
0052-zebra-show-nexthop-count-in-nexthop-group-command.patch da5703e
0053-zebra-Allow-nhg-s-to-be-reused-when-multiple-interfa.patch 46044a4
0054-zebra-Prevent-active-setting-if-interface-is-not-ope.patch e5f4675
0055-zebra-Add-nexthop-group-id-to-route-dump.patch b732ad2
0056-zebra-Display-interface-name-not-ifindex-in-nh-dump.patch c891cd2

How to verify it

Verified the changes on topology with scaled BGP tests and standard test suite

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bradh352
Copy link
Copy Markdown
Collaborator

That is a whole lot of additional patches. It does make me wonder if the effort would be better spent collaborating on #22267

I haven't tried to see what commits weren't already part of the 10.3 release other than the obvious PR FRRouting/frr#18451 which is not yet merged

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

r12f pushed a commit to Azure/sonic-buildimage-msft that referenced this pull request Apr 23, 2025
**Add support for 512 BGP sessions**

Backport sonic-net/sonic-buildimage#22390 to
202412

| Patch | Upstream Commit |
|-------|----------------|
| 0044-zebra-Prevent-starvation-in-dplane_thread_loop.patch |
[6faad863](FRRouting/frr@6faad86)
|
| 0083-bgpd-fix-vty-output-of-evpn-route-target-AS4.patch |
[20b3ab48](FRRouting/frr@20b3ab4)
|
| 0084-zebra-Ensure-dplane-does-not-send-work-back-to-maste.patch |
[c4115522](FRRouting/frr@c411552)
|
| 0085-zebra-Limit-mutex-for-obuf-to-when-we-access-obuf.patch |
[c58da10d](FRRouting/frr@c58da10)
|
| 0086-bgpd-backpressure-Fix-to-pop-items-off-zebra_announc.patch |
[898852f](FRRouting/frr@898852f)
|
| 0087-zebra-fnc-obuf-could-be-accessed-without-a-lock.patch |
[e7a1fbbcf](FRRouting/frr@e7a1fbb)
|
| 0088-zebra-Add-show-fpm-status-json-command.patch |
[0a9e8ef49](FRRouting/frr@0a9e8ef)
|
| 0089-doc-Add-show-fpm-status-json-command-to-documentatio.patch |
[a0c4fe2ca](FRRouting/frr@a0c4fe2)
|
| 0090-zebra-avoid-a-race-during-FPM-dplane-plugin-shutdown.patch |
[277784f](FRRouting/frr@277784f)
|
| 0091-zebra-add-nexthop-counter-to-show-zebra-dplane-comma.patch |
[e36e570c](FRRouting/frr@e36e570)
|
| 0092-zebra-Installation-success-should-not-set-NHG-as-val.patch |
[910b2c5a](FRRouting/frr@910b2c5)
|
| 0093-zebra-When-reinstalling-a-NHG-set-REINSTALL-flag.patch |
[b2ade8e](FRRouting/frr@b2ade8e)
|
| 0094-zebra-Conslidate-zebra_nhg_set_valid-invalid-functio.patch |
[6ee9cc68](FRRouting/frr@8f76afd)
|
| 0095-zebra-Properly-note-that-a-nhg-s-nexthop-has-gone-do.patch |
[3b9428a7](FRRouting/frr@1bbbcf0)
|
| 0096-zebra-be-consistent-about-v6-nexthops-for-v4-routes.patch |
[c93bc371](FRRouting/frr@0221ed2)
|
| 0097-lib-zebra-Modify-nexthop_cmp-to-allow-you-to-use-wei.patch |
[75268f01](FRRouting/frr@b8e24a0)
|
| 0098-zebra-Create-Singleton-nhg-s-without-weights.patch |
[ae4a1315](FRRouting/frr@c20fa97)
|
| 0099-zebra-Allow-blackhole-singleton-nexthops-to-be-v6.patch |
[ae397ad9](FRRouting/frr@f90989d)
|
| 0100-zebra-Allow-for-initial-deny-of-installation-of-nhe-.patch |
[4bf2c11f](FRRouting/frr@0c72a78)
|
| 0101-zebra-Properly-note-that-a-nhg-s-nexthop-has-gone-do.patch |
[892e8179](FRRouting/frr@1bbbcf0)
|
| 0102-zebra-Reinstall-nexthop-when-interface-comes-back-up.patch |
[279f427c](FRRouting/frr@3be8b48)
|
| 0103-zebra-Attempt-to-reuse-NHG-after-interface-up-and-ro.patch |
[98d56711](FRRouting/frr@f02d76f)
|
| 0104-zebra-Expose-_route_entry_dump_nh-so-it-can-be-used.patch |
[4fb44993](FRRouting/frr@ce166ca)
|
| 0105-zebra-Fix-resetting-valid-flags-for-NHG-dependents.patch |
[6e95686b](FRRouting/frr@54ec9f3)
|
| 0106-zebra-Fix-leaked-nhe.patch |
[a84d2bc0](FRRouting/frr@97fa24e)
|
| 0107-zebra-Uninstall-NHG-in-some-situations.patch |
[d1cba73a](FRRouting/frr@4c16694)
|
| 0108-zebra-Optimize-invoking-nhg-compare-func.patch |
[0faa70a5](FRRouting/frr@e77954e)
|
| 0109-zebra-Nexthops-need-to-be-ACTIVE-in-some-cases.patch |
[df56b92b](FRRouting/frr@b61424a)
|
| 0110-zebra-On-Nexthop-install-failure-don-t-set-Installat.patch |
[4b2b1a9a](FRRouting/frr@ec6a000)
|
| 0111-zebra-Bring-up-514-BGP-neighbor-sessions.patch |
[ea399e15](FRRouting/frr@6a75d33)
|
| 0112-lib-Add-support-for-stream-buffer-to-expand.patch |
[65b3ee4e](FRRouting/frr@c0c46ba)
|
| 0113-zebra-zebra-crash-for-zapi-stream.patch |
[c122afdb](FRRouting/frr@6fe9092)
|
| 0114-bgpd-Replace-per-peer-connection-error-with-per-bgp.patch |
[10c127bc](FRRouting/frr@6a5962e)
|
| 0115-bgpd-remove-apis-from-bgp_route.h.patch |
[1d5a8a20](FRRouting/frr@020245b)
|
| 0116-bgpd-batch-peer-connection-error-clearing.patch |
[4baa9f2d](FRRouting/frr@58f924d)
|
| 0117-zebra-move-peer-conn-error-list-to-connection-struct.patch |
[411abd6b](FRRouting/frr@6206e7e)
|
| 0118-bgpd-Allow-batch-clear-to-do-partial-work-and-contin.patch |
[b68be906](FRRouting/frr@c527882)
|
| 0119-zebra-send-v6-fast-RA-at-faster-interval.patch |
[c8f12a4f](FRRouting/frr#18451) |
| 0120-lib-add-option-to-start-stop-wheel-timer.patch |
[ca0adcdd](FRRouting/frr#18451) |
| 0121-bgpd-Paths-received-from-shutdown-peer-not-deleted.patch |
[2cbfc7ec](FRRouting/frr@d2bec7a)
|

**Verification:**
Verified the changes on topology with scaled BGP tests

---------

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
Signed-off-by: Vivek <vivekreddykarri98@gmail.com>
@ahsalam
Copy link
Copy Markdown

ahsalam commented Apr 23, 2025

@vivekrnv @r12f @BYGX-wcr We are working on the FRR upgrade as we speak. The PR has passed all CI checks #22267

can we push this after the FRR upgrade ?

@r12f
Copy link
Copy Markdown
Contributor

r12f commented Apr 23, 2025

@lguohan and @yxieca to approve for master.

@BYGX-wcr
Copy link
Copy Markdown
Contributor

@r12f, As the community mentioned. This PR introduced too many patches which could be avoided by upgrading the FRR to 10.3. Since the upgrade is almost there, if we need this urgently, why don't we do this on 202412 only?

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv vivekrnv marked this pull request as draft May 14, 2025 22:08
Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv vivekrnv marked this pull request as ready for review May 20, 2025 18:23
@vivekrnv
Copy link
Copy Markdown
Contributor Author

@BYGX-wcr, @ahsalam, @bradh352 I've updated the patches to be compatible with 10.3.

@BYGX-wcr
Copy link
Copy Markdown
Contributor

Thanks. Please watch the PR checker results, introducing new routing functionality may break some existing assumptions in code.

@vivekrnv
Copy link
Copy Markdown
Contributor Author

Thanks. Please watch the PR checker results, introducing new routing functionality may break some existing assumptions in code.

Looks like a generic failure across all the PR's. But i will monitor after the Infra issues are fixed

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@BYGX-wcr
Copy link
Copy Markdown
Contributor

/azpw ms_conflict

@vivekrnv
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@vivekrnv
Copy link
Copy Markdown
Contributor Author

@BYGX-wcr, Can you review and signoff?

Copy link
Copy Markdown
Contributor

@BYGX-wcr BYGX-wcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown

@ahsalam ahsalam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@cscarpitta cscarpitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lguohan lguohan merged commit 8fd8350 into sonic-net:master May 28, 2025
19 checks passed
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202505: #22786

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.