Skip to content

[BGP]Tuning zebra nexthop-group keep parameter#21135

Merged
lguohan merged 1 commit intosonic-net:masterfrom
dgsudharsan:nhg_keep
Dec 16, 2024
Merged

[BGP]Tuning zebra nexthop-group keep parameter#21135
lguohan merged 1 commit intosonic-net:masterfrom
dgsudharsan:nhg_keep

Conversation

@dgsudharsan
Copy link
Copy Markdown
Collaborator

Why I did it

Setting the nexthop-group keep parameter to 1. This will instruct zebra not to save nexthop group for more than 1 second after removal. Without this zebra will keep nexthop group in the system for 180 seconds.
In scaled scenarios when this parameter is not set it resulted in the queue growing so big and crashing zebra due to OOM when there is test on link flapping.

Work item tracking
  • Microsoft ADO (number only):

How I did it

Update the zebra template and initialize nexthop-group keep as 1.

How to verify it

Running the scale test with link flapping and ensure no memory increase in zebra.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@dgsudharsan
Copy link
Copy Markdown
Collaborator Author

/azpw run Azure.sonic-buildimage

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to msft-202412: Azure/sonic-buildimage-msft#528

VladimirKuk pushed a commit to Marvell-switching/sonic-buildimage that referenced this pull request Jan 21, 2025
Why I did it
Setting the nexthop-group keep parameter to 1. This will instruct zebra not to save nexthop group for more than 1 second after removal. Without this zebra will keep nexthop group in the system for 180 seconds.
In scaled scenarios when this parameter is not set it resulted in the queue growing so big and crashing zebra due to OOM when there is test on link flapping.

How I did it
Update the zebra template and initialize nexthop-group keep as 1.

How to verify it
Running the scale test with link flapping and ensure no memory increase in zebra.
tshalvi pushed a commit to tshalvi/sonic-buildimage that referenced this pull request Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants