Dynamic BGP peer update and delete support#1963
Conversation
|
/azp run |
|
No pipelines are associated with this pull request. |
|
/azp run |
|
No pipelines are associated with this pull request. |
|
/azp run |
|
No pipelines are associated with this pull request. |
|
/azp run |
|
No pipelines are associated with this pull request. |
| ### New changes: | ||
| An update.conf.j2 and delete.conf.j2 will be defined in the same folder path structure as above, and similar branch out logic like the existing j2 files can be used with update.conf.j2 and delete.conf.j2. The update.conf.j2 and delete.conf.j2 will host the required logic to not only modify and delete the peers respectively, but also modify/delete associated configurations like route maps, peer groups, prefix lists etc. For the the purposes of this design we aim to simplify and only have a single j2 for update and delete each, which implies that update.j2 and delete.j2 will contain configuration that spans across all 3 types of config: instance, peer-group and policies. The alternate approach we considered was defining an update/delete version for each type of config, thereby defining multiple templates as follows: instance.update.conf.j2, policies.update.conf.j2, peer-group.update.conf.j2, instances.delete.conf.j2, policies.delete.conf.j2, peer-group.delete.conf.j2. We preferred the former approach of a single update.conf.j2 and delete.conf.j2 because of the simplicity it brings, moving to the latter model can be taken up in the future if use-cases for update and delete expand beyond and it becomes important to have more segregation of configuration across instance, policies and peer groups for the newly added update and delete operations. | ||
|
|
||
| # 3 Test Plan |
There was a problem hiding this comment.
If there is some issue or limitation in new feature, is it possible to disable it at runtime?
There was a problem hiding this comment.
We will need to remove the template file, upon removing the template file associated with update/delete, the code will not be activated.
There was a problem hiding this comment.
Can we add a config flag to disable this feature?
There was a problem hiding this comment.
IMO, this capability is already controlled by update.j2 and delete.j2, if not present, will retain the existing behavior. Prefer not to add extra knobs. If a config-db approach is preferred, then we should skip the template approach altogether. I think this can be added as phase 2.
Why I did it This change is to support enhanced CRUD operations on dynamic bgp peers using a standard template-based mechanism. Here is the High level design document explaining this feature: sonic-net/SONiC#1963 Work item tracking Microsoft ADO (number only): How I did it Upon a delete peer ocurring, we render the delete template(instead of executing the current behavior of no neighbor {{ neighbor addr}}), on the other hand if a delete template is not defined then the default behavior of no neighbor {{ neighbor addr}} applies as usual, thereby making this change backward compatible. How to verify it Create a dynamic peer like so in config_db.json: "BGP_NEIGHBOR": { "10.0.0.62": { "admin_status": "up", "asn": "65100", "holdtime": "10", "keepalive": "3", "local_addr": "10.0.0.63", "name": "vlab-01", "nhopself": "0", "rrclient": "0" } } Once the peer is up, delete the neighbor (eg. config bgp neighbor remove ). Verify that the neighbor is deleted.
|
/azp run |
|
No pipelines are associated with this pull request. |
| ## 2.3 CLI | ||
| The following CLIs will be added, each of these will have similar CLI outputs to their ```default VRF``` counterparts which are widely in use today: | ||
| ``` | ||
| 1. show ip bgp vrf {vrf_name} summary |
There was a problem hiding this comment.
Iso of adding new command can we add an option existing command show ip bgp summary --vrf <vrf_name> this way existing command show ip bgp summary will work for default VRF and can user can pass the vrf_name to get output per vrf.
There was a problem hiding this comment.
oh the existing command show ip bgp summary will still work for default vrf, intention is to have all existing commands work as-is, and support the extra vrf option in a way that it matches vtysh. Doing it this way will match exactly how the command is structured in vtysh, making it helpful for everyone who is used to that structure of cli.
| ### New changes: | ||
| An update.conf.j2 and delete.conf.j2 will be defined in the same folder path structure as above, and similar branch out logic like the existing j2 files can be used with update.conf.j2 and delete.conf.j2. The update.conf.j2 and delete.conf.j2 will host the required logic to not only modify and delete the peers respectively, but also modify/delete associated configurations like route maps, peer groups, prefix lists etc. For the the purposes of this design we aim to simplify and only have a single j2 for update and delete each, which implies that update.j2 and delete.j2 will contain configuration that spans across all 3 types of config: instance, peer-group and policies. The alternate approach we considered was defining an update/delete version for each type of config, thereby defining multiple templates as follows: instance.update.conf.j2, policies.update.conf.j2, peer-group.update.conf.j2, instances.delete.conf.j2, policies.delete.conf.j2, peer-group.delete.conf.j2. We preferred the former approach of a single update.conf.j2 and delete.conf.j2 because of the simplicity it brings, moving to the latter model can be taken up in the future if use-cases for update and delete expand beyond and it becomes important to have more segregation of configuration across instance, policies and peer groups for the newly added update and delete operations. | ||
|
|
||
| # 3 Test Plan |
There was a problem hiding this comment.
Can we add a config flag to disable this feature?
| ``` | ||
| self.templates["update"] = self.fabric.from_file(update_template_path) | ||
| ``` | ||
| 3. If an update template is supported and the peer_type is dynamic, we will identify the set of ip_ranges which are added, and those which are deleted as part of the operation, and pass in those as additional kwargs and render the update template, sample code: |
There was a problem hiding this comment.
we will identify the set of ip_ranges which are added, and those which are deleted as part of the operation
Is this done in bgpcfgd? how is this done? Can you add more details?
There was a problem hiding this comment.
Yes it is done in bgpcfgd, it is done by querying vtysh to find the current set of ranges, and diff-ing that against the newly configured set of ranges as received from config_db, the added/deleted ranges are then passed into an update.j2 template to render the required config changes. PR: sonic-net/sonic-buildimage#22261
I will update the HLD PR to add some more details on how it is done in bgpcfgd
|
@StormLiangMS , @arlakshm , would you approve this? |
|
/azp run |
|
No pipelines are associated with this pull request. |
| ###### Table 2: Scalability | ||
| | Component | Expected value | | ||
| |--------------------------|-----------------------------| | ||
| | Number of Dynamic BGP Peers each with route-map, prefix-list and peer-group|2k| |
There was a problem hiding this comment.
@anish-n Could you clarify what the "2K" refers to in this context? A scale of 2,000 is significantly larger compared to 64 neighbors and could potentially lead to performance issues or even system crashes. I recommend explicitly defining what the 2K and 4K entries represent in the table to avoid confusion.
Updated scalability requirements for dynamic BGP peers and listen ranges.
|
/azp run |
|
No pipelines are associated with this pull request. |
| # 3 Test Plan | ||
| Bgpcfgd, CLI and template changes will have corresponding UTs. | ||
|
|
||
| New SONiC-mgmt tests will be introduced to test the following: |
There was a problem hiding this comment.
Hi @anish-n Do we have a more detailed test plan for this feature?
- Scale testing – 2K peers is non-trivial, especially considering the scaled device.
- Stress testing – including reloads and reboots.
- Dynamic update/delete scenarios – what operations are supported? Could we add one section in this doc to explicitly describe what is supported and what is not? For example:
a. Updating the IP range of an existing dynamic peer.
b. Adding a new dynamic peer with route-map/prefix-list configurations.
c. Deleting an existing dynamic peer.
d. Updating the route-map/prefix-list of an existing dynamic peer.
And could you provide an update on the progress of this feature and clarify which release it is planned for?
There was a problem hiding this comment.
@StormLiangMS,
Added high level tests which will be added for scale and stress test. The intent of this HLD is supporting dynamic bgp peer changes, detailed stress test plan can be added to sonic-mgmt if required.
In terms of operations supported, this list is present in the requirements section, from config db perspective there is no change in operations beyond allowing a change in ip_ranges passed in. We handle the change in ip range and then FRR templates will render associated change in route map/prefix list as present in template, exactly like how the add peer/delete peers function today, so operationally it would be identical.
|
/azp run |
|
No pipelines are associated with this pull request. |
|
/azp run |
|
No pipelines are associated with this pull request. |
This PR contains the HLD for Dynamic BGP Peer modification, specifically it lists the design for the following: