Skip to content

[sanity][bgp] Skip default route missing recover for multi-asic#16264

Merged
wangxin merged 1 commit intosonic-net:masterfrom
yaqiangz:azure-master_sanity
Dec 31, 2024
Merged

[sanity][bgp] Skip default route missing recover for multi-asic#16264
wangxin merged 1 commit intosonic-net:masterfrom
yaqiangz:azure-master_sanity

Conversation

@yaqiangz
Copy link
Copy Markdown
Contributor

@yaqiangz yaqiangz commented Dec 31, 2024

Description of PR

Summary:
Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

Default route check in sanity is added by #16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30

How did you do it?

Add single asic constraint in recover

How did you verify/test it?

Run test

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@wangxin wangxin merged commit d45bde5 into sonic-net:master Dec 31, 2024
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Dec 31, 2024
…c-net#16264)

What is the motivation for this PR?
Default route check in sanity is added by sonic-net#16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Dec 31, 2024
…c-net#16264)

What is the motivation for this PR?
Default route check in sanity is added by sonic-net#16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Dec 31, 2024
…c-net#16264)

What is the motivation for this PR?
Default route check in sanity is added by sonic-net#16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202405: #16267

@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202411: #16268

@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202311: #16269

mssonicbld pushed a commit that referenced this pull request Dec 31, 2024
What is the motivation for this PR?
Default route check in sanity is added by #16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
mssonicbld pushed a commit that referenced this pull request Dec 31, 2024
What is the motivation for this PR?
Default route check in sanity is added by #16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
yaqiangz added a commit to yaqiangz/sonic-mgmt that referenced this pull request Jan 2, 2025
…c-net#16264)

What is the motivation for this PR?
Default route check in sanity is added by sonic-net#16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
StormLiangMS pushed a commit that referenced this pull request Jan 2, 2025
…) (#16293)

What is the motivation for this PR?
Default route check in sanity is added by #16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
@yejianquan
Copy link
Copy Markdown
Collaborator

merged to 202405: #16293

nnelluri-cisco pushed a commit to nnelluri-cisco/sonic-mgmt that referenced this pull request Mar 15, 2025
…c-net#16264)

What is the motivation for this PR?
Default route check in sanity is added by sonic-net#16235
It only supports single asic for now. But constraint for single asic in recover stage is missed
It would cause keyError in multi-asic if there is bgp sanity check failure

    def adaptive_recover(dut, localhost, fanouthosts, nbrhosts, tbinfo, check_results, wait_time):
        outstanding_action = None
        for result in check_results:
            if result['failed']:
                if result['check_item'] == 'interfaces':
                    action = _recover_interfaces(dut, fanouthosts, result, wait_time)
                elif result['check_item'] == 'services':
                    action = _recover_services(dut, result)
                elif result['check_item'] == 'bgp':
                    # If there is only default route missing issue, only need to re-announce routes to recover
>                   if ("no_v4_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        "no_v6_default_route" in result['bgp'] and len(result['bgp']) == 1 or
                        ("no_v4_default_route" in result['bgp'] and "no_v6_default_route" in result['bgp'] and
                         len(result['bgp']) == 2)):
E                        KeyError: 'bgp'

check_results = [{'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}]
dut        = <MultiAsicSonicHost vlab-08>
fanouthosts = {}
localhost  = <tests.common.devices.local.Localhost object at 0x77f1b9270a90>
nbrhosts   = {'ARISTA01T0': <EosHost VM0129>, 'ARISTA01T2': <EosHost VM0128>}
outstanding_action = None
result     = {'bgp0': {'down_neighbors': ['2603:10e2:400:1::5', 'fc00::2']}, 'bgp3': {'down_neighbors': ['2603:10e2:400:1::6']}, 'check_item': 'bgp', 'failed': True, ...}
tbinfo     = {'auto_recover': 'False', 'comment': 'Tests multi-asic virtual switch vm', 'conf-name': 'vms-kvm-four-asic-t1-lag', 'duts': ['vlab-08'], ...}
wait_time  = 30
How did you do it?
Add single asic constraint in recover

How did you verify/test it?
Run test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants