Skip to content

[sanity-check]: Show failed check items in stack trace and error logs#2242

Merged
theasianpianist merged 1 commit intosonic-net:masterfrom
theasianpianist:sanity-check-log
Sep 21, 2020
Merged

[sanity-check]: Show failed check items in stack trace and error logs#2242
theasianpianist merged 1 commit intosonic-net:masterfrom
theasianpianist:sanity-check-log

Conversation

@theasianpianist
Copy link
Copy Markdown
Contributor

@theasianpianist theasianpianist commented Sep 18, 2020

Signed-off-by: Lawrence Lee lawlee@microsoft.com

Description of PR

Summary:
When the sanity check fails, add failed items in error log and include them in the stack trace.

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

What is the motivation for this PR?

Currently when the sanity check fails, the stack trace is not able to show all of the check results since the results are too long. To see the failed items it is necessary to view the test logs, which can be annoying since they are not always displayed together (esp. in Jenkins)

How did you do it?

Select the failed check items and include them in the error log. Then also add these items to the pytest.fail message so they are printed in the stack trace.

How did you verify/test it?

Force a sanity check failure on a device and run some tests. Verified that the stack trace included the failed check items. Verify that the error log included the failed check items.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

Previous output (no check-items in stack trace)

localhost = <tests.common.devices.Localhost object at 0x7f9e38bb7cd0>
duthost = <tests.common.devices.SonicHost object at 0x7f9e3a4d0750>
request = <SubRequest 'sanity_check' for <Function test_bgp_facts>>
fanouthosts = <omitted>
testbed = <omitted>

    @pytest.fixture(scope="module", autouse=True)
    def sanity_check(localhost, duthost, request, fanouthosts, testbed):
        logger.info("Start pre-test sanity check")
    
        skip_sanity = False
        allow_recover = False
        recover_method = "adaptive"
        check_items = set(copy.deepcopy(constants.DEFAULT_CHECK_ITEMS))  # Default check items
        post_check = False
    
        customized_sanity_check = None
        for m in request.node.iter_markers():
            logger.info("Found marker: m.name=%s, m.args=%s, m.kwargs=%s" % (m.name, m.args, m.kwargs))
            if m.name == "sanity_check":
                customized_sanity_check = m
                break
    
        if customized_sanity_check:
            logger.info("Process marker %s in script. m.args=%s, m.kwargs=%s" % (m.name, str(m.args), str(m.kwargs)))
            skip_sanity = customized_sanity_check.kwargs.get("skip_sanity", False)
            allow_recover = customized_sanity_check.kwargs.get("allow_recover", False)
            recover_method = customized_sanity_check.kwargs.get("recover_method", "adaptive")
            if allow_recover and recover_method not in constants.RECOVER_METHODS:
                pytest.warning("Unsupported recover method")
                logger.info("Fall back to use default recover method 'config_reload'")
                recover_method = "config_reload"
    
            check_items = _update_check_items(check_items,
                                              customized_sanity_check.kwargs.get("check_items", []),
                                              constants.SUPPORTED_CHECK_ITEMS)
            post_check = customized_sanity_check.kwargs.get("post_check", False)
    
        if request.config.option.skip_sanity:
            skip_sanity = True
        if request.config.option.allow_recover:
            allow_recover = True
        items = request.config.getoption("--check_items")
        if items:
            items_array=str(items).split(',')
            check_items = _update_check_items(check_items, items_array, constants.SUPPORTED_CHECK_ITEMS)
    
        # ignore BGP check for particular topology type
        if testbed['topo']['type'] == 'ptf' and 'bgp' in check_items:
            check_items.remove('bgp')
    
        logger.info("Sanity check settings: skip_sanity=%s, check_items=%s, allow_recover=%s, recover_method=%s, post_check=%s" % \
            (skip_sanity, check_items, allow_recover, recover_method, post_check))
    
        if skip_sanity:
            logger.info("Skip sanity check according to command line argument or configuration of test script.")
            yield
            return
    
        if not check_items:
            logger.info("No sanity check item is specified, no pre-test sanity check")
            yield
            logger.info("No sanity check item is specified, no post-test sanity check")
            return
    
        print_logs(duthost, constants.PRINT_LOGS)
        check_results = do_checks(duthost, check_items)
        logger.info("!!!!!!!!!!!!!!!! Pre-test sanity check results: !!!!!!!!!!!!!!!!\n%s" % \
                    json.dumps(check_results, indent=4))
        if any([result["failed"] for result in check_results]):
            if not allow_recover:
                pytest.fail("Pre-test sanity check failed, allow_recover=False {}".format(check_results))
                return
    
            logger.info("Pre-test sanity check failed, try to recover, recover_method=%s" % recover_method)
            recover(duthost, localhost, fanouthosts, check_results, recover_method)
            logger.info("Run sanity check again after recovery")
            new_check_results = do_checks(duthost, check_items)
            logger.info("!!!!!!!!!!!!!!!! Pre-test sanity check after recovery results: !!!!!!!!!!!!!!!!\n%s" % \
                        json.dumps(new_check_results, indent=4))
            if any([result["failed"] for result in new_check_results]):
>               pytest.fail("Pre-test sanity check failed again after recovered by '%s'" % recover_method)
E               Failed: Pre-test sanity check failed again after recovered by 'adaptive'

allow_recover = True
check_items = set(['bgp', 'dbmemory', 'interfaces', 'processes', 'services'])
check_results = [{'check_item': 'services', 'failed': False, 'services_status': {'bgp': True, 'database': True, 'lldp': True, 'pmon': ...tatus': {'bgp': True, 'database': True, 'lldp': True, 'pmon': True, ...}}, {'check_item': 'dbmemory', 'failed': False}]
customized_sanity_check = None
duthost    = <tests.common.devices.SonicHost object at 0x7f9e3a4d0750>
fanouthosts = <omitted>
items      = False
localhost  = <tests.common.devices.Localhost object at 0x7f9e38bb7cd0>
m          = Mark(name='device_type', args=('vs',), kwargs={})
new_check_results = [{'check_item': 'services', 'failed': False, 'services_status': {'bgp': True, 'database': True, 'lldp': True, 'pmon': ...tatus': {'bgp': True, 'database': True, 'lldp': True, 'pmon': True, ...}}, {'check_item': 'dbmemory', 'failed': False}]
post_check = False
recover_method = 'adaptive'
request    = <SubRequest 'sanity_check' for <Function test_bgp_facts>>
result     = {'check_item': 'dbmemory', 'failed': False}
skip_sanity = False
testbed    = <omitted>

common/plugins/sanity_check/__init__.py:118: Failed

New output:

<same as previous>
...
if any([result["failed"] for result in new_check_results]):
                failed_items = json.dumps([result for result in new_check_results if result["failed"]], indent=4)
                logger.error("Failed check items:\n{}".format(failed_items))
>               pytest.fail("Pre-test sanity check failed again after recovered by '{}' with failed items:\n{}".format(recover_method, failed_items))
E               Failed: Pre-test sanity check failed again after recovered by 'adaptive' with failed items:
E               [
E                   {
E                       "check_item": "bgp",
E                       "failed": true,
E                       "down_neighbors": [
E                           "10.0.0.59",
E                           "10.0.0.61",
E                           "10.0.0.63",
E                           "fc00::7e",
E                           "fc00::7a",
E                           "10.0.0.57",
E                           "fc00::76",
E                           "fc00::72"
E                       ]
E                   },
E                   {
E                       "check_item": "interfaces",
E                       "failed": true,
E                       "down_ports": [
E                           "PortChannel0001",
E                           "PortChannel0003",
E                           "PortChannel0002",
E                           "PortChannel0004"
E                       ]
E                   }
E               ]
...
<same as previous>

Signed-off-by: Lawrence Lee <lawlee@microsoft.com>
@theasianpianist theasianpianist requested review from a team and daall September 18, 2020 21:16
Copy link
Copy Markdown
Contributor

@daall daall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example of the before/after of the output?

@daall daall requested review from wangxin and yxieca September 19, 2020 00:38
@theasianpianist theasianpianist merged commit a8c79bf into sonic-net:master Sep 21, 2020
@theasianpianist theasianpianist deleted the sanity-check-log branch September 21, 2020 16:35
kazinator-arista pushed a commit to kazinator-arista/sonic-mgmt that referenced this pull request Mar 4, 2026
…#12292)

linkmgrd:
* a5ac7f6 2022-10-05 | [Active-Active] Post link prober stats to state db  (sonic-net#140) (HEAD -> 202205, github/202205) [Jing Zhang]
* f4b0e53 2022-10-05 | [Active-Active] Retry config mux mode standby (sonic-net#139) [Jing Zhang]

utilities:
* a255838 2022-10-04 | [minigraph] new workflow for golden path (sonic-net#2396) (HEAD -> 202205, github/202205) [jingwenxie]
* 99425a8 2022-10-03 | [actions] Support Semgrep by Github Actions (sonic-net#2417) [Mai Bui]
* f41e4d1 2022-09-30 | Fix for show vxlan tunnel command display issue sonic-net#11902 (sonic-net#2391) [Senthil Bhava]
* e1d827e 2022-09-29 | [VxLAN]Fix Vxlan delete command to throw error when there are references (sonic-net#2404) [Sudharsan Dhamal Gopalarathnam]
* d77acf8 2022-09-28 | [doc] add documentation on automatic techsupport based on memory (sonic-net#2411) [Stepan Blyshchak]
* 2cfc75a 2022-09-28 | [doc] update "config feature" section with "--block" option (sonic-net#2409) [Stepan Blyshchak]
* 9dc8471 2022-09-28 | [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (sonic-net#2398) [Vivek]
* 342589e 2022-10-03 | Added cisco config platform commands (sonic-net#2242) (sonic-net#2418) [yucgu]

swss:
* 9d9f395 2022-10-04 | [intfmgr]: Enable `accept_untracked_na` kernel param (sonic-net#2436) (HEAD -> 202205, github/202205) [Lawrence Lee]
* 6b6d25d 2022-10-04 |  [orchdaemon]: Fixed sairedis record file rotation (sonic-net#2480) [Bryan Crossland]

Signed-off-by: Ying Xie <ying.xie@microsoft.com>

Signed-off-by: Ying Xie <ying.xie@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants