Skip to content

[tests]: Fix test_MirrorDestMoveLag test failure#3639

Merged
prsunny merged 6 commits intosonic-net:masterfrom
cscarpitta:fix/fix_test_mirror_test_failure
May 31, 2025
Merged

[tests]: Fix test_MirrorDestMoveLag test failure#3639
prsunny merged 6 commits intosonic-net:masterfrom
cscarpitta:fix/fix_test_mirror_test_failure

Conversation

@cscarpitta
Copy link
Copy Markdown
Contributor

@cscarpitta cscarpitta commented May 11, 2025

test_MirrorDestMoveLag performs the following steps:

  1. Create mirror session
  2. Enable non-LAG monitor port
  3. Create LAG; move to LAG with one member
  4. Remove LAG member
  5. Create LAG member
  6. Remove LAG; move to non-LAG
  7. Disable non-LAG monitor port
  8. Remove mirror session

At Step3, we call create_port_channel and set_interface_status to create LAG and set admin_status "up".

This creates a "PortChannel080" entry in the "PORTCHANNEL" table of CONFIG_DB, which results in the creation of a team device "PortChannel080" in Linux.

When step 6 calls remove_port_channel to remove the LAG, it is expected that the "PortChannel080" entry created previously will be removed and the team device will be deleted.

However, the current code does not remove the "PortChannel080" entry from CONFIG_DB and consequently the team device is not deleted.

This issue causes a test failure. The reason is that subsequent tests try to create a "PortChannel080" and since the "PortChannel080" already exists, unexpected behavior occurs.

This PR fixes the issue by changing the remove_port_channel code to delete the PortChannel entry.

`test_MirrorDestMoveLag` performs the following steps:
1. Create mirror session
2. Enable non-LAG monitor port
3. Create LAG; move to LAG with one member
4. Remove LAG member
5. Create LAG member
6. Remove LAG; move to non-LAG
7. Disable non-LAG monitor port
8. Remove mirror session

At Step3, we call `create_port_channel` and `set_interface_status`
to create LAG and set admin_status "up".

This creates a "PortChannel080" entry in the "PORTCHANNEL" table of
CONFIG_DB, which results in the creation of a team device "PortChannel080"
in Linux.

When step 6 calls `remove_port_channel` to remove the LAG, it is expected
that the "PortChannel080" entry created previously will be removed and the
team device will be deleted.

However, the current code does not remove the "PortChannel080" entry from
CONFIG_DB and consequently the team device is not deleted.

This issue causes a test failure. The reason is that subsequent tests try
to create a "PortChannel080" and since the "PortChannel080" already exists,
unexpected behavior occurs.

This commit fixes the issue by changing the `remove_port_channel` code to
delete the PortChannel entry.

Signed-off-by: Carmine Scarpitta <cscarpit@cisco.com>
@cscarpitta cscarpitta requested a review from prsunny as a code owner May 11, 2025 10:37
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

@BYGX-wcr BYGX-wcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@liuh-80 liuh-80 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified this PR can fix test case break caused by FRR upgrade.

Copy link
Copy Markdown
Contributor

@prabhataravind prabhataravind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@prsunny
Copy link
Copy Markdown
Collaborator

prsunny commented May 14, 2025

Verified this PR can fix test case break caused by FRR upgrade.

@liuh-80 , why is this related to FRR upgrade? Can you provide more context?

@dgsudharsan for viz

@liuh-80
Copy link
Copy Markdown
Contributor

liuh-80 commented May 15, 2025

Verified this PR can fix test case break caused by FRR upgrade.

@liuh-80 , why is this related to FRR upgrade? Can you provide more context?

@dgsudharsan for viz

Hi @prsunny , after FRR upgrade, sonic-swss PR blocked by that change, and after the fix FRR PR on sonic-buildimage repo merged, there still some failed test case, I verified this PR can fix the test_mirror.py issue:

2025-05-14T22:09:55.7744038Z -------- generated xml file: /agent/_work/1/s/tests/test_mirror_tr.xml ---------
2025-05-14T22:09:55.7745563Z ===Flaky Test Report===
2025-05-14T22:09:55.7745691Z
2025-05-14T22:09:55.7747419Z test_MirrorInvalidEntry passed 1 out of the required 1 times. Success!
2025-05-14T22:09:55.7747951Z test_MirrorAddRemove passed 1 out of the required 1 times. Success!
2025-05-14T22:09:55.7748374Z test_MirrorToVlanAddRemove passed 1 out of the required 1 times. Success!
2025-05-14T22:09:55.7748833Z test_MirrorToLagAddRemove passed 1 out of the required 1 times. Success!
2025-05-14T22:09:55.7749246Z test_MirrorDestMoveVlan passed 1 out of the required 1 times. Success!
2025-05-14T22:09:55.7749685Z test_MirrorDestMoveLag failed (1 runs remaining out of 2).
2025-05-14T22:09:55.7750100Z <class 'AssertionError'>
2025-05-14T22:09:55.7750437Z assert 'Ethernet64' == 'Ethernet32'
2025-05-14T22:09:55.7750757Z - Ethernet64
2025-05-14T22:09:55.7751010Z ? ^^
2025-05-14T22:09:55.7751463Z + Ethernet32
2025-05-14T22:09:55.7751895Z ? ^^
2025-05-14T22:09:55.7752309Z [<TracebackEntry /agent/_work/1/s/tests/test_mirror.py:783>, <TracebackEntry /agent/_work/1/s/tests/test_mirror.py:711>]
2025-05-14T22:09:55.7755097Z test_MirrorDestMoveLag failed; it passed 0 out of the required 1 times.
2025-05-14T22:09:55.7755584Z <class 'AssertionError'>
2025-05-14T22:09:55.7755988Z assert 'active' == 'inactive'
2025-05-14T22:09:55.7756655Z - active
2025-05-14T22:09:55.7757137Z + inactive
2025-05-14T22:09:55.7758116Z ? ++
2025-05-14T22:09:55.7758541Z [<TracebackEntry /agent/_work/1/s/tests/test_mirror.py:782>, <TracebackEntry /agent/_work/1/s/tests/test_mirror.py:665>]
2025-05-14T22:09:55.7759014Z test_AclBindMirrorPerStage failed (1 runs remaining out of 2).
2025-05-14T22:09:55.7759355Z <class 'AssertionError'>
2025-05-14T22:09:55.7759799Z assert 2 == 1
2025-05-14T22:09:55.7760059Z -2
2025-05-14T22:09:55.7760299Z +1
2025-05-14T22:09:55.7760629Z [<TracebackEntry /agent/_work/1/s/tests/test_mirror.py:843>]
2025-05-14T22:09:55.7761033Z test_AclBindMirrorPerStage failed; it passed 0 out of the required 1 times.
2025-05-14T22:09:55.7761389Z <class 'AssertionError'>
2025-05-14T22:09:55.7761670Z assert 2 == 1
2025-05-14T22:09:55.7761925Z -2
2025-05-14T22:09:55.7762163Z +1
2025-05-14T22:09:55.7762719Z [<TracebackEntry /agent/_work/1/s/tests/test_mirror.py:843>]
2025-05-14T22:09:55.7763273Z test_AclBindMirror failed (1 runs remaining out of 2).
2025-05-14T22:09:55.7763804Z <class 'AssertionError'>
2025-05-14T22:09:55.7764116Z assert 2 == 1
2025-05-14T22:09:55.7764380Z -2
2025-05-14T22:09:55.7764622Z +1
2025-05-14T22:09:55.7765041Z [<TracebackEntry /agent/_work/1/s/tests/test_mirror.py:990>, <TracebackEntry /agent/_work/1/s/tests/test_mirror.py:911>]
2025-05-14T22:09:55.7765536Z test_AclBindMirror failed; it passed 0 out of the required 1 times.
2025-05-14T22:09:55.7765939Z <class 'AssertionError'>
2025-05-14T22:09:55.7766230Z assert 2 == 1
2025-05-14T22:09:55.7766523Z -2
2025-05-14T22:09:55.7766775Z +1
2025-05-14T22:09:55.7767169Z [<TracebackEntry /agent/_work/1/s/tests/test_mirror.py:990>, <TracebackEntry /agent/_work/1/s/tests/test_mirror.py:911>]
2025-05-14T22:09:55.7767638Z test_nonflaky_dummy passed 1 out of the required 1 times. Success!
2025-05-14T22:09:55.7767872Z
2025-05-14T22:09:55.7768355Z ===End Flaky Test Report===
2025-05-14T22:09:55.7768726Z ===================== 3 failed, 6 passed in 382.05 seconds =====================

@prsunny
Copy link
Copy Markdown
Collaborator

prsunny commented May 15, 2025

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@liuh-80
Copy link
Copy Markdown
Contributor

liuh-80 commented May 15, 2025

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

Only PR owner can use /azpw run

@liuh-80
Copy link
Copy Markdown
Contributor

liuh-80 commented May 15, 2025

/azp run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Commenter does not have sufficient privileges for PR 3639 in repo sonic-net/sonic-swss

@liuh-80
Copy link
Copy Markdown
Contributor

liuh-80 commented May 15, 2025

@cscarpitta , you need trigger PR validation again, current test failed because build docker image with #850459 which does not contain the FRR fix.

@cscarpitta
Copy link
Copy Markdown
Contributor Author

/azp run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Commenter does not have sufficient privileges for PR 3639 in repo sonic-net/sonic-swss

@cscarpitta
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@cscarpitta
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-swss

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@cscarpitta
Copy link
Copy Markdown
Contributor Author

/azpw run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run Azure.sonic-swss

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@prabhataravind
Copy link
Copy Markdown
Contributor

@cscarpitta there are two more tests in test_mirror.py and 6 test cases in test_warm_reboot.py that are failing (These have been passing until May 9 and no swss changes have gone in since then that could have caused these failures). Could you please check if these are related to the FRR upgrades. I have temporarily skipped them in #3664 to let pipeline runs proceed.

@cscarpitta
Copy link
Copy Markdown
Contributor Author

Hi @prabhataravind I’m already working on a different PR for the other test failures.

In the meantime we can merge this PR to fix the test_MirrorDestMoveLag test case.
I will open a different PR to fix the other test cases.

@prabhataravind
Copy link
Copy Markdown
Contributor

Hi @prabhataravind I’m already working on a different PR for the other test failures.

In the meantime we can merge this PR to fix the test_MirrorDestMoveLag test case. I will open a different PR to fix the other test cases.

This test (test_MirrorDestMoveLag) is currently skipped to unblock PRs. Could you remove the skip mark as part of this PR if you think this will pass?

Copy link
Copy Markdown
Contributor

@prabhataravind prabhataravind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the skip mark in line 782 as well..

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

@prabhataravind prabhataravind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@prsunny prsunny merged commit 3a5efa3 into sonic-net:master May 31, 2025
14 checks passed
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202505: #3700

divyagayathri-hcl pushed a commit to divyagayathri-hcl/sonic-swss that referenced this pull request Jun 23, 2025
* [tests]: Fix `test_MirrorDestMoveLag` test failure

`test_MirrorDestMoveLag` performs the following steps:
1. Create mirror session
2. Enable non-LAG monitor port
3. Create LAG; move to LAG with one member
4. Remove LAG member
5. Create LAG member
6. Remove LAG; move to non-LAG
7. Disable non-LAG monitor port
8. Remove mirror session

At Step3, we call `create_port_channel` and `set_interface_status`
to create LAG and set admin_status "up".

This creates a "PortChannel080" entry in the "PORTCHANNEL" table of
CONFIG_DB, which results in the creation of a team device "PortChannel080"
in Linux.

When step 6 calls `remove_port_channel` to remove the LAG, it is expected
that the "PortChannel080" entry created previously will be removed and the
team device will be deleted.

However, the current code does not remove the "PortChannel080" entry from
CONFIG_DB and consequently the team device is not deleted.

This issue causes a test failure. The reason is that subsequent tests try
to create a "PortChannel080" and since the "PortChannel080" already exists,
unexpected behavior occurs.

This commit fixes the issue by changing the `remove_port_channel` code to
delete the PortChannel entry.
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202411: #3804

balanokia pushed a commit to balanokia/sonic-swss that referenced this pull request Nov 17, 2025
* [tests]: Fix `test_MirrorDestMoveLag` test failure

`test_MirrorDestMoveLag` performs the following steps:
1. Create mirror session
2. Enable non-LAG monitor port
3. Create LAG; move to LAG with one member
4. Remove LAG member
5. Create LAG member
6. Remove LAG; move to non-LAG
7. Disable non-LAG monitor port
8. Remove mirror session

At Step3, we call `create_port_channel` and `set_interface_status`
to create LAG and set admin_status "up".

This creates a "PortChannel080" entry in the "PORTCHANNEL" table of
CONFIG_DB, which results in the creation of a team device "PortChannel080"
in Linux.

When step 6 calls `remove_port_channel` to remove the LAG, it is expected
that the "PortChannel080" entry created previously will be removed and the
team device will be deleted.

However, the current code does not remove the "PortChannel080" entry from
CONFIG_DB and consequently the team device is not deleted.

This issue causes a test failure. The reason is that subsequent tests try
to create a "PortChannel080" and since the "PortChannel080" already exists,
unexpected behavior occurs.

This commit fixes the issue by changing the `remove_port_channel` code to
delete the PortChannel entry.
baorliu pushed a commit to baorliu/sonic-swss that referenced this pull request Feb 23, 2026
* [tests]: Fix `test_MirrorDestMoveLag` test failure

`test_MirrorDestMoveLag` performs the following steps:
1. Create mirror session
2. Enable non-LAG monitor port
3. Create LAG; move to LAG with one member
4. Remove LAG member
5. Create LAG member
6. Remove LAG; move to non-LAG
7. Disable non-LAG monitor port
8. Remove mirror session

At Step3, we call `create_port_channel` and `set_interface_status`
to create LAG and set admin_status "up".

This creates a "PortChannel080" entry in the "PORTCHANNEL" table of
CONFIG_DB, which results in the creation of a team device "PortChannel080"
in Linux.

When step 6 calls `remove_port_channel` to remove the LAG, it is expected
that the "PortChannel080" entry created previously will be removed and the
team device will be deleted.

However, the current code does not remove the "PortChannel080" entry from
CONFIG_DB and consequently the team device is not deleted.

This issue causes a test failure. The reason is that subsequent tests try
to create a "PortChannel080" and since the "PortChannel080" already exists,
unexpected behavior occurs.

This commit fixes the issue by changing the `remove_port_channel` code to
delete the PortChannel entry.

Signed-off-by: Baorong Liu <96146196+baorliu@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.