Skip to content

[action] [PR:633] [Smartswitch][pcied] Fix pcied handling for smartswitch during DPU detach #5#641

Merged
mssonicbld merged 1 commit intosonic-net:202505from
mssonicbld:cherry/202505/633
Jul 9, 2025
Merged

[action] [PR:633] [Smartswitch][pcied] Fix pcied handling for smartswitch during DPU detach #5#641
mssonicbld merged 1 commit intosonic-net:202505from
mssonicbld:cherry/202505/633

Conversation

@mssonicbld
Copy link
Copy Markdown
Collaborator

Description

Two changes are being done in this PR:
Remove deletion of the PCIE_DETACH_INFO from the pcied (This should be handled by the power off/power on of the DPU from the module_base implementation)
The returned value from self.detach_info.get(key) in pcied is of the form tuple:
(True, [('bus_info', '0000:03:00.1'), ('dpu_state', 'detaching')]) This has to be handled correctly by the pcied (which was expecting a dictionary

Motivation and Context

This is required since the pcied should not delete the entries from the PCIE_DETACH_INFO table as opposed to the module_base implementation

How Has This Been Tested?

Unit tests:

tests/test_DaemonPcied.py::TestDaemonPcied::test_signal_handler PASSED   [  7%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_run PASSED              [ 14%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_del PASSED              [ 21%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_del_exception PASSED    [ 28%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_is_dpu_in_detaching_mode PASSED [ 35%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_pcie_devices PASSED [ 42%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_pcie_devices_update_aer PASSED [ 50%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_pcie_devices_detaching PASSED [ 57%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_update_pcie_devices_status_db PASSED [ 64%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_n_update_pcie_aer_stats PASSED [ 71%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_update_aer_to_statedb PASSED [ 78%]
tests/test_pcied.py::test_main PASSED                                    [ 85%]
tests/test_pcied.py::test_read_id_file PASSED                            [ 92%]
tests/test_pcied.py::test_load_platform_pcieutil PASSED                  [100%]

Also tested manually on smart switch system,
Restarted pcied (supervisorctl restart pcied in pmon docker) and checked that the detach info table is not deleted)
Checked the logs to confirm that pcied is not crashing due to the additional change for processing of detach table

Additional Information (Optional)

…tach sonic-net#5

<!-- Provide a general summary of your changes in the Title above -->

#### Description
<!--
     Describe your changes in detail
-->
Two changes are being done in this PR:
Remove deletion of the `PCIE_DETACH_INFO` from the pcied (This should be handled by the power off/power on of the DPU from the module_base implementation)
The returned value from `self.detach_info.get(key)` in pcied is of the form tuple:
`(True, [('bus_info', '0000:03:00.1'), ('dpu_state', 'detaching')])` This has to be handled correctly by the pcied (which was expecting a dictionary

#### Motivation and Context
<!--
     Why is this change required? What problem does it solve?
     If this pull request closes/resolves an open Issue, make sure you
     include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
This is required since the pcied should not delete the entries from the `PCIE_DETACH_INFO` table as opposed to the module_base implementation

#### How Has This Been Tested?
<!--
     Please describe in detail how you tested your changes.
     Include details of your testing environment, and the tests you ran to
     see how your change affects other areas of the code, etc.
-->
Unit tests:
```
tests/test_DaemonPcied.py::TestDaemonPcied::test_signal_handler PASSED   [  7%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_run PASSED              [ 14%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_del PASSED              [ 21%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_del_exception PASSED    [ 28%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_is_dpu_in_detaching_mode PASSED [ 35%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_pcie_devices PASSED [ 42%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_pcie_devices_update_aer PASSED [ 50%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_pcie_devices_detaching PASSED [ 57%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_update_pcie_devices_status_db PASSED [ 64%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_check_n_update_pcie_aer_stats PASSED [ 71%]
tests/test_DaemonPcied.py::TestDaemonPcied::test_update_aer_to_statedb PASSED [ 78%]
tests/test_pcied.py::test_main PASSED                                    [ 85%]
tests/test_pcied.py::test_read_id_file PASSED                            [ 92%]
tests/test_pcied.py::test_load_platform_pcieutil PASSED                  [100%]
```
Also tested manually on smart switch system,
Restarted pcied (`supervisorctl restart pcied` in pmon docker) and checked that the detach info table is not deleted)
Checked the logs to confirm that pcied is not crashing due to the additional change for processing of detach table
#### Additional Information (Optional)
@mssonicbld
Copy link
Copy Markdown
Collaborator Author

Original PR: #633

@mssonicbld
Copy link
Copy Markdown
Collaborator Author

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld mssonicbld merged commit b4b76a9 into sonic-net:202505 Jul 9, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant