[xcvrd] Enable periodic polling of VDM relevant data#582
[xcvrd] Enable periodic polling of VDM relevant data#582prgeor merged 13 commits intosonic-net:masterfrom
Conversation
Signed-off-by: Mihir Patel <patelmi@microsoft.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
… periodic polling
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
…daemons into vdm_read_through_xcvrd
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@mihirpat1 can you make this log as debug log? |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
@prgeor I have addressed this now. |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
| - Last Set Time Table | ||
| - Last Clear Time Table | ||
| """ | ||
| def update_flag_metadata_tables(logical_port_name, physical_port_name, field_name, current_value, |
There was a problem hiding this comment.
@mihirpat1 do we need logical port name since we are moving away from updating dom per logical portname?
There was a problem hiding this comment.
@prgeor Yes - ideally, we should not pass the logical port name here. However, I will plan to remove the logical port name once we modify the DomInfoUpdateTask thread to poll based on physical port number.
sonic-xcvrd/xcvrd/dom/dom_mgr.py
Outdated
| else: | ||
| return xcvrd.SFP_EEPROM_NOT_READY | ||
|
|
||
| def post_port_diagnostic_values_to_db(self, logical_port_name, table, |
There was a problem hiding this comment.
@mihirpat1 rename to post_vdm_real_values_to_db?
There was a problem hiding this comment.
nit: rename dom_utilities -> utilities
| @@ -0,0 +1,14 @@ | |||
| class XCVRDUtils: | |||
| """ | |||
| This class provides utility functions for managing VDM operations on transceivers. | |||
| @@ -0,0 +1,59 @@ | |||
| from swsscommon import swsscommon | |||
|
|
|||
| class DBUtils: | |||
There was a problem hiding this comment.
nit, rename common_db_utils -> db_utils?
sonic-xcvrd/xcvrd/xcvrd.py
Outdated
| class SfpStateUpdateTask(threading.Thread): | ||
| RETRY_EEPROM_READING_INTERVAL = 60 | ||
| def __init__(self, namespaces, port_mapping, main_thread_stop_event, sfp_error_event): | ||
| def __init__(self, namespaces, port_mapping, sfp_obj_dict, main_thread_stop_event, sfp_error_event, helper_logger): |
There was a problem hiding this comment.
can you use SysLogger() for helper_logger
There was a problem hiding this comment.
PR Overview
This PR enables periodic polling of VDM-relevant data in xcvrd by updating database utility functions, VDM utilities, and associated state-management logic. Key changes include:
- Parameter and API signature updates (e.g. passing the API object explicitly to post_port_active_apsel_to_db and updating task constructors).
- Moving and refining modules (e.g. moving DomInfoUpdateTask into the dom subdirectory and updating its dependencies).
- Minor logging level adjustments to reduce unnecessary flooding.
Reviewed Changes
| File | Description |
|---|---|
| sonic-xcvrd/xcvrd/xcvrd.py | Updates to function calls and task constructors to ensure correct dependency injection and correct parameter ordering. |
| sonic-xcvrd/xcvrd/dom/dom_mgr.py | Updates to DomInfoUpdateTask constructor and dependency propagation with the inclusion of sfp_obj_dict. |
| sonic-xcvrd/xcvrd/xcvrd_utilities/port_event_helper.py | Change of log level from warning to debug to reduce log flooding. |
| sonic-xcvrd/xcvrd/xcvrd_utilities/xcvr_table_helper.py | Added new VDM tables and associated getter APIs to support the new polling functionality. |
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (4)
sonic-xcvrd/xcvrd/xcvrd.py:1262
- Passing the API object as the first parameter fixes an argument mismatch in the post_port_active_apsel_to_db call. Confirm that the function signature and all call sites are updated accordingly.
self.post_port_active_apsel_to_db(api, lport, host_lanes_mask, reset_apsel=True)
sonic-xcvrd/xcvrd/xcvrd.py:1478
- Updating the SfpStateUpdateTask constructor to include 'sfp_obj_dict' and 'helper_logger' ensures proper dependency injection. Verify that all call sites supply the required parameters.
def __init__(self, namespaces, port_mapping, sfp_obj_dict, main_thread_stop_event, sfp_error_event, helper_logger):
sonic-xcvrd/xcvrd/dom/dom_mgr.py:33
- Including 'sfp_obj_dict' in the DomInfoUpdateTask constructor improves consistency with dependency requirements. Ensure that sfp_obj_dict is properly initialized and passed from the caller.
def __init__(self, namespaces, port_mapping, sfp_obj_dict, main_thread_stop_event, skip_cmis_mgr, helper_logger):
sonic-xcvrd/xcvrd/xcvrd_utilities/port_event_helper.py:147
- [nitpick] Changing the log level from warning to debug helps reduce log flooding; verify that this change aligns with the overall logging strategy in production.
self.logger.log_debug("$$$ {} handle_port_update_event() : op={} DB:{} Table:{} fvp {}".format(...))
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
<!-- Provide a general summary of your changes in the Title above -->
#### Description
<!--
Describe your changes in detail
-->
The following traceback is seen with the latest image for DAC cables.
```
2025 Mar 17 17:24:19.889826 sonic-dut ERR pmon#xcvrd[67]: Traceback (most recent call last):
2025 Mar 17 17:24:19.889997 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet259', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet259', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890132 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1878, in run
2025 Mar 17 17:24:19.890201 sonic-dut ERR pmon#xcvrd[67]: self.task_worker(self.task_stopping_event, self.sfp_error_event)
2025 Mar 17 17:24:19.890397 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet21', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet21', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890516 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1671, in task_worker
2025 Mar 17 17:24:19.890609 sonic-dut ERR pmon#xcvrd[67]: self.init()
2025 Mar 17 17:24:19.890757 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet86', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet86', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890810 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1590, in init
2025 Mar 17 17:24:19.890856 sonic-dut ERR pmon#xcvrd[67]: self.retry_eeprom_set = self._post_port_sfp_info_and_dom_thr_to_db_once(port_mapping_data, self.xcvr_table_helper, self.main_thread_stop_event)
2025 Mar 17 17:24:19.891031 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet294', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet294', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891059 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891104 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1547, in _post_port_sfp_info_and_dom_thr_to_db_once
2025 Mar 17 17:24:19.891255 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet289', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet289', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891290 sonic-dut ERR pmon#xcvrd[67]: self.vdm_db_utils.post_port_vdm_thresholds_to_db(logical_port_name)
2025 Mar 17 17:24:19.891332 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 70, in post_port_vdm_thresholds_to_db
2025 Mar 17 17:24:19.891490 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet128', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet128', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891525 sonic-dut ERR pmon#xcvrd[67]: return self._post_port_vdm_thresholds_or_flags_to_db(logical_port_name, self.xcvr_table_helper.get_vdm_threshold_tbl,
2025 Mar 17 17:24:19.891569 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891721 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet2', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet2', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891757 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 100, in _post_port_vdm_thresholds_or_flags_to_db
2025 Mar 17 17:24:19.891794 sonic-dut ERR pmon#xcvrd[67]: vdm_values_dict = get_vdm_values_func(physical_port)
2025 Mar 17 17:24:19.891839 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891990 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet471', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet471', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892029 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/utils.py", line 39, in get_vdm_thresholds
2025 Mar 17 17:24:19.892067 sonic-dut ERR pmon#xcvrd[67]: return self.sfp_obj_dict[physical_port].get_transceiver_vdm_thresholds()
2025 Mar 17 17:24:19.892109 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892262 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet118', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet118', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892302 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/sfp_optoe_base.py", line 76, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892340 sonic-dut ERR pmon#xcvrd[67]: return api.get_transceiver_vdm_thresholds() if api is not None else None
2025 Mar 17 17:24:19.892381 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892544 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet343', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet343', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892585 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/api/public/cmis.py", line 2566, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892625 sonic-dut ERR pmon#xcvrd[67]: vdm_raw_dict = self.get_vdm(self.vdm.VDM_THRESHOLD)
2025 Mar 17 17:24:19.892666 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892812 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet224', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'true', 'index': '-1', 'port_name': 'Ethernet224', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892857 sonic-dut ERR pmon#xcvrd[67]: AttributeError: 'NoneType' object has no attribute 'VDM_THRESHOLD'
2025 Mar 17 17:24:19.893100 sonic-dut NOTICE pmon#xcvrd[67]: Stop daemon main loop
2025 Mar 17 17:24:19.893330 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet230', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet230', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.893330 sonic-dut ERR pmon#xcvrd[67]: Xcvrd: exception found at child thread SfpStateUpdateTask due to AttributeError("'NoneType' object has no attribute 'VDM_THRESHOLD'")
2025 Mar 17 17:24:19.893412 sonic-dut ERR pmon#xcvrd[67]: Exiting main loop as child thread raised exception!
2025 Mar 17 17:24:19.904444 sonic-dut INFO pmon#supervisord 2025-03-17 17:24:19,904 WARN exited: xcvrd (terminated by SIGKILL; not expected)
```
#### Motivation and Context
<!--
Why is this change required? What problem does it solve?
If this pull request closes/resolves an open Issue, make sure you
include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
With sonic-net/sonic-platform-daemons#582 merged, we are now updating the VDM threshold data for all types of transceivers.
However, for transceivers which are CMIS compliant but have flat memory, they don't have VDM support. The driver handler for fetching the VDM threshold data does not check if the CMIS transceiver supports VDM or not, which causes xcvrd to crash.
https://github.com/sonic-net/sonic-platform-common/blob/e5aedb6bab10a16d0167488eb9e291805c397c8f/sonic_platform_base/sonic_xcvr/api/public/cmis.py#L2619
To address this issue, ensure that a transceiver is flat memory based before reading the VDM threshold data from the transceiver.
#### How Has This Been Tested?
<!--
Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc.
-->
1. Ensured that xcvrd is stable and VDM threshold table is present for CMIS optics supporting VDM
2. Ensured that xcvrd is stable and VDM threshold table is present for C-CMIS optics supporting VDM
3. Ensured that xcvrd is stable and VDM threshold table is not present for
3.1 CMIS optics not supporting VDM + does not have flat memory
3.2 CMIS optics but has flat memory
3.3 10G SFP
#### Additional Information (Optional)
MSFT ADO - 31849344
<!-- Provide a general summary of your changes in the Title above -->
#### Description
<!--
Describe your changes in detail
-->
The following traceback is seen with the latest image for DAC cables.
```
2025 Mar 17 17:24:19.889826 sonic-dut ERR pmon#xcvrd[67]: Traceback (most recent call last):
2025 Mar 17 17:24:19.889997 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet259', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet259', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890132 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1878, in run
2025 Mar 17 17:24:19.890201 sonic-dut ERR pmon#xcvrd[67]: self.task_worker(self.task_stopping_event, self.sfp_error_event)
2025 Mar 17 17:24:19.890397 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet21', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet21', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890516 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1671, in task_worker
2025 Mar 17 17:24:19.890609 sonic-dut ERR pmon#xcvrd[67]: self.init()
2025 Mar 17 17:24:19.890757 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet86', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet86', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890810 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1590, in init
2025 Mar 17 17:24:19.890856 sonic-dut ERR pmon#xcvrd[67]: self.retry_eeprom_set = self._post_port_sfp_info_and_dom_thr_to_db_once(port_mapping_data, self.xcvr_table_helper, self.main_thread_stop_event)
2025 Mar 17 17:24:19.891031 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet294', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet294', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891059 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891104 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1547, in _post_port_sfp_info_and_dom_thr_to_db_once
2025 Mar 17 17:24:19.891255 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet289', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet289', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891290 sonic-dut ERR pmon#xcvrd[67]: self.vdm_db_utils.post_port_vdm_thresholds_to_db(logical_port_name)
2025 Mar 17 17:24:19.891332 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 70, in post_port_vdm_thresholds_to_db
2025 Mar 17 17:24:19.891490 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet128', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet128', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891525 sonic-dut ERR pmon#xcvrd[67]: return self._post_port_vdm_thresholds_or_flags_to_db(logical_port_name, self.xcvr_table_helper.get_vdm_threshold_tbl,
2025 Mar 17 17:24:19.891569 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891721 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet2', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet2', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891757 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 100, in _post_port_vdm_thresholds_or_flags_to_db
2025 Mar 17 17:24:19.891794 sonic-dut ERR pmon#xcvrd[67]: vdm_values_dict = get_vdm_values_func(physical_port)
2025 Mar 17 17:24:19.891839 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891990 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet471', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet471', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892029 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/utils.py", line 39, in get_vdm_thresholds
2025 Mar 17 17:24:19.892067 sonic-dut ERR pmon#xcvrd[67]: return self.sfp_obj_dict[physical_port].get_transceiver_vdm_thresholds()
2025 Mar 17 17:24:19.892109 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892262 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet118', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet118', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892302 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/sfp_optoe_base.py", line 76, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892340 sonic-dut ERR pmon#xcvrd[67]: return api.get_transceiver_vdm_thresholds() if api is not None else None
2025 Mar 17 17:24:19.892381 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892544 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet343', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet343', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892585 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/api/public/cmis.py", line 2566, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892625 sonic-dut ERR pmon#xcvrd[67]: vdm_raw_dict = self.get_vdm(self.vdm.VDM_THRESHOLD)
2025 Mar 17 17:24:19.892666 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892812 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet224', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'true', 'index': '-1', 'port_name': 'Ethernet224', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892857 sonic-dut ERR pmon#xcvrd[67]: AttributeError: 'NoneType' object has no attribute 'VDM_THRESHOLD'
2025 Mar 17 17:24:19.893100 sonic-dut NOTICE pmon#xcvrd[67]: Stop daemon main loop
2025 Mar 17 17:24:19.893330 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet230', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet230', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.893330 sonic-dut ERR pmon#xcvrd[67]: Xcvrd: exception found at child thread SfpStateUpdateTask due to AttributeError("'NoneType' object has no attribute 'VDM_THRESHOLD'")
2025 Mar 17 17:24:19.893412 sonic-dut ERR pmon#xcvrd[67]: Exiting main loop as child thread raised exception!
2025 Mar 17 17:24:19.904444 sonic-dut INFO pmon#supervisord 2025-03-17 17:24:19,904 WARN exited: xcvrd (terminated by SIGKILL; not expected)
```
#### Motivation and Context
<!--
Why is this change required? What problem does it solve?
If this pull request closes/resolves an open Issue, make sure you
include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
With sonic-net/sonic-platform-daemons#582 merged, we are now updating the VDM threshold data for all types of transceivers.
However, for transceivers which are CMIS compliant but have flat memory, they don't have VDM support. The driver handler for fetching the VDM threshold data does not check if the CMIS transceiver supports VDM or not, which causes xcvrd to crash.
https://github.com/sonic-net/sonic-platform-common/blob/e5aedb6bab10a16d0167488eb9e291805c397c8f/sonic_platform_base/sonic_xcvr/api/public/cmis.py#L2619
To address this issue, ensure that a transceiver is flat memory based before reading the VDM threshold data from the transceiver.
#### How Has This Been Tested?
<!--
Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc.
-->
1. Ensured that xcvrd is stable and VDM threshold table is present for CMIS optics supporting VDM
2. Ensured that xcvrd is stable and VDM threshold table is present for C-CMIS optics supporting VDM
3. Ensured that xcvrd is stable and VDM threshold table is not present for
3.1 CMIS optics not supporting VDM + does not have flat memory
3.2 CMIS optics but has flat memory
3.3 10G SFP
#### Additional Information (Optional)
MSFT ADO - 31849344
|
removing tag, since it is already merged. |
…et#582) * [xcvrd] Enable periodic polling of VDM relevant data Signed-off-by: Mihir Patel <patelmi@microsoft.com> * Added VDM freeze and unfreeze support * Update VDM flag change counters and set/clear time in redis-db during periodic polling * Updated comments and initializing flag count to 0 if flag is clear upon xcvrd boot-up * Updated comments and initializing flag count to 0 * Fixed unit-test failure in test_update_flag_metadata_tables * Moved dom_mgr.py to xcvrd/dom/ and changed a warning to debug in port_event_helper.py * Restructured VDM related functions to separate classes * Created vdm_utilities and db_utilities folder * Addressed PR comments --------- Signed-off-by: Mihir Patel <patelmi@microsoft.com>
[202412][xcvrd] Enable periodic polling of VDM relevant data (sonic-net#582)
…ure#11) <!-- Provide a general summary of your changes in the Title above --> #### Description <!-- Describe your changes in detail --> The following traceback is seen with the latest image for DAC cables. ``` 2025 Mar 17 17:24:19.889826 sonic-dut ERR pmon#xcvrd[67]: Traceback (most recent call last): 2025 Mar 17 17:24:19.889997 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet259', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet259', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.890132 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1878, in run 2025 Mar 17 17:24:19.890201 sonic-dut ERR pmon#xcvrd[67]: self.task_worker(self.task_stopping_event, self.sfp_error_event) 2025 Mar 17 17:24:19.890397 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet21', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet21', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.890516 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1671, in task_worker 2025 Mar 17 17:24:19.890609 sonic-dut ERR pmon#xcvrd[67]: self.init() 2025 Mar 17 17:24:19.890757 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet86', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet86', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.890810 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1590, in init 2025 Mar 17 17:24:19.890856 sonic-dut ERR pmon#xcvrd[67]: self.retry_eeprom_set = self._post_port_sfp_info_and_dom_thr_to_db_once(port_mapping_data, self.xcvr_table_helper, self.main_thread_stop_event) 2025 Mar 17 17:24:19.891031 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet294', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet294', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.891059 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2025 Mar 17 17:24:19.891104 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1547, in _post_port_sfp_info_and_dom_thr_to_db_once 2025 Mar 17 17:24:19.891255 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet289', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet289', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.891290 sonic-dut ERR pmon#xcvrd[67]: self.vdm_db_utils.post_port_vdm_thresholds_to_db(logical_port_name) 2025 Mar 17 17:24:19.891332 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 70, in post_port_vdm_thresholds_to_db 2025 Mar 17 17:24:19.891490 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet128', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet128', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.891525 sonic-dut ERR pmon#xcvrd[67]: return self._post_port_vdm_thresholds_or_flags_to_db(logical_port_name, self.xcvr_table_helper.get_vdm_threshold_tbl, 2025 Mar 17 17:24:19.891569 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2025 Mar 17 17:24:19.891721 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet2', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet2', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.891757 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 100, in _post_port_vdm_thresholds_or_flags_to_db 2025 Mar 17 17:24:19.891794 sonic-dut ERR pmon#xcvrd[67]: vdm_values_dict = get_vdm_values_func(physical_port) 2025 Mar 17 17:24:19.891839 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2025 Mar 17 17:24:19.891990 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet471', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet471', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.892029 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/utils.py", line 39, in get_vdm_thresholds 2025 Mar 17 17:24:19.892067 sonic-dut ERR pmon#xcvrd[67]: return self.sfp_obj_dict[physical_port].get_transceiver_vdm_thresholds() 2025 Mar 17 17:24:19.892109 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2025 Mar 17 17:24:19.892262 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet118', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet118', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.892302 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/sfp_optoe_base.py", line 76, in get_transceiver_vdm_thresholds 2025 Mar 17 17:24:19.892340 sonic-dut ERR pmon#xcvrd[67]: return api.get_transceiver_vdm_thresholds() if api is not None else None 2025 Mar 17 17:24:19.892381 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2025 Mar 17 17:24:19.892544 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet343', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet343', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.892585 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/api/public/cmis.py", line 2566, in get_transceiver_vdm_thresholds 2025 Mar 17 17:24:19.892625 sonic-dut ERR pmon#xcvrd[67]: vdm_raw_dict = self.get_vdm(self.vdm.VDM_THRESHOLD) 2025 Mar 17 17:24:19.892666 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^ 2025 Mar 17 17:24:19.892812 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet224', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'true', 'index': '-1', 'port_name': 'Ethernet224', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.892857 sonic-dut ERR pmon#xcvrd[67]: AttributeError: 'NoneType' object has no attribute 'VDM_THRESHOLD' 2025 Mar 17 17:24:19.893100 sonic-dut NOTICE pmon#xcvrd[67]: Stop daemon main loop 2025 Mar 17 17:24:19.893330 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet230', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet230', 'asic_id': 0, 'op': 'SET'} 2025 Mar 17 17:24:19.893330 sonic-dut ERR pmon#xcvrd[67]: Xcvrd: exception found at child thread SfpStateUpdateTask due to AttributeError("'NoneType' object has no attribute 'VDM_THRESHOLD'") 2025 Mar 17 17:24:19.893412 sonic-dut ERR pmon#xcvrd[67]: Exiting main loop as child thread raised exception! 2025 Mar 17 17:24:19.904444 sonic-dut INFO pmon#supervisord 2025-03-17 17:24:19,904 WARN exited: xcvrd (terminated by SIGKILL; not expected) ``` #### Motivation and Context <!-- Why is this change required? What problem does it solve? If this pull request closes/resolves an open Issue, make sure you include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here --> With sonic-net/sonic-platform-daemons#582 merged, we are now updating the VDM threshold data for all types of transceivers. However, for transceivers which are CMIS compliant but have flat memory, they don't have VDM support. The driver handler for fetching the VDM threshold data does not check if the CMIS transceiver supports VDM or not, which causes xcvrd to crash. https://github.com/sonic-net/sonic-platform-common/blob/e5aedb6bab10a16d0167488eb9e291805c397c8f/sonic_platform_base/sonic_xcvr/api/public/cmis.py#L2619 To address this issue, ensure that a transceiver is flat memory based before reading the VDM threshold data from the transceiver. #### How Has This Been Tested? <!-- Please describe in detail how you tested your changes. Include details of your testing environment, and the tests you ran to see how your change affects other areas of the code, etc. --> 1. Ensured that xcvrd is stable and VDM threshold table is present for CMIS optics supporting VDM 2. Ensured that xcvrd is stable and VDM threshold table is present for C-CMIS optics supporting VDM 3. Ensured that xcvrd is stable and VDM threshold table is not present for 3.1 CMIS optics not supporting VDM + does not have flat memory 3.2 CMIS optics but has flat memory 3.3 10G SFP #### Additional Information (Optional) MSFT ADO - 31849344
* [xcvrd] Enable periodic polling of VDM relevant data Signed-off-by: Mihir Patel <patelmi@microsoft.com> * Added VDM freeze and unfreeze support * Update VDM flag change counters and set/clear time in redis-db during periodic polling * Updated comments and initializing flag count to 0 if flag is clear upon xcvrd boot-up * Updated comments and initializing flag count to 0 * Fixed unit-test failure in test_update_flag_metadata_tables * Moved dom_mgr.py to xcvrd/dom/ and changed a warning to debug in port_event_helper.py * Restructured VDM related functions to separate classes * Created vdm_utilities and db_utilities folder * Addressed PR comments --------- Signed-off-by: Mihir Patel <patelmi@microsoft.com>
Description
VDM data from xcvrd needs to be read and the redis-db needs to be updated accordingly.
Motivation and Context
VDM data from xcvrd needs to be read in the following manner so that the data can be accessed through CLI as well as Streaming telemetry dynamically.
The table and field related details for the above fields can be found in HLD for diagnostic monitoring of CMIS based transceivers by mihirpat1 · Pull Request #1828 · sonic-net/SONiC
Freeze/Unfreeze of VDM data
Also, VDM statistics are frozen before reading VDM real values, VDM flag values and PM values and are unfrozen after the read operation is completed.
VDM metadata update for VDM flags
All the VDM metadata fields for VDM flags are now being updated. This includes updated the flag change count, last set and last clear time as part of periodic polling by the DomInfoUpdateTask thread.
The change counters and set/clear time tables will be present for only 1 subport for a port breakout group. This has been done inline to future direction wherein the dignostic tables will be maintained only for first subport and not for other subports of the port breakout group.
Other related changes in this PR
xcvrd/dom_mgr.pytoxcvrd/dom/dom_mgr.pyhandle_port_update_eventto reduce flooding of logstask_stopping_eventis set while DOM polling for a port is in progress. This ensures thatxcvrdcan perform all deinitialization actions (including deleting relevant Redis-DB tables) whensupervisorctldis in the process of terminatingxcvrd. Without this fix, we have encountered issues where some VDM-related tables were not deleted because theDomInfoUpdateTaskthread was busy polling for VDM operations for more than 10 seconds, causingsupervisorctldto killxcvrdbefore deinitialization was completed.get_sfpinstead of get_all_sfps in theinitialize_sfp_obj_dictfunction since some platforms have customized the definition of get_sfp to convert 1 based port to 0 based portUnrelated issue addressed in the current PR
Issue
After issuing shutdown for a port, following Traceback is seen
RCA
apias an argument is not passed to the function from the below locationhttps://github.com/sonic-net/sonic-platform-daemons/blob/ee9da5f65de49d6fdaa25dfba668007b18bcc0c8/sonic-xcvrd/xcvrd/xcvrd.py#L1271C32-L1271C107
This change was merged via reset active application code as 'N/A' when port shutdown by chiourung · Pull Request #550 · sonic-net/sonic-platform-daemons
How Has This Been Tested?
Modules tested
Test cases attempted
Additional Information (Optional)
This PR needs to be merged only after the below PR is merged.
sonic-net/sonic-platform-common#527
MSFT ADO - 30598749
How Has This Been Tested?
Modules tested
Test cases attempted
Additional Information (Optional)
This PR needs to be merged only after the below PR is merged.
sonic-net/sonic-platform-common#527
MSFT ADO - 30598749