Skip to content

[xcvrd] Enable periodic polling of VDM relevant data#582

Merged
prgeor merged 13 commits intosonic-net:masterfrom
mihirpat1:vdm_read_through_xcvrd
Mar 6, 2025
Merged

[xcvrd] Enable periodic polling of VDM relevant data#582
prgeor merged 13 commits intosonic-net:masterfrom
mihirpat1:vdm_read_through_xcvrd

Conversation

@mihirpat1
Copy link
Copy Markdown
Contributor

@mihirpat1 mihirpat1 commented Dec 24, 2024

Description

VDM data from xcvrd needs to be read and the redis-db needs to be updated accordingly.

Motivation and Context

VDM data from xcvrd needs to be read in the following manner so that the data can be accessed through CLI as well as Streaming telemetry dynamically.

  1. VDM threshold values should be read during transceiver detection. We don't need to poll this data periodically since the threshold values are static
  2. VDM real/sample values and VDM flag values should be read periodically by the DomInfoUpdateTask thread

The table and field related details for the above fields can be found in HLD for diagnostic monitoring of CMIS based transceivers by mihirpat1 · Pull Request #1828 · sonic-net/SONiC

Freeze/Unfreeze of VDM data
Also, VDM statistics are frozen before reading VDM real values, VDM flag values and PM values and are unfrozen after the read operation is completed.

VDM metadata update for VDM flags
All the VDM metadata fields for VDM flags are now being updated. This includes updated the flag change count, last set and last clear time as part of periodic polling by the DomInfoUpdateTask thread.
The change counters and set/clear time tables will be present for only 1 subport for a port breakout group. This has been done inline to future direction wherein the dignostic tables will be maintained only for first subport and not for other subports of the port breakout group.

Other related changes in this PR

  1. Skip reading all diagnostic info if transceiver is not present on a port. This is now done at the beginning of the diagnostic polling loop to optimize the polling routine
  2. Read VDM and PM data only if VDM is supported on a transceiver
  3. The del_port_sfp_dom_info_from_db function has now been modified to make it more generic and to reduce modifying the implementation while adding new tables in future
  4. Moved xcvrd/dom_mgr.py to xcvrd/dom/dom_mgr.py
  5. Modified a warning to debug in handle_port_update_event to reduce flooding of logs
  6. Breaking the DOM monitoring port handling loop if task_stopping_event is set while DOM polling for a port is in progress. This ensures that xcvrd can perform all deinitialization actions (including deleting relevant Redis-DB tables) when supervisorctld is in the process of terminating xcvrd. Without this fix, we have encountered issues where some VDM-related tables were not deleted because the DomInfoUpdateTask thread was busy polling for VDM operations for more than 10 seconds, causing supervisorctld to kill xcvrd before deinitialization was completed.
  7. Using get_sfp instead of get_all_sfps in the initialize_sfp_obj_dict function since some platforms have customized the definition of get_sfp to convert 1 based port to 0 based port

Unrelated issue addressed in the current PR
Issue
After issuing shutdown for a port, following Traceback is seen

2025 Mar  4 02:44:58.584918 str4-sn5600-2 NOTICE pmon#xcvrd[27599]: CMIS: Ethernet0 Forcing Tx laser OFF
2025 Mar  4 02:44:58.588461 str4-sn5600-2 ERR pmon#xcvrd[27599]: CMIS: Ethernet0: internal errors due to CmisManagerTask.post_port_active_apsel_to_db() missing 1 required positional argument: 'host_lanes_mask'
2025 Mar  4 02:44:58.589536 str4-sn5600-2 ERR pmon#xcvrd[27599]: Traceback (most recent call last):
2025 Mar  4 02:44:58.589576 str4-sn5600-2 ERR pmon#xcvrd[27599]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1312, in task_worker
2025 Mar  4 02:44:58.589610 str4-sn5600-2 ERR pmon#xcvrd[27599]:     self.post_port_active_apsel_to_db(lport, host_lanes_mask, reset_apsel=True)
2025 Mar  4 02:44:58.589661 str4-sn5600-2 ERR pmon#xcvrd[27599]: TypeError: CmisManagerTask.post_port_active_apsel_to_db() missing 1 required positional argument: 'host_lanes_mask'

RCA
api as an argument is not passed to the function from the below location
https://github.com/sonic-net/sonic-platform-daemons/blob/ee9da5f65de49d6fdaa25dfba668007b18bcc0c8/sonic-xcvrd/xcvrd/xcvrd.py#L1271C32-L1271C107
This change was merged via reset active application code as 'N/A' when port shutdown by chiourung · Pull Request #550 · sonic-net/sonic-platform-daemons

How Has This Been Tested?

Modules tested

  1. CMIS module
  2. C-CMIS module
  3. SFF-8472 module

Test cases attempted

  1. Dumped VDM threshold value tables
  2. Dumped VDM real value tables
  3. Dumped VDM flag value tables
  4. Dumped VDM metadata tables
  5. Ensure all VDM related tables are empty after xcvrd stop

Additional Information (Optional)

This PR needs to be merged only after the below PR is merged.
sonic-net/sonic-platform-common#527

MSFT ADO - 30598749

How Has This Been Tested?

Modules tested

  1. CMIS module
  2. C-CMIS module
  3. SFF-8472 module

Test cases attempted

  1. Dumped VDM threshold value tables
  2. Dumped VDM real value tables
  3. Dumped VDM flag value tables
  4. Dumped VDM metadata tables
  5. Ensure all VDM related tables are empty after xcvrd stop

Additional Information (Optional)

This PR needs to be merged only after the below PR is merged.
sonic-net/sonic-platform-common#527

MSFT ADO - 30598749

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mihirpat1 mihirpat1 added the xcvrd label Jan 7, 2025
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@prgeor
Copy link
Copy Markdown
Collaborator

prgeor commented Feb 13, 2025

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mihirpat1
Copy link
Copy Markdown
Contributor Author

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

- Last Set Time Table
- Last Clear Time Table
"""
def update_flag_metadata_tables(logical_port_name, physical_port_name, field_name, current_value,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 do we need logical port name since we are moving away from updating dom per logical portname?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 should this be part of dom_mgr.py

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Yes - ideally, we should not pass the logical port name here. However, I will plan to remove the logical port name once we modify the DomInfoUpdateTask thread to poll based on physical port number.

else:
return xcvrd.SFP_EEPROM_NOT_READY

def post_port_diagnostic_values_to_db(self, logical_port_name, table,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 rename to post_vdm_real_values_to_db?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Renamed the function now

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rename dom_utilities -> utilities

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Addressed this now

@@ -0,0 +1,14 @@
class XCVRDUtils:
"""
This class provides utility functions for managing VDM operations on transceivers.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment valid?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Addressed this now

@@ -0,0 +1,59 @@
from swsscommon import swsscommon

class DBUtils:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, rename common_db_utils -> db_utils?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Addressed this now

class SfpStateUpdateTask(threading.Thread):
RETRY_EEPROM_READING_INTERVAL = 60
def __init__(self, namespaces, port_mapping, main_thread_stop_event, sfp_error_event):
def __init__(self, namespaces, port_mapping, sfp_obj_dict, main_thread_stop_event, sfp_error_event, helper_logger):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor Addressed this now

@prgeor prgeor requested a review from Copilot March 5, 2025 23:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR enables periodic polling of VDM-relevant data in xcvrd by updating database utility functions, VDM utilities, and associated state-management logic. Key changes include:

  • Parameter and API signature updates (e.g. passing the API object explicitly to post_port_active_apsel_to_db and updating task constructors).
  • Moving and refining modules (e.g. moving DomInfoUpdateTask into the dom subdirectory and updating its dependencies).
  • Minor logging level adjustments to reduce unnecessary flooding.

Reviewed Changes

File Description
sonic-xcvrd/xcvrd/xcvrd.py Updates to function calls and task constructors to ensure correct dependency injection and correct parameter ordering.
sonic-xcvrd/xcvrd/dom/dom_mgr.py Updates to DomInfoUpdateTask constructor and dependency propagation with the inclusion of sfp_obj_dict.
sonic-xcvrd/xcvrd/xcvrd_utilities/port_event_helper.py Change of log level from warning to debug to reduce log flooding.
sonic-xcvrd/xcvrd/xcvrd_utilities/xcvr_table_helper.py Added new VDM tables and associated getter APIs to support the new polling functionality.

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (4)

sonic-xcvrd/xcvrd/xcvrd.py:1262

  • Passing the API object as the first parameter fixes an argument mismatch in the post_port_active_apsel_to_db call. Confirm that the function signature and all call sites are updated accordingly.
self.post_port_active_apsel_to_db(api, lport, host_lanes_mask, reset_apsel=True)

sonic-xcvrd/xcvrd/xcvrd.py:1478

  • Updating the SfpStateUpdateTask constructor to include 'sfp_obj_dict' and 'helper_logger' ensures proper dependency injection. Verify that all call sites supply the required parameters.
def __init__(self, namespaces, port_mapping, sfp_obj_dict, main_thread_stop_event, sfp_error_event, helper_logger):

sonic-xcvrd/xcvrd/dom/dom_mgr.py:33

  • Including 'sfp_obj_dict' in the DomInfoUpdateTask constructor improves consistency with dependency requirements. Ensure that sfp_obj_dict is properly initialized and passed from the caller.
def __init__(self, namespaces, port_mapping, sfp_obj_dict, main_thread_stop_event, skip_cmis_mgr, helper_logger):

sonic-xcvrd/xcvrd/xcvrd_utilities/port_event_helper.py:147

  • [nitpick] Changing the log level from warning to debug helps reduce log flooding; verify that this change aligns with the overall logging strategy in production.
self.logger.log_debug("$$$ {} handle_port_update_event() : op={} DB:{} Table:{} fvp {}".format(...))

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@prgeor prgeor merged commit 05d2068 into sonic-net:master Mar 6, 2025
5 checks passed
mssonicbld added a commit to mssonicbld/sonic-platform-daemons.msft that referenced this pull request Mar 18, 2025
<!-- Provide a general summary of your changes in the Title above -->

#### Description
<!--
     Describe your changes in detail
-->
The following traceback is seen with the latest image for DAC cables.
```
2025 Mar 17 17:24:19.889826 sonic-dut ERR pmon#xcvrd[67]: Traceback (most recent call last):
2025 Mar 17 17:24:19.889997 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet259', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet259', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890132 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1878, in run
2025 Mar 17 17:24:19.890201 sonic-dut ERR pmon#xcvrd[67]:     self.task_worker(self.task_stopping_event, self.sfp_error_event)
2025 Mar 17 17:24:19.890397 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet21', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet21', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890516 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1671, in task_worker
2025 Mar 17 17:24:19.890609 sonic-dut ERR pmon#xcvrd[67]:     self.init()
2025 Mar 17 17:24:19.890757 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet86', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet86', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890810 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1590, in init
2025 Mar 17 17:24:19.890856 sonic-dut ERR pmon#xcvrd[67]:     self.retry_eeprom_set = self._post_port_sfp_info_and_dom_thr_to_db_once(port_mapping_data, self.xcvr_table_helper, self.main_thread_stop_event)
2025 Mar 17 17:24:19.891031 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet294', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet294', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891059 sonic-dut ERR pmon#xcvrd[67]:                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891104 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1547, in _post_port_sfp_info_and_dom_thr_to_db_once
2025 Mar 17 17:24:19.891255 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet289', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet289', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891290 sonic-dut ERR pmon#xcvrd[67]:     self.vdm_db_utils.post_port_vdm_thresholds_to_db(logical_port_name)
2025 Mar 17 17:24:19.891332 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 70, in post_port_vdm_thresholds_to_db
2025 Mar 17 17:24:19.891490 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet128', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet128', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891525 sonic-dut ERR pmon#xcvrd[67]:     return self._post_port_vdm_thresholds_or_flags_to_db(logical_port_name, self.xcvr_table_helper.get_vdm_threshold_tbl,
2025 Mar 17 17:24:19.891569 sonic-dut ERR pmon#xcvrd[67]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891721 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet2', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet2', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891757 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 100, in _post_port_vdm_thresholds_or_flags_to_db
2025 Mar 17 17:24:19.891794 sonic-dut ERR pmon#xcvrd[67]:     vdm_values_dict = get_vdm_values_func(physical_port)
2025 Mar 17 17:24:19.891839 sonic-dut ERR pmon#xcvrd[67]:                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891990 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet471', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet471', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892029 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/utils.py", line 39, in get_vdm_thresholds
2025 Mar 17 17:24:19.892067 sonic-dut ERR pmon#xcvrd[67]:     return self.sfp_obj_dict[physical_port].get_transceiver_vdm_thresholds()
2025 Mar 17 17:24:19.892109 sonic-dut ERR pmon#xcvrd[67]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892262 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet118', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet118', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892302 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/sfp_optoe_base.py", line 76, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892340 sonic-dut ERR pmon#xcvrd[67]:     return api.get_transceiver_vdm_thresholds() if api is not None else None
2025 Mar 17 17:24:19.892381 sonic-dut ERR pmon#xcvrd[67]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892544 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet343', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet343', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892585 sonic-dut ERR pmon#xcvrd[67]:   File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/api/public/cmis.py", line 2566, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892625 sonic-dut ERR pmon#xcvrd[67]:     vdm_raw_dict = self.get_vdm(self.vdm.VDM_THRESHOLD)
2025 Mar 17 17:24:19.892666 sonic-dut ERR pmon#xcvrd[67]:                                 ^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892812 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet224', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'true', 'index': '-1', 'port_name': 'Ethernet224', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892857 sonic-dut ERR pmon#xcvrd[67]: AttributeError: 'NoneType' object has no attribute 'VDM_THRESHOLD'
2025 Mar 17 17:24:19.893100 sonic-dut NOTICE pmon#xcvrd[67]: Stop daemon main loop
2025 Mar 17 17:24:19.893330 sonic-dut WARNING pmon#xcvrd[67]: *** ('Ethernet230', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet230', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.893330 sonic-dut ERR pmon#xcvrd[67]: Xcvrd: exception found at child thread SfpStateUpdateTask due to AttributeError("'NoneType' object has no attribute 'VDM_THRESHOLD'")
2025 Mar 17 17:24:19.893412 sonic-dut ERR pmon#xcvrd[67]: Exiting main loop as child thread raised exception!
2025 Mar 17 17:24:19.904444 sonic-dut INFO pmon#supervisord 2025-03-17 17:24:19,904 WARN exited: xcvrd (terminated by SIGKILL; not expected)
```

#### Motivation and Context
<!--
     Why is this change required? What problem does it solve?
     If this pull request closes/resolves an open Issue, make sure you
     include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
With sonic-net/sonic-platform-daemons#582 merged, we are now updating the VDM threshold data for all types of transceivers.

However, for transceivers which are CMIS compliant but have flat memory, they don't have VDM support. The driver handler for fetching the VDM threshold data does not check if the CMIS transceiver supports VDM or not, which causes xcvrd to crash.
https://github.com/sonic-net/sonic-platform-common/blob/e5aedb6bab10a16d0167488eb9e291805c397c8f/sonic_platform_base/sonic_xcvr/api/public/cmis.py#L2619

To address this issue, ensure that a transceiver is flat memory based before reading the VDM threshold data from the transceiver.

#### How Has This Been Tested?
<!--
     Please describe in detail how you tested your changes.
     Include details of your testing environment, and the tests you ran to
     see how your change affects other areas of the code, etc.
-->
1. Ensured that xcvrd is stable and VDM threshold table is present for CMIS optics supporting VDM
2. Ensured that xcvrd is stable and VDM threshold table is present for C-CMIS optics supporting VDM
3. Ensured that xcvrd is stable and VDM threshold table is not present for
3.1 CMIS optics not supporting VDM + does not have flat memory
3.2 CMIS optics but has flat memory
3.3 10G SFP

#### Additional Information (Optional)
MSFT ADO - 31849344
mssonicbld added a commit to Azure/sonic-platform-daemons.msft that referenced this pull request Mar 18, 2025
<!-- Provide a general summary of your changes in the Title above -->

#### Description
<!--
 Describe your changes in detail
-->
The following traceback is seen with the latest image for DAC cables.
```
2025 Mar 17 17:24:19.889826 sonic-dut ERR pmon#xcvrd[67]: Traceback (most recent call last):
2025 Mar 17 17:24:19.889997 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet259', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet259', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890132 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1878, in run
2025 Mar 17 17:24:19.890201 sonic-dut ERR pmon#xcvrd[67]: self.task_worker(self.task_stopping_event, self.sfp_error_event)
2025 Mar 17 17:24:19.890397 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet21', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet21', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890516 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1671, in task_worker
2025 Mar 17 17:24:19.890609 sonic-dut ERR pmon#xcvrd[67]: self.init()
2025 Mar 17 17:24:19.890757 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet86', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet86', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890810 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1590, in init
2025 Mar 17 17:24:19.890856 sonic-dut ERR pmon#xcvrd[67]: self.retry_eeprom_set = self._post_port_sfp_info_and_dom_thr_to_db_once(port_mapping_data, self.xcvr_table_helper, self.main_thread_stop_event)
2025 Mar 17 17:24:19.891031 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet294', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet294', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891059 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891104 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1547, in _post_port_sfp_info_and_dom_thr_to_db_once
2025 Mar 17 17:24:19.891255 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet289', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet289', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891290 sonic-dut ERR pmon#xcvrd[67]: self.vdm_db_utils.post_port_vdm_thresholds_to_db(logical_port_name)
2025 Mar 17 17:24:19.891332 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 70, in post_port_vdm_thresholds_to_db
2025 Mar 17 17:24:19.891490 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet128', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet128', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891525 sonic-dut ERR pmon#xcvrd[67]: return self._post_port_vdm_thresholds_or_flags_to_db(logical_port_name, self.xcvr_table_helper.get_vdm_threshold_tbl,
2025 Mar 17 17:24:19.891569 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891721 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet2', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet2', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891757 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 100, in _post_port_vdm_thresholds_or_flags_to_db
2025 Mar 17 17:24:19.891794 sonic-dut ERR pmon#xcvrd[67]: vdm_values_dict = get_vdm_values_func(physical_port)
2025 Mar 17 17:24:19.891839 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891990 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet471', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet471', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892029 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/utils.py", line 39, in get_vdm_thresholds
2025 Mar 17 17:24:19.892067 sonic-dut ERR pmon#xcvrd[67]: return self.sfp_obj_dict[physical_port].get_transceiver_vdm_thresholds()
2025 Mar 17 17:24:19.892109 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892262 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet118', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet118', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892302 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/sfp_optoe_base.py", line 76, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892340 sonic-dut ERR pmon#xcvrd[67]: return api.get_transceiver_vdm_thresholds() if api is not None else None
2025 Mar 17 17:24:19.892381 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892544 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet343', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet343', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892585 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/api/public/cmis.py", line 2566, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892625 sonic-dut ERR pmon#xcvrd[67]: vdm_raw_dict = self.get_vdm(self.vdm.VDM_THRESHOLD)
2025 Mar 17 17:24:19.892666 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892812 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet224', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'true', 'index': '-1', 'port_name': 'Ethernet224', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892857 sonic-dut ERR pmon#xcvrd[67]: AttributeError: 'NoneType' object has no attribute 'VDM_THRESHOLD'
2025 Mar 17 17:24:19.893100 sonic-dut NOTICE pmon#xcvrd[67]: Stop daemon main loop
2025 Mar 17 17:24:19.893330 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet230', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet230', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.893330 sonic-dut ERR pmon#xcvrd[67]: Xcvrd: exception found at child thread SfpStateUpdateTask due to AttributeError("'NoneType' object has no attribute 'VDM_THRESHOLD'")
2025 Mar 17 17:24:19.893412 sonic-dut ERR pmon#xcvrd[67]: Exiting main loop as child thread raised exception!
2025 Mar 17 17:24:19.904444 sonic-dut INFO pmon#supervisord 2025-03-17 17:24:19,904 WARN exited: xcvrd (terminated by SIGKILL; not expected)
```

#### Motivation and Context
<!--
 Why is this change required? What problem does it solve?
 If this pull request closes/resolves an open Issue, make sure you
 include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
With sonic-net/sonic-platform-daemons#582 merged, we are now updating the VDM threshold data for all types of transceivers.

However, for transceivers which are CMIS compliant but have flat memory, they don't have VDM support. The driver handler for fetching the VDM threshold data does not check if the CMIS transceiver supports VDM or not, which causes xcvrd to crash.
https://github.com/sonic-net/sonic-platform-common/blob/e5aedb6bab10a16d0167488eb9e291805c397c8f/sonic_platform_base/sonic_xcvr/api/public/cmis.py#L2619

To address this issue, ensure that a transceiver is flat memory based before reading the VDM threshold data from the transceiver.

#### How Has This Been Tested?
<!--
 Please describe in detail how you tested your changes.
 Include details of your testing environment, and the tests you ran to
 see how your change affects other areas of the code, etc.
-->
1. Ensured that xcvrd is stable and VDM threshold table is present for CMIS optics supporting VDM
2. Ensured that xcvrd is stable and VDM threshold table is present for C-CMIS optics supporting VDM
3. Ensured that xcvrd is stable and VDM threshold table is not present for
3.1 CMIS optics not supporting VDM + does not have flat memory
3.2 CMIS optics but has flat memory
3.3 10G SFP

#### Additional Information (Optional)
MSFT ADO - 31849344
@r12f
Copy link
Copy Markdown

r12f commented Mar 22, 2025

removing tag, since it is already merged.

Junchao-Mellanox pushed a commit to Junchao-Mellanox/sonic-platform-daemons that referenced this pull request Apr 22, 2025
…et#582)

* [xcvrd] Enable periodic polling of VDM relevant data

Signed-off-by: Mihir Patel <patelmi@microsoft.com>

* Added VDM freeze and unfreeze support

* Update VDM flag change counters and set/clear time in redis-db during periodic polling

* Updated comments and initializing flag count to 0 if flag is clear upon xcvrd boot-up

* Updated comments and initializing flag count to 0

* Fixed unit-test failure in test_update_flag_metadata_tables

* Moved dom_mgr.py to xcvrd/dom/ and changed a warning to debug in port_event_helper.py

* Restructured VDM related functions to separate classes

* Created vdm_utilities and db_utilities folder

* Addressed PR comments

---------

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
Junchao-Mellanox pushed a commit to Junchao-Mellanox/sonic-platform-daemons that referenced this pull request Apr 22, 2025
[202412][xcvrd] Enable periodic polling of VDM relevant data (sonic-net#582)
mihirpat1 pushed a commit to mihirpat1/sonic-platform-daemons.msft that referenced this pull request Apr 30, 2025
…ure#11)

<!-- Provide a general summary of your changes in the Title above -->

#### Description
<!--
 Describe your changes in detail
-->
The following traceback is seen with the latest image for DAC cables.
```
2025 Mar 17 17:24:19.889826 sonic-dut ERR pmon#xcvrd[67]: Traceback (most recent call last):
2025 Mar 17 17:24:19.889997 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet259', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet259', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890132 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1878, in run
2025 Mar 17 17:24:19.890201 sonic-dut ERR pmon#xcvrd[67]: self.task_worker(self.task_stopping_event, self.sfp_error_event)
2025 Mar 17 17:24:19.890397 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet21', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet21', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890516 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1671, in task_worker
2025 Mar 17 17:24:19.890609 sonic-dut ERR pmon#xcvrd[67]: self.init()
2025 Mar 17 17:24:19.890757 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet86', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet86', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.890810 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1590, in init
2025 Mar 17 17:24:19.890856 sonic-dut ERR pmon#xcvrd[67]: self.retry_eeprom_set = self._post_port_sfp_info_and_dom_thr_to_db_once(port_mapping_data, self.xcvr_table_helper, self.main_thread_stop_event)
2025 Mar 17 17:24:19.891031 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet294', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet294', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891059 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891104 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1547, in _post_port_sfp_info_and_dom_thr_to_db_once
2025 Mar 17 17:24:19.891255 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet289', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet289', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891290 sonic-dut ERR pmon#xcvrd[67]: self.vdm_db_utils.post_port_vdm_thresholds_to_db(logical_port_name)
2025 Mar 17 17:24:19.891332 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 70, in post_port_vdm_thresholds_to_db
2025 Mar 17 17:24:19.891490 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet128', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet128', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891525 sonic-dut ERR pmon#xcvrd[67]: return self._post_port_vdm_thresholds_or_flags_to_db(logical_port_name, self.xcvr_table_helper.get_vdm_threshold_tbl,
2025 Mar 17 17:24:19.891569 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891721 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet2', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet2', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.891757 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/db_utils.py", line 100, in _post_port_vdm_thresholds_or_flags_to_db
2025 Mar 17 17:24:19.891794 sonic-dut ERR pmon#xcvrd[67]: vdm_values_dict = get_vdm_values_func(physical_port)
2025 Mar 17 17:24:19.891839 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.891990 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet471', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet471', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892029 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/xcvrd/dom/utilities/vdm/utils.py", line 39, in get_vdm_thresholds
2025 Mar 17 17:24:19.892067 sonic-dut ERR pmon#xcvrd[67]: return self.sfp_obj_dict[physical_port].get_transceiver_vdm_thresholds()
2025 Mar 17 17:24:19.892109 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892262 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet118', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet118', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892302 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/sfp_optoe_base.py", line 76, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892340 sonic-dut ERR pmon#xcvrd[67]: return api.get_transceiver_vdm_thresholds() if api is not None else None
2025 Mar 17 17:24:19.892381 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892544 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet343', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet343', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892585 sonic-dut ERR pmon#xcvrd[67]: File "/usr/local/lib/python3.11/dist-packages/sonic_platform_base/sonic_xcvr/api/public/cmis.py", line 2566, in get_transceiver_vdm_thresholds
2025 Mar 17 17:24:19.892625 sonic-dut ERR pmon#xcvrd[67]: vdm_raw_dict = self.get_vdm(self.vdm.VDM_THRESHOLD)
2025 Mar 17 17:24:19.892666 sonic-dut ERR pmon#xcvrd[67]: ^^^^^^^^^^^^^^^^^^^^^^
2025 Mar 17 17:24:19.892812 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet224', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'true', 'index': '-1', 'port_name': 'Ethernet224', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.892857 sonic-dut ERR pmon#xcvrd[67]: AttributeError: 'NoneType' object has no attribute 'VDM_THRESHOLD'
2025 Mar 17 17:24:19.893100 sonic-dut NOTICE pmon#xcvrd[67]: Stop daemon main loop
2025 Mar 17 17:24:19.893330 sonic-dut WARNING pmon#xcvrd[67]: failure_prs.log skip_prs.log ('Ethernet230', 'STATE_DB', 'PORT_TABLE') handle_port_update_event() fvp {'host_tx_ready': 'false', 'index': '-1', 'port_name': 'Ethernet230', 'asic_id': 0, 'op': 'SET'}
2025 Mar 17 17:24:19.893330 sonic-dut ERR pmon#xcvrd[67]: Xcvrd: exception found at child thread SfpStateUpdateTask due to AttributeError("'NoneType' object has no attribute 'VDM_THRESHOLD'")
2025 Mar 17 17:24:19.893412 sonic-dut ERR pmon#xcvrd[67]: Exiting main loop as child thread raised exception!
2025 Mar 17 17:24:19.904444 sonic-dut INFO pmon#supervisord 2025-03-17 17:24:19,904 WARN exited: xcvrd (terminated by SIGKILL; not expected)
```

#### Motivation and Context
<!--
 Why is this change required? What problem does it solve?
 If this pull request closes/resolves an open Issue, make sure you
 include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
With sonic-net/sonic-platform-daemons#582 merged, we are now updating the VDM threshold data for all types of transceivers.

However, for transceivers which are CMIS compliant but have flat memory, they don't have VDM support. The driver handler for fetching the VDM threshold data does not check if the CMIS transceiver supports VDM or not, which causes xcvrd to crash.
https://github.com/sonic-net/sonic-platform-common/blob/e5aedb6bab10a16d0167488eb9e291805c397c8f/sonic_platform_base/sonic_xcvr/api/public/cmis.py#L2619

To address this issue, ensure that a transceiver is flat memory based before reading the VDM threshold data from the transceiver.

#### How Has This Been Tested?
<!--
 Please describe in detail how you tested your changes.
 Include details of your testing environment, and the tests you ran to
 see how your change affects other areas of the code, etc.
-->
1. Ensured that xcvrd is stable and VDM threshold table is present for CMIS optics supporting VDM
2. Ensured that xcvrd is stable and VDM threshold table is present for C-CMIS optics supporting VDM
3. Ensured that xcvrd is stable and VDM threshold table is not present for
3.1 CMIS optics not supporting VDM + does not have flat memory
3.2 CMIS optics but has flat memory
3.3 10G SFP

#### Additional Information (Optional)
MSFT ADO - 31849344
mihirpat1 added a commit to mihirpat1/sonic-platform-daemons that referenced this pull request May 6, 2025
* [xcvrd] Enable periodic polling of VDM relevant data

Signed-off-by: Mihir Patel <patelmi@microsoft.com>

* Added VDM freeze and unfreeze support

* Update VDM flag change counters and set/clear time in redis-db during periodic polling

* Updated comments and initializing flag count to 0 if flag is clear upon xcvrd boot-up

* Updated comments and initializing flag count to 0

* Fixed unit-test failure in test_update_flag_metadata_tables

* Moved dom_mgr.py to xcvrd/dom/ and changed a warning to debug in port_event_helper.py

* Restructured VDM related functions to separate classes

* Created vdm_utilities and db_utilities folder

* Addressed PR comments

---------

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants