Skip to content

[cmis] Separate Flag-Specific Fields for DOM and Status APIs#556

Merged
prgeor merged 10 commits intosonic-net:masterfrom
mihirpat1:dom_status_tbl_enhancement
May 6, 2025
Merged

[cmis] Separate Flag-Specific Fields for DOM and Status APIs#556
prgeor merged 10 commits intosonic-net:masterfrom
mihirpat1:dom_status_tbl_enhancement

Conversation

@mihirpat1
Copy link
Copy Markdown
Contributor

@mihirpat1 mihirpat1 commented Apr 1, 2025

Description

The DOM and Status related APIs need to be modified to ensure that the fields which have latched registers are moved to the flag specific API.
Also, all the VDM specific fields have now been removed from DOM and Status related APIs since separate APIs pertaining to processing VDM data have now been created.

This change is in line to the requirements of SONiC/doc/platform_api/CMIS_Diagnostic_Monitoring_Overview_in_SONiC.md at master · sonic-net/SONiC

Motivation and Context

N/A

How Has This Been Tested?

Please refer to sonic-net/sonic-utilities#3844 for the test details

Additional Information (Optional)

MSFT ADO - 30090754

Related PRs
sonic-net/SONiC#1954
sonic-net/sonic-platform-daemons#604
sonic-net/sonic-utilities#3844

Signed-off-by: Mihir Patel <patelmi@microsoft.com>
@mihirpat1 mihirpat1 requested a review from Copilot April 1, 2025 03:00
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the handling of flag‐specific fields by separating DOM and Status API fields and removing VDM-specific fields from the DOM and Status APIs. Key changes include:

  • Removal of VDM‐specific fields from test expectations and API implementations.
  • Renaming and restructuring of flag fields (e.g. from snake_case to camelCase) in both the DOM and status APIs.
  • Updates to test mocks and parameterized tests to reflect the new field separations.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
tests/sonic_xcvr/test_ccmis.py Updated test expectations and mock assignments; removed VDM-specific fields.
sonic_platform_base/sonic_xcvr/api/public/cmis.py Removed VDM dict updates and updated flag field naming in DOM and status APIs.
sonic_platform_base/sonic_xcvr/api/public/c_cmis.py Removed VDM delta mappings and streamlined bulk status and status flag methods.
Comments suppressed due to low confidence (3)

tests/sonic_xcvr/test_ccmis.py:278

  • Ensure that the updated mock return value indices for get_laser_config_freq, get_current_laser_freq, and get_tx_config_power reflect the new ordering of mock_response elements; verify that the intended data is being used in each assignment.
self.api.get_laser_config_freq.return_value = mock_response[1]

sonic_platform_base/sonic_xcvr/api/public/cmis.py:2188

  • [nitpick] When fault values are missing, the default value 'N/A' is used, which may be inconsistent with the expected boolean type; consider using a boolean default (e.g., false) instead.
key = fault_type_template.format(lane_num=lane)

tests/sonic_xcvr/test_ccmis.py:496

  • [nitpick] Verify that the expected results in test_get_transceiver_status_flags have been updated to match the new flag naming conventions and field separations implemented in the API.
assert result == expected

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mihirpat1 mihirpat1 requested a review from Copilot April 4, 2025 05:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (4)

tests/sonic_xcvr/test_ccmis.py:258

  • Ensure that the removal of VDM-specific fields (e.g., 'laser_temperature') from the expected dictionary is consistently reflected in all tests to match the updated API outputs.
                    'laser_temperature': 40,

sonic_platform_base/sonic_xcvr/api/public/sff8636.py:102

  • [nitpick] Consider updating the docstring to explicitly state that this method accesses latched registers, clarifying its difference from get_transceiver_status.
def get_transceiver_status_flags(self):

sonic_platform_base/sonic_xcvr/api/public/cmis.py:385

  • The removal of the VDM dictionary assignment may affect downstream processing; verify that no components depend on VDM data from get_transceiver_bulk_status.
self.vdm_dict = self.get_vdm(self.vdm.VDM_REAL_VALUE)

sonic_platform_base/sonic_xcvr/api/public/c_cmis.py:400

  • The method name has been changed from get_transceiver_threshold_info to get_transceiver_status; ensure that related documentation and client usages are updated accordingly.
def get_transceiver_status(self):

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

bias_yp = FLOAT ; modulator bias yp
========================================================================
Dictionary
"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 should we rename this bulk api to dom status or as appropriate per the HLD table

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor I have addressed this now

def get_transceiver_status_flags(self):
"""
Retrieves transceiver status of this SFP
Retrieves transceiver status flags for this SFP module.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 SFP -> Active modules?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor This API can is also used for non-active modules. Hence, planning to keep this as is per our discussion.

Copy link
Copy Markdown
Collaborator

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 for each of the APIs can we put the name of the corresponding STATE_DB table name also?

'vcclowalarm': voltage_flags['voltage_low_alarm_flag'],
'vcchighwarning': voltage_flags['voltage_high_warn_flag'],
'vcclowwarning': voltage_flags['voltage_low_warn_flag']
'tempHAlarm': case_temp_flags['case_temp_high_alarm_flag'],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mihirpat1 can we rename get_transceiver_bulk_status to get_dom_sensor() and state that this will fetch real time dom sensor number value. get_transceiver_dom_flags() will only get flag T/F values. Please make this change for SFF8436, SFF8636 as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prgeor I have addressed this now

@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to msft-202412: Azure/sonic-platform-common.msft#82

mssonicbld added a commit to Azure/sonic-utilities.msft that referenced this pull request May 9, 2025
…es (#176)

<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "closes #xxxx",
 "fixes #xxxx" or "resolves #xxxx" so that GitHub automatically closes the related
 issue when the PR is merged.

 If you are adding/modifying/removing any command or utility script, please also
 make sure to add/modify/remove any unit tests from the tests
 directory as appropriate.

 If you are modifying or removing an existing 'show', 'config' or 'sonic-clear'
 subcommand, or you are adding a new subcommand, please make sure you also
 update the Command Line Reference Guide (doc/Command-Reference.md) to reflect
 your changes.

 Please provide the following information:
-->

#### What I did
1. With the recent transceiver DOM and STATUS table related changes, the handles of the following CLIs need to be modified to ensure that the CLI function inline to the existing behavior
 1. show int transceiver eeprom -d
 2. show int transceiver status
 3. show int transceiver pm
 4. sfputil show eeprom -d -p EthernetXX

2. Also, all the diagnostics related tables will not be present only for the first subport for breakout ports. Hence, made the corresponding CLI handler related changes to support this
3. Modified the `sfputil show error-status` CLI handler to use `TRANSCEIVER_STATUS_SW` instead of `TRANSCEIVER_STATUS` table

#### How I did it
1. Replaced function call for get_transceiver_bulk_status with get_transceiver_dom_real_value
5. As part of displaying o/p for "show int transceiver status" CLI, we are now fetching data from TRANSCEIVER_DOM_FLAG, VDM flag tables and TRANSCEIVER_STATUS_FLAG tables as well since the relevant fields have been moved to these new tables
6. As part of display o/p for "show int transceiver pm" CLI, we are now fetching data from VDM threshold tables as well since the relevant fields have been moved to these new tables

Related PRs
sonic-net/SONiC#1954
[[cmis] Separate Flag-Specific Fields for DOM and Status APIs by mihirpat1 · Pull Request #556 · sonic-net/sonic-platform-common](sonic-net/sonic-platform-common#556)
[[xcvrd] Re-organize transceiver DOM and STATUS tables by mihirpat1 · Pull Request #604 · sonic-net/sonic-platform-daemons](sonic-net/sonic-platform-daemons#604)

#### How to verify it
Following tests were attempted and confirmed to pass

1. Diagnostic table related verification
	1. Compare tables with nightly image and ensure non-CMIS transceivers do not change their mapping for DOM and status tables
	2. Compare DOM sensor, threshold and status tables between current and nightly. Ensure units are not shown
	3. Dump TRANSCEIVER_INFO table for CMIS transceiver to ensure is_replaceable field is present
2. CLI verification
	1. sudo sfputil show error-status - This should dump output for all the ports and the behavior should not change between nightly and private image
	2. Use invalid port name (Ethernet2033) to dump show int transceiver info, status CLI. Ensure to note new errors in PR description for invalid ports
	3. For breakout scenario, keep 2 subports in different CMIS_STATES and ensure the CLI o/p is different for status CLI for both subports
3. Run transceiver onboarding test due to TRANSCEIVER_INFO table deletion change
4. XCVRD stop/start
	1. Stop xcvrd and ensure all diagnostic tables are deleted
	2. Start xcvrd and ensure all diagnostic tables are populated
5. Link flap event handling
	1. Change link flap count for different types of transceivers and ensure no crash is seen. This is to ensure that the link event handler code does not cause any crash for non-CMIS transceivers
	2. Shutdown peer port and ensure the tables get updated which are handled via link event update handler.
		Also, execute sfputil reset on 10 different physical ports and ensure that all the corresponding flags on the remote side are updated for all 8 lanes of a DR8 optic
		Specifically, we need to ensure that the following flags are set wherein X represents the lane number
		1. LOSFlagRxX (page 11, byte 147)
		2. CDRLOLFlagRxX (page 11, byte 148)
		3. OpticalPowerLowAlarmFlagRxX (page 11, byte 150)
		4. OpticalPowerLowWarningFlagRxX (page 11, byte 152)
6. Transceiver OIR test
	1. Remove transceiver and ensure all diagnostic related tables are deleted
	2. Insert transceiver and ensure all diagnostic tables are populated
	3. Stop xcvrd and remove transceiver and ensure TRANSCEIVER_INFO table is present. Start xcvrd and ensure that the TRANSCEIVER_INFO table is now deleted
	4. Insert the transceiver now
7. Ensure firmware upgrade is successfull and TRANSCEIVER_FIRMWARE_INFO table changes only for the primary subport eventhough, the upgrade was triggered for a non-primary subport
8. Ensure SFF manager crash is seen with nightly and is resolved with this PR (since the PR has the fix for [SffLoggerForPortUpdateEvent class does not have log_debug function · Issue #605 · sonic-net/sonic-platform-daemons](sonic-net/sonic-platform-daemons#605))

#### Previous command output (if the output of a command-line utility has changed)
The output of show int transceiver info and show int transceiver eeprom will change if an invalid port name is passed to the CLI

```
root@str4-sn5600-3:/home/admin# show int transceiver info Ethernet8
Ethernet8: SFP EEPROM Not detected
root@str4-sn5600-3:/home/admin#
```

#### New command output (if the output of a command-line utility has changed)
```
root@str4-sn5600-3:/home/admin# show int transceiver info Ethernet8
Error: Found KeyError while getting first subport for Ethernet8
Error: Unable to get first subport for Ethernet8 while converting SFP info
Ethernet8: SFP EEPROM Not detected
root@str4-sn5600-3:/home/admin#
```
MSFT ADO - 32238285
@infn-ke
Copy link
Copy Markdown
Contributor

infn-ke commented Jun 19, 2025

@mihirpat1 this PR removes several alarm flags, trying to understand if they are renamed or completed removed? E.g. where did osnrhighalarm_flag, esnrlowalarm_flag, rxsigpowerlowalarm_flag etc. go?

@mihirpat1
Copy link
Copy Markdown
Contributor Author

mihirpat1 commented Jun 19, 2025

@mihirpat1 this PR removes several alarm flags, trying to understand if they are renamed or completed removed? E.g. where did osnrhighalarm_flag, esnrlowalarm_flag, rxsigpowerlowalarm_flag etc. go?

@infn-ke All these flags were moved to VDM related flag tables. Please refer to the below HLD for further details.

https://github.com/sonic-net/SONiC/blob/master/doc/platform_api/CMIS_Diagnostic_Monitoring_Overview_in_SONiC.md#3-state_db-schema-for-cmis-diagnostic-monitoring

Specifically, the VDM flags are now moved to TRANSCEIVER_VDM_HALARM_FLAG, TRANSCEIVER_VDM_LALARM_FLAG etc tables.

@infn-ke
Copy link
Copy Markdown
Contributor

infn-ke commented Jun 19, 2025

Specifically, the VDM flags are now moved to TRANSCEIVER_VDM_HALARM_FLAG, TRANSCEIVER_VDM_LALARM_FLAG etc tables.

I see, does that mean they were duplicated in two apis earlier? So they would be in the get_transceiver_vdm_flags api now? And thanks for the reference!

@mihirpat1
Copy link
Copy Markdown
Contributor Author

Specifically, the VDM flags are now moved to TRANSCEIVER_VDM_HALARM_FLAG, TRANSCEIVER_VDM_LALARM_FLAG etc tables.

I see, does that mean they were duplicated in two apis earlier? So they would be in the get_transceiver_vdm_flags api now? And thanks for the reference!

@infn-ke Yes, that's correct.

@yejianquan
Copy link
Copy Markdown

Hi @mihirpat1 , please fix the conflict and create PR to 202505 branch

@yejianquan
Copy link
Copy Markdown

202505 branched out around 5/16, this commit is included when created, removing request labels.
and please ignore the conflict notification

mssonicbld added a commit to mssonicbld/sonic-platform-common that referenced this pull request Mar 9, 2026
<!-- Provide a general summary of your changes in the Title above -->

#### Description
<!--
     Describe your changes in detail
-->

Remove calls to setting a syslog tag as there was no logging actually used in this module. The existing call to log a message was in dead code, therefore there was no need to setup a syslogger at all.

#### Motivation and Context
<!--
     Why is this change required? What problem does it solve?
     If this pull request closes/resolves an open Issue, make sure you
     include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
In python services using `sonic_py_common.logger` the syslog identifier already set up by the executable/service can be overwritten when importing other modules.

This log hijack is coming from a library: `sonic_platform_base/sonic_xcvr/api/public/c_cmis.py`
```
SYSLOG_IDENTIFIER = "CCmisApi"
...
helper_logger = logger.Logger(SYSLOG_IDENTIFIER)
# This is the only log that this file has:
    helper_logger.log_debug('key {} not present in VDM'.format(new_key))
```

Searching in the syslog of a switch running SONiC we see logs with the `CCmisApi` that shouldn't:
```
2025-11-11T13:20:03.440479+00:00 sonic CCmisApi: Failed to load platform Pcie module. Warning : No module named 'sonic_platform.pcie', fallback to load Pcie common utility.
2025 Nov 11 13:20:45.179454 gold228 NOTICE CCmisApi: System is ready
```

If any code imports c_cmis.py (indirectly for the most part) and has a logger session open already, their existing session will be re-opened with a new log tag, `CCmisApi` which is not desired.

To fix this we remove this logging code entirely because the function where it is used became dead in sonic-net#556

Libraries (rather than executables) probably shouldn't be syslogging on their own, and they certainly shouldn't hijack the log tag of the including executable.

Resolves sonic-net/sonic-buildimage#24489

#### How Has This Been Tested?
<!--
     Please describe in detail how you tested your changes.
     Include details of your testing environment, and the tests you ran to
     see how your change affects other areas of the code, etc.
-->
Before the fix:
Open an interactive python session and run this code:
```
from sonic_py_common import logger
log = logger.Logger("NATE")
log.log_notice("Log 1")
import sonic_platform
log.log_notice("Log 2")
```
Watch the syslog output:
```
2025 Nov 11 13:47:29.002667 gold228-dut NOTICE NATE: Log 1
2025 Nov 11 13:48:28.406481 gold228-dut NOTICE CCmisApi: Log 2
```
Clearly the second log should also have had the tag `NATE`

After the fix:
Syslog entries should have been:
```
2025 Nov 11 13:47:29.002667 gold228-dut NOTICE NATE: Log 1
2025 Nov 11 13:48:28.406481 gold228-dut NOTICE NATE: Log 2
```

Signed-off-by: Sonic Build Admin <sonicbld@microsoft.com>
#### Additional Information (Optional)
mssonicbld added a commit that referenced this pull request Mar 9, 2026
<!-- Provide a general summary of your changes in the Title above -->

#### Description
<!--
 Describe your changes in detail
-->

Remove calls to setting a syslog tag as there was no logging actually used in this module. The existing call to log a message was in dead code, therefore there was no need to setup a syslogger at all.

#### Motivation and Context
<!--
 Why is this change required? What problem does it solve?
 If this pull request closes/resolves an open Issue, make sure you
 include the text "fixes #xxxx", "closes #xxxx" or "resolves #xxxx" here
-->
In python services using `sonic_py_common.logger` the syslog identifier already set up by the executable/service can be overwritten when importing other modules.

This log hijack is coming from a library: `sonic_platform_base/sonic_xcvr/api/public/c_cmis.py`
```
SYSLOG_IDENTIFIER = "CCmisApi"
...
helper_logger = logger.Logger(SYSLOG_IDENTIFIER)
# This is the only log that this file has:
 helper_logger.log_debug('key {} not present in VDM'.format(new_key))
```

Searching in the syslog of a switch running SONiC we see logs with the `CCmisApi` that shouldn't:
```
2025-11-11T13:20:03.440479+00:00 sonic CCmisApi: Failed to load platform Pcie module. Warning : No module named 'sonic_platform.pcie', fallback to load Pcie common utility.
2025 Nov 11 13:20:45.179454 gold228 NOTICE CCmisApi: System is ready
```

If any code imports c_cmis.py (indirectly for the most part) and has a logger session open already, their existing session will be re-opened with a new log tag, `CCmisApi` which is not desired.

To fix this we remove this logging code entirely because the function where it is used became dead in #556

Libraries (rather than executables) probably shouldn't be syslogging on their own, and they certainly shouldn't hijack the log tag of the including executable.

Resolves sonic-net/sonic-buildimage#24489

#### How Has This Been Tested?
<!--
 Please describe in detail how you tested your changes.
 Include details of your testing environment, and the tests you ran to
 see how your change affects other areas of the code, etc.
-->
Before the fix:
Open an interactive python session and run this code:
```
from sonic_py_common import logger
log = logger.Logger("NATE")
log.log_notice("Log 1")
import sonic_platform
log.log_notice("Log 2")
```
Watch the syslog output:
```
2025 Nov 11 13:47:29.002667 gold228-dut NOTICE NATE: Log 1
2025 Nov 11 13:48:28.406481 gold228-dut NOTICE CCmisApi: Log 2
```
Clearly the second log should also have had the tag `NATE`

After the fix:
Syslog entries should have been:
```
2025 Nov 11 13:47:29.002667 gold228-dut NOTICE NATE: Log 1
2025 Nov 11 13:48:28.406481 gold228-dut NOTICE NATE: Log 2
```

Signed-off-by: Sonic Build Admin <sonicbld@microsoft.com>
#### Additional Information (Optional)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants