Skip to content

add exception handling in get_chassis#652

Merged
judyjoseph merged 17 commits intosonic-net:masterfrom
LinJin23:get_chassis_add_exception
Oct 28, 2025
Merged

add exception handling in get_chassis#652
judyjoseph merged 17 commits intosonic-net:masterfrom
LinJin23:get_chassis_add_exception

Conversation

@LinJin23
Copy link
Copy Markdown
Contributor

@LinJin23 LinJin23 commented Jul 31, 2025

Description

add exception handling in get_chassis()

Motivation and Context

Improve exception handling during chassis loading to prevent thermalctld crashes (e.g., ValueError: invalid literal for int() with base 16: 'read error').

How Has This Been Tested?

Unit tests were extended to test the chassis exception handling functionality:

Added test for chassis initialization failure:

  • TestThermalControlDaemon.test_get_chassis_exception()

Added tests for None chassis handling:

  • TestFanUpdater.test_update_with_none_chassis()
  • TestTemperatureUpdater.test_init_with_none_chassis()
  • TestTemperatureUpdater.test_update_with_none_chassis()

Verified:
Daemon continues running and logs appropriate warnings when chassis initialization fails.

Additionally, tested on a physical platform to verify that when an exception occurs in get_chassis, the thermalctld daemon doesn’t crash.

Additional Information (Optional)

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@LinJin23 LinJin23 changed the title add exception in get chassis add exception handling in get_chassis Aug 4, 2025
@azure-pipelines
Copy link
Copy Markdown

Commenter does not have sufficient privileges for PR 652 in repo sonic-net/sonic-platform-daemons

@LinJin23
Copy link
Copy Markdown
Contributor Author

/Azpw run

@mssonicbld
Copy link
Copy Markdown
Collaborator

/AzurePipelines run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@LinJin23 LinJin23 requested a review from vvolam August 13, 2025 01:08
try:
self.chassis = sonic_platform.platform.Platform().get_chassis()
except Exception as e:
self.log_warning("Failed to get chassis due to {}".format(repr(e)))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use log_error instead here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


self.chassis = sonic_platform.platform.Platform().get_chassis()
try:
self.chassis = sonic_platform.platform.Platform().get_chassis()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please check if these lines have code coverage in sonic-thermalctld/tests/test_thermalctld.py

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added code coverage.

@vvolam
Copy link
Copy Markdown
Contributor

vvolam commented Aug 21, 2025

@LinJin23 could you please test this exception path is hit by testing on any platform and simulating the exception

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@LinJin23 LinJin23 marked this pull request as draft September 23, 2025 16:22
…lControlDaemon

Signed-off-by: Lin Jin <linjin@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@LinJin23 LinJin23 marked this pull request as ready for review September 29, 2025 16:04
@LinJin23 LinJin23 requested a review from vvolam September 29, 2025 16:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds exception handling to the get_chassis() method call in the ThermalControlDaemon initialization to prevent crashes when chassis loading fails.

  • Wraps the platform chassis initialization in a try-except block to catch and log exceptions
  • Adds comprehensive test coverage for both successful and failed chassis initialization scenarios
  • Updates mock library to include missing methods and classes for improved test compatibility

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
sonic-thermalctld/scripts/thermalctld Adds try-except wrapper around get_chassis() call with error logging
sonic-thermalctld/tests/test_thermalctld.py Adds test class with comprehensive test cases for chassis initialization scenarios
sonic-thermalctld/tests/mocked_libs/swsscommon/swsscommon.py Updates mock implementation with missing classes and methods for test compatibility

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

try:
self.chassis = sonic_platform.platform.Platform().get_chassis()
except Exception as e:
self.log_error("Failed to get chassis due to {}".format(repr(e)))
Copy link

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handling catches the error but doesn't prevent the subsequent code from trying to use self.chassis. Line 829 will fail with AttributeError because self.chassis is never assigned when an exception occurs. Consider either setting self.chassis = None in the except block or adding a return/raise to prevent further initialization.

Suggested change
self.log_error("Failed to get chassis due to {}".format(repr(e)))
self.log_error("Failed to get chassis due to {}".format(repr(e)))
raise

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With self.chassis initialized, this comment could be ignored.

Comment on lines 5 to 17
# Mock classes that are imported by sonic_py_common
class ConfigDBConnector:
def __init__(self):
pass

class SonicV2Connector:
def __init__(self):
pass

class SonicDBConfig:
def __init__(self):
pass

Copy link

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the import statement from swsssdk import ConfigDBConnector, SonicDBConfig, SonicV2Connector and replacement with local mock classes creates potential issues. If other parts of the test suite depend on the imported classes from swsssdk, this change could break existing functionality. Consider keeping both approaches or ensuring all dependencies are properly mocked.

Suggested change
# Mock classes that are imported by sonic_py_common
class ConfigDBConnector:
def __init__(self):
pass
class SonicV2Connector:
def __init__(self):
pass
class SonicDBConfig:
def __init__(self):
pass
# Try to import real swsssdk classes, fallback to mocks if not available
try:
from swsssdk import ConfigDBConnector, SonicDBConfig, SonicV2Connector
except ImportError:
class ConfigDBConnector:
def __init__(self):
pass
class SonicV2Connector:
def __init__(self):
pass
class SonicDBConfig:
def __init__(self):
pass

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@vvolam
Copy link
Copy Markdown
Contributor

vvolam commented Sep 30, 2025

Description

add exception handling in get_chassis()

Motivation and Context

Improve exception handling during chassis loading to prevent thermalctld crashes (e.g., ValueError: invalid literal for int() with base 16: 'read error').

How Has This Been Tested?

Additional Information (Optional)

Please updated "How Has This Been Tested?" step as well

@@ -821,7 +821,10 @@ class ThermalControlDaemon(daemon_base.DaemonBase):

self.wait_time = self.INTERVAL

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize self.chassis to None

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Lin Jin <linjin@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Lin Jin <linjin@microsoft.com>
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@LinJin23 LinJin23 requested a review from Copilot October 23, 2025 00:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +823 to +824
assert any(
"Failed to get chassis due to" in call_args[0] and "Failed to initialize chassis" in call_args[0]
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion logic checks if both substrings appear in the same call, but the actual log message format is 'Failed to get chassis due to {repr(e)}'. The check for 'Failed to initialize chassis' should verify it appears within the exception representation, not as a separate substring. Consider using a single contains check or verifying the full expected message format.

Suggested change
assert any(
"Failed to get chassis due to" in call_args[0] and "Failed to initialize chassis" in call_args[0]
expected_msg = "Failed to get chassis due to Exception('Failed to initialize chassis')"
assert any(
expected_msg in call_args[0]

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@LinJin23 LinJin23 requested a review from Copilot October 23, 2025 06:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

mock_platform_instance.get_chassis.side_effect = Exception("Failed to initialize chassis")
mock_platform_class.return_value = mock_platform_instance

# ThermalControlDaemon should exit with error code when chassis initialization fails
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The comment states 'should exit with error code' but doesn't mention the SystemExit behavior. Consider clarifying: 'ThermalControlDaemon should raise SystemExit with CHASSIS_GET_ERROR code when chassis initialization fails'.

Suggested change
# ThermalControlDaemon should exit with error code when chassis initialization fails
# ThermalControlDaemon should raise SystemExit with CHASSIS_GET_ERROR code when chassis initialization fails

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +845 to +848
if mock_log_error.called:
for call_args in mock_log_error.call_args_list:
args, _ = call_args
assert "Failed to get chassis due to" not in args[0]
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional check for mock_log_error.called before iterating is unnecessary. If the mock wasn't called, call_args_list would be empty and the loop wouldn't execute. Consider simplifying by removing the conditional and directly iterating over mock_log_error.call_args_list.

Suggested change
if mock_log_error.called:
for call_args in mock_log_error.call_args_list:
args, _ = call_args
assert "Failed to get chassis due to" not in args[0]
for call_args in mock_log_error.call_args_list:
args, _ = call_args
assert "Failed to get chassis due to" not in args[0]

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Signed-off-by: Lin Jin <linjin@microsoft.com>
@LinJin23 LinJin23 requested a review from Copilot October 23, 2025 07:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +825 to +826
expected_msg in call_args[0]
for call_args, _ in mock_log_error.call_args_list
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The assertion iterates through call_args_list but extracts call_args[0] assuming positional arguments. This could be fragile if the logging call format changes. Consider using mock_log_error.assert_any_call(mock.ANY) with a custom matcher or assert_called_with() to make the test more maintainable.

Suggested change
expected_msg in call_args[0]
for call_args, _ in mock_log_error.call_args_list
expected_msg in call.args[0]
for call in mock_log_error.call_args_list

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changing it for now.

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@LinJin23 LinJin23 requested a review from Copilot October 24, 2025 04:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +807 to +808
with mock.patch('thermalctld.sonic_platform.platform.Platform') as mock_platform_class, \
mock.patch.object(thermalctld.ThermalControlDaemon, 'log_error') as mock_log_error:
Copy link

Copilot AI Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The test patches 'log_error' but doesn't verify the call arguments precisely. Consider using mock_log_error.assert_called_with() or mock_log_error.assert_called_once_with() instead of the current approach with any() to make the assertion more explicit and maintainable.

Copilot uses AI. Check for mistakes.
@LinJin23 LinJin23 requested review from Copilot and vvolam October 24, 2025 11:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants