Skip to content

fix(mqtt): Map mosquitto bridge service health status updates#2903

Merged
albinsuresh merged 1 commit intothin-edge:mainfrom
albinsuresh:fix/2902/convert-bridge-service-health-status-updates
May 27, 2024
Merged

fix(mqtt): Map mosquitto bridge service health status updates#2903
albinsuresh merged 1 commit intothin-edge:mainfrom
albinsuresh:fix/2902/convert-bridge-service-health-status-updates

Conversation

@albinsuresh
Copy link
Copy Markdown
Contributor

@albinsuresh albinsuresh commented May 24, 2024

Proposed changes

  • Map health status updates from mosquitto bridge services
  • Avoid redundant service creation messages while mapping the health status updates
  • Ensure that the initial health status is propagated as-is without assuming it to be up

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

#2902

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s)
  • I ran cargo fmt as mentioned in CODING_GUIDELINES
  • I used cargo clippy as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

Comment on lines -362 to -366
if self.config.bridge_in_mapper && entity_topic_id.is_bridge_health_topic() {
// Skip service creation for the mapper-inbuilt bridge, as it is part of the mapper service itself
return Ok(vec![]);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced this with a better logic higher up the stack on the auto-registration path, which was the only path where this logic was relevant.

@codecov
Copy link
Copy Markdown

codecov bot commented May 24, 2024

Codecov Report

Attention: Patch coverage is 94.28571% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 78.0%. Comparing base (8a91efb) to head (d52bbdf).
Report is 2 commits behind head on main.

Additional details and impacted files
Files Coverage Δ
crates/core/tedge_api/src/health.rs 92.6% <ø> (ø)
crates/core/tedge_api/src/mqtt_topics.rs 86.9% <100.0%> (-0.6%) ⬇️
...s/extensions/c8y_mapper_ext/src/service_monitor.rs 97.6% <100.0%> (+1.2%) ⬆️
crates/extensions/c8y_mapper_ext/src/converter.rs 83.6% <77.7%> (+<0.1%) ⬆️

... and 1 file with indirect coverage changes

@github-actions
Copy link
Copy Markdown
Contributor

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass % ⏱️ Duration
443 0 3 443 100 1h0m32.488258999s

Copy link
Copy Markdown
Contributor

@didier-wenzek didier-wenzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new system tests related to bridge health status are clear and extensive enough to provide the confidence that this PR fix the recently-discovered issues.

However, the code related to health status is still fragile because unstructured.

  • There are two similar HealthStatus structs defined by c8y related crates (c8y_api and c8y_mapper_ext).
  • The respective roles of tedge_api and c8y_api is not clear. Several methods should be moved into tedge_api to encapsulate all the tedge conventions.
  • This PR introduces UP_STATUS and DOWN_STATUS constants, but "up" and "down" literals are used everywhere.

Hence, I'm okay to merge this PR if the point if to have an urgent fix. But to avoid other future regressions, it would be really better to take the time to clean-up health status handling.

Comment on lines +42 to +49
fn entity_is_mosquitto_bridge_service(entity_topic_id: &EntityTopicId) -> bool {
entity_topic_id
.default_service_name()
.filter(|name| {
name.starts_with(MOSQUITTO_BRIDGE_PREFIX) && name.ends_with(MOSQUITTO_BRIDGE_SUFFIX)
})
.is_some()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this function be part of tedge_api?

Comment on lines +1233 to +1237
// If the auto-registration is done on a health status message,
// no need to map it to a C8y service creation message here,
// as the status message itself is mapped into a service creation message
// in try_convert_data_message called after this auto-registration.
// This avoids redundant service status creation/mapping
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. It seems that something can still be improved here.

return vec![];
}

if entity.topic_id.is_bridge_health_topic() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beware that the is_bridge_health_topic method is now unused and has to be removed.

Comment on lines +31 to +35
let status = match payload {
MQTT_BRIDGE_UP_PAYLOAD => UP_STATUS,
MQTT_BRIDGE_DOWN_PAYLOAD => DOWN_STATUS,
_ => UNKNOWN_STATUS,
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should be moved into tedge_api. It would even be better to have a function taking both the topic and the payload. Indeed, with the current API (a bunch of constants), this is up to the caller to figure out that this is a bridge health topic and that the payload has to be decoded differently).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a comprehensive cleanup of the currently fragmented health monitoring code will be done a follow-up PR as mentioned in this comment

Copy link
Copy Markdown
Member

@rina23q rina23q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that this PR fixes the reported issues. Thanks Albin.
I approve this state since it fixes the issues and robot tests coverage is good, although I agree with the comments from Didier about c8y_api vs c8y_mapper_ext.

@albinsuresh
Copy link
Copy Markdown
Contributor Author

However, the code related to health status is still fragile because unstructured.

  • There are two similar HealthStatus structs defined by c8y related crates (c8y_api and c8y_mapper_ext).
  • The respective roles of tedge_api and c8y_api is not clear. Several methods should be moved into tedge_api to encapsulate all the tedge conventions.
  • This PR introduces UP_STATUS and DOWN_STATUS constants, but "up" and "down" literals are used everywhere.

I completely agree and this was the cleanup that I wanted to do as well. But, kept the minimal changes in this PR because of the urgency in fixing the bug. Will definitely do a follow-up PR with all that cleanup.

@albinsuresh albinsuresh added this pull request to the merge queue May 27, 2024
Merged via the queue into thin-edge:main with commit 7a18df8 May 27, 2024
@albinsuresh albinsuresh deleted the fix/2902/convert-bridge-service-health-status-updates branch May 28, 2024 05:48
Copy link
Copy Markdown
Contributor

@gligorisaev gligorisaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases are well-designed for verifying the mapping of service statuses to Cumulocity managed objects.
They ensure that both valid and invalid inputs are handled appropriately, and the expected outcomes are clearly defined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants