Skip to content

fix: manually republish messages on built-in bridge reconnect#3717

Merged
jarhodes314 merged 1 commit intothin-edge:mainfrom
jarhodes314:fix/bridge-republish
Jul 11, 2025
Merged

fix: manually republish messages on built-in bridge reconnect#3717
jarhodes314 merged 1 commit intothin-edge:mainfrom
jarhodes314:fix/bridge-republish

Conversation

@jarhodes314
Copy link
Copy Markdown
Contributor

@jarhodes314 jarhodes314 commented Jun 30, 2025

Proposed changes

Manually republish messages on bridge reconnect to simulate non-clean sessions for Cumulocity.

Since v0.24, rumqttc has fixed how it handles reconnection to a broker, only republishing the existing packets when a session is present (i.e. we have connected to a broker with clean_session=false and the broker has acknowledged this in the connack packet). For Cumulocity, we ideally want a non-clean (i.e. persistent) session, but this is explicitly recommended against.

Still TODO:

  • Test this in a reproducible way, with a broker we can control the responses of, so we can ensure that the messages get published exactly once

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s. You can activate automatic signing by running just prepare-dev once)
  • I ran just format as mentioned in CODING_GUIDELINES
  • I used just check as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

@codecov
Copy link
Copy Markdown

codecov bot commented Jun 30, 2025

Codecov Report

Attention: Patch coverage is 82.75862% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...s/extensions/tedge_mqtt_bridge/src/test_helpers.rs 37.50% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@reubenmiller reubenmiller added the theme:bridge Built-in (Rust) bridge topics label Jul 2, 2025
@jarhodes314 jarhodes314 marked this pull request as draft July 2, 2025 09:09
@jarhodes314 jarhodes314 force-pushed the fix/bridge-republish branch from 9954874 to fe29324 Compare July 2, 2025 15:51
@jarhodes314 jarhodes314 force-pushed the fix/bridge-republish branch from fe29324 to f65270e Compare July 2, 2025 16:00
@jarhodes314 jarhodes314 temporarily deployed to Test Pull Request July 2, 2025 16:00 — with GitHub Actions Inactive
@jarhodes314 jarhodes314 marked this pull request as ready for review July 2, 2025 16:00
@jarhodes314 jarhodes314 requested a review from reubenmiller as a code owner July 2, 2025 16:00
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is a mod.rs so it doesn't get compiled into its own test binary. It will be compiled as a submodule of republish.rs so the tests in this file will run as part of this.

The first draft of this code was created by gemini, hence the heavy commenting. I intend to remove some of the less useful ones, although the details of what the broker should be expecting are broadly useful.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jul 2, 2025

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass % ⏱️ Duration
654 0 3 654 100 1h50m22.795886s

Copy link
Copy Markdown
Contributor

@didier-wenzek didier-wenzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like these new tests driven by small precise actions on the local and remote brokers.

The code of the test_broker used underneath is a bit weird but that's okay as the method to control its behavior from the test suite are clear enough.

I will be happy to approve this PR once the clippy warnings fixed (or simply marked as ignored, since they are related to the test_broker).

Copy link
Copy Markdown
Contributor

@albinsuresh albinsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some queries to better understand some aspects of the fix.

let mut published = 0; // Count of messages published (by the companion)
let mut acknowledged = 0; // Count of messages acknowledged (by the MQTT end-point of the companion)

let mut session_present: Option<bool> = None;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering why this is not initiated by the clean_session value used in the mqtt config used to establish the connection. Got the answer from the code below that it is decided based on the broker response, since some brokers like C8Y can ignore the client's request for a persistent session. A comment would be helpful for future readers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now added a comment

// We have to subscribe to this asynchronously (i.e. in a task) since we might at
// this point have filled our cloud event loop with outgoing messages
tokio::spawn(async move { recv_client.subscribe_many(topics).await.unwrap() });
{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this code moved into a sub-block? To drop the recv_client handle as quickly as possible? Any subtle impact on the fix with this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think that's a relic of the initial implementation where I attempted to resend the messages manually and needed multiple clones of the client and topics, so I put this into a block to allow me to reuse names.

Signed-off-by: James Rhodes <jarhodes314@gmail.com>
Copy link
Copy Markdown
Contributor

@didier-wenzek didier-wenzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

@jarhodes314 jarhodes314 added this pull request to the merge queue Jul 11, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 11, 2025
@jarhodes314 jarhodes314 added this pull request to the merge queue Jul 11, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 11, 2025
@jarhodes314 jarhodes314 added this pull request to the merge queue Jul 11, 2025
Merged via the queue into thin-edge:main with commit 944a3cc Jul 11, 2025
47 of 52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

theme:bridge Built-in (Rust) bridge topics

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants