Skip to content

fix: aws and az fail when no response on connection check#3656

Merged
Bravo555 merged 3 commits intothin-edge:mainfrom
Bravo555:fix/3640/untrusted-passes-aws-conncheck
Jun 9, 2025
Merged

fix: aws and az fail when no response on connection check#3656
Bravo555 merged 3 commits intothin-edge:mainfrom
Bravo555:fix/3640/untrusted-passes-aws-conncheck

Conversation

@Bravo555
Copy link
Copy Markdown
Member

@Bravo555 Bravo555 commented May 30, 2025

TODO

  • tests

Proposed changes

When connecting to aws and azure, during "Verifying device is connected to cloud..." stage, if we don't get response from a remote broker, mark the spinner as failed instead of OK.

It was being marked as OK, because the code was a bit too c8y-centric: for c8y we connect directly to the remote broker to create a device and after that switch to the bridge. When using the bridge, if remote broker doesn't respond, we know that the bridge is connected with remote broker because it's using the same configuration as the direct connection, which was successful, so we consider device status unknown instead of failing the connection.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

#3640

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s. You can activate automatic signing by running just prepare-dev once)
  • I ran just format as mentioned in CODING_GUIDELINES
  • I used just check as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

@Bravo555 Bravo555 temporarily deployed to Test Pull Request May 30, 2025 13:56 — with GitHub Actions Inactive
@codecov
Copy link
Copy Markdown

codecov bot commented May 30, 2025

Codecov Report

Attention: Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/core/tedge/src/cli/connect/command.rs 0.00% 6 Missing ⚠️
crates/core/tedge/src/cli/connect/aws.rs 0.00% 1 Missing ⚠️
crates/core/tedge/src/cli/connect/azure.rs 0.00% 1 Missing ⚠️
crates/core/tedge/src/cli/connect/c8y.rs 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Bravo555 Bravo555 force-pushed the fix/3640/untrusted-passes-aws-conncheck branch from 3004ea2 to 5669bfe Compare June 2, 2025 11:24
@Bravo555 Bravo555 had a problem deploying to Test Pull Request June 2, 2025 11:24 — with GitHub Actions Failure
@Bravo555 Bravo555 marked this pull request as ready for review June 2, 2025 11:27
@Bravo555
Copy link
Copy Markdown
Member Author

Bravo555 commented Jun 2, 2025

NOTE: The new test that checks the connection has to be run manually because the suite is marked test:on_demand because we don't have AWS cloud configured in CI.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jun 2, 2025

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass % ⏱️ Duration
641 0 3 641 100 1h47m37.623212s

Comment on lines +273 to +277
if connection_check_success {
Ok(())
} else {
Err(anyhow::anyhow!("Connection check failed").into())
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: To make the tedge connect aws return a non-zero exit code when the connection check failed had to add this here, which can influence c8y where failing the connection check is apparently not a failure? I'm not sure why, and if it's intended or accidental.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tedge connect c8y ignores connection check errors since #218.
@rina23q do you remember the motivation for this behavior?

That said this is not my main concern regarding connect/command.rs. There is so many alternative paths for c8y, aws and az that its really difficult to tell what is done for each. For instance, a few lines above this one, the agent is restarted for c8y but not for aws and az. Why? Also, you have to know that enable_software_management means restart the agent.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I remember the decision.
At the time, the purpose of tedge connect was to establish the bridge to the cloud, and the connection check was considered an optional check because it may fail, for example, when the server is not responded, that's not thin-edge side of failure. That's why we decided to have exit code 0.

However, now we have --offline option. So, I don't mind if we return non-zero exist code when connection check fails.

Bravo555 added 2 commits June 2, 2025 13:51
When connecting to aws and azure, during "Verifying device is connected
to cloud..." stage, if we don't get response from a remote broker, mark
the spinner as failed instead of OK.

It was being marked as OK, because the code was a bit too c8y-centric:
for c8y we connect directly to the remote broker to create a device and
after that switch to the bridge. When using the bridge, if remote broker
doesn't respond, we know that the bridge is connected with remote broker
because it's using the same configuration as the direct connection,
which was successful, so we consider device status unknown instead of
failing the connection.

Signed-off-by: Marcel Guzik <marcel.guzik@cumulocity.com>
Signed-off-by: Marcel Guzik <marcel.guzik@cumulocity.com>
@Bravo555 Bravo555 force-pushed the fix/3640/untrusted-passes-aws-conncheck branch from 5669bfe to 2f09de6 Compare June 2, 2025 13:52
@Bravo555 Bravo555 temporarily deployed to Test Pull Request June 2, 2025 13:52 — with GitHub Actions Inactive
@Bravo555 Bravo555 requested a review from reubenmiller June 3, 2025 12:23
Copy link
Copy Markdown
Contributor

@didier-wenzek didier-wenzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved

Signed-off-by: reubenmiller <reuben.d.miller@gmail.com>
@reubenmiller reubenmiller temporarily deployed to Test Pull Request June 4, 2025 17:06 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@reubenmiller reubenmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. Tested with AWS and it was good (both negative and positive test cases).

I pushed a minor commit to improve the readability of some of the error messages for all clouds, b347575

@Bravo555 Bravo555 added this pull request to the merge queue Jun 9, 2025
Merged via the queue into thin-edge:main with commit 3273c82 Jun 9, 2025
51 of 52 checks passed
@Bravo555 Bravo555 deleted the fix/3640/untrusted-passes-aws-conncheck branch June 10, 2025 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants