fix: Don't fail if only some operations could not be loaded by Bravo555 · Pull Request #3236 · thin-edge/thin-edge.io

Bravo555 · 2024-11-08T09:22:22Z

TODO

stop loading all the operations when registering a single operation
~~read c8y operations directory only once~~(fix: Don't fail if only some operations could not be loaded #3236 (comment))
~~clean up Operations struct and its usage in converter~~ defered to another PR
fix integration test errors
unit test coverage

Proposed changes

When loading operations from $CONFIG_DIR/operations/c8y, log an error for operations we were unable to load but return other successfully loaded operations.

Because of #3160, this resulted in a lot of logs being emitted, so the PR also reworks the operation registration to not load and re-register all the operations from disk when only a single operation is registered. As a result, we still log invalid operation files 2 times on startup, because of inability to change subscriptions during runtime, see #3236 (comment).

Types of changes

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
Documentation Update (if none of the other choices apply)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

tedge-mapper-c8y panics on startup if an invalid operations file exists #3234

Checklist

I have read the CONTRIBUTING doc
I have signed the CLA (in all commits with git commit -s)
I ran cargo fmt as mentioned in CODING_GUIDELINES
I used cargo clippy as mentioned in CODING_GUIDELINES
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

Further comments

codecov · 2024-11-08T09:33:18Z

Codecov Report

Attention: Patch coverage is 84.54106% with 32 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/extensions/c8y_mapper_ext/src/converter.rs	82.69%	14 Missing and 13 partials ⚠️
crates/core/c8y_api/src/smartrest/operations.rs	90.19%	3 Missing and 2 partials ⚠️

Additional details and impacted files

📢 Thoughts on this report? Let us know!

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests

github-actions · 2024-11-10T23:59:02Z

Robot Results

✅ Passed	❌ Failed	⏭️ Skipped	Total	Pass %	⏱️ Duration
524	0	2	524	100	1h26m23.606578999s

Bravo555 · 2024-11-13T12:31:06Z

This PR changes logic of loading c8y operations in the CumulocityConverter so that it no longer panics, but invalid operation files are logged with error level. However, due to #3160, the output gets spammed by multiple errors regarding the same file.

To ensure that only a single log is emitted for a given operation file, this PR originally aimed to address #3160 as well, by refactoring registering operation flow so that operation directory would be read only a single time. However this optimal solution is made impossible by the inability to add an MQTT subscription after the actor is already spawned, so we're forced to scan the directory when creating C8yConverterConfig to read all MQTT topics that are possibly defined by custom c8y operations. Also there are some tests that assert some implementation details, which makes it impossible to completely remove duplicate c8y operation directory traversals.

As such, this PR will remove duplicates traversals where it's feasible, but will not completely address #3160, and as such, the log output will still be a bit spammed by duplicate errors, which will be annoying for users. Solving this may require bigger changes to our MQTT actor, so it will be left for another PR.

albinsuresh · 2024-11-15T07:10:36Z

crates/core/c8y_api/src/smartrest/operations.rs

+        self.operations.dedup();
+        let pos = self.operations.iter().position(|op| op.name == name);
+        pos.map(|pos| self.operations.remove(pos))


If the switch to a BTreeSet is very minimal, I'm even okay with you making that switch in this PR itself, to avoid complications like this. Or if you prefer a standalone PR, that's also fine.

Yes, better to switch to a BTreeSet right now.

I would really prefer to avoid big changes to Operations struct in this PR because of parallel PR #3225, after one of these merges we expect to see conflicts which I'd like to minimise.
I'd also like to reorganize the module a bit in the refactor, so I'd rather do it as a separate PR.

crates/extensions/c8y_mapper_ext/src/converter.rs

albinsuresh · 2024-11-15T07:30:03Z

crates/extensions/c8y_mapper_ext/src/converter.rs

+        let operation = c8y_api::smartrest::operations::get_operation(
+            ops_file.as_std_path(),
+            &self.config.bridge_config,
+        )?;
+        let operations = self
+            .operations_for_device_mut(target)
+            .expect("entity should've been checked before that it's not a service");

-        let need_cloud_update = self.update_operations(ops_dir.as_std_path())?;
+        let prev_operation = operations.remove_operation(&operation.name);
+        // even if the body of the operation is different, as long as it has the same name, supported operations message
+        // will be the same, so we don't need to resend
+        let need_cloud_update = prev_operation.is_none();


Why not move this logic into the existing update_operations function, as that function already has the contract to return the boolean flag whether the cloud update is required or not?

Yeah, the logic of checking if operations changed is a bit duplicated, but also slightly different:

update_operations reads all operations from the c8y operations dir and returns true if there were any updates in there. It compares two different instances of Operations.

in register_operation as we register only a single operation, we know the update is required when prev_operation is None. We only mutate a single Operations value.

So update_operation has a side effect of creating the entire new Operations value and it checks more that's necessary, so I can't really reuse it as-is, but I agree that the logic that decides if we publish a 114 message being split is suboptimal.

The function changed a bit in a313601, and I ended up with two branches, in the case of a child device I just call update_operations outright. Let me know if this addresses it, or if you had something more particular in mind.

crates/extensions/c8y_mapper_ext/src/converter.rs

albinsuresh · 2024-11-15T07:40:17Z

crates/extensions/c8y_mapper_ext/src/tests.rs


 #[tokio::test]
+// TODO: fix or remove test
+#[ignore = "asserts that publishing a single operation capability message causes full rescan of c8y operations directory, which is undesirable behaviour"]


I understand that the rescan is not optimal. But, we shouldn't be skipping the whole test for that reason. The dynamic ops update feature is still relevant and must be tested.

Yeah, this will need to be addressed before the merge.
Just that from my perspective, it looks like the primary point of this test is to assert the exact behaviour I'm removing (registering any operation via MQTT causes full rescan of c8y operations directory), so I'm not really sure what behaviour we could be checking instead.

Okay, I discovered the reason why the test checked that behaviour (#2614), and found that my implementation was incorrect, and in fact, we need to still maintain the old behaviour for child devices.

Fixed in a313601

rina23q

As I confirmed that it no longer makes a panic. I would approve it. There are still many small things to be improved regarding Operations struct, however, I believe time to move on.

As such, this PR will remove duplicates traversals where it's feasible, but will not completely address #3160, and as such, the log output will still be a bit spammed by duplicate errors, which will be annoying for users. Solving this may require bigger changes to our MQTT actor, so it will be left for another PR.

I saw the error messages are now much less (now 2 vs before around 7, the numbers of operation files at tedge-mapper-c8y's startup). Thanks for improving.

albinsuresh

Despite the shortcomings of the Operations structure, this is sufficient for the bug fix.

albinsuresh · 2024-11-18T05:52:42Z

crates/extensions/c8y_mapper_ext/src/converter.rs

-        let operations = self
-            .operations_for_device_mut(target)
-            .expect("entity should've been checked before that it's not a service");
+        let need_cloud_update = match is_child_operation_path(ops_dir) {


Not a request for change, but just curious why you chose match statement here instead of a simple if-else.

albinsuresh · 2024-11-18T06:11:28Z

crates/extensions/c8y_mapper_ext/src/converter.rs

-            } else {
-                C8yTopic::upstream_topic(&c8y_prefix)
-            };
+                let prev_operation = current_operations.remove_operation(&operation.name);


In your refactoring PR, consider updating the signature of Operations::add_operation function to return the previous operation so that we can avoid this removal first and addition one step later.

When loading operations from $CONFIG_DIR/operations/c8y, log an error for operations we were unable to load but return other successfully loaded operations. Currently this spams the log output a bit, because we statically register some operations by name, but it causes full directory scan and traversal of all operation files anyway. Will be fixed in next commit. Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>

Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>

… operation [1] disabled dynamic operation reload for child devices because it needed to support nested child devices as well. For this reason, when receiving an MQTT command metadata message and registering that operation, we still need to read the operation directory and register and send all the new operations. [1]: thin-edge#2614 Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>

Bravo555 had a problem deploying to Test Pull Request November 8, 2024 09:22 — with GitHub Actions Failure

Bravo555 changed the title ~~Don't fail if only some operations could not be loaded~~ fix: Don't fail if only some operations could not be loaded Nov 8, 2024

Bravo555 temporarily deployed to Test Pull Request November 8, 2024 18:25 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 10, 2024 22:48 — with GitHub Actions Failure

Bravo555 temporarily deployed to Test Pull Request November 12, 2024 14:13 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 12, 2024 14:19 — with GitHub Actions Failure

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from 7c40908 to 5dd7588 Compare November 13, 2024 12:15

Bravo555 had a problem deploying to Test Pull Request November 13, 2024 12:16 — with GitHub Actions Failure

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from 5dd7588 to c024494 Compare November 13, 2024 12:17

Bravo555 temporarily deployed to Test Pull Request November 13, 2024 12:17 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 13, 2024 12:26 — with GitHub Actions Failure

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from c024494 to 95dc831 Compare November 13, 2024 12:54

Bravo555 temporarily deployed to Test Pull Request November 13, 2024 12:54 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 13, 2024 13:01 — with GitHub Actions Failure

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from 95dc831 to ea2da72 Compare November 13, 2024 21:58

Bravo555 temporarily deployed to Test Pull Request November 13, 2024 21:58 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto November 13, 2024 22:05 — with GitHub Actions Inactive

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from ea2da72 to 51f6823 Compare November 14, 2024 12:15

Bravo555 temporarily deployed to Test Pull Request November 14, 2024 12:15 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 14, 2024 12:48 — with GitHub Actions Failure

Bravo555 had a problem deploying to Test Pull Request November 14, 2024 13:38 — with GitHub Actions Failure

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from 61c89ab to bf9f88a Compare November 14, 2024 13:54

Bravo555 temporarily deployed to Test Pull Request November 14, 2024 13:54 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 14, 2024 14:00 — with GitHub Actions Failure

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from bf9f88a to ac65be8 Compare November 14, 2024 14:34

Bravo555 temporarily deployed to Test Pull Request November 14, 2024 14:34 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 14, 2024 14:39 — with GitHub Actions Failure

Bravo555 requested review from jarhodes314 and rina23q as code owners November 14, 2024 15:07

Bravo555 had a problem deploying to Test Auto November 14, 2024 15:08 — with GitHub Actions Failure

Bravo555 had a problem deploying to Test Auto November 14, 2024 15:26 — with GitHub Actions Failure

Bravo555 temporarily deployed to Test Auto November 14, 2024 17:33 — with GitHub Actions Inactive

Bravo555 requested a review from Ruadhri17 November 14, 2024 20:54

albinsuresh reviewed Nov 15, 2024

View reviewed changes

Bravo555 self-assigned this Nov 15, 2024

Bravo555 temporarily deployed to Test Pull Request November 15, 2024 10:49 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 15, 2024 10:56 — with GitHub Actions Failure

Bravo555 temporarily deployed to Test Pull Request November 15, 2024 14:42 — with GitHub Actions Inactive

Bravo555 removed their assignment Nov 15, 2024

rina23q approved these changes Nov 15, 2024

View reviewed changes

Bravo555 had a problem deploying to Test Auto November 18, 2024 00:12 — with GitHub Actions Failure

albinsuresh approved these changes Nov 18, 2024

View reviewed changes

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from a313601 to e4e1d22 Compare November 18, 2024 08:50

Bravo555 temporarily deployed to Test Pull Request November 18, 2024 08:50 — with GitHub Actions Inactive

Bravo555 had a problem deploying to Test Auto November 18, 2024 08:56 — with GitHub Actions Failure

Bravo555 had a problem deploying to Test Auto November 18, 2024 09:16 — with GitHub Actions Failure

Bravo555 added 3 commits November 19, 2024 10:49

Don't read entire c8y operations dir when registering a single operation

c25966d

Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>

Bravo555 force-pushed the fix/3234/tedge-mapper-c8y-panics-on-startup branch from e4e1d22 to d66f2ca Compare November 19, 2024 09:49

Bravo555 temporarily deployed to Test Pull Request November 19, 2024 09:49 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto November 19, 2024 09:55 — with GitHub Actions Inactive

Bravo555 added this pull request to the merge queue Nov 19, 2024

Merged via the queue into thin-edge:main with commit 944dce0 Nov 19, 2024

Bravo555 deleted the fix/3234/tedge-mapper-c8y-panics-on-startup branch November 19, 2024 11:48

Bravo555 mentioned this pull request Nov 21, 2024

refactor: Operations struct refactor #3256

Open

7 tasks

Conversation

Bravo555 commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO

Proposed changes

Types of changes

Paste Link to the issue

Checklist

Further comments

Uh oh!

codecov bot commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Nov 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Robot Results

Uh oh!

Bravo555 commented Nov 13, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rina23q left a comment

Choose a reason for hiding this comment

Uh oh!

albinsuresh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Bravo555 commented Nov 8, 2024 •

edited

Loading

codecov bot commented Nov 8, 2024 •

edited

Loading

github-actions bot commented Nov 10, 2024 •

edited

Loading