refactor: Simplify the operation handling code by replacing fragmented control flow with regular async/await#2904
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files
|
4f53e99 to
df72f91
Compare
df72f91 to
faecc49
Compare
faecc49 to
b51176e
Compare
Robot Results
|
|
Was a regression introduced by the PR, fixed. |
I would be surprised by that. The tests are green on other PRs and the errors are related to this PR's changes. |
ee20ab7 to
80b918b
Compare
albinsuresh
left a comment
There was a problem hiding this comment.
Great effort. It was nice to see the OperationHandler abstraction with only the stuff that really needs to be shared for operation handling. And I really liked that subtle trick you used with the Entity abstraction, eliminating the need for sharing the entity_store as well.
I'm still a bit confused to see the locking required around the proxy. I was under the impression that it wouldn't be needed anymore. But still, it's good to see that the locking is limited to the upload path and it doesn't influence the processing of any other parallel requests.
crates/extensions/c8y_mapper_ext/src/operations/config_snapshot.rs
Outdated
Show resolved
Hide resolved
crates/extensions/c8y_mapper_ext/src/operations/firmware_update.rs
Outdated
Show resolved
Hide resolved
didier-wenzek
left a comment
There was a problem hiding this comment.
I will be really happy to merge this PR!
However, I see this work as half-done:
- All the operations must be handled by the
OperationHandler. - This is an opportunity to move the code for
Restart,SoftwareListandSoftwareUpdateinto their own modules underoperations.
albinsuresh
left a comment
There was a problem hiding this comment.
A giant leap in terms of streamlining the bloated responsibilities of the converter. The code looks really good and consider this my stamp of approval of the refactoring.
Yes, there are still some minor niggles in how the clear messages are a handled etc. But, to avoid blocking this PR for much longer, I'm happy to approve this PR in its current state, as the improvements made far outweighs such minor niggles. Will give the "formal" approval once Didier's comments are also addressed.
Agree. The benefits are great and I will be happy to merge this PR despite being not fully happy with the design. See #2904 (comment).
I'm afraid my comments are not straightforward to address:
Hence, I'm happy to address these pending comments in a following PR. |
didier-wenzek
left a comment
There was a problem hiding this comment.
Approved. Thank you for this sustained effort to untangle the c8y converter. A great leap forward.
albinsuresh
left a comment
There was a problem hiding this comment.
There's some test failure, but looks unrelated.
To be more sure about future changes to the c8y mapper and the converter, we need tests checking that operations happen concurrently. This change introduces a test that only covers config_snapshot operation, but it expectedly fails when regressions were introduced by the previous commits in this PR. Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Creates a new `OperationHandler` struct with methods that handle operations. This is done to both pull stuff out of `CumulocityConverter`, which is too large and does too many things, and to make it possible to handle operations concurrently. The current design of the `CumulocityConverter` is that all the incoming MQTT messages go into the converter's `convert()` method, which converts in both directions between messages from the cloud and local MQTT broker, and also needs to return a converted message without blocking. In contrast, when handling operations, it can take some time before an MQTT response is ready to be sent. As such, it makes sense to spawn a task when starting the operation, and have the converter return no messages, so other messages can be converted. Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
To make sure `OperationHandler` can be used concurrently, its handler methods were made `&self`. Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Before the change, `pending_upload_operations` and `pending_fts_download_operations` hashmaps lived inside `CumulocityConverter`, so the main actor would need to access them via the converter. After the change, the hashmaps were put behind `Arc<Mutex<>>`, so the main actors would be able to access them without going through the converter. Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Created a new struct `OperationContext` from the fields of `OperationHandler` that were used by operations tasks. Thanks to that: - mutex on the `running_operations` field was removed - `OperationContext` no longer needs to be behind `Arc` to be useful, but uses `Arc`s internally - data flow in `RunningOperation::spawn` is clearer - `OperationContext` is private, so constructing `OperationHandler` is easier, because we don't need to pass every field, instead we just take what we need from `C8yMapperConfig` Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
d0674c1 to
fe3d76a
Compare
TODO
add more tests to see which operations are blocked/executed concurrentlyaddedlog_upload, no other operations downloading from the FTS (from the perspective of the mapper)remove remaining mutexes/locksreplaced with ordinary sender clonesclean up operation handling codewill be done in next PRFollow-up tasks in: #2880 (comment)
Proposed changes
This PR is a 2nd attempt at simplifying operation control flow. I found that the top-down approach in #2881 of spawning a task for every main actor message would be too tedious to get working, because to remove the lock on the converter would require to change
convertmethod to&self, which at this moment is infeasible because of how many things it does/how many other functions it calls.Instead, the approach taken here was a bottom-up one:
convertmethod's assumptions:convert()needs to convert MQTT messages in both directions without blockingCumulocityConverterand placed in a newly createdOperationHandler. The operation handler was later made totokio::spawnthe operation handlers. This separates the scope of operation handling from the converter, and makes it so the converter can immediately return.pending_upload_operationsandpending_fts_download_operationshashmaps were removed.In contrast to the previous attempt, the stated aims, i.e.: simplifying the control flow without impacting the ability to execute multiple operations concurrently, appear to have been achieved.
cont. in Further comments
Types of changes
Paste Link to the issue
Checklist
cargo fmtas mentioned in CODING_GUIDELINEScargo clippyas mentioned in CODING_GUIDELINESFurther comments
Caveats
Testing and correctness
There was a test created, but it only tests
config_snapshotandlog_uploadcommand, and it only tests if it's concurrent by checking if the download requests were created. Ideally we'd spawn operations of all types and check if they're progressing equally, but the only view into their execution is messages they send to other actors, but some don't do this at all.https://github.com/Bravo555/thin-edge.io/blob/4f53e990e160924aec0931ba1afe771f9301195f/crates/extensions/c8y_mapper_ext/src/tests.rs#L2421-L2469