Skip to content

fix: use FTS for log_upload and config_snapshot operations#2440

Merged
Bravo555 merged 2 commits intothin-edge:mainfrom
Bravo555:fix/2390/decouple-mapper-c8y-from-agent
Nov 14, 2023
Merged

fix: use FTS for log_upload and config_snapshot operations#2440
Bravo555 merged 2 commits intothin-edge:mainfrom
Bravo555:fix/2390/decouple-mapper-c8y-from-agent

Conversation

@Bravo555
Copy link
Copy Markdown
Member

@Bravo555 Bravo555 commented Nov 10, 2023

TODO

  • fix blocking using a downloader actor
  • add a test which confirms that download no longer blocks will not be done because it's too complex for the current architecture @reubenmiller for questions

Proposed changes

Makes tedge-mapper-c8y use the File Transfer Service (FTS) when processing log_upload or config_snapshot operations.

Previously tedge-mapper-c8y implicitly assumed that it's installed on the same device as the tedge-agent and tried to access the file-transfer directory directly. Now it correctly downloads the files in the operation's tedgeUrl parameter via HTTP.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
  • Documentation Update (if none of the other choices apply)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

#2390

Checklist

  • I have read the CONTRIBUTING doc
  • I have signed the CLA (in all commits with git commit -s)
  • I ran cargo fmt as mentioned in CODING_GUIDELINES
  • I used cargo clippy as mentioned in CODING_GUIDELINES
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)

Further comments

The 1st commit introduces the FTS download operation by using the Downloader directly, fixing the bug.

The 2nd commit in this PR removes the use of Downloader inside log_upload and config_snapshot handlers to use DownloaderActor instead.

The advantage of this is that downloading from the FTS no longer blocks and other such operations can be processed in the meantime.

I will argue, however, that this also had negative effects on code complexity and maintainability. Whereas previously the download was in a single place and one could trace the control flow easily, now the state with maintaining the state of the download is spread very thin across many different functions and variables.

The effect of the current architecture is that for every complex operation where multiple steps are required, for every transition step1 -> step2 (e.g. before the download and after the download) one has to create:

  • a piece of state where we await the response from the actor that performs the effect associated with the transition, e.g. pending_fts_download_operations for DownloaderActor
  • a function step1 which starts the transition to step2 by sending a message request
  • a function step2 which gets the result response
  • a function which listens to response from the actor, and inspects the response to dispatch it to correct function which performs the next step

I believe that Rust's async/await was created to solve, among others, exactly this problem, by transforming every async function into a state machine which automatically handles transistions between .await points which block.

@Bravo555 Bravo555 force-pushed the fix/2390/decouple-mapper-c8y-from-agent branch from a3f27ad to 329f6b5 Compare November 10, 2023 11:27
@codecov
Copy link
Copy Markdown

codecov bot commented Nov 10, 2023

Codecov Report

Merging #2440 (3d9c6d9) into main (87cd1a2) will decrease coverage by 0.1%.
The diff coverage is 75.8%.

❗ Current head 3d9c6d9 differs from pull request most recent head 752bb22. Consider uploading reports for the commit 752bb22 to get more accurate results

Additional details and impacted files
Files Coverage Δ
crates/extensions/c8y_mapper_ext/src/converter.rs 81.1% <100.0%> (+<0.1%) ⬆️
crates/extensions/c8y_mapper_ext/src/error.rs 0.0% <ø> (ø)
crates/extensions/c8y_mapper_ext/src/tests.rs 91.6% <100.0%> (+<0.1%) ⬆️
crates/extensions/c8y_mapper_ext/src/config.rs 46.3% <50.0%> (+0.1%) ⬆️
crates/extensions/c8y_mapper_ext/src/actor.rs 77.5% <77.4%> (+0.7%) ⬆️
...extensions/c8y_mapper_ext/src/config_operations.rs 86.7% <75.6%> (-1.5%) ⬇️
crates/extensions/c8y_mapper_ext/src/log_upload.rs 87.7% <75.0%> (-2.7%) ⬇️

... and 7 files with indirect coverage changes

@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 10, 2023 11:40 — with GitHub Actions Inactive
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Nov 10, 2023

Robot Results

✅ Passed ❌ Failed ⏭️ Skipped Total Pass % ⏱️ Duration
359 0 3 359 100 56m43.825s

@reubenmiller reubenmiller added this to the 0.13.1 milestone Nov 10, 2023
@Bravo555 Bravo555 marked this pull request as ready for review November 10, 2023 14:36
@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 10, 2023 14:40 — with GitHub Actions Inactive
@Bravo555 Bravo555 force-pushed the fix/2390/decouple-mapper-c8y-from-agent branch from 793a3bf to c7849f5 Compare November 10, 2023 15:12
@Bravo555 Bravo555 force-pushed the fix/2390/decouple-mapper-c8y-from-agent branch 2 times, most recently from e0bb75f to ac7dcd6 Compare November 10, 2023 15:23
@Bravo555 Bravo555 changed the title fix: use FTS for necessary operations fix: use FTS for log_upload and config_snapshot operations Nov 10, 2023
@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 10, 2023 18:43 — with GitHub Actions Inactive
@Bravo555 Bravo555 force-pushed the fix/2390/decouple-mapper-c8y-from-agent branch from ac7dcd6 to be9fa00 Compare November 13, 2023 09:10
@Bravo555 Bravo555 self-assigned this Nov 13, 2023
@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 13, 2023 10:40 — with GitHub Actions Inactive
@reubenmiller reubenmiller removed this from the 0.13.1 milestone Nov 13, 2023
Copy link
Copy Markdown
Member

@rina23q rina23q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. The most important for me is where tedge-mapper-c8y should keep the files by FTS.

@didier-wenzek
Copy link
Copy Markdown
Contributor

I will argue, however, that this also had negative effects on code complexity and maintainability. Whereas previously the download was in a single place and one could trace the control flow easily, now the state with maintaining the state of the download is spread very thin across many different functions and variables.

I will not argue the contrary. The c8y mapper was already complex and the current design adds complexity.
Not only there is some complexity at the functional level, but also this complexity is made worse - as you highlight well - by this idea to persist the current state of any operation that must be resumed later when a response is received.
To be concrete, for a log upload request, the mapper has to forward the request to the child device, to await its response, to download the child-device log excerpts from the file-transfer service then upload these log excerpts to c8y. Athe code level, this leads to 3 states that have to be materialized, stored, handled, cleaned ("request sent to the child", "fts download request sent" and "c8y upload request sent".

I believe that Rust's async/await was created to solve, among others, exactly this problem, by transforming every async function into a state machine which automatically handles transistions between .await points which block.

This is correct up to the point one needs to share some mutable state. With async/await, one has a nice view of the sequence of steps (sent a request to a child device, await for its response, download the log excerpt from fts, upload this excerpt to c8y). And tasks can be spawned in the background to handle such sequences concurrently ... as long as there is no mutable shared state.

Do we have such mutable shared state here? Requests are sent over senders, which are Clone and hence not an issue. However to receive responses such a task needs to own its receiver. Why not? No straightforward to quickly implement now given the current state of the c8y mapper, however I see no blocker.

Can code clarity be improved without a deep refactoring? I see some improvements, but I would be happy to merge this PR as is, to meet the 0.13.1 release, and address improvements later.

  1. The code is currently scattered. The pending_fts_download_operations cache is defined and created in converter.rs, populated in config_operations and log_upload.rs and finally cleared in actor.rs.
  2. Sending a request to the downloader (or the uploader) works in pair with caching the request. Receiving a response and resuming a pending download (upload) operation also works in pair. Can this be abstracted by some wrapper that cache and send requests as well as received and resume responses with their context?

@Bravo555
Copy link
Copy Markdown
Member Author

Can code clarity be improved without a deep refactoring? I see some improvements, but I would be happy to merge this PR as is, to meet the 0.13.1 release, and address improvements later.

That was my intention as well, to merge the PR as it is now, since it contains crucial fixes, and look at addressing this complexity later, when we're less pressed for 1.0. But on my side I can say that this spreading of the state has slowed me down noticeably, so I feel this is a bit of a tech debt and will need to be paid back later.

@reubenmiller reubenmiller modified the milestone: 0.13.1 Nov 13, 2023
/// Used to denote as type of what operation was the file downloaded from the FTS.
///
/// Used to dispatch download result to the correct operation handler.
pub enum FtsDownloadOperationType {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just reuse CumulocitySupportedOperations instead of this new struct.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using a different enum, I wanted to underline that these are the only operations in which we need to download from the FTS. By having a new enum with only these 2 variants, I don't have to do a catch-all _ => {} in matches, so that when a new variant is possibly added, the compiler will automatically highlight all the matches where this variant is not handled.

Comment on lines +178 to +181
pub url: String,

// used to automatically remove the temporary file after operation is finished
pub destination_dir: tempfile::TempDir,
Copy link
Copy Markdown
Contributor

@albinsuresh albinsuresh Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be we could store the DownloadRequest itself here instead of the url and destination_dir separately.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree, except that DownloadRequest contains a plain PathBuf and here we have tempdile::TempDir, which is a temporary directory that is automatically freed on destruction. I've chosed to use this instead of manually removing the parent directory download of the file as then we don't have to remember to do it in the download result handler, but but it's still not ideal.

How to handle is a bit tricky because not all operations need to create a temporary directory, so we don't want to automatically create one for every operation, and I'd rather use something which is automatically removed on destruction, so we don't forget. The most awkward thing is storing it between different different stages of the operation (handle download, handle upload, etc.), because we don't use it for the data, we use it for the behaviour.

clear_cmd_topic: message.topic.clone(),
c8y_binary_url: binary_upload_event_url.to_string(),
operation: CumulocitySupportedOperations::C8yLogFileRequest,
event_response_id,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no point creating/sending the event before this download. So, it's better to move that to handle_fts_log_download right before the binary_upload_event_url is created.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 5e1124f.

@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 14, 2023 11:48 — with GitHub Actions Inactive
@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 14, 2023 11:58 — with GitHub Actions Inactive
@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 14, 2023 14:04 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@didier-wenzek didier-wenzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved.

The code is more complex than one might wish - but this is not due to this PR which only adds on complexity already here. To make things simpler we have to rethink the way the mapper interact with the file transfer service. This design work will be done in the context of firmware management.

@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 14, 2023 14:49 — with GitHub Actions Inactive
Copy link
Copy Markdown
Contributor

@albinsuresh albinsuresh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Makes tedge-mapper-c8y use the File Transfer Service (FTS) when
processing `log_upload` or `config_snapshot` operations.

Previously tedge-mapper-c8y implicitly assumed that it's installed on
the same device as the tedge-agent and tried to access the file-transfer
directory directly. Now it correctly downloads the files in the
operation's `tedgeUrl` parameter via HTTP.

This implementation directly uses downloader and waits until the
download is done, which means it will block if the download takes a long
time.
This commit changes the use of `Downloader` inside `log_upload` and
`config_snapshot` handlers to use `DownloaderActor` instead.

The advantage of this is that downloading from the FTS no longer blocks
and other such operations can be processed in the meantime.

I will argue, however, that this also had negative effects on code
complexity and maintainability. Whereas previously the download was in a
single place and one could trace the control flow easily, now the state
with maintaining the state of the download is spread very thin across
many different functions and variables.

The effect of the current architecture is that for every complex
operation where multiple steps are required, for every transition `step1
-> step2` (e.g. before the download and after the download) one has to
create:

- a piece of state where we await the response from the actor that
  performs the effect associated with the transition, e.g.
  `pending_fts_download_operations` for `DownloaderActor`
- a function `step1` which starts the transition to `step2` by sending
  a message `request`
- a function `step2` which gets the result `response`
- a function which listens to `response` from the actor, and inspects
  the response to dispatch it to correct function which performs the
  next step

I believe that Rust's async/await was created to solve, among others,
exactly this problem, by transforming every async function into a state
machine which automatically handles transistions between `.await` points
which block.
@Bravo555 Bravo555 force-pushed the fix/2390/decouple-mapper-c8y-from-agent branch from 3d9c6d9 to 752bb22 Compare November 14, 2023 15:02
@Bravo555 Bravo555 temporarily deployed to Test Pull Request November 14, 2023 15:27 — with GitHub Actions Inactive
@Bravo555 Bravo555 merged commit 2278afe into thin-edge:main Nov 14, 2023
@Bravo555 Bravo555 deleted the fix/2390/decouple-mapper-c8y-from-agent branch November 14, 2023 15:47
@reubenmiller reubenmiller added this to the 0.13.1 milestone Nov 14, 2023
@Bravo555 Bravo555 removed their assignment Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants