Skip to content

Conversation

@e-sumin
Copy link
Contributor

@e-sumin e-sumin commented Feb 18, 2024

When restoring huge file(s), the progress reporting is done in a bit weird way:

kopia_test % kopia snapshot restore ka2084d263182164b6cf3456668e6b6da /Users/eugen.sumin/kopia_test/2
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/2) with parallelism=8...
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (5.4 GB).

In fact, the amount of restored data is dumped when particular file completely restored.

This PR contains the least invasive change, which allows us to see progress update while file is downloaded from object storage.

Restoring to local filesystem (/Users/eugen.sumin/kopia_test/55) with parallelism=8...
Processed 2 (3.1 MB) of 5 (1.8 GB).
Processed 4 (459.6 MB) of 5 (1.8 GB) 270.3 MB/s (25.2%) remaining 4s.
Processed 4 (468.7 MB) of 5 (1.8 GB) 269 MB/s (25.7%) remaining 4s.
Processed 4 (741.6 MB) of 5 (1.8 GB) 269 MB/s (40.6%) remaining 3s.
Processed 4 (1.1 GB) of 5 (1.8 GB) 280 MB/s (57.6%) remaining 2s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 291.1 MB/s (75.2%) remaining 1s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 289.8 MB/s (75.6%) remaining 1s.
Processed 5 (1.6 GB) of 5 (1.8 GB) 270.2 MB/s (85.3%) remaining 0s.
Processed 5 (1.7 GB) of 5 (1.8 GB) 256.3 MB/s (95.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (1.8 GB).

@e-sumin
Copy link
Contributor Author

e-sumin commented Feb 18, 2024

@KastenMike @julio-lopez could you please review ?


// FileReadingProgressCallback is a callback intended to be used during file copying
// to report amount of data sent to destination.
type FileReadingProgressCallback func(chunkSize int64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to change object reader at all, can't we just emit the progress as we're reading the bytes during restore?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does not seem used anywhere, the rest looks good

@codecov
Copy link

codecov bot commented Feb 19, 2024

Codecov Report

Attention: Patch coverage is 83.20000% with 21 lines in your changes are missing coverage. Please review.

Project coverage is 77.18%. Comparing base (cb455c6) to head (3b49d9a).
Report is 133 commits behind head on master.

Files Patch % Lines
cli/cli_progress.go 77.58% 8 Missing and 5 partials ⚠️
snapshot/restore/local_fs_output.go 63.15% 7 Missing ⚠️
snapshot/restore/restore.go 95.23% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3655      +/-   ##
==========================================
+ Coverage   75.86%   77.18%   +1.31%     
==========================================
  Files         470      479       +9     
  Lines       37301    28756    -8545     
==========================================
- Hits        28299    22194    -6105     
+ Misses       7071     4660    -2411     
+ Partials     1931     1902      -29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

}
defer r.Close() //nolint:errcheck

type withProgressTracking interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's easier to wrap "r" so that it reports bytes written in the callback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the idea, I've changed the code, is it what you meant ?

@e-sumin
Copy link
Contributor Author

e-sumin commented Mar 4, 2024

@jkowalski do you think the changes could be merged ? Or something else has to be done before ?

@e-sumin
Copy link
Contributor Author

e-sumin commented Mar 11, 2024

@jkowalski do you think the changes could be merged ? Or something else has to be done before ?

Don't want to be annoying, just a friendly reminder :)

@denisvmedia
Copy link
Collaborator

@jkowalski @redgoat650 @julio-lopez could you please check this PR?

}

return write(targetPath, r, f.Size(), o.copier)
return write(targetPath, wr, f.Size(), o.copier)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@e-sumin Question about the progress reporting using the reader tracking mechanism:

What is the best way to evaluate progress, as in the definition of done on processing X bytes ? When we have read X bytes from the reader does that mean we are done writing X bytes to the target ? In other words, would it be possible to get in a scenario where we have read 100% of the data from repository and are still stuck writing the final block due to a network issue (NFS) or similar ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but this way of progress update was chosen because it is less intrusive. Also, when reading, usually, the block is very small in comparison with total amount of data. So, we, anyway, can get 100% earlier than last block will be read. So, I'd not consider this as critical flaw.

bytesWritten := int64(0)
progressCallback := func(chunkSize int64) {
bytesWritten += chunkSize
c.stats.RestoredTotalFileSize.Add(chunkSize)
Copy link
Collaborator

@Shrekster Shrekster Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; this callback is hit each time we have read chunkSize from the repository, not necessarily "bytes written". Is that correct ? This is w.r.t. to the my other comment as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct. This progress calculation way treats bytes read as bytes written.

@Shrekster
Copy link
Collaborator

Shrekster commented Apr 4, 2024

@e-sumin and I discussed this over a call and he explained why the reader based approach is correct / simpler for now. I also agree with his comments above.

@e-sumin is looking into adding unit-test coverage with a custom test-only callback plumbing to make it non-flaky.

@e-sumin
Copy link
Contributor Author

e-sumin commented May 6, 2024

I did some refactoring of snapshot restore:

  • progress reporting interval is taken from command line, as it was done for upload progress reporting
    • currently upload progress instance is keeping progress flag and interval and restore progress just utilizes its value (this was originally discussed with @Shrekster as less invasive thing)
    • cli progress refactoring should be done (but probably it would be better to have it as a separate PR). We need to extract flag and interval and throttling mechanism to reused part and create progress instance depending on operation (upload or restore)
      • this also will allow to write separate tests for progress reporting format and progress reporting intervals
  • probably it would be better to pass restore progress interface to copier and increase counters separately when needed instead of keeping another copy of stats in copier and passing its values to progress reporter.

Some of changes I've done was done to be able to have an unit test which verifies that progress is reported properly.

@e-sumin e-sumin requested a review from Shrekster May 6, 2024 10:08
@e-sumin
Copy link
Contributor Author

e-sumin commented May 9, 2024

I've fixed the test and merged master into my branch. The tests should pass now.

@Shrekster Shrekster merged commit 2b92388 into kopia:master May 10, 2024

// Progress is invoked by copier to report status of snapshot restoration.
type Progress interface {
SetCounters(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too late to comment here, but I believe having an args struct would be better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@denisvmedia In the context of how this function is called, it does make sense to use a struct as an argument, more specifically restore.Stats. That would allow simply passing the struct as is, and would make the call just slightly simpler.

However, this pattern of using a struct to pass a bunch of arguments should be avoided in general, as it essentially makes all the arguments "optional". This becomes error prone as it makes it easy to miss setting a value and simply passing 0 (the default value).

Copy link
Collaborator

@julio-lopez julio-lopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@e-sumin see couple of minor comments


// Progress is invoked by copier to report status of snapshot restoration.
type Progress interface {
SetCounters(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@denisvmedia In the context of how this function is called, it does make sense to use a struct as an argument, more specifically restore.Stats. That would allow simply passing the struct as is, and would make the call just slightly simpler.

However, this pattern of using a struct to pass a bunch of arguments should be avoided in general, as it essentially makes all the arguments "optional". This becomes error prone as it makes it easy to miss setting a value and simply passing 0 (the default value).

julio-lopez added a commit that referenced this pull request Aug 27, 2024
Followups to #3655

* wrap fs.Reader
* nit: remove unnecessary intermediate variable
* nit: rename local variable
* cleanup: move restore.Progress interface to cli pkg
* move cliRestoreProgress to a separate file
* refactor(general): replace switch with if/else for clarity
  Removes a tautology for `err == nil`, which was guaranteed
  to be true in the second case statement for the switch.
  Replacing the switch statement with and if/else block is clearer.
* initialize restoreProgress in restore command
* fix: use error.Wrapf with format string and args


Simplify SetCounters signature:

Pass arguments in a `restore.Stats` struct.
  `SetCounters(s restore.Stats)`
Simplifies call sites and implementation.
In this case it makes sense to pass all the values
using the restore.Stats struct as it simplifies
the calls.
However, this pattern should be avoided in general
as it essentially makes all the arguments "optional".
This makes it easy to miss setting a value and simply
passing 0 (the default value), thus it becomes error
prone.
In this particular case, the struct is being passed
through verbatim, thus eliminating the risk of
missing a value, at least in the current state of
the code.
Lyndon-Li pushed a commit to project-velero/kopia that referenced this pull request Aug 28, 2024
When restoring huge file(s), the progress reporting is done in a bit
weird way:

```
kopia_test % kopia snapshot restore ka2084d263182164b6cf3456668e6b6da /Users/eugen.sumin/kopia_test/2
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/2) with parallelism=8...
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (5.4 GB).
```
In fact, the amount of restored data is dumped when particular file
completely restored.

This PR contains the least invasive change, which allows us to see
progress update while file is downloaded from object storage.
```
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/55) with parallelism=8...
Processed 2 (3.1 MB) of 5 (1.8 GB).
Processed 4 (459.6 MB) of 5 (1.8 GB) 270.3 MB/s (25.2%) remaining 4s.
Processed 4 (468.7 MB) of 5 (1.8 GB) 269 MB/s (25.7%) remaining 4s.
Processed 4 (741.6 MB) of 5 (1.8 GB) 269 MB/s (40.6%) remaining 3s.
Processed 4 (1.1 GB) of 5 (1.8 GB) 280 MB/s (57.6%) remaining 2s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 291.1 MB/s (75.2%) remaining 1s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 289.8 MB/s (75.6%) remaining 1s.
Processed 5 (1.6 GB) of 5 (1.8 GB) 270.2 MB/s (85.3%) remaining 0s.
Processed 5 (1.7 GB) of 5 (1.8 GB) 256.3 MB/s (95.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (1.8 GB).
```

---------

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this pull request Aug 28, 2024
When restoring huge file(s), the progress reporting is done in a bit
weird way:

```
kopia_test % kopia snapshot restore ka2084d263182164b6cf3456668e6b6da /Users/eugen.sumin/kopia_test/2
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/2) with parallelism=8...
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (5.4 GB).
```
In fact, the amount of restored data is dumped when particular file
completely restored.

This PR contains the least invasive change, which allows us to see
progress update while file is downloaded from object storage.
```
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/55) with parallelism=8...
Processed 2 (3.1 MB) of 5 (1.8 GB).
Processed 4 (459.6 MB) of 5 (1.8 GB) 270.3 MB/s (25.2%) remaining 4s.
Processed 4 (468.7 MB) of 5 (1.8 GB) 269 MB/s (25.7%) remaining 4s.
Processed 4 (741.6 MB) of 5 (1.8 GB) 269 MB/s (40.6%) remaining 3s.
Processed 4 (1.1 GB) of 5 (1.8 GB) 280 MB/s (57.6%) remaining 2s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 291.1 MB/s (75.2%) remaining 1s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 289.8 MB/s (75.6%) remaining 1s.
Processed 5 (1.6 GB) of 5 (1.8 GB) 270.2 MB/s (85.3%) remaining 0s.
Processed 5 (1.7 GB) of 5 (1.8 GB) 256.3 MB/s (95.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (1.8 GB).
```

---------

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this pull request Aug 28, 2024
When restoring huge file(s), the progress reporting is done in a bit
weird way:

```
kopia_test % kopia snapshot restore ka2084d263182164b6cf3456668e6b6da /Users/eugen.sumin/kopia_test/2
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/2) with parallelism=8...
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (5.4 GB).
```
In fact, the amount of restored data is dumped when particular file
completely restored.

This PR contains the least invasive change, which allows us to see
progress update while file is downloaded from object storage.
```
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/55) with parallelism=8...
Processed 2 (3.1 MB) of 5 (1.8 GB).
Processed 4 (459.6 MB) of 5 (1.8 GB) 270.3 MB/s (25.2%) remaining 4s.
Processed 4 (468.7 MB) of 5 (1.8 GB) 269 MB/s (25.7%) remaining 4s.
Processed 4 (741.6 MB) of 5 (1.8 GB) 269 MB/s (40.6%) remaining 3s.
Processed 4 (1.1 GB) of 5 (1.8 GB) 280 MB/s (57.6%) remaining 2s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 291.1 MB/s (75.2%) remaining 1s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 289.8 MB/s (75.6%) remaining 1s.
Processed 5 (1.6 GB) of 5 (1.8 GB) 270.2 MB/s (85.3%) remaining 0s.
Processed 5 (1.7 GB) of 5 (1.8 GB) 256.3 MB/s (95.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (1.8 GB).
```

---------

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this pull request Oct 10, 2024
Followups to kopia#3655

* wrap fs.Reader
* nit: remove unnecessary intermediate variable
* nit: rename local variable
* cleanup: move restore.Progress interface to cli pkg
* move cliRestoreProgress to a separate file
* refactor(general): replace switch with if/else for clarity
  Removes a tautology for `err == nil`, which was guaranteed
  to be true in the second case statement for the switch.
  Replacing the switch statement with and if/else block is clearer.
* initialize restoreProgress in restore command
* fix: use error.Wrapf with format string and args


Simplify SetCounters signature:

Pass arguments in a `restore.Stats` struct.
  `SetCounters(s restore.Stats)`
Simplifies call sites and implementation.
In this case it makes sense to pass all the values
using the restore.Stats struct as it simplifies
the calls.
However, this pattern should be avoided in general
as it essentially makes all the arguments "optional".
This makes it easy to miss setting a value and simply
passing 0 (the default value), thus it becomes error
prone.
In this particular case, the struct is being passed
through verbatim, thus eliminating the risk of
missing a value, at least in the current state of
the code.
Lyndon-Li pushed a commit to Lyndon-Li/kopia that referenced this pull request Oct 16, 2024
Followups to kopia#3655

* wrap fs.Reader
* nit: remove unnecessary intermediate variable
* nit: rename local variable
* cleanup: move restore.Progress interface to cli pkg
* move cliRestoreProgress to a separate file
* refactor(general): replace switch with if/else for clarity
  Removes a tautology for `err == nil`, which was guaranteed
  to be true in the second case statement for the switch.
  Replacing the switch statement with and if/else block is clearer.
* initialize restoreProgress in restore command
* fix: use error.Wrapf with format string and args


Simplify SetCounters signature:

Pass arguments in a `restore.Stats` struct.
  `SetCounters(s restore.Stats)`
Simplifies call sites and implementation.
In this case it makes sense to pass all the values
using the restore.Stats struct as it simplifies
the calls.
However, this pattern should be avoided in general
as it essentially makes all the arguments "optional".
This makes it easy to miss setting a value and simply
passing 0 (the default value), thus it becomes error
prone.
In this particular case, the struct is being passed
through verbatim, thus eliminating the risk of
missing a value, at least in the current state of
the code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants