Skip to content

filed: parallel compression/reading/sending#1533

Merged
BareosBot merged 31 commits intobareos:masterfrom
sebsura:dev/ssura/master/parallel-send
Nov 6, 2023
Merged

filed: parallel compression/reading/sending#1533
BareosBot merged 31 commits intobareos:masterfrom
sebsura:dev/ssura/master/parallel-send

Conversation

@sebsura
Copy link
Contributor

@sebsura sebsura commented Aug 22, 2023

Thank you for contributing to the Bareos Project!

Instead of serially doing read chunk -> compress chunk -> send chunk we instead try to do as much as possible in parallel.

Please check

  • Short description and the purpose of this PR is present above this paragraph
  • Your name is present in the AUTHORS file (optional)

If you have any questions or problems, please give a comment in the PR.

Helpful documentation and best practices

Checklist for the reviewer of the PR (will be processed by the Bareos team)

Make sure you check/merge the PR using devtools/pr-tool to have some simple automated checks run and a proper changelog record added.

General
  • Is the PR title usable as CHANGELOG entry?
  • Purpose of the PR is understood
  • Commit descriptions are understandable and well formatted
  • Check backport line
  • Required backport PRs have been created
Source code quality
  • Source code changes are understandable
  • Variable and function names are meaningful
  • Code comments are correct (logically and spelling)
  • Required documentation changes are present and part of the PR
Tests
  • Decision taken that a test is required (if not, then remove this paragraph)
  • The choice of the type of test (unit test or systemtest) is reasonable
  • Testname matches exactly what is being tested
  • On a fail, output of the test leads quickly to the origin of the fault

@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch from 8ea29bc to b8bffe5 Compare August 29, 2023 06:20
arogge pushed a commit to bareos-testing/bareos that referenced this pull request Aug 29, 2023
If leftover snapshots exist from previous backups, they are now cleaned
up before taking a new snapshot. Additionally, the snapshot cleanup
after backup now works more reliable by reconnecting to the vCenter
server if necessary. Code refactored to consequently use methods from
pyVim.task.

- fix a bug when folder=/ is used.

Fixes: bareos#1533: Restoring vmware vms keeps failing, can't restore the data

- Fix restore when bootOrder was set

When bootOrder was set to a virtualDisk at backup time, ensure the
restore works by setting the bootOrder now after disks were added to
the recreated VM.

- Fix restore when virtualCdrom path is invalid

Restore fails when recreating a VM and the backing file of a virtual
CDROM is inaccessible, this can occur when it was connected to an ISO
file on a NFS datastore which is not accessible at restore time.
This change will then create a disconnected virtual CDROM to allow
the restore to finish successfully.

Co-authored-by: Philipp Storz <philipp.storz@bareos.com>
sebsura pushed a commit to sebsura/bareos that referenced this pull request Aug 31, 2023
If leftover snapshots exist from previous backups, they are now cleaned
up before taking a new snapshot. Additionally, the snapshot cleanup
after backup now works more reliable by reconnecting to the vCenter
server if necessary. Code refactored to consequently use methods from
pyVim.task.

- fix a bug when folder=/ is used.

Fixes: bareos#1533: Restoring vmware vms keeps failing, can't restore the data

- Fix restore when bootOrder was set

When bootOrder was set to a virtualDisk at backup time, ensure the
restore works by setting the bootOrder now after disks were added to
the recreated VM.

- Fix restore when virtualCdrom path is invalid

Restore fails when recreating a VM and the backing file of a virtual
CDROM is inaccessible, this can occur when it was connected to an ISO
file on a NFS datastore which is not accessible at restore time.
This change will then create a disconnected virtual CDROM to allow
the restore to finish successfully.

Co-authored-by: Philipp Storz <philipp.storz@bareos.com>
(cherry picked from commit 0e7f740)
@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch from b8bffe5 to 88b37ef Compare September 5, 2023 07:53
@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch 3 times, most recently from 1704007 to 70e2ab5 Compare September 18, 2023 05:34
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few remarks already. However, I guess I don't fully understand everything yet. So maybe some of the remarks are a bit "dull"...

@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch from 70e2ab5 to 6b0f8ad Compare September 22, 2023 09:54
@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch from c808c61 to ac057cf Compare October 9, 2023 07:20
@sebsura sebsura requested a review from arogge October 10, 2023 07:12
@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch 2 times, most recently from 5ee59b7 to 2f2ebda Compare October 16, 2023 07:20
Copy link
Member

@arogge arogge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty complicated, but great!
I have some minor remarks, am still not happy with the parameter naming and you have an endianess issue.

@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch 2 times, most recently from fea9cc3 to 1383a99 Compare October 23, 2023 13:31
@sebsura sebsura requested a review from arogge October 24, 2023 04:40
@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch from 683699b to 34e6e31 Compare November 6, 2023 06:25
Because the jcr is not fully created after new_jcr (the daemon
specific pointer is still null), it is not 100% safe to register it
since a lot of jcr-chain-walkers expect that pointer to be set when
they walk the chain.
This introduces a new register_jcr function which handles just the
registering so that it can be done after the initialisation is
actually over.
This should make it easier to implement parallelisation.
When we want to to this in parallel we clearly cannot use the buffers
in bctx but have to allocate our own.  As preperation for that we do
that even now in serial mode.
Instead of reading the file into the send buffer directly, we read it
into its own buffer that is shared with every worker thread.  Then we
either compress the file data into the send buffer or just memcpy it
-- in case compression is not enabled -- in a worker thread.
Previously thread_pool took care of two different concerns:
1) Being able to reuse threads for work
2) Sharing threads for concurrent non-blocking computations

To be able to support both of these concerns we needed to add a lot of
complexity to the class.  Splitting it into two, a thread_pool, that
only lets us borrow threads for a while, and a work group, with which
you can share threads to do non-blocking computations, made the code
much easier to understand.
Also adds zero-copy sending to the non-compression path.
@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch from 34e6e31 to edd1b8e Compare November 6, 2023 07:52
@sebsura sebsura force-pushed the dev/ssura/master/parallel-send branch from 9cd2812 to 24cab2d Compare November 6, 2023 09:48
@BareosBot BareosBot merged commit 0b40c5d into bareos:master Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants