Skip to content

Move all Snapshot Master Node Steps to SnapshotsService (#56365)#59373

Merged
original-brownbear merged 1 commit intoelastic:7.xfrom
original-brownbear:56365-7.x
Jul 12, 2020
Merged

Move all Snapshot Master Node Steps to SnapshotsService (#56365)#59373
original-brownbear merged 1 commit intoelastic:7.xfrom
original-brownbear:56365-7.x

Conversation

@original-brownbear
Copy link
Copy Markdown
Contributor

This refactoring has three motivations:

  1. Separate all master node steps during snapshot operations from all data node steps in code.
  2. Set up next steps in concurrent repository operations and general improvements by centralizing tracking of each shard's state in the repository in SnapshotsService so that operations for each shard can be linearized efficiently (i.e. without having to inspect the full snapshot state for all shards on every cluster state update, allowing us to track more in memory and only fall back to inspecting the full CS on master failover like we do in the snapshot shards service).
    • This PR already contains some best effort examples of this, but obviously this could be way improved upon still (just did not want to do it in this PR for complexity reasons)
  3. Make the SnapshotsService less expensive on the CS thread for large snapshots

backport of #56365

This refactoring has three motivations:

1. Separate all master node steps during snapshot operations from all data node steps in code.
2. Set up next steps in concurrent repository operations and general improvements by centralizing tracking of each shard's state in the repository in `SnapshotsService` so that operations for each shard can be linearized efficiently (i.e. without having to inspect the full snapshot state for all shards on every cluster state update, allowing us to track more in memory and only fall back to inspecting the full CS on master failover like we do in the snapshot shards service).
    * This PR already contains some best effort examples of this, but obviously this could be way improved upon still (just did not want to do it in this PR for complexity reasons)
3. Make the `SnapshotsService` less expensive on the CS thread for large snapshots
@original-brownbear original-brownbear added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs backport labels Jul 12, 2020
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team. label Jul 12, 2020
@original-brownbear
Copy link
Copy Markdown
Contributor Author

Jenkins run elasticsearch-ci/2 (GCS port infra issue)

@original-brownbear original-brownbear merged commit 4833861 into elastic:7.x Jul 12, 2020
@original-brownbear original-brownbear deleted the 56365-7.x branch July 12, 2020 20:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Meta label for distributed team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants