Skip to content

[7.x] Resilient saved object migration algorithm (#78413)#86284

Merged
rudolf merged 4 commits intoelastic:7.xfrom
rudolf:backport/7.x/pr-78413
Dec 18, 2020
Merged

[7.x] Resilient saved object migration algorithm (#78413)#86284
rudolf merged 4 commits intoelastic:7.xfrom
rudolf:backport/7.x/pr-78413

Conversation

@rudolf
Copy link
Copy Markdown
Contributor

@rudolf rudolf commented Dec 17, 2020

Backports the following commits to 7.x:

* Initial structure of migration state-action machine

* Fix type import

* Retries with exponential back off

* Use discriminated union for state type

* Either type for actions

* Test exponential retries

* TaskEither types for actions

* Fetch indices instead of aliases so we can collect all index state in one request

* Log document id if transform fails

* WIP: Legacy pre-migrations

* UPDATE_TARGET_MAPPINGS

* WIP OUTDATED_DOCUMENTS_TRANSFORM

* Narrow res types depending on control state

* OUTDATED_DOCUMENTS_TRANSFORM

* Use .kibana instead of .kibana_current

* rename control states TARGET_DOCUMENTS* -> OUTDATED_DOCUMENTS*

* WIP MARK_VERSION_INDEX_READY

* Fix and expand INIT -> * transition tests

* Add alias/index name helper functions

* Add feature flag for enabling v2 migrations

* split state_action_machine, reindex legacy indices

* Don't use a scroll search for migrating outdated documents

* model: test control state progressions

* Action integration tests

* Fix existing tests and type errors

* snapshot_in_progress_exception can only happen when closing/deleting an index

* Retry steps up to 10 times

* Update api.md documentation files

* Further actions integration tests

* Action unit tests

* Fix actions integration tests

* Rename actions to be more domain-specific

* Apply suggestions from code review

Co-authored-by: Josh Dover <me@joshdover.com>

* Review feedback: polish and flesh out inline comments

* Fix unhandled rejections in actions unit tests

* model: only delay retryable_es_client_error, reset for other left responses

* Actions unit tests

* More inline comments

* Actions: Group index settings under 'index' key

* bulkIndex -> bulkOverwriteTransformedDocuments to be more domain specific

* state_action_machine tests, fix and add additional tests

* Action integration tests: updateAndPickupMappings, searchForOutdatedDocuments

* oops: uncomment commented out code

* actions integration tests: rejection for createIndex

* update state properties: clearer names, mark all as readonly

* add state properties currentAlias, versionAlias, legacyIndex and test for invalid version scheme in index names

* Use CONSTANTS for constants :D

* Actions: Clarify behaviour and impact of acknowledged: false responses

* Use consistent vocabulary for action responses

* KibanaMigrator test for migrationsV2

* KibanaMigrator test for FATAL state and action exceptions in v2 migrations

* Fix ts error in test

* Refactor: split index file up into a file per model, next, types

* next: use partial application so we don't generate a nextActionMap on every call

* move logic from index.ts to migrations_state_action_machine.ts and test

* add test

* use `Root` to allow specifying oss mode

* Add fix and todo tests for reindexing with preMigrationScript

* Dump execution log of state transitions and responses if we hit FATAL

* add 7.3 xpack tests

* add 100k test data

* Reindex instead of cloning for migrations

* Skip 100k x-pack integration test

* MARK_VERSION_INDEX_READY_CONFLICT for dealing with different versions migrating in parallel

* Track elapsed time

* Fix tests

* Model: make exhaustiveness checks more explicit

* actions integration tests: add additional tests from CR

* migrations_state_action_machine fix flaky test

* Fix flaky integration test

* Reserve FATAL termination only for situations which we never can recover from such as later version already migrated the index

* Handle incompatible_mapping_exception caused by another instance

* Cleanup logging

* Fix/stabilize integration tests

* Add REINDEX_SOURCE_TO_TARGET_VERIFY step

* Strip tests archives of */.DS_Store and __MAC_OSX

* Task manager migrations: remove invalid kibana property when converting legacy indices

* Add disabled mappings for removed field in map saved object type

* verifyReindex action: use count API

* REINDEX_BLOCK_* to prevent lost deletes (needs tests)

* Split out 100k docs integration test so that it has it's own kibana process

* REINDEX_BLOCK_* action tests

* REINDEX_BLOCK_* model tests

* Include original error message when migration_state_machine throws

* Address some CR nits

* Fix TS errors

* Fix bugs

* Reindex then clone to prevent lost deletes

* Fix tests

Co-authored-by: Josh Dover <me@joshdover.com>
Co-authored-by: pgayvallet <pierre.gayvallet@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
# Conflicts:
#	rfcs/text/0013_saved_object_migrations.md
@rudolf rudolf added the backport This PR is a backport of another PR label Dec 17, 2020
* Attempt to stabilize cloneIndex integration tests

* Unskip test

* return resolves/rejects and add assertions counts to each test

* Await don't return expect promises

* Await don't return expect promises for other tests too
@kibanamachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
triggersActionsUi 1.5MB 1.5MB -26.9KB

Distributable file count

id before after diff
default 47558 48333 +775
oss 27818 28137 +319

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
triggersActionsUi 162.0KB 162.1KB +102.0B
Unknown metric groups

async chunk count

id before after diff
triggersActionsUi 31 32 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@rudolf rudolf merged commit 6a8726b into elastic:7.x Dec 18, 2020
@rudolf rudolf deleted the backport/7.x/pr-78413 branch December 18, 2020 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport This PR is a backport of another PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants