Skip to content

[Upgrade Assistant] Update logic for handling reindex failures#124571

Merged
alisonelizabeth merged 5 commits intoelastic:7.17from
alisonelizabeth:ua/reindex_error_handling
Feb 10, 2022
Merged

[Upgrade Assistant] Update logic for handling reindex failures#124571
alisonelizabeth merged 5 commits intoelastic:7.17from
alisonelizabeth:ua/reindex_error_handling

Conversation

@alisonelizabeth
Copy link
Copy Markdown
Contributor

@alisonelizabeth alisonelizabeth commented Feb 3, 2022

Fixes #124153

The Upgrade Assistant was incorrectly marking a reindexing process as a failure. This PR adjusts the logic for determining a reindex failure to only check for the existence of failures in the task API response.

It also fixes a minor spacing issue when rendering an error callout.

How to test

This is a hard one to add automated tests for (open to suggestions!), as it involves restarting Kibana to reproduce.

I was able to reproduce the bug (pre change) fairly consistently using the manual steps documented in #123817 (comment). With this change, the reindexing process should no longer fail.

To trigger an actual reindexing failure (post change):

  1. Start up 6.x Elasticsearch and create some indices that contain a property with a text type. For example:
PUT /test/_doc/1
{
  "field": "my_string"
}

Alternatively, you can use this zip file, which already contains some indices: 6.8.16-data.zip

  1. Start Elasticsearch with the 6.x snapshot created in step 1: yarn es snapshot --license=trial -E path.data=./path_to_6.x
  2. Hack the UA code so that it creates the destination index with a bad mapping. This will trigger a reindex failure:
      createIndex = await esClient.indices.create({
        index: newIndexName,
        body: {
          settings,
          mappings: {
            properties: {
              field: { type: 'date' },
            },
          },
        },
      });
  1. Start Kibana and navigate to Stack Management -> Upgrade Assistant -> ES deprecations
  2. Start reindexing one of the indices. Note reindexing should fail and trigger error callout.

Screenshots

Before/after of spacing issue.

Before:
Upgrade-Assistant-Elastic

After:
Screen Shot 2022-02-03 at 1 32 48 PM

@alisonelizabeth alisonelizabeth added release_note:fix Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more t// Feature:Upgrade Assistant v7.17.1 labels Feb 3, 2022
body: { count },
} = await esClient.count({ index: reindexOp.attributes.indexName });

if (taskResponse.task.status!.created < count) {
Copy link
Copy Markdown
Contributor Author

@alisonelizabeth alisonelizabeth Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a reliable check, as the documents could end up in an updated state. This could occur if Kibana is restarted and a user resumes the reindex process in Upgrade Assistant, which actually kicks off the reindex again. As the code is here, we would actually throw an error when there was not.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alisonelizabeth
Copy link
Copy Markdown
Contributor Author

@elasticmachine merge upstream

@alisonelizabeth alisonelizabeth marked this pull request as ready for review February 7, 2022 16:02
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/platform-deployment-management (Team:Deployment Management)

@alisonelizabeth
Copy link
Copy Markdown
Contributor Author

@elasticmachine merge upstream

@alisonelizabeth
Copy link
Copy Markdown
Contributor Author

@elasticmachine merge upstream

@kibana-ci
Copy link
Copy Markdown

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Copy Markdown
Contributor

@sebelga sebelga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM! I tried to reproduce locally the issue reverting your changes but could not get the error to be thrown. I tried both stopping Kibana at the beginning of the reindexing and in the middle of the 3rd step ("Reindex documents"). In both cases UA resumed and completed the reindex.

The code change make sense to me though. But it would be great to get to the bottom as exactly when this issue occurs. Could it be ES related?

@alisonelizabeth
Copy link
Copy Markdown
Contributor Author

Thanks for the review @sebelga!

The code change make sense to me though. But it would be great to get to the bottom as exactly when this issue occurs. Could it be ES related?

The issue is likely occurring because we are actually kicking off another reindex operation when the user clicks "resume" in the UI so the documents may have already been created, and end up in an updated state in the ES task response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature:Upgrade Assistant release_note:fix Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more t// v7.17.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants