[UA] Tight worker loop can cause high CPU usage#60950
Merged
jloleysens merged 9 commits intoelastic:masterfrom Mar 31, 2020
Merged
[UA] Tight worker loop can cause high CPU usage#60950jloleysens merged 9 commits intoelastic:masterfrom
jloleysens merged 9 commits intoelastic:masterfrom
Conversation
Contributor
|
Pinging @elastic/es-ui (Team:Elasticsearch UI) |
joshdover
reviewed
Mar 23, 2020
x-pack/plugins/upgrade_assistant/server/lib/reindexing/worker.ts
Outdated
Show resolved
Hide resolved
The worker scheduler should only sleep when it cannot process any in progress operations. Additionally, logic has been added for handling of queue operations that have been in the queue for a long time and may be viewed as still in small window of time by wokers that do not have the credentials to process those reindex operations.
Contributor
Author
|
@elasticmachine merge upstream |
Contributor
Author
|
@elasticmachine merge upstream |
Contributor
Author
|
@elasticmachine merge upstream |
Contributor
Author
|
@elasticmachine merge upstream |
sebelga
approved these changes
Mar 31, 2020
Contributor
sebelga
left a comment
There was a problem hiding this comment.
LGTM! Tested locally and works as expected.
| ) { | ||
| // TODO: This tight loop needs something to relax potentially high CPU demands so this padding is added. | ||
| // This scheduler should be revisited in future. | ||
| await new Promise(res => setTimeout(res, WORKER_PADDING_MS)); |
Contributor
There was a problem hiding this comment.
nit: Do you mind using resolve instead of res. I first read it as response (that I always shortened as res! 😄 )
Contributor
Author
|
@elasticmachine merge upstream |
Contributor
💚 Build SucceededHistory
To update your PR or re-run it, just comment with: |
jloleysens
added a commit
to jloleysens/kibana
that referenced
this pull request
Mar 31, 2020
* Addded worker padding to save some CPU * Updated comments * Update worker scheduler and add a new util The worker scheduler should only sleep when it cannot process any in progress operations. Additionally, logic has been added for handling of queue operations that have been in the queue for a long time and may be viewed as still in small window of time by wokers that do not have the credentials to process those reindex operations. * res 👉🏻resolve Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
jloleysens
added a commit
to jloleysens/kibana
that referenced
this pull request
Mar 31, 2020
* Addded worker padding to save some CPU * Updated comments * Update worker scheduler and add a new util The worker scheduler should only sleep when it cannot process any in progress operations. Additionally, logic has been added for handling of queue operations that have been in the queue for a long time and may be viewed as still in small window of time by wokers that do not have the credentials to process those reindex operations. * res 👉🏻resolve Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
gmmorris
added a commit
to gmmorris/kibana
that referenced
this pull request
Mar 31, 2020
* upstream/master: (69 commits) Adding PagerDuty icon to connectors cards (elastic#60805) Fix drag and drop flakiness (elastic#61993) Grok debugger migration (elastic#60658) Endpoint: Fix resolver SVG position issue (elastic#61886) [SIEM] version 7.7 rule import (elastic#61903) Added styles to make combobox list items wider for alerting flyout (elastic#61894) [UA] Tight worker loop can cause high CPU usage (elastic#60950) [ML] DF Analytics results table: use index pattern field format if one exists (elastic#61709) [ML] Catching unknown index pattern errors (elastic#61935) [Discover] Deangularize and euificate sidebar (elastic#47559) Endpoint: Add ts-node dev dependency (elastic#61884) Add an onBlur handler for the kuery bar. Only resubmit when input changes. (elastic#61901) [ML] Handle Empty Partition Field Values in Single Metric Viewer (elastic#61649) Auto interval on date histogram is getting displayed as timestamp per… (elastic#59171) [Maps] Explicitly pass fetch function to ems-client (elastic#61846) [SIEM][CASE] Fix aria-labels and translations (elastic#61670) [ML] Settings: Increase number of items that can be paged in calendars and filters lists (elastic#61842) [EPM] update epm filepath route (elastic#61910) APM] Set ignore_above to 1024 for telemetry saved object (elastic#61732) [Logs UI] Log stream row rendering (elastic#60773) ...
jloleysens
added a commit
that referenced
this pull request
Apr 1, 2020
* Addded worker padding to save some CPU * Updated comments * Update worker scheduler and add a new util The worker scheduler should only sleep when it cannot process any in progress operations. Additionally, logic has been added for handling of queue operations that have been in the queue for a long time and may be viewed as still in small window of time by wokers that do not have the credentials to process those reindex operations. * res 👉🏻resolve Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
jloleysens
added a commit
that referenced
this pull request
Apr 1, 2020
* Addded worker padding to save some CPU * Updated comments * Update worker scheduler and add a new util The worker scheduler should only sleep when it cannot process any in progress operations. Additionally, logic has been added for handling of queue operations that have been in the queue for a long time and may be viewed as still in small window of time by wokers that do not have the credentials to process those reindex operations. * res 👉🏻resolve Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
gmmorris
added a commit
to gmmorris/kibana
that referenced
this pull request
Apr 1, 2020
* master: (64 commits) Adding PagerDuty icon to connectors cards (elastic#60805) Fix drag and drop flakiness (elastic#61993) Grok debugger migration (elastic#60658) Endpoint: Fix resolver SVG position issue (elastic#61886) [SIEM] version 7.7 rule import (elastic#61903) Added styles to make combobox list items wider for alerting flyout (elastic#61894) [UA] Tight worker loop can cause high CPU usage (elastic#60950) [ML] DF Analytics results table: use index pattern field format if one exists (elastic#61709) [ML] Catching unknown index pattern errors (elastic#61935) [Discover] Deangularize and euificate sidebar (elastic#47559) Endpoint: Add ts-node dev dependency (elastic#61884) Add an onBlur handler for the kuery bar. Only resubmit when input changes. (elastic#61901) [ML] Handle Empty Partition Field Values in Single Metric Viewer (elastic#61649) Auto interval on date histogram is getting displayed as timestamp per… (elastic#59171) [Maps] Explicitly pass fetch function to ems-client (elastic#61846) [SIEM][CASE] Fix aria-labels and translations (elastic#61670) [ML] Settings: Increase number of items that can be paged in calendars and filters lists (elastic#61842) [EPM] update epm filepath route (elastic#61910) APM] Set ignore_above to 1024 for telemetry saved object (elastic#61732) [Logs UI] Log stream row rendering (elastic#60773) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
In Upgrade Assistant, when there are multiple Kibana instances sharing an ES cluster, the worker loop can consume a lot of CPU under certain conditions.
How to reproduce on master
x-pack/plugins/upgrade_assistant/server/routes/reindex_indices/reindex_handler.tscomment out the line that readscredentialStore.set(reindexOp, headers);. This will simulate a situation where we are a Kibana instance that does not have the user credentials required for furthering the reindex operation - this is the key to unlocking the performance bug.Solution
The simplest solution was just to add some padding in the form of simulated sleep.
Additional
There was also a (small) potential issue with queued items that could still be seen as stale (see #60770). We now let workers without credentials to update the reindex op double check queued operations.