Skip to content

Task Manager Workload aggregator goes into a retry loop when index is yellow #82928

@gmmorris

Description

@gmmorris

When the .kibana_task_manager index goes yellow, Task Manager's WorkloadAggregator goes into a failure loop every few ms.
It should only retry on the next monitored_aggregated_stats_refresh_rate interval.

server    log   [00:16:43.235] [error][elasticsearch][taskManager] [search_phase_execution_exception]: all shards failed
server    log   [00:16:43.235] [error][plugins][taskManager] [WorkloadAggregator]: ResponseError: search_phase_execution_exception
server    log   [00:16:43.238] [error][elasticsearch][taskManager] [search_phase_execution_exception]: all shards failed
server    log   [00:16:43.238] [error][plugins][taskManager] [WorkloadAggregator]: ResponseError: search_phase_execution_exception
server    log   [00:16:43.240] [error][elasticsearch][taskManager] [search_phase_execution_exception]: all shards failed
server    log   [00:16:43.243] [error][plugins][taskManager] [WorkloadAggregator]: ResponseError: search_phase_execution_exception
server    log   [00:16:43.245] [error][elasticsearch][taskManager] [search_phase_execution_exception]: all shards failed
server    log   [00:16:43.245] [error][plugins][taskManager] [WorkloadAggregator]: ResponseError: search_phase_execution_exception
server    log   [00:16:43.248] [error][elasticsearch][taskManager] [search_phase_execution_exception]: all shards failed
server    log   [00:16:43.248] [error][plugins][taskManager] [WorkloadAggregator]: ResponseError: search_phase_execution_exception
server    log   [00:16:43.250] [error][elasticsearch][taskManager] [search_phase_execution_exception]: all shards failed
server    log   [00:16:43.250] [error][plugins][taskManager] [WorkloadAggregator]: ResponseError: search_phase_execution_exception

To reproduce you can set the number of replica’s on the .kibana_task_manager index to 1 using the index settings api.

PUT /.kibana_task_manager/_settings
{
  "index" : {
    "number_of_replicas" : 2
  }
}

Metadata

Metadata

Assignees

Labels

Feature:Task ManagerTeam:ResponseOpsPlatform ResponseOps team (formerly the Cases and Alerting teams) t//bugFixes for quality problems that affect the customer experience

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions