Skip to content

Make GetTrainedModelsStatsAction cancellable #87931

@DaveCTurner

Description

@DaveCTurner

I encountered a Cloud cluster with an overworked master due (partly) to processing multiple calls to GET /_ml/anomaly_detectors/_all/_stats originating from an external Metricbeat monitoring process. Metricbeat imposes a 10s timeout after which it closes the HTTP connection and tries again. However, GetTrainedModelsStatsAction does not notice if the client connection closes (i.e. the REST handler does not use RestCancellableNodeClient and the resulting transport task is not a CancellableTask) so it carries on wastefully processing the request even after the client timeout.

Relates #55550

Metadata

Metadata

Assignees

Labels

:mlMachine learning>bugTeam:MLMeta label for the ML team

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions