Skip to content

Improve task manager warning log messaging when all workers are used #54920

@peterschretlen

Description

@peterschretlen

From #54697 (comment)

The message from TM that gets logged when all the workers are in use seems ... not great.

[Task Ownership]: Task Manager has skipped Claiming Ownership \
of available tasks at it has ran out Available Workers. \
If this happens often, consider adjusting the \
"xpack.task_manager.max_workers" configuration
  • it's very long
  • it's annoying, and so that will lead the customer to update max_workers, which will cause a thundering herd issue as SIEM has seen

I don't really have a great list of concrete alternatives, but will discuss some ideas below
ideas:

  • have TM print some basic stats in one line, every minute, if any tasks have run in that minute; start basic - total number of executions, failures, timeouts, etc
  • have TM print some basic stats in one line, every threshold number of executions (100, 1000, not sure)

I think I like the first. Default might be 10 minutes, for dev I'd want to set it to 1 minute, or maybe even 30 seconds.

Metadata

Metadata

Assignees

Labels

Feature:AlertingTeam:ResponseOpsPlatform ResponseOps team (formerly the Cases and Alerting teams) t//

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions