Skip to content

[RAC][Rule Registry] UI/UX around timeouts and errors during index bootstrapping #111170

@banderror

Description

@banderror

Parent ticket: #101016

Summary

Background: #108115 (comment)

During index bootstrapping there can occur certain situations related to network conditions:

  • Timeouts, for example when network or ES cluster are under load. Currently we have 20 minutes timeout for installing common ES resources shared between all indices + 20 minutes timeout for installing index-specific resources for each index separately (e.g. for .alerts-security.alerts). Total 40 minutes.

    • During these 20-40 minutes the rules will be blocked on attempting to write alerts and will be hanging. It will look like "going to run" status in the Rule Management table in Security and no logs or other messages.
  • Errors, like network errors or errors from Elasticsearch.

    • In this case, errors will be re-thrown as exceptions; the rule status will change to "failed" and there will be some Kibana logs available.

Do we need to build a better UX around that?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Team:ResponseOpsPlatform ResponseOps team (formerly the Cases and Alerting teams) t//Theme: raclabel obsolete

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions