Skip to content

[Alerting] Introduces a ActionSubGroup which allows for more granular action group scheduling#84751

Merged
gmmorris merged 16 commits intoelastic:masterfrom
gmmorris:alerts/force-schedule
Dec 10, 2020
Merged

[Alerting] Introduces a ActionSubGroup which allows for more granular action group scheduling#84751
gmmorris merged 16 commits intoelastic:masterfrom
gmmorris:alerts/force-schedule

Conversation

@gmmorris
Copy link
Copy Markdown
Contributor

@gmmorris gmmorris commented Dec 2, 2020

Summary

Closes #83980

This PR introduces a new concept of an Action Subgroup (naming is open for discussion) which can be used by an Alert Type when scheduling actions.
An Action Subgroup can be dynamically specified, unlike Action Groups which have to be specified on the AlertType definition.
When scheduling actions, and AlertType can specify an Action Subgroup along side the scheduled Action Group, which denotes that the alert instance falls into some kind of narrower grouping in the action group.

For example, given a Contained Action Group that denotes that the instance is contained by something else, the Alert Type can provide a unique identifier for the container.

Specifying a SubGroup enables the following:

  1. The framework will use the specified Subgroup to identify cases where the instance has changed, but is still in the same action group. In such a case the framework will fire the actions as if the instance changed action groups (bypassing throttling etc., same as in if this were a different action group)
  2. A new message variable becomes available {{alertActionSubgroup}} which is the specified subgroup, and can be used in the actions.

This also adds the subgroup into the Event Log messages, such as:

alert: test.patternFiring:123: 'abc' active instance: 'instance' in actionGroup(subgroup): 'contained(region-1)' action: test.noop:123

Checklist

Delete any items that are not applicable to this PR.

For maintainers

@gmmorris gmmorris added Feature:Alerting release_note:enhancement Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v7.11.0 v8.0.0 labels Dec 2, 2020
* master: (72 commits)
  Make alert status fetching more resilient (elastic#84676)
  [APM] Refactor hooks and context (elastic#84615)
  Added word break styles to the texts in the item details card. (elastic#84654)
  [Search] Disable "send to background" when auto-refresh is enabled (elastic#84106)
  Add readme for new palette service (elastic#84512)
  Make all providers to preserve original URL when session expires. (elastic#84229)
  [Lens] Show color in flyout instead of auto (elastic#84532)
  [Lens] Use index pattern through service instead of reading saved object (elastic#84432)
  Make it possible to use Kibana anonymous authentication provider with ES anonymous access. (elastic#84074)
  TelemetryCollectionManager: Use X-Pack strategy as an OSS overwrite (elastic#84477)
  migrate away from rest_total_hits_as_int (elastic#84508)
  [Input Control] Custom renderer (elastic#84423)
  Attempt to more granularly separate App Search vs Workplace Search vs shared GitHub notifications (elastic#84713)
  [Security Solutino][Case] Case connector alert UI (elastic#82405)
  [Maps] Support runtime fields in tooltips (elastic#84377)
  [CCR] Fix row actions in follower index and auto-follow pattern tables (elastic#84433)
  [Enterprise Search] Migrate shared Indexing Status component (elastic#84571)
  [maps] remove fields from index-pattern test artifacts (elastic#84379)
  Add routes for use in Sources Schema (elastic#84579)
  Changes UI links for drilldowns (elastic#83971)
  ...
* master: (40 commits)
  fix: 🐛 don't add separator befor group on no main items (elastic#83166)
  [Security Solution][Detections] Implements indicator match rule cypress test (elastic#84323)
  [APM] Add APM agent config options (elastic#84678)
  Fixed a11y issue on rollup jobs table selection (elastic#84567)
  [Discover] Refactor getContextUrl to separate file (elastic#84503)
  [Embeddable] Export CSV action for Lens embeddables in dashboard (elastic#83654)
  [TSVB] [Cleanup] Remove extra dateFormat props (elastic#84749)
  [Lens] Migrate legacy es client and remove total hits as int (elastic#84340)
  Improve logging pipeline in @kbn/legacy-logging (elastic#84629)
  Catch @hapi/podium errors (elastic#84575)
  [Discover] Unskip date histogram test (elastic#84727)
  Rename server.xsrf.whitelist to server.xsrf.allowlist (elastic#84791)
  [Enterprise Search] Fix schema errors button (elastic#84842)
  [APM] Removes react-sticky dependency in favor of using CSS (elastic#84589)
  [Maps] Always initialize routes on server-startup (elastic#84806)
  [Fleet] EPM support to handle uploaded file paths (elastic#84708)
  [Snapshot Restore] Fix initial policy form state (elastic#83928)
  Upgrade Node.js to version 14 (elastic#83425)
  [Security Solution] Keep Endpoint policies up to date with license changes (elastic#83992)
  [Security Solution][Exceptions] Implement exceptions for ML rules (elastic#84006)
  ...
@pmuellr
Copy link
Copy Markdown
Contributor

pmuellr commented Dec 4, 2020

Haven't reviewed this yet, but did peek to see if we're adding the new subgroup in the event log, and it doesn't appear we are. I think we should. Once we have that, we also likely want to make the subgroup available in the instance summary, which we would also display in the alert details view:

status.actionGroupId = event?.kibana?.alerting?.action_group_id;

I think all this could be done in a follow-on PR, but kinda worried about the timing - seems like this PR is do-able within 7.11, not sure the follow-on PR I described could be done as well, so might have to wait for 7.12. Probably fine.

* master: (119 commits)
  [Uptime] Fix headers io-ts type (elastic#84089)
  [fleet] Add config options to accepted docker env vars (elastic#84338)
  [Fleet] Support URL query state in agent logs UI (elastic#84298)
  [basePathProxy] include query in redirect (elastic#84356)
  [Security Solution] Add Endpoint policy feature checks (elastic#83972)
  Fix issues with show_license_expiration (elastic#84361)
  [Security Solution][Resolver] Add support for predefined schemas for endpoint and winlogbeat (elastic#84103)
  [cli/dev] log a warning when --no-base-path is used with --dev (elastic#84354)
  [Fleet] Support input-level vars & templates (elastic#83878)
  [APM] Elastic chart issues (elastic#84238)
  [Time to Visualize] Fix Unlink Action via Rollback of ReplacePanel (elastic#83873)
  redirect to visualize listing page when by value visualization editor doesn't have a value input (elastic#84287)
  add live region for field search (elastic#84310)
  [ML] Persisted URL state for Anomalies table (elastic#84314)
  [dev/cli] detect worker type using env, not cluster module (elastic#83977)
  [Workplace Search] Migrate DisplaySettings tree (elastic#84283)
  Deprecate `xpack.task_manager.index` setting (elastic#84155)
  [Search] Search batching using bfetch (again) (elastic#84043)
  Use .kibana instead of .kibana_current to mark migration completion (elastic#83373)
  [Monitoring] Only look at ES for the missing data alert for now (elastic#83839)
  ...
@gmmorris
Copy link
Copy Markdown
Contributor Author

gmmorris commented Dec 9, 2020

Haven't reviewed this yet, but did peek to see if we're adding the new subgroup in the event log, and it doesn't appear we are. I think we should. Once we have that, we also likely want to make the subgroup available in the instance summary, which we would also display in the alert details view:

status.actionGroupId = event?.kibana?.alerting?.action_group_id;

I think all this could be done in a follow-on PR, but kinda worried about the timing - seems like this PR is do-able within 7.11, not sure the follow-on PR I described could be done as well, so might have to wait for 7.12. Probably fine.

Yeah, it's on my list... part of why this is still in draft mode 👍
The UI I wasn't planning for this PR, just the event log change.

@gmmorris gmmorris marked this pull request as ready for review December 9, 2020 17:19
@gmmorris gmmorris requested a review from a team as a code owner December 9, 2020 17:19
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

Copy link
Copy Markdown
Contributor

@YulNaumenko YulNaumenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Should be a great opportunities extension for Maps purposes!

* master: (53 commits)
  Fixing recovered instance reference bug (elastic#85412)
  Switch to new elasticsearch client for Visualizations (elastic#85245)
  Switch to new elasticsearch client for TSVB (elastic#85275)
  Switch to new elasticsearch client for Vega (elastic#85280)
  [ILM] Add shrink field to hot phase (elastic#84087)
  Add rolling-file appender to core logging (elastic#84735)
  [APM] Service overview: Dependencies table (elastic#83416)
  [Uptime ]Update empty message for certs list (elastic#78575)
  [Graph] Fix graph saved object references (elastic#85295)
  [APM] Create new API's to return Latency and Throughput charts (elastic#85242)
  [Advanced settings] Reset to default for empty strings (elastic#85137)
  [SECURITY SOLUTION] Bundles _source -> Fields + able to sort on multiple fields in Timeline (elastic#83761)
  [Fleet] Update agent listing for better status reporting (elastic#84798)
  [APM] enable 'sanitize_field_names' for Go (elastic#85373)
  Update dependency @elastic/charts to v24.4.0 (elastic#85452)
  Introduce external url service (elastic#81234)
  Deprecate disabling the security plugin (elastic#85159)
  [FLEET] New Integration Policy Details page for use in Integrations section (elastic#85355)
  [Security Solutions][Detection Engine] Fixes one liner access control with find_rules REST API
  chore: 🤖 remove extraPublicDirs (elastic#85454)
  ...
Copy link
Copy Markdown
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just a question about the event log mappings.json file

@kibanamachine
Copy link
Copy Markdown
Contributor

💚 Build Succeeded

Metrics [docs]

Distributable file count

id before after diff
default 47010 47770 +760

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
alerts 67.7KB 67.9KB +157.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@gmmorris gmmorris merged commit 015f3c9 into elastic:master Dec 10, 2020
gmmorris added a commit to gmmorris/kibana that referenced this pull request Dec 10, 2020
… action group scheduling (elastic#84751)


This PR introduces a new concept of an _Action Subgroup_ (naming is open for discussion) which can be used by an Alert Type when scheduling actions.
An Action Subgroup can be dynamically specified, unlike Action Groups which have to be specified on the AlertType definition.
When scheduling actions, and AlertType can specify an _Action Subgroup_ along side the scheduled _Action Group_, which denotes that the alert instance falls into some kind of narrower grouping in the action group.
gmmorris added a commit that referenced this pull request Dec 10, 2020
… action group scheduling (#84751) (#85585)

This PR introduces a new concept of an _Action Subgroup_ (naming is open for discussion) which can be used by an Alert Type when scheduling actions.
An Action Subgroup can be dynamically specified, unlike Action Groups which have to be specified on the AlertType definition.
When scheduling actions, and AlertType can specify an _Action Subgroup_ along side the scheduled _Action Group_, which denotes that the alert instance falls into some kind of narrower grouping in the action group.
@mikecote mikecote added needs_docs release_note:skip Skip the PR/issue when compiling release notes and removed release_note:enhancement labels Dec 16, 2020
@gmmorris
Copy link
Copy Markdown
Contributor Author

Decided to remove the needs_docs label on this as it is documented in the Developer Docs and as an end user can't use Sub Action Groups (they are used by Alert Type implementors) it doesn't make sense to add them to user docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature:Alerting release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Platform ResponseOps team (formerly the Cases and Alerting teams) t// v7.11.0 v8.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Alerting] alert types can't schedule actions within the same action group without being throttled

7 participants