Add elastic agent alerting rule templates#15572
Merged
MichelLosier merged 15 commits intoelastic:mainfrom Oct 23, 2025
Merged
Conversation
|
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
pierrehilbert
approved these changes
Oct 8, 2025
Contributor
Author
|
Putting this back in draft temporarily to avoid accidental merge. We want to validate these more against running agents -- but still open for config review. |
…agent path matching
💚 Build Succeeded
History
|
|
Package elastic_agent - 2.6.4 containing this change is available at https://epr.elastic.co/package/elastic_agent/2.6.4/ |
agithomas
pushed a commit
to agithomas/integrations
that referenced
this pull request
Oct 30, 2025
Add alerting rule templates to the Elastic Agent package: * CPU usage spike * Excessive memory usage * High pipeline queue * Dropped events * Output errors * Excessive restarts * Unhealthy status
tehbooom
pushed a commit
to tehbooom/integrations
that referenced
this pull request
Nov 19, 2025
Add alerting rule templates to the Elastic Agent package: * CPU usage spike * Excessive memory usage * High pipeline queue * Dropped events * Output errors * Excessive restarts * Unhealthy status
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed commit message
Extended description
Here is an initial exploration of alerting rule templates for monitoring elastic agent health. This PR can just include the ones we feel the most confident about, and defer others for further refinement and exploration.
Install the rules
How to install the rules:
your-local-dir/integrations/packages/elastic_agentpackages/elastic_agent/manifest.ymlfrom 2.6.4 to 2.6.3elastic-package build --skip-validation. Run this in theelastic_agentpackage directorybuild/packages/elastic_agent-2.6.3.zipCreate new integrationCTA at the top rightupload it as a .ziplink, and upload the zip you builtElastic Agentfor filtering.Rule templates:
So that the ESQL is clear, here is a summary of their definitions.
Resource Utilization
*elastic*agent*are above 80% of total cpu utilization. Calculate the max for 1 minute buckets and check if there are 5 occurrences when looking back 7 minutes. Rows are distinct by agent id and process name.FROM metrics-, :metrics-
| WHERE process.executable RLIKE ".[Ee]lastic.[Aa]gent." AND agent.name NOT LIKE "agentless"
| STATS cpu_process_pct = MAX(system.process.cpu.total.pct) * 100
BY elastic_agent.id, process.name,
time_bucket = BUCKET(@timestamp, 1 minute)
// Count the 1 minute timebuckets that are above 80% by process and agent
| WHERE cpu_process_pct >= 80
| STATS count_above_threshold = COUNT(*)
BY elastic_agent.id, process.name
// Alert if there are 5 or more occurences
| WHERE count_above_threshold >= 5
```
*elastic*agent*are above 50% of total memory usage. Rows are distinct by agent id.Beats Pipelines and Queues
beat.stats.libbeat.pipeline.queue.filled.pctexceeds 90%. Rows are distinct by agent id and component idAgent Stability
elastic_agent.status_changedatastreamChecklist
changelog.ymlfile.Author's Checklist
How to test this PR locally
Built and Install the elastic agent package locally:
Related issues
Screenshots