Add new out of box stack monitoring alert for CCR Read Exceptions:
Provide a way for the user to get alerted when CCR is enabled on the cluster and some indices are NOT getting replicated:
Condition
- What to alert on: Is there a read exception in the latest CCR stats call indices.shards.read_exceptions
- Alert re-notify interval => 6 hours
- Group => Accumulate 1 alert for the cluster(include all indices/shards that have exceptions)
Schedule
- Check done every 1 minute
Troubleshooting actions
- Link them to the “CCR” tab is stack monitoring showing information such as the leader index, an indication of how much the follower index is lagging behind the leader index, the last fetch time, the number of operations synced, and error messages.
- If you select a shard, you can see graphs for the fetch and operation delays. You can also see advanced information, which contains the results from the get follower stats API
- Link them to stack management Manage cross-cluster replication
- Inspect replication statistics in Stack Management side navigation Cross-Cluster Replication.
- Manage auto-follow pattern
Helpful Notes:
Add new out of box stack monitoring alert for CCR Read Exceptions:
Provide a way for the user to get alerted when CCR is enabled on the cluster and some indices are NOT getting replicated:
Condition
Schedule
Troubleshooting actions
Helpful Notes: