Support "cluster" scope in Metricbeat elasticsearch module#18547
Support "cluster" scope in Metricbeat elasticsearch module#18547ycombinator merged 26 commits intoelastic:masterfrom ycombinator:mb-es-cluster-mode
Conversation
💚 Build SucceededExpand to view the summary
Build stats
Test stats 🧪
|
|
Pinging @elastic/integrations (Team:Integrations) |
There was a problem hiding this comment.
this logic repeats across the board, I wonder if it should go into a helper function
There was a problem hiding this comment.
Refactored in 348c9b9.
There was a problem hiding this comment.
I guess this TODO is done by GetMasterNodeID
There was a problem hiding this comment.
Removed in 34cd849.
metricbeat/metricbeat.reference.yml
Outdated
There was a problem hiding this comment.
I wonder if mode is enough, in other places we have used scope too. Anyway I don't have a strong opinion here
There was a problem hiding this comment.
I was reserving mode for this change: #9424 (comment).
Using scope here instead of hosts_mode sounds good to me.
There was a problem hiding this comment.
Changed in f2a5618.
exekias
left a comment
There was a problem hiding this comment.
This is looking great Shaunak! thank you for working on this
|
I tested the changes in this PR for heap usage w.r.t # of nodes in the ES cluster being monitored. There was no significant difference in heap usage, no matter how many nodes were in the the ES cluster being monitored. Details below. Setup
Resultshttps://docs.google.com/spreadsheets/d/1Boi0uw846OSY3vnGqC604Dj8Lh28lD3G9OsKFgPGz8I/edit?usp=sharing |
|
Pinging @elastic/stack-monitoring (Stack monitoring) |
Co-authored-by: DeDe Morton <dede.morton@elastic.co>
Co-authored-by: DeDe Morton <dede.morton@elastic.co>
…20413) * Adding configuration for hosts_mode * Only perform master check in HostsModeNode * Only ask the node if it's the master node if we're in HostsModeNode * Unpack host_mode string into enum * Adding some specific TODOs in node_stats code * Updating x-pack/metricbeat reference config * Set correct service URI * Get master node ID * Adding CHANGELOG entry * Rename hosts_mode => scope * Removing stale TODO comment * Adding docs * Refactoring common code into helper method * Do not set service URI up front * Updating documentation per review * Remove comments from doc examples * Adding configuration for hosts_mode * Set correct service URI * Adding CHANGELOG entry * Rename hosts_mode => scope * Do not set service URI up front * Update metricbeat/docs/modules/elasticsearch.asciidoc Co-authored-by: DeDe Morton <dede.morton@elastic.co> * Update metricbeat/module/elasticsearch/_meta/docs.asciidoc Co-authored-by: DeDe Morton <dede.morton@elastic.co> * Update reference config * Cleaning up CHANGELOG * Updating generated files Co-authored-by: DeDe Morton <dede.morton@elastic.co> Co-authored-by: DeDe Morton <dede.morton@elastic.co>
…ne-2.0 * upstream/master: [docs] Promote ingest management to beta (elastic#20295) Upgrade elasticsearch client library used in tests (elastic#20405) Disable logging when pulling on python integration tests (elastic#20397) Remove pillow from testing requirements.txt (elastic#20407) [Filebeat][ATP Module]Setting user agent field required by the API (elastic#20440) [Ingest Manager] Send datastreams fields (elastic#20402) Add event.ingested to all Filebeat modules (elastic#20386) [Elastic Agent] Fix agent control socket path to always be less than 107 characters (elastic#20426) Improve cgroup_regex docs with examples (elastic#20425) Makes `metrics` config option required in app_insights (elastic#20406) Ensure install scripts only install if needed (elastic#20349) Update container name for the azure filesets (elastic#19899) Group same timestamp metrics values in app_insights metricset (elastic#20403) add_process_metadata processor adds container id even if process metadata not accessible (elastic#19767) Support "cluster" scope in Metricbeat elasticsearch module (elastic#18547) [Filebeat][SophosXG Module] Renaming module and fileset (elastic#20396) Update Suricata dashboards (elastic#20394) [Elastic Agent] Improve version, restart, enroll CLI commands (elastic#20359) Prepare home directories for docker images in a different stage (elastic#20356)
…allation * upstream/master: (23 commits) [docs] Promote ingest management to beta (elastic#20295) Upgrade elasticsearch client library used in tests (elastic#20405) Disable logging when pulling on python integration tests (elastic#20397) Remove pillow from testing requirements.txt (elastic#20407) [Filebeat][ATP Module]Setting user agent field required by the API (elastic#20440) [Ingest Manager] Send datastreams fields (elastic#20402) Add event.ingested to all Filebeat modules (elastic#20386) [Elastic Agent] Fix agent control socket path to always be less than 107 characters (elastic#20426) Improve cgroup_regex docs with examples (elastic#20425) Makes `metrics` config option required in app_insights (elastic#20406) Ensure install scripts only install if needed (elastic#20349) Update container name for the azure filesets (elastic#19899) Group same timestamp metrics values in app_insights metricset (elastic#20403) add_process_metadata processor adds container id even if process metadata not accessible (elastic#19767) Support "cluster" scope in Metricbeat elasticsearch module (elastic#18547) [Filebeat][SophosXG Module] Renaming module and fileset (elastic#20396) Update Suricata dashboards (elastic#20394) [Elastic Agent] Improve version, restart, enroll CLI commands (elastic#20359) Prepare home directories for docker images in a different stage (elastic#20356) New multiline mode in Filebeat: while_pattern (elastic#19662) ...
…8547) * Adding configuration for hosts_mode * Only perform master check in HostsModeNode * Only ask the node if it's the master node if we're in HostsModeNode * Unpack host_mode string into enum * Adding some specific TODOs in node_stats code * Updating x-pack/metricbeat reference config * Set correct service URI * Get master node ID * Adding CHANGELOG entry * Rename hosts_mode => scope * Removing stale TODO comment * Adding docs * Refactoring common code into helper method * Do not set service URI up front * Updating documentation per review * Remove comments from doc examples * Adding configuration for hosts_mode * Set correct service URI * Adding CHANGELOG entry * Rename hosts_mode => scope * Do not set service URI up front * Update metricbeat/docs/modules/elasticsearch.asciidoc Co-authored-by: DeDe Morton <dede.morton@elastic.co> * Update metricbeat/module/elasticsearch/_meta/docs.asciidoc Co-authored-by: DeDe Morton <dede.morton@elastic.co> * Update reference config * Cleaning up CHANGELOG * Updating generated files Co-authored-by: DeDe Morton <dede.morton@elastic.co>
What does this PR do?
This PR introduces a new
scopesetting for theelasticsearchMetricbeat module. This setting can take one of two values:node(default): indicates that each item in thehostslist points to a distinct Elasticsearch node in a cluster, orcluster: indicates that each item in thehostslists points to a single endpoint for a distinct Elasticsearch cluster (e.g. a load-balancing proxy fronting the cluster).Why is it important?
Sometimes it may not be possible for Metricbeat to reach individual Elasticsearch nodes. It might only have access to a single endpoint that fronts the entire Elasticsearch cluster.
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.Manual testing
Testing the new
scope: clusterfunctionality introduced in this PR requires an Elasticsearch cluster with multiple nodes. The easiest way is probably to spin up an Elastic Cloud cluster.Enable the
elasticsearch-xpackmodule.Configure the module (
./modules.d/elasticsearch-xpack.yml) like so:Obviously, replace the
XXXXXplaceholders with your own.The key is that there should be exactly one item under
hosts, pointing to single endpoint for your cluster. This could be the Elasticsearch endpoint obtained from Elastic Cloud or the address of a single node in your on-prem/local Elasticsearch cluster.Configure Metricbeat to send the collected stats to an Elasticsearch cluster. This will act as your Monitoring Cluster. It could be the same cluster you're using to collect stats from (as configured in your
elasticsearch-xpackmodule configuration above) or it could be an entirely separate cluster.Start Metricbeat.
Let Metricbeat run for ~30 seconds. Make sure there are no errors in the Metricbeat logs.
Perform the following query against your Monitoring Cluster.
This query checks 3 things:
aggs.by_cluster_uuidaggregation checks that we are only seeing data for a single cluster and that all documents contain that single cluster UUID.aggregations.by_cluster_uuid.bucketsonly contains a single bucket, for the single cluster UUID of the Elasticsearch cluster you are monitoring.aggregations.by_cluster_uuid.buckets[0].doc_countis the same as thehits.total.value.aggs.cluster_statsaggregation checks thattype: cluster_statsdocuments are only indexed once every collection period (10 seconds) and that there is at most one document per collection period. We are usingtype: cluster_statshere as an example; the same should be true for anytypes other thantype: node_stats.aggregations.cluster_stats.by_period.bucketshave several buckets, each corresponding to a time period. Each buckets should be 10 seconds "wide". Within each bucket, verify thatdoc_countis <= 1.aggs.node_statsaggregation checks thattype: node_statsdocuments are only indexed once every collection period (10 seconds) and that there are at mostNdocuments per collection period, whereNis the number of nodes in the cluster you are monitoring.aggregations.node_stats.by_period.bucketshave several buckets, each corresponding to a time period. Each buckets should be 10 seconds "wide". Within each bucket, verify thatdoc_countis <=N, whereNis the number of nodes in the cluster you are monitoring.Related issues
elasticsearchmodule should be able to collect from a single cluster endpoint #18539.