-
Notifications
You must be signed in to change notification settings - Fork 78
Updated monitoring-stack #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
playbooks/files/grafana.yaml
Outdated
| type: prometheus | ||
| uid: prometheus | ||
| access: proxy | ||
| url: http://kube-prometheus-stack-1735-prometheus.monitoring.svc.cluster.local:9090 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a fixed URL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it will change but need to test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for pointing this out.
Initially, I was facing problem while setting the URL, so I followed this recommendation, and it worked.
But yes, you're right, URL can change because of the suffix (1735). I used the --generate-name option while installing kube-prometheus-stack, as this option adds a unique suffix to the release name.
To avoid this, we can use a fixed release name "kube-prometheus-stack", instead of --generate-name. This will create a consistent service name: kube-prometheus-stack-prometheus
helm install kube-prometheus-stack --version {{ prometheus_stack }} prometheus-community/kube-prometheus-stack --create-namespace --namespace monitoring --values {{ ansible_user_dir }}/kube-prometheus-stack.values
With this, URL will always be:
http://kube-prometheus-stack-prometheus.monitoring:9090 as describe in this kube-prometheus-stack grafana datasource file also
I tried this approach, and it worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@singh-kalpana not sure why you removed the additional scrape config, because of that unable to see GPU Metrics on Grafana Dashboard. I will fix that but it's FYI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to add additional scrape config, as ServiceMonitor for DCGM exporter is being enabled, which allows Prometheus to scrape metrics from DCGM exporter, it takes about a minute to show metrics on Grafana Dashboard
What this PR does / why we need it?
Install the Prometheus Operator and Grafana Operator instead of a standalone Prometheus and Grafana instance.
The standalone instances does not include important CRDs like servicemonitor ( prometheus-community/helm-charts#3010 ), that is needed to monitor user-defined applications without directly modifying the Prometheus configuration, also absence of these CRDs limits flexibility and stop us to migrate monitoring resources to production environment.
Changes:
This PR #2172 suggests including the Grafana Operator within the
kube-prometheus stackrather than Grafana instance, but the Prometheus community recommends disabling Grafana in the stack and installing Grafana Operator separately.