Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

release/appliance: resource def grafana#63535

Merged
DaedalusG merged 9 commits into
mainfrom
wg/rel/app-grafana-def
Jul 4, 2024
Merged

release/appliance: resource def grafana#63535
DaedalusG merged 9 commits into
mainfrom
wg/rel/app-grafana-def

Conversation

@DaedalusG

Copy link
Copy Markdown
Contributor

appliance reconciler resource definition for grafana

Test plan

Golden Tests

Changelog

@DaedalusG DaedalusG requested a review from craigfurman June 28, 2024 03:36
@cla-bot cla-bot Bot added the cla-signed label Jun 28, 2024

@craigfurman craigfurman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start 🎊

There is some meaningful drift compared to helm I think, and I think I found the cause - see notes.

Similar to the other service definition PR today, can we also update dev_mode.go? My bad on this being totally undocumented, I only fixed dev mode the other day after having forgotten about it for weeks.

Comment thread internal/appliance/config/defaults.go Outdated
Comment thread internal/appliance/reconciler/grafana_test.go
@@ -0,0 +1,270 @@
resources:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go run ./internal/appliance/dev/compare-helm \
  -deploy-sourcegraph-helm-path ../../deploy-sourcegraph-helm \
  -component grafana \
  -golden-file internal/appliance/reconciler/testdata/golden-fixtures/grafana/default.yaml

Against v5.3.9104 checkout of the helm chart:

2c2
< # helm: ServiceAccount/grafana
---
> # golden: ServiceAccount/grafana
5a6,8
>   annotations:
>     appliance.sourcegraph.com/configHash: 230d0f8eb733e287c9f84ab7eb488598073e7ef3db0a7da09e44c65de850f082
>   creationTimestamp: "2024-04-19T00:00:00Z"
7,8d9
<     app.kubernetes.io/component: grafana
<     category: rbac
10a12,21
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
12c23
< # helm: ConfigMap/grafana
---
> # golden: ConfigMap/grafana
15,28c26,28
<   datasources.yml: |
<     apiVersion: 1
< 
<     datasources:
<       - name: Prometheus
<         type: prometheus
<         access: proxy
<         url: http://prometheus:30090
<         isDefault: true
<         editable: false
<       - name: Jaeger
<         type: Jaeger
<         access: proxy
<         url: http://jaeger-query:16686/-/debug/jaeger
---
>   extra_rules.yml: ""
>   grafana.yml: ""
> immutable: false

This looks like meaningful drift. I can see you're embedding the config file though, let's take a look at the code and figure out what's happening.

30a31,33
>   annotations:
>     appliance.sourcegraph.com/configHash: 230d0f8eb733e287c9f84ab7eb488598073e7ef3db0a7da09e44c65de850f082
>   creationTimestamp: "2024-04-19T00:00:00Z"
32d34
<     app.kubernetes.io/component: grafana
34a37,46
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
36c48
< # helm: Service/grafana
---
> # golden: Service/grafana
39a52,54
>   annotations:
>     appliance.sourcegraph.com/configHash: 230d0f8eb733e287c9f84ab7eb488598073e7ef3db0a7da09e44c65de850f082
>   creationTimestamp: "2024-04-19T00:00:00Z"
44a60,69
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
45a71,77
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>     - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>     - IPv4
>   ipFamilyPolicy: SingleStack
48c80,81
<       port: 30070
---
>       port: 3181
>       protocol: TCP
49a83,86
>     - name: debug
>       port: 6060
>       protocol: TCP
>       targetPort: debug
52,53c89
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: sourcegraph
---
>   sessionAffinity: None
54a91,92
> status:
>   loadBalancer: {}
56c94
< # helm: StatefulSet/grafana
---
> # golden: StatefulSet/grafana
61c99,101
<     description: Metrics/monitoring dashboards and alerts.
---
>     appliance.sourcegraph.com/configHash: 230d0f8eb733e287c9f84ab7eb488598073e7ef3db0a7da09e44c65de850f082
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   generation: 1
64,65d103
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/managed-by: Helm
69d106
<     helm.sh/chart: sourcegraph-5.3.9104
70a108,117
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
71a119,124
>   minReadySeconds: 10
>   persistentVolumeClaimRetentionPolicy:
>     whenDeleted: Retain
>     whenScaled: Retain
>   podManagementPolicy: OrderedReady
>   replicas: 0
80a134
>       creationTimestamp: null
83,84d136
<         app.kubernetes.io/instance: release-name
<         app.kubernetes.io/name: sourcegraph
85a138
>       name: grafana
87d139
<       affinity: null
89,90c141
<         - env: null
<           image: index.docker.io/sourcegraph/grafana:5.3.9104@sha256:3105bba2de3f33498c78e8b39fd5543653f096c268d17790014ecb5997617ea4
---
>         - image: index.docker.io/sourcegraph/grafana:5.3.9104
95a147
>               protocol: TCP
107a160
>           terminationMessagePath: /dev/termination-log
114c167,169
<       nodeSelector: null
---
>       dnsPolicy: ClusterFirst
>       restartPolicy: Always
>       schedulerName: default-scheduler
119a175
>       serviceAccount: grafana
121c177
<       tolerations: null
---
>       terminationGracePeriodSeconds: 30
130c186,191
<     - metadata:
---
>     - apiVersion: v1
>       kind: PersistentVolumeClaim
>       metadata:
>         creationTimestamp: null
>         labels:
>           deploy: sourcegraph
131a193
>         namespace: NORMALIZED_FOR_TESTING
138c200,279
<         storageClassName: sourcegraph
---
>         volumeMode: Filesystem
>       status:
>         phase: Pending
> status:
>   availableReplicas: 0
>   replicas: 0
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
>   spec: |
>     spec:
>       requestedVersion: "5.3.9104"
> 
>       blobstore:
>         disabled: true
> 
>       codeInsights:
>         disabled: true
> 
>       codeIntel:
>         disabled: true
> 
>       frontend:
>         disabled: true
> 
>       gitServer:
>         disabled: true
> 
>       indexedSearch:
>         disabled: true
> 
>       openTelemetry:
>         disabled: true
> 
>       pgsql:
>         disabled: true
> 
>       postgresExporter:
>         disabled: true
> 
>       preciseCodeIntel:
>         disabled: true
> 
>       redisCache:
>         disabled: true
> 
>       redisStore:
>         disabled: true
> 
>       repoUpdater:
>         disabled: true
> 
>       searcher:
>         disabled: true
> 
>       symbols:
>         disabled: true
> 
>       syntectServer:
>         disabled: true
> 
>       worker:
>         disabled: true
> 
>       prometheus:
>         disabled: true
> 
>       grafana:
>         disabled: false
> kind: ConfigMap
> metadata:
>   annotations:
>     appliance.sourcegraph.com/currentVersion: 5.3.9104
>     appliance.sourcegraph.com/managed: "true"
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   name: sg
>   namespace: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING

@DaedalusG DaedalusG Jun 28, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm still getting this drift after fixing the way the datasource file is embedded?

12,34d22
< # helm: ConfigMap/grafana
< apiVersion: v1
< data:
<   datasources.yml: |
<     apiVersion: 1
<
<     datasources:
<       - name: Prometheus
<         type: prometheus
<         access: proxy
<         url: http://prometheus:30090
<         isDefault: true
<         editable: false
<       - name: Jaeger
<         type: Jaeger
<         access: proxy
<         url: http://jaeger-query:16686/-/debug/jaeger
< kind: ConfigMap
< metadata:
<   labels:
<     app.kubernetes.io/component: grafana
<     deploy: sourcegraph
<   name: grafana
36c24
< # helm: Service/grafana
---
> # golden: Service/grafana

fwiw I checked the prometheus helm diff and it also says drift for this specific setup. Is it maybe expected? Despite having fixed the previous errors?

Comment thread internal/appliance/config/embed.go Outdated
@Chickensoupwithrice Chickensoupwithrice marked this pull request as ready for review July 2, 2024 21:44

@Chickensoupwithrice Chickensoupwithrice left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paired with @DaedalusG and @jdpleiness

Wrong PR 🤦🏽

@Chickensoupwithrice Chickensoupwithrice self-requested a review July 2, 2024 21:47
@Chickensoupwithrice

Copy link
Copy Markdown
Contributor

In it's current state I can't run go test -args appliance-update-golden-tests getting an infinite loop with this error sticking out:

2024/07/03 15:52:35 "msg"="Reconciler error" "error"="failed to reconcile grafana: reconciling ConfigMap: parsing default grafana config template: template: grafana-config:13: function \"default\" not defined" "controller"="configmap" "controllerGroup"="" "controllerKind"="ConfigMap" "ConfigMap"={"name"="sg" "namespace"="test-appliance-3c7a89"} "namespace"="test-appliance-3c7a89" "name"="sg" "reconcileID"="8c1776db-4d30-4497-8567-1e1dfc34c76a"

Investigating further

@Chickensoupwithrice

Chickensoupwithrice commented Jul 3, 2024

Copy link
Copy Markdown
Contributor

Updated the files with the new diff below. No longer missing the ConfigMap

Details
2c2
< # helm: ServiceAccount/grafana
---
> # golden: ServiceAccount/grafana
5a6,8
>   annotations:
>     appliance.sourcegraph.com/configHash: c7ada10d28d26c809f8b7e71aabb5a65420c3c2f306a364d3b9b966189d84a05
>   creationTimestamp: "2024-04-19T00:00:00Z"
7,8d9
<     app.kubernetes.io/component: grafana
<     category: rbac
10a12,21
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
12c23
< # helm: ConfigMap/grafana
---
> # golden: ConfigMap/grafana
28a40,41
>   extra_rules.yml: ""
> immutable: false
30a44,46
>   annotations:
>     appliance.sourcegraph.com/configHash: c7ada10d28d26c809f8b7e71aabb5a65420c3c2f306a364d3b9b966189d84a05
>   creationTimestamp: "2024-04-19T00:00:00Z"
32d47
<     app.kubernetes.io/component: grafana
34a50,59
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
36c61
< # helm: Service/grafana
---
> # golden: Service/grafana
39a65,67
>   annotations:
>     appliance.sourcegraph.com/configHash: c7ada10d28d26c809f8b7e71aabb5a65420c3c2f306a364d3b9b966189d84a05
>   creationTimestamp: "2024-04-19T00:00:00Z"
44a73,82
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
45a84,90
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>     - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>     - IPv4
>   ipFamilyPolicy: SingleStack
48c93,94
<       port: 30070
---
>       port: 3181
>       protocol: TCP
49a96,99
>     - name: debug
>       port: 6060
>       protocol: TCP
>       targetPort: debug
52,53c102
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: sourcegraph
---
>   sessionAffinity: None
54a104,105
> status:
>   loadBalancer: {}
56c107
< # helm: StatefulSet/grafana
---
> # golden: StatefulSet/grafana
61c112,114
<     description: Metrics/monitoring dashboards and alerts.
---
>     appliance.sourcegraph.com/configHash: c7ada10d28d26c809f8b7e71aabb5a65420c3c2f306a364d3b9b966189d84a05
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   generation: 1
64,65d116
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/managed-by: Helm
67c118
<     app.kubernetes.io/version: 5.3.2
---
>     app.kubernetes.io/version: 5.3.9104
69d119
<     helm.sh/chart: sourcegraph-5.3.2
70a121,130
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
71a132,136
>   minReadySeconds: 10
>   persistentVolumeClaimRetentionPolicy:
>     whenDeleted: Retain
>     whenScaled: Retain
>   podManagementPolicy: OrderedReady
81a147
>       creationTimestamp: null
84,85d149
<         app.kubernetes.io/instance: release-name
<         app.kubernetes.io/name: sourcegraph
86a151
>       name: grafana
88d152
<       affinity: null
90,91c154
<         - env: null
<           image: index.docker.io/sourcegraph/grafana:5.3.2@sha256:afc26fbaad8ccf0b7205a05d8505bad62b93042ad17facaf240d1287467439f8
---
>         - image: index.docker.io/sourcegraph/grafana:5.3.9104
96a160
>               protocol: TCP
108a173
>           terminationMessagePath: /dev/termination-log
115c180,182
<       nodeSelector: null
---
>       dnsPolicy: ClusterFirst
>       restartPolicy: Always
>       schedulerName: default-scheduler
120a188
>       serviceAccount: grafana
122c190
<       tolerations: null
---
>       terminationGracePeriodSeconds: 30
131c199,204
<     - metadata:
---
>     - apiVersion: v1
>       kind: PersistentVolumeClaim
>       metadata:
>         creationTimestamp: null
>         labels:
>           deploy: sourcegraph
132a206
>         namespace: NORMALIZED_FOR_TESTING
139c213,292
<         storageClassName: sourcegraph
---
>         volumeMode: Filesystem
>       status:
>         phase: Pending
> status:
>   availableReplicas: 0
>   replicas: 0
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
>   spec: |
>     spec:
>       requestedVersion: "5.3.9104"
> 
>       blobstore:
>         disabled: true
> 
>       codeInsights:
>         disabled: true
> 
>       codeIntel:
>         disabled: true
> 
>       frontend:
>         disabled: true
> 
>       gitServer:
>         disabled: true
> 
>       indexedSearch:
>         disabled: true
> 
>       openTelemetry:
>         disabled: true
> 
>       pgsql:
>         disabled: true
> 
>       postgresExporter:
>         disabled: true
> 
>       preciseCodeIntel:
>         disabled: true
> 
>       redisCache:
>         disabled: true
> 
>       redisStore:
>         disabled: true
> 
>       repoUpdater:
>         disabled: true
> 
>       searcher:
>         disabled: true
> 
>       symbols:
>         disabled: true
> 
>       syntectServer:
>         disabled: true
> 
>       worker:
>         disabled: true
> 
>       prometheus:
>         disabled: true
> 
>       grafana:
>         disabled: false
> kind: ConfigMap
> metadata:
>   annotations:
>     appliance.sourcegraph.com/currentVersion: 5.3.9104
>     appliance.sourcegraph.com/managed: "true"
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   name: sg
>   namespace: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING

@Chickensoupwithrice

Copy link
Copy Markdown
Contributor

Hmmm, seems because of the grafana statefulset, all the other reconciler tests also get updated when running go test -args appliance-update-golden-files :/

Would recommend reviewing without that commit to begin with.

@DaedalusG DaedalusG merged commit b98c7d0 into main Jul 4, 2024
@DaedalusG DaedalusG deleted the wg/rel/app-grafana-def branch July 4, 2024 16:46
@craigfurman

Copy link
Copy Markdown
Contributor

@Chickensoupwithrice https://github.com/sourcegraph/sourcegraph/pull/63659 contains a fix for the comment above 😁

craigfurman referenced this pull request Jul 9, 2024
)

Disable grafana in all non-grafana input fixtures, which causes grafana
to be removed from most golden fixtures. This follows a pattern we've
been using for most service tests: disabling all on-by-default services
except the one under test, to make compare-helm diffs easier to read.

```
find internal/appliance/reconciler/testdata/sg -name '*.yaml' | \
  xargs sed -Ei '/gitServer:/i\  grafana:\n    disabled: true\n'
```

Fixes
https://github.com/sourcegraph/sourcegraph/pull/63535#issuecomment-2207724318
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants