Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

release/appliance: resource def jaeger#63534

Merged
Chickensoupwithrice merged 7 commits into
mainfrom
wg/rel/jaeger
Jul 3, 2024
Merged

release/appliance: resource def jaeger#63534
Chickensoupwithrice merged 7 commits into
mainfrom
wg/rel/jaeger

Conversation

@DaedalusG

Copy link
Copy Markdown
Contributor

appliance reconciler resource definition for jaeger

Test plan

Golden Tests

Changelog

@DaedalusG DaedalusG requested a review from craigfurman June 28, 2024 01:24
@cla-bot cla-bot Bot added the cla-signed label Jun 28, 2024

@craigfurman craigfurman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start! There's a couple of helm drift line items in the comments below that do stop this deploying, but we're almost there.

Random note: please add sg.Spec.Jaeger.ContainerConfig = setBestEffortQOSOnContainer(sg.Spec.Jaeger.ContainerConfig, "jaeger" to the function in appliance/dev_mode.go. This strips resources from the container when dev mode is enabled in the web UI, which allows us to kick the tires on a real deployment. There's no way you could have known to do this (that's my bad), so we should probably add it to development.md 🙏

You'll almost certainly want to rebase before doing that to avoid a merge conflict afterwards.

The build is failing because we need to run sg bazel configure and check in the BUILD.bazel changes.

Comment thread internal/appliance/config/spec.go Outdated
StandardConfig

// Replicas defines the number of Symbols pod replicas.
// Replicas defines the number of GitServer pod replicas.

@craigfurman craigfurman Jun 28, 2024

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional nit: my preference would just be to delete the comment altogether. Comments can be amazing, but GitServerSpec.Replicas is pretty unambiguous IMO

for _, tc := range []struct {
name string
}{
{name: "jaeger/default"},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness (and only because it's easy 😁) let's add a "with-replicas" test case, since there is a config element for that. You can then diff the golden fixtures to get easy confidence that this works.

@@ -0,0 +1,278 @@
resources:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a little bit to figure out how to run compare helm for this. In our helm chart, the component label for this is "all-in-one" rather than "jaeger". Some annotated output of:

go run ./internal/appliance/dev/compare-helm \
  -deploy-sourcegraph-helm-path ../../deploy-sourcegraph-helm \
  -component all-in-one \
  -golden-file internal/appliance/reconciler/testdata/golden-fixtures/jaeger/default.yaml \
  -helm-template-extra-args '--set jaeger.enabled=true'

I had the v5.3.9104 tag of deploy-sourcegraph-helm checked out in the specified directory btw.

See inline comments 👇

2c2
< # helm: Service/jaeger-collector
---
> # golden: Service/jaeger-collector
6c6,8
<   annotations: null
---
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
8,10c10,11
<     app: jaeger
<     app.kubernetes.io/component: all-in-one
<     app.kubernetes.io/name: jaeger
---
>     app: jaeger-collector
>     app.kubernetes.io/component: jaeger-collector
12a14,23
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
13a25,31
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>     - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>     - IPv4
>   ipFamilyPolicy: SingleStack
30,31c48
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: jaeger
---
>   sessionAffinity: None
32a50,51
> status:
>   loadBalancer: {}
34c53
< # helm: Service/jaeger-query
---
> # golden: Service/jaeger-query
38c57,59
<   annotations: null
---
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
40,42c61,62
<     app: jaeger
<     app.kubernetes.io/component: all-in-one
<     app.kubernetes.io/name: jaeger
---
>     app: jaeger-query
>     app.kubernetes.io/component: jaeger-query
44a65,74
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
45a76,82
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>     - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>     - IPv4
>   ipFamilyPolicy: SingleStack
54,55c91
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: jaeger
---
>   sessionAffinity: None
56a93,94
> status:
>   loadBalancer: {}
58c96
< # helm: Deployment/jaeger
---
> # golden: Deployment/jaeger
61a100,103
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   generation: 1
63,66c105,106
<     app: jaeger
<     app.kubernetes.io/component: all-in-one
<     app.kubernetes.io/managed-by: Helm
<     app.kubernetes.io/name: jaeger
---
>     app.kubernetes.io/component: jaeger
>     app.kubernetes.io/name: sourcegraph
69d108
<     helm.sh/chart: sourcegraph-5.3.9104
70a110,119
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
71a121,122
>   minReadySeconds: 10
>   progressDeadlineSeconds: 600
77,78d127
<       app.kubernetes.io/component: all-in-one
<       app.kubernetes.io/name: jaeger
85,86c134
<         prometheus.io/port: "16686"
<         prometheus.io/scrape: "true"
---
>       creationTimestamp: null

We should add these prometheus annotations. StandardConfig allows us to declare a prometheus port in defaults.go, but this causes the annotations to be added to the service. That would probably be fine, as long as we pick the correct service, whose endpoints return the same set of pods as are in the labelSelector for this deployment.

If we want to take the path of least resistance, we can just hardcode these annotations in the service definition. After all, an annotation that no system pays attention to (e.g. if prometheus is not deployed) still does no harm!

89,91d136
<         app.kubernetes.io/component: all-in-one
<         app.kubernetes.io/instance: release-name
<         app.kubernetes.io/name: jaeger
92a138
>       name: jaeger
94d139
<       affinity: null
96,99c141
<         - args:
<             - --memory.max-traces=20000
<           env: null
<           image: index.docker.io/sourcegraph/jaeger-all-in-one:5.3.9104@sha256:e59f9bfcd0430d932f0755ec0271e695c183aeeabfacc94dd1424d2034ed5a44
---
>         - image: index.docker.io/sourcegraph/jaeger:5.3.9104

We probably want to squash those args and image diffs. For the image, pass "jaeger-all-in-one" into getDefaultImage() instead of the name variable, whose value is "jaeger".

115a158
>             failureThreshold: 3
118,119c161,166
<               port: 14269
<             initialDelaySeconds: 5
---
>               port: 16686
>               scheme: HTTP
>             initialDelaySeconds: 10
>             periodSeconds: 10
>             successThreshold: 1
>             timeoutSeconds: 1

The probe port looks wrong here. The rest of the diff is probably fine (most of it is defaults)

132,136c179,287
<           volumeMounts: null
<       nodeSelector: null
<       securityContext: {}
<       tolerations: null
<       volumes: null
---
>           terminationMessagePath: /dev/termination-log
>           terminationMessagePolicy: FallbackToLogsOnError
>       dnsPolicy: ClusterFirst
>       restartPolicy: Always
>       schedulerName: default-scheduler
>       securityContext:
>         fsGroup: 101
>         fsGroupChangePolicy: OnRootMismatch
>         runAsGroup: 101
>         runAsUser: 100
>       serviceAccount: jaeger
>       serviceAccountName: jaeger
>       terminationGracePeriodSeconds: 30
> status: {}
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
>   spec: |
>     spec:
>       requestedVersion: "5.3.9104"
> 
>       blobstore:
>         disabled: true
> 
>       codeInsights:
>         disabled: true
> 
>       codeIntel:
>         disabled: true
> 
>       frontend:
>         disabled: true
> 
>       gitServer:
>         disabled: true
> 
>       indexedSearch:
>         disabled: true
> 
>       openTelemetry:
>         disabled: true
> 
>       pgsql:
>         disabled: true
> 
>       postgresExporter:
>         disabled: true
> 
>       preciseCodeIntel:
>         disabled: true
> 
>       redisCache:
>         disabled: true
> 
>       redisStore:
>         disabled: true
> 
>       repoUpdater:
>         disabled: true
> 
>       searcher:
>         disabled: true
> 
>       symbols:
>         disabled: true
> 
>       syntectServer:
>         disabled: true
> 
>       worker:
>         disabled: true
> 
>       prometheus:
>         disabled: true
> 
>       jaeger:
>         disabled: false
> kind: ConfigMap
> metadata:
>   annotations:
>     appliance.sourcegraph.com/currentVersion: 5.3.9104
>     appliance.sourcegraph.com/managed: "true"
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   name: sg
>   namespace: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
> ---
> # golden: ServiceAccount/jaeger
> apiVersion: v1
> kind: ServiceAccount
> metadata:
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   labels:
>     deploy: sourcegraph
>   name: jaeger
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING

@Chickensoupwithrice

Copy link
Copy Markdown
Contributor

We paired on this, and worked it out.

Here's the diff:

2c2
< # helm: Service/jaeger-collector
---
> # golden: Service/jaeger-collector
6c6,8
<   annotations: null
---
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
8,10c10,11
<     app: jaeger
<     app.kubernetes.io/component: all-in-one
<     app.kubernetes.io/name: jaeger
---
>     app: jaeger-collector
>     app.kubernetes.io/component: jaeger-collector
12a14,23
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
13a25,31
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>     - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>     - IPv4
>   ipFamilyPolicy: SingleStack
30,31c48
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: jaeger
---
>   sessionAffinity: None
32a50,51
> status:
>   loadBalancer: {}
34c53
< # helm: Service/jaeger-query
---
> # golden: Service/jaeger-query
38c57,59
<   annotations: null
---
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
40,42c61,62
<     app: jaeger
<     app.kubernetes.io/component: all-in-one
<     app.kubernetes.io/name: jaeger
---
>     app: jaeger-query
>     app.kubernetes.io/component: jaeger-query
44a65,74
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
45a76,82
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>     - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>     - IPv4
>   ipFamilyPolicy: SingleStack
54,55c91
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: jaeger
---
>   sessionAffinity: None
56a93,94
> status:
>   loadBalancer: {}
58c96
< # helm: Deployment/jaeger
---
> # golden: Deployment/jaeger
61a100,103
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   generation: 1
63,67c105,107
<     app: jaeger
<     app.kubernetes.io/component: all-in-one
<     app.kubernetes.io/managed-by: Helm
<     app.kubernetes.io/name: jaeger
<     app.kubernetes.io/version: 5.3.2
---
>     app.kubernetes.io/component: jaeger
>     app.kubernetes.io/name: sourcegraph
>     app.kubernetes.io/version: 5.3.9104
69d108
<     helm.sh/chart: sourcegraph-5.3.2
70a110,119
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
71a121,122
>   minReadySeconds: 10
>   progressDeadlineSeconds: 600
77,78d127
<       app.kubernetes.io/component: all-in-one
<       app.kubernetes.io/name: jaeger
84d132
<         kubectl.kubernetes.io/default-container: jaeger
86a135
>       creationTimestamp: null
89,91d137
<         app.kubernetes.io/component: all-in-one
<         app.kubernetes.io/instance: release-name
<         app.kubernetes.io/name: jaeger
92a139
>       name: jaeger
94d140
<       affinity: null
98,99c144
<           env: null
<           image: index.docker.io/sourcegraph/jaeger-all-in-one:5.3.2@sha256:b68de2304ea2db0187aee9e2d0bfce02365583b2ad25f1780ea64ca135a8253b
---
>           image: index.docker.io/sourcegraph/jaeger-all-in-one:5.3.9104
115a161
>             failureThreshold: 3
118a165
>               scheme: HTTP
119a167,169
>             periodSeconds: 10
>             successThreshold: 1
>             timeoutSeconds: 1
132,136c182,290
<           volumeMounts: null
<       nodeSelector: null
<       securityContext: {}
<       tolerations: null
<       volumes: null
---
>           terminationMessagePath: /dev/termination-log
>           terminationMessagePolicy: FallbackToLogsOnError
>       dnsPolicy: ClusterFirst
>       restartPolicy: Always
>       schedulerName: default-scheduler
>       securityContext:
>         fsGroup: 101
>         fsGroupChangePolicy: OnRootMismatch
>         runAsGroup: 101
>         runAsUser: 100
>       serviceAccount: jaeger
>       serviceAccountName: jaeger
>       terminationGracePeriodSeconds: 30
> status: {}
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
>   spec: |
>     spec:
>       requestedVersion: "5.3.9104"
> 
>       blobstore:
>         disabled: true
> 
>       codeInsights:
>         disabled: true
> 
>       codeIntel:
>         disabled: true
> 
>       frontend:
>         disabled: true
> 
>       gitServer:
>         disabled: true
> 
>       indexedSearch:
>         disabled: true
> 
>       openTelemetry:
>         disabled: true
> 
>       pgsql:
>         disabled: true
> 
>       postgresExporter:
>         disabled: true
> 
>       preciseCodeIntel:
>         disabled: true
> 
>       redisCache:
>         disabled: true
> 
>       redisStore:
>         disabled: true
> 
>       repoUpdater:
>         disabled: true
> 
>       searcher:
>         disabled: true
> 
>       symbols:
>         disabled: true
> 
>       syntectServer:
>         disabled: true
> 
>       worker:
>         disabled: true
> 
>       prometheus:
>         disabled: true
> 
>       jaeger:
>         disabled: false
> kind: ConfigMap
> metadata:
>   annotations:
>     appliance.sourcegraph.com/currentVersion: 5.3.9104
>     appliance.sourcegraph.com/managed: "true"
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   name: sg
>   namespace: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
> ---
> # golden: ServiceAccount/jaeger
> apiVersion: v1
> kind: ServiceAccount
> metadata:
>   annotations:
>     appliance.sourcegraph.com/configHash: a975cc1b1a01d8f09ff9fb298bfbfa5a75ffd4b82636e8628255be58a34bba3b
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   labels:
>     deploy: sourcegraph
>   name: jaeger
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>     - apiVersion: v1
>       blockOwnerDeletion: true
>       controller: true
>       kind: ConfigMap
>       name: sg
>       uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING

@Chickensoupwithrice Chickensoupwithrice marked this pull request as ready for review July 2, 2024 21:48
@Chickensoupwithrice

Copy link
Copy Markdown
Contributor

Paired with @DaedalusG and @jdpleiness on this one...

@Chickensoupwithrice Chickensoupwithrice enabled auto-merge (squash) July 2, 2024 21:50
@Chickensoupwithrice Chickensoupwithrice merged commit 0d004a0 into main Jul 3, 2024
@Chickensoupwithrice Chickensoupwithrice deleted the wg/rel/jaeger branch July 3, 2024 01:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants