appliance: add pgsql service definition#62512
Conversation
craigfurman
left a comment
There was a problem hiding this comment.
A few issues with the compare-helm diff, some more notes and replies. A great start, this is definitely nearly ready!
There was a problem hiding this comment.
This is something I was thinking of making a standard feature out of, but that can be done later in a refactor. If it introduces weird-looking if-statements, better to have a little duplication. No action needed, just a note.
There was a problem hiding this comment.
Annotated output of compare helm:
go run ./internal/appliance/dev/compare-helm \
-component pgsql \
-golden-file internal/appliance/testdata/golden-fixtures/pgsql/default.yaml
2c2
< # helm: Secret/pgsql-auth
---
> # golden: Secret/pgsql-auth
9a10
> immutable: false
11a13,15
> annotations:
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
13,14d16
< app: pgsql
< app.kubernetes.io/component: pgsql
16a19,28
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
19c31
< # helm: ConfigMap/pgsql-conf
---
> # golden: ConfigMap/pgsql-conf
78a91
> immutable: false
82c95,96
< description: Configuration for PostgreSQL
---
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
84d97
< app.kubernetes.io/component: pgsql
86a100,109
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
88c111
< # helm: PersistentVolumeClaim/pgsql
---
> # golden: PersistentVolumeClaim/pgsql
91a115,119
> annotations:
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
> finalizers:
> - kubernetes.io/pvc-protection
93d120
< app.kubernetes.io/component: pgsql
95a123,132
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
102a140,142
> volumeMode: Filesystem
> status:
> phase: Pending
104c144
< # helm: Service/pgsql
---
> # golden: Service/pgsql
109c149,150
< prometheus.io/port: "9187"
---
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> prometheus.io/port: "6060"Noted in defaults.go
110a152
> creationTimestamp: "2024-04-19T00:00:00Z"
115a158,167
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
116a169,175
> clusterIP: NORMALIZED_FOR_TESTING
> clusterIPs:
> - NORMALIZED_FOR_TESTING
> internalTrafficPolicy: Cluster
> ipFamilies:
> - IPv4
> ipFamilyPolicy: SingleStack
119a179
> protocol: TCP
123,124c183
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/name: sourcegraph
---
> sessionAffinity: None
125a185,186
> status:
> loadBalancer: {}
127c188
< # helm: StatefulSet/pgsql
---
> # golden: StatefulSet/pgsql
132c193,195
< description: Postgres database for various data.
---
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
> generation: 1
135,136d197
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/managed-by: Helm
138c199
< app.kubernetes.io/version: 5.3.2
---
> app.kubernetes.io/version: 5.3.9104
140d200
< helm.sh/chart: sourcegraph-5.3.2
141a202,211
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
142a213,217
> minReadySeconds: 10
> persistentVolumeClaimRetentionPolicy:
> whenDeleted: Retain
> whenScaled: Retain
> podManagementPolicy: OrderedReady
148,149d222
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/name: sourcegraph
154d226
< checksum/pgsql.secret: 180940fdb956526d197a8efaf15bc2f14a3db83e09610917f8b9040fa5232d39
155a228
> creationTimestamp: null
158,159d230
< app.kubernetes.io/instance: release-name
< app.kubernetes.io/name: sourcegraph
161c232
< group: backend
---
> name: pgsql
163d233
< affinity: null
198a269
> failureThreshold: 3
199a271,273
> periodSeconds: 10
> successThreshold: 1
> timeoutSeconds: 1
203a278
> protocol: TCP
207a283,286
> failureThreshold: 3
> periodSeconds: 10
> successThreshold: 1
> timeoutSeconds: 1
218,219c297,298
< runAsGroup: 999
< runAsUser: 999
---
> runAsGroup: 101
> runAsUser: 100This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.
225a305,307
> successThreshold: 1
> timeoutSeconds: 1
> terminationMessagePath: /dev/termination-log
261a344
> imagePullPolicy: IfNotPresent
266c349
< memory: 50Mi
---
> memory: 50MDefault resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.
269,270c352,358
< memory: 50Mi
< securityContext: null
---
> memory: 50M
> securityContext:
> allowPrivilegeEscalation: false
> readOnlyRootFilesystem: true
> runAsGroup: 101
> runAsUser: 100
> terminationMessagePath: /dev/termination-log
271a360
> dnsPolicy: ClusterFirst
283c372
< memory: 50Mi
---
> memory: 50M
286c375
< memory: 50Mi
---
> memory: 50M
290,291c379,382
< runAsGroup: 999
< runAsUser: 999
---
> runAsGroup: 101
> runAsUser: 100
> terminationMessagePath: /dev/termination-log
> terminationMessagePolicy: FallbackToLogsOnError
295c386,387
< nodeSelector: null
---
> restartPolicy: Always
> schedulerName: default-scheduler
297c389
< fsGroup: 999
---
> fsGroup: 101
299,302c391,395
< runAsGroup: 999
< runAsUser: 999
< terminationGracePeriodSeconds: 120
< tolerations: null
---
> runAsGroup: 101
> runAsUser: 100
> serviceAccount: pgsql
> serviceAccountName: pgsql
> terminationGracePeriodSeconds: 30terminationGracePeriodSeconds differs here (it's longer in helm).
Looks like we also unconditionally use a ServiceAccount here but not in helm,
which is probably harmless! I probably should have passed an extra-arg to my
compare-helm invocation to account for this.
303a397,402
> - emptyDir: {}
> name: lockdir
> - emptyDir:
> medium: Memory
> sizeLimit: 1Gi
> name: dshm
311,316d409
< - emptyDir:
< medium: Memory
< sizeLimit: 1G
< name: dshm
< - emptyDir: {}
< name: lockdir
318a412,505
> status:
> availableReplicas: 0
> replicas: 0
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
> spec: |
> spec:
> requestedVersion: "5.3.9104"
>
> blobstore:
> disabled: true
>
> codeInsights:
> disabled: true
>
> codeIntel:
> disabled: true
>
> frontend:
> disabled: true
>
> gitServer:
> disabled: true
>
> indexedSearch:
> disabled: true
>
> indexedSearchIndexer:
> disabled: true
>
> pgsql: {}
>
> postgresExporter:
> disabled: true
>
> preciseCodeIntel:
> disabled: true
>
> redisCache:
> disabled: true
>
> redisExporter:
> disabled: true
>
> redisStore:
> disabled: true
>
> repoUpdater:
> disabled: true
>
> searcher:
> disabled: true
>
> symbols:
> disabled: true
>
> syntectServer:
> disabled: true
>
> worker:
> disabled: true
> kind: ConfigMap
> metadata:
> annotations:
> appliance.sourcegraph.com/currentVersion: 5.3.9104
> appliance.sourcegraph.com/managed: "true"
> creationTimestamp: "2024-04-19T00:00:00Z"
> name: sg
> namespace: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTING
> ---
> # golden: ServiceAccount/pgsql
> apiVersion: v1
> kind: ServiceAccount
> metadata:
> annotations:
> appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
> creationTimestamp: "2024-04-19T00:00:00Z"
> labels:
> deploy: sourcegraph
> name: pgsql
> namespace: NORMALIZED_FOR_TESTING
> ownerReferences:
> - apiVersion: v1
> blockOwnerDeletion: true
> controller: true
> kind: ConfigMap
> name: sg
> uid: NORMALIZED_FOR_TESTING
> resourceVersion: NORMALIZED_FOR_TESTING
> uid: NORMALIZED_FOR_TESTINGThere was a problem hiding this comment.
Default resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.
I've been doing this intentionally just to standardize on the units we use. Most these limits/requests are guesses at best. If we find otherwise, I'll revert.
This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.
This was another attempt to standardize as well, but I'm a bit more leery of this for the reasons you mentioned and I'm not 100% sure myself.
There was a problem hiding this comment.
re: resources standardisation: 👍. This will technically cause a small (2%-ish) downscale, but that's relatively unlikely to tip anyone over into saturation. Users can configure custom resources anyway.
re: uid/gid standardisation: in general, that would require modifying the root filesystems of the container images to chown certain files. Otherwise the unprivileged user we exec as, might not be able to read/exec files that are needed at runtime. This is a general weird point about linux containers, and a few projects (e.g. shiftfs) exist to try to standardise this, mapping runtime UIDs to a different namespace of in-image UIDs, but I'm not sure any of these are compatible with common k8s container runtimes.
This is taking me back, I remember discussing similar UID-mapping mechanisms when we were introducing Docker image support to Cloud Foundry in 2015 😁
745e79e to
6614d3b
Compare
2ea52e9 to
3916a08
Compare
craigfurman
left a comment
There was a problem hiding this comment.
🚀
FYI: I added checksum/pgsql.secret to the list in https://github.com/sourcegraph/sourcegraph/issues/62227
3916a08 to
b19732f
Compare
Add
pgsqlservice definition.Test plan
Golden files