Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

appliance: add pgsql service definition#62512

Merged
jdpleiness merged 8 commits into
mainfrom
jdp/appliance/pgsql-service-def
May 10, 2024
Merged

appliance: add pgsql service definition#62512
jdpleiness merged 8 commits into
mainfrom
jdp/appliance/pgsql-service-def

Conversation

@jdpleiness

Copy link
Copy Markdown
Contributor

Add pgsql service definition.

Test plan

Golden files

@cla-bot cla-bot Bot added the cla-signed label May 8, 2024
Comment thread internal/appliance/config/embed.go Outdated
Comment thread internal/appliance/defaults.go Outdated
Comment thread internal/k8s/resource/container/container.go Outdated
Comment thread internal/k8s/resource/container/container.go Outdated
Comment thread internal/appliance/defaults.go Outdated
@jdpleiness jdpleiness requested a review from craigfurman May 8, 2024 00:49
Comment thread internal/appliance/pgsql.go Outdated

@craigfurman craigfurman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few issues with the compare-helm diff, some more notes and replies. A great start, this is definitely nearly ready!

Comment thread internal/appliance/config/embed.go Outdated
Comment thread internal/appliance/defaults.go Outdated
Comment thread internal/appliance/defaults.go Outdated
Comment thread internal/appliance/pgsql.go Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I was thinking of making a standard feature out of, but that can be done later in a refactor. If it introduces weird-looking if-statements, better to have a little duplication. No action needed, just a note.

Comment thread internal/k8s/resource/container/container.go Outdated
Comment thread internal/appliance/spec.go Outdated
Comment thread internal/appliance/defaults.go Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotated output of compare helm:

go run ./internal/appliance/dev/compare-helm \
  -component pgsql \
  -golden-file internal/appliance/testdata/golden-fixtures/pgsql/default.yaml

2c2
< # helm: Secret/pgsql-auth
---
> # golden: Secret/pgsql-auth
9a10
> immutable: false
11a13,15
>   annotations:
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
13,14d16
<     app: pgsql
<     app.kubernetes.io/component: pgsql
16a19,28
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
19c31
< # helm: ConfigMap/pgsql-conf
---
> # golden: ConfigMap/pgsql-conf
78a91
> immutable: false
82c95,96
<     description: Configuration for PostgreSQL
---
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
84d97
<     app.kubernetes.io/component: pgsql
86a100,109
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
88c111
< # helm: PersistentVolumeClaim/pgsql
---
> # golden: PersistentVolumeClaim/pgsql
91a115,119
>   annotations:
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   finalizers:
>   - kubernetes.io/pvc-protection
93d120
<     app.kubernetes.io/component: pgsql
95a123,132
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
102a140,142
>   volumeMode: Filesystem
> status:
>   phase: Pending
104c144
< # helm: Service/pgsql
---
> # golden: Service/pgsql
109c149,150
<     prometheus.io/port: "9187"
---
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>     prometheus.io/port: "6060"

Noted in defaults.go

110a152
>   creationTimestamp: "2024-04-19T00:00:00Z"
115a158,167
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
116a169,175
>   clusterIP: NORMALIZED_FOR_TESTING
>   clusterIPs:
>   - NORMALIZED_FOR_TESTING
>   internalTrafficPolicy: Cluster
>   ipFamilies:
>   - IPv4
>   ipFamilyPolicy: SingleStack
119a179
>     protocol: TCP
123,124c183
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/name: sourcegraph
---
>   sessionAffinity: None
125a185,186
> status:
>   loadBalancer: {}
127c188
< # helm: StatefulSet/pgsql
---
> # golden: StatefulSet/pgsql
132c193,195
<     description: Postgres database for various data.
---
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   generation: 1
135,136d197
<     app.kubernetes.io/instance: release-name
<     app.kubernetes.io/managed-by: Helm
138c199
<     app.kubernetes.io/version: 5.3.2
---
>     app.kubernetes.io/version: 5.3.9104
140d200
<     helm.sh/chart: sourcegraph-5.3.2
141a202,211
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
142a213,217
>   minReadySeconds: 10
>   persistentVolumeClaimRetentionPolicy:
>     whenDeleted: Retain
>     whenScaled: Retain
>   podManagementPolicy: OrderedReady
148,149d222
<       app.kubernetes.io/instance: release-name
<       app.kubernetes.io/name: sourcegraph
154d226
<         checksum/pgsql.secret: 180940fdb956526d197a8efaf15bc2f14a3db83e09610917f8b9040fa5232d39
155a228
>       creationTimestamp: null
158,159d230
<         app.kubernetes.io/instance: release-name
<         app.kubernetes.io/name: sourcegraph
161c232
<         group: backend
---
>       name: pgsql
163d233
<       affinity: null
198a269
>           failureThreshold: 3
199a271,273
>           periodSeconds: 10
>           successThreshold: 1
>           timeoutSeconds: 1
203a278
>           protocol: TCP
207a283,286
>           failureThreshold: 3
>           periodSeconds: 10
>           successThreshold: 1
>           timeoutSeconds: 1
218,219c297,298
<           runAsGroup: 999
<           runAsUser: 999
---
>           runAsGroup: 101
>           runAsUser: 100

This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.

225a305,307
>           successThreshold: 1
>           timeoutSeconds: 1
>         terminationMessagePath: /dev/termination-log
261a344
>         imagePullPolicy: IfNotPresent
266c349
<             memory: 50Mi
---
>             memory: 50M

Default resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.

269,270c352,358
<             memory: 50Mi
<         securityContext: null
---
>             memory: 50M
>         securityContext:
>           allowPrivilegeEscalation: false
>           readOnlyRootFilesystem: true
>           runAsGroup: 101
>           runAsUser: 100
>         terminationMessagePath: /dev/termination-log
271a360
>       dnsPolicy: ClusterFirst
283c372
<             memory: 50Mi
---
>             memory: 50M
286c375
<             memory: 50Mi
---
>             memory: 50M
290,291c379,382
<           runAsGroup: 999
<           runAsUser: 999
---
>           runAsGroup: 101
>           runAsUser: 100
>         terminationMessagePath: /dev/termination-log
>         terminationMessagePolicy: FallbackToLogsOnError
295c386,387
<       nodeSelector: null
---
>       restartPolicy: Always
>       schedulerName: default-scheduler
297c389
<         fsGroup: 999
---
>         fsGroup: 101
299,302c391,395
<         runAsGroup: 999
<         runAsUser: 999
<       terminationGracePeriodSeconds: 120
<       tolerations: null
---
>         runAsGroup: 101
>         runAsUser: 100
>       serviceAccount: pgsql
>       serviceAccountName: pgsql
>       terminationGracePeriodSeconds: 30

terminationGracePeriodSeconds differs here (it's longer in helm).
Looks like we also unconditionally use a ServiceAccount here but not in helm,
which is probably harmless! I probably should have passed an extra-arg to my
compare-helm invocation to account for this.

303a397,402
>       - emptyDir: {}
>         name: lockdir
>       - emptyDir:
>           medium: Memory
>           sizeLimit: 1Gi
>         name: dshm
311,316d409
<       - emptyDir:
<           medium: Memory
<           sizeLimit: 1G
<         name: dshm
<       - emptyDir: {}
<         name: lockdir
318a412,505
> status:
>   availableReplicas: 0
>   replicas: 0
> ---
> # golden: ConfigMap/sg
> apiVersion: v1
> data:
>   spec: |
>     spec:
>       requestedVersion: "5.3.9104"
> 
>       blobstore:
>         disabled: true
> 
>       codeInsights:
>         disabled: true
> 
>       codeIntel:
>         disabled: true
> 
>       frontend:
>         disabled: true
> 
>       gitServer:
>         disabled: true
> 
>       indexedSearch:
>         disabled: true
> 
>       indexedSearchIndexer:
>         disabled: true
> 
>       pgsql: {}
> 
>       postgresExporter:
>         disabled: true
> 
>       preciseCodeIntel:
>         disabled: true
> 
>       redisCache:
>         disabled: true
> 
>       redisExporter:
>         disabled: true
> 
>       redisStore:
>         disabled: true
> 
>       repoUpdater:
>         disabled: true
> 
>       searcher:
>         disabled: true
> 
>       symbols:
>         disabled: true
> 
>       syntectServer:
>         disabled: true
> 
>       worker:
>         disabled: true
> kind: ConfigMap
> metadata:
>   annotations:
>     appliance.sourcegraph.com/currentVersion: 5.3.9104
>     appliance.sourcegraph.com/managed: "true"
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   name: sg
>   namespace: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING
> ---
> # golden: ServiceAccount/pgsql
> apiVersion: v1
> kind: ServiceAccount
> metadata:
>   annotations:
>     appliance.sourcegraph.com/configHash: 86fa1e5cbb702a914107796b2236964ac6042a4f399010458f077f8a854c0cc8
>   creationTimestamp: "2024-04-19T00:00:00Z"
>   labels:
>     deploy: sourcegraph
>   name: pgsql
>   namespace: NORMALIZED_FOR_TESTING
>   ownerReferences:
>   - apiVersion: v1
>     blockOwnerDeletion: true
>     controller: true
>     kind: ConfigMap
>     name: sg
>     uid: NORMALIZED_FOR_TESTING
>   resourceVersion: NORMALIZED_FOR_TESTING
>   uid: NORMALIZED_FOR_TESTING

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default resources are often differing by M vs Mi in this diff. Very minor note,
probably not a real problem, but may as well squash the diff.

I've been doing this intentionally just to standardize on the units we use. Most these limits/requests are guesses at best. If we find otherwise, I'll revert.

This user ID diff might stop the process from running successfully, not 100%
sure though. Depends on what's in the rootFS. May as well squash the diff?
Similar diffs for other containers although I haven't commented on these.

This was another attempt to standardize as well, but I'm a bit more leery of this for the reasons you mentioned and I'm not 100% sure myself.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: resources standardisation: 👍. This will technically cause a small (2%-ish) downscale, but that's relatively unlikely to tip anyone over into saturation. Users can configure custom resources anyway.

re: uid/gid standardisation: in general, that would require modifying the root filesystems of the container images to chown certain files. Otherwise the unprivileged user we exec as, might not be able to read/exec files that are needed at runtime. This is a general weird point about linux containers, and a few projects (e.g. shiftfs) exist to try to standardise this, mapping runtime UIDs to a different namespace of in-image UIDs, but I'm not sure any of these are compatible with common k8s container runtimes.

This is taking me back, I remember discussing similar UID-mapping mechanisms when we were introducing Docker image support to Cloud Foundry in 2015 😁

@jdpleiness jdpleiness force-pushed the jdp/appliance/pgsql-service-def branch from 745e79e to 6614d3b Compare May 8, 2024 16:05
@jdpleiness jdpleiness marked this pull request as ready for review May 8, 2024 18:14
@jdpleiness jdpleiness force-pushed the jdp/appliance/pgsql-service-def branch from 2ea52e9 to 3916a08 Compare May 8, 2024 18:28
@jdpleiness jdpleiness requested a review from craigfurman May 9, 2024 00:43

@craigfurman craigfurman left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

FYI: I added checksum/pgsql.secret to the list in https://github.com/sourcegraph/sourcegraph/issues/62227

@jdpleiness jdpleiness force-pushed the jdp/appliance/pgsql-service-def branch from 3916a08 to b19732f Compare May 10, 2024 17:38
@jdpleiness jdpleiness merged commit 5b7c98e into main May 10, 2024
@jdpleiness jdpleiness deleted the jdp/appliance/pgsql-service-def branch May 10, 2024 17:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants