Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

fix(appliance): do not delete unowned resources#63826

Merged
craigfurman merged 1 commit into
mainfrom
appliance-dont-delete-unowned-objects
Jul 16, 2024
Merged

fix(appliance): do not delete unowned resources#63826
craigfurman merged 1 commit into
mainfrom
appliance-dont-delete-unowned-objects

Conversation

@craigfurman

Copy link
Copy Markdown
Contributor

The appliance checks owner references, and only deletes (or updates) resources whose owner references matches the configmap being reconciled.

The current user interface to external databases, is for the admin to create secrets with a well-known name out of band (e.g. "pgsql-auth") and then disable the relevant backing services (e.g. pgsql). This commit fixes a bug in which the appliance would have deleted such secrets, leaving the admin unable to use external databases.

Discovered while investigating
https://linear.app/sourcegraph/issue/REL-14/ensure-pods-roll-when-referenced-secret-changes, but does not close it.

Test plan

Test against a real k8s API server (envtest) included. Tests that exercise deletion, e.g. TestApplianceTestSuite/TestResourcesDeletedWhenDisabled, still pass.

Changelog

@craigfurman craigfurman added the no-changelog Exclude this PR from the next changelog. label Jul 15, 2024
@cla-bot cla-bot Bot added the cla-signed label Jul 15, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 105 to 108

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Chickensoupwithrice I threw this in, since it seemed that if we refuse to delete non-owned resources, we should also refuse to update such resources. Depending on the direction we choose to go with accommodating admin config changes to frontend's service and ingress, we might either want to drop these lines, or add a flag or something to allow dropping this constraint for a few resources. This would be done so that we can "adopt" these 2 resources that were helm-provisioned, in the appliance.

Conversations on https://linear.app/sourcegraph/issue/REL-218/ingressservice-consistency-for-helm-deployment and. https://github.com/sourcegraph/sourcegraph/pull/63627 seem to have stalled. I think we should decide how we want to proceed there, and then make sure I don't break our decision here 😁

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can think of a few broad options on how to balance the the risks of updating non-owned resources with the need (if it indeed exists) for the appliance to "adopt" frontend's service and ingress that were provisioned by helm. Bear in mind that in the future we may want to "adopt" all helm-provisioned resources to offer a migration path from helm:

  1. Remove these lines. In general, the appliance will update resources that it expects to own by GVKNN, regardless of whether or not it owns them
  2. Keep these lines, and treat frontend's service and ingress as special cases in code, allowing them to be "adopted"
    1. We could even SetControllerReference() on these 2 resources, making it very clear in kube terms that the appliance ConfigMap owns them
    2. This punts the decision on adoption more generally (i.e. helm migration path) - which is fine, that's out-of-scope for EAP. 2-way door decisions etc etc

Feel free to suggest more @Chickensoupwithrice @jdpleiness

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per https://sourcegraph.slack.com/archives/C05EH3JP15Z/p1721144468116559?thread_ts=1721122657.775889&cid=C05EH3JP15Z, let's roll forward with this. We'll need to figure out how ingress adoption should work at some point soon, and we can always change this. Classic 2-way door decision

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'm a fan of 2. Explicitly setting ownership to Appliance for objects we know we want to adopt seems like a reasonable path forward.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM 👍

@craigfurman craigfurman Jul 15, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scenario fails without the changes in kubernetes.go. Note the use of envtest without golden fixtures - I think it was clearer in this particular case. We could always do this in this package, I'm only calling it out because I think it's the first test that doesn't use golden fixtures here.

The appliance checks owner references, and only deletes (or updates)
resources whose owner references matches the configmap being reconciled.

The current user interface to external databases, is for the admin to
create secrets with a well-known name out of band (e.g. "pgsql-auth")
and then disable the relevant backing services (e.g. pgsql). This commit
fixes a bug in which the appliance would have deleted such secrets,
leaving the admin unable to use external databases.

Discovered while investigating
https://linear.app/sourcegraph/issue/REL-14/ensure-pods-roll-when-referenced-secret-changes,
but does not close it.
@craigfurman craigfurman merged commit 9186428 into main Jul 16, 2024
@craigfurman craigfurman deleted the appliance-dont-delete-unowned-objects branch July 16, 2024 16:18
craigfurman pushed a commit that referenced this pull request Jul 31, 2024
The appliance checks owner references, and only deletes (or updates)
resources whose owner references matches the configmap being reconciled.

The current user interface to external databases, is for the admin to
create secrets with a well-known name out of band (e.g. "pgsql-auth")
and then disable the relevant backing services (e.g. pgsql). This commit
fixes a bug in which the appliance would have deleted such secrets,
leaving the admin unable to use external databases.

Discovered while investigating

https://linear.app/sourcegraph/issue/REL-14/ensure-pods-roll-when-referenced-secret-changes,
but does not close it.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed no-changelog Exclude this PR from the next changelog.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants