Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

[Backport 5.5.x] fix(appliance): reliably redirect to site-admin post-install#64220

Merged
craigfurman merged 1 commit into
5.5.xfrom
backport-64216-to-5.5.x
Aug 1, 2024
Merged

[Backport 5.5.x] fix(appliance): reliably redirect to site-admin post-install#64220
craigfurman merged 1 commit into
5.5.xfrom
backport-64216-to-5.5.x

Conversation

@sourcegraph-release-bot

Copy link
Copy Markdown
Collaborator

When the admin has first installed Code Search Suite, the appliance waits for the admin to click an "I'm ready" button. This causes the appliance to unblock a background thread that periodically checks the health of sg-frontend. When it is healthy, it ensures that the ingress-facing frontend is pointed to sg-frontend. And when it is not, it points to the appliance-frontend. Pointing to the appliance-frontend is its initial state pre-install, and given that we've just installed sg, the appliance switches the service over quickly.

Meanwhile, clicking this button transitions the frontend to a "refresh" state (this being one of the states in its state machine). This causes the UI to reload the web page. The reason we have to do this is that it is a way to "redirect to yourself". If the ingress-facing service has been repointed, refreshing like this will show site-admin, which is the desired behavior. The issue this commit fixes, is that this is racy: upon refresh, the browser tab queries the appliance (via an nginx proxy hosted on the same domain serving appliance-frontend) for its state. We have to store state on the backend (specifically, we use a ConfigMap annotation), so that the appliance can do the right thing if it has been rebooted at any time. This will help power future features such as UI-driven upgrades. The race occurs if, upon refresh, the ingress-facing service has been flipped over to sg-frontend. The appliance API that answered the state questions is no longer available!

In general, we can't tell the difference between this expected turn of events, and a state in which the backend can't be reached. This commit mitigates the race by setting the appliance UI to refresh if it cannot reach the appliance API. This looks no different to a "disconnected" state if things really are broken, but in the expected path, it will resolve the race by retrying.

This commit reliably causes the appliance-driven installation flow to redirect to site-admin after clicking "ready", according to my experimentation in minikube. I suspect that this would be the case even without https://github.com/sourcegraph/sourcegraph/pull/64213, which fixes an unrelated performance issue. I suspect we need both, otherwise the appliance UI will regularly disconnect for prolonged periods of time, which is confusing.

Closes https://linear.app/sourcegraph/issue/REL-308/appliance-frontend-seems-to-disconnect-the-backend-during-installation

Test plan

Manual testing described above

Changelog


Backport e54407d from #64216

@craigfurman craigfurman enabled auto-merge (squash) August 1, 2024 16:51
When the admin has first installed Code Search Suite, the appliance
waits for the admin to click an "I'm ready" button. This causes the
appliance to unblock a background thread that periodically checks the
health of sg-frontend. When it is healthy, it ensures that the
ingress-facing frontend is pointed to sg-frontend. And when it is not,
it points to the appliance-frontend. Pointing to the appliance-frontend
is its initial state pre-install, and given that we've just installed
sg, the appliance switches the service over quickly.

Meanwhile, clicking this button transitions the frontend to a "refresh"
state (this being one of the states in its state machine). This causes
the UI to reload the web page. The reason we have to do this is that it
is a way to "redirect to yourself". If the ingress-facing service has
been repointed, refreshing like this will show site-admin, which is the
desired behavior. The issue this commit fixes, is that this is racy:
upon refresh, the browser tab queries the appliance (via an nginx proxy
hosted on the same domain serving appliance-frontend) for its state. We
have to store state on the backend (specifically, we use a ConfigMap
annotation), so that the appliance can do the right thing if it has been
rebooted at any time. This will help power future features such as
UI-driven upgrades. The race occurs if, upon refresh, the ingress-facing
service has been flipped over to sg-frontend. The appliance API that
answered the state questions is no longer available!

In general, we can't tell the difference between this expected turn of
events, and a state in which the backend can't be reached. This commit
mitigates the race by setting the appliance UI to refresh if it cannot
reach the appliance API. This looks no different to a "disconnected"
state if things really are broken, but in the expected path, it will
resolve the race by retrying.

This commit reliably causes the appliance-driven installation flow to
redirect to site-admin after clicking "ready", according to my
experimentation in minikube. I suspect that this would be the case even
without https://github.com/sourcegraph/sourcegraph/pull/64213, which
fixes an unrelated performance issue. I suspect we need both, otherwise
the appliance UI will regularly disconnect for prolonged periods of
time, which is confusing.

Closes
https://linear.app/sourcegraph/issue/REL-308/appliance-frontend-seems-to-disconnect-the-backend-during-installation

(cherry picked from commit e54407d)
@craigfurman craigfurman force-pushed the backport-64216-to-5.5.x branch from 5b20b05 to bc7740d Compare August 1, 2024 16:58
@craigfurman craigfurman merged commit e1e2029 into 5.5.x Aug 1, 2024
@craigfurman craigfurman deleted the backport-64216-to-5.5.x branch August 1, 2024 17:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants