Skip to content

Make the Postgres healthchecks more lenient#824

Merged
chrismwendt merged 3 commits into
masterfrom
more-lenient-postgres-healthcheck
Jun 2, 2022
Merged

Make the Postgres healthchecks more lenient#824
chrismwendt merged 3 commits into
masterfrom
more-lenient-postgres-healthcheck

Conversation

@chrismwendt

Copy link
Copy Markdown
Contributor

See sourcegraph/deploy-sourcegraph#4136

Checklist

Test plan

N/A

Comment thread docker-compose/docker-compose.yaml Outdated
interval: 10s
timeout: 1s
retries: 10
retries: 360

@caugustus-sourcegraph caugustus-sourcegraph Jun 1, 2022

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the health of a running container can fail for a full hour, is this healthcheck really even doing anything valuable? Perhaps modifying the start_period is a better option here, if this is intended to address the same slow startup issue as in sourcegraph/deploy-sourcegraph#4136.

https://docs.docker.com/engine/reference/builder/#healthcheck

start period provides initialization time for containers that need time to bootstrap. Probe failure during that period will not be counted towards the maximum number of retries. However, if a health check succeeds during the start period, the container is considered started and all consecutive failures will be counted towards the maximum number of retries.

An hour feels too long here (and on the Kubernetes startup probe), but I don't have any hard data to gauge typical recovery startup times. It might not make any difference - most of the the non-OOM failures we see aren't recoverable from a restart (example: bad file system permissions).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 switched to start_period.

Comment thread docker-compose/docker-compose.yaml Outdated
interval: 10s
timeout: 1s
retries: 10
retries: 360

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that codeinsights-db isn't included in this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated codeinsights-db, too.

@chrismwendt chrismwendt merged commit 4bcdf13 into master Jun 2, 2022
@chrismwendt chrismwendt deleted the more-lenient-postgres-healthcheck branch June 2, 2022 02:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants