Add a resilient sidekiq client (liked resilient logged webhooks)#965
Merged
rgalanakis merged 4 commits intomainfrom Jun 9, 2025
Merged
Add a resilient sidekiq client (liked resilient logged webhooks)#965rgalanakis merged 4 commits intomainfrom
rgalanakis merged 4 commits intomainfrom
Conversation
Was using COUNT with LIMIT 1, which doesn't do anything.
There are generally four situations the app can be in when a webhook comes in: - Stable, everything is handled well. - Programming error, which should 500. - Postgres is unavailable; in this case, we use the resilient logged webhooks, which will automatically retry the webhook later. It stores the webhook in another Postgres in the meantime. - Redis in unavailable; in this case, we were 500ing, but because the LoggedWebhook insert succeeded, we would never retry the webhook. This last condition means we didn't make the uptime guarantees we otherwise should be, allowing us to ingest webhooks for processing even when data stores are unavailable. To solve this, we add a new 'resilient' sidekiq client that will push to a Postgres database, the same way logged webhooks write to a Postgres database if the initial insert succeeds. Using the same logic as resilient logged webhooks, we replay these events when Redis becomes available. In this case, because the job is done async, we can still 200 from the webhook handler; it doesn't matter to the caller whether we are processing the job directly via Sidekiq, or through the resilient postgres datastore and then eventually onto Sidekiq.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #965 +/- ##
==========================================
+ Coverage 97.07% 97.57% +0.49%
==========================================
Files 488 490 +2
Lines 31046 30991 -55
==========================================
+ Hits 30139 30240 +101
+ Misses 907 751 -156 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Delete unused code - Cover some easy code missing coverage - Rewrite NotImplemented to single line methods that pass coverage
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a resilient sidekiq client
There are generally four situations the app can be in
when a webhook comes in:
logged webhooks, which will automatically retry the webhook later.
It stores the webhook in another Postgres in the meantime.
but because the LoggedWebhook insert succeeded,
we would never retry the webhook.
This last condition means we didn't make the uptime guarantees
we otherwise should be, allowing us to ingest webhooks for processing
even when data stores are unavailable.
To solve this, we add a new 'resilient' sidekiq client
that will push to a Postgres database, the same way logged webhooks
write to a Postgres database if the initial insert succeeds.
Using the same logic as resilient logged webhooks,
we replay these events when Redis becomes available.
In this case, because the job is done async,
we can still 200 from the webhook handler;
it doesn't matter to the caller whether we are processing the job
directly via Sidekiq, or through the resilient postgres datastore
and then eventually onto Sidekiq.
Refactor LoggedWebhook::Resilient into reusable base class and helper
Fix non-optimal query in
avoid_writes?Was using COUNT with LIMIT 1,
which doesn't do anything.