Idempotent counter caches, fix concurrency issues with counter caches#14735
Idempotent counter caches, fix concurrency issues with counter caches#14735tenderlove merged 2 commits intorails:masterfrom
Conversation
There was a problem hiding this comment.
This will cause uninitialized variable warnings.
There was a problem hiding this comment.
Not if called only in after_destroy, which is the case here since this method is supposed to remain purely private.
I could actually access the variable directly.
Anyway, as I said, I think it would be better to integrate counter cache code in the CounterCache module instead of using callbacks.
|
@tenderlove beside the initialized variable thing, are you interested by this PR? Should I go ahead? |
|
Have you thought about adding the expected cache value to the UPDATE "topics"
SET "replies_count" = COALESCE("replies_count", 0) - 1
WHERE "topics"."id" = 7But we could change it to something like this: UPDATE "topics"
SET "replies_count" = COALESCE("replies_count", 0) - 1
WHERE "topics"."id" = 7
AND COALESCE("replies_count", 0) = 10If something else had updated the count, then this query would fail. Also it looks like we have access to the expected count, we just don't use it. Here is a non-working patch proof of concept to show you the gist. WDYT? |
|
Also, I am obligated to say that if you don't make your apps acquire a lock when updating the count, it doesn't matter what Ruby changes we make, it's possible for the count to be incorrect. |
|
With all due respect what you propose is a terrible idea. The
|
Don't forget that the destroy and the decrement happen in a transaction. So when you destroy the record it implicitly lock it. If another transaction try to destroy the same record while the first one is not committed, it will block. |
That escalated quickly.
Ah, makes sense. This should be fine then I think. |
Sorry :/ I didn't know how to say it...
For what it's worth we used to have thousands of negatives counters, not mentioning the just slightly desynchronized ones. So far with this monkey patch our monitoring task did not detected a single desynchronized counter in 3 months. Also what do you think of only fixing the destroy? Should I fix the update too even if it mean one more SQL query per counter cache? Or is it not acceptable? And how would you feel about moving the code out of the callbacks? |
Awesome, I'll merge this patch in.
Honestly, we don't use the counter caches at work, so I'm not 100% sure. It seems fine to me, will it be OK for your systems?
👍 |
Idempotent counter caches, fix concurrency issues with counter caches
|
Woo! Awesome! Thanks! ❤️
We never update the foreign key, so we don't really care about fixing this part... But my conscience say that we should not fix only half of the feature... I'll submit one or 2 other PRs to refactor counter cache and maybe fix the update too then. Thanks again! |
Previously we have just overridden the behavior of `ActiveRecord::Persistence#destroy_row` to apply soft-delete functionality. This does not work, since Rails depends on inheritance to add extra functionality to `#destroy_row`. In Rails 4.2, this includes `ActiveRecord::CounterCache#destroy_row` calling `super`, as well as `ActiveRecord::Locking::Optimistic#destroy_row` also calling `super`. See also rails/rails#14735 rails/rails#14765
Fixes #7965 among other that I remember but can't find anymore.
Problem
In SQL a
DELETEstatement is inherently idempotent. You can repeat it indefinitely, it will always succeed.On the other hand counter cache decrementation is not. So if you have 2 process following this scenario:
Then the counter cache is decremented twice.
Solution
This patch make the decrementation on
after_destroyso it can check theaffected_rowsvalue returned by the database to only perform the decrementation if a record was actually deleted.We (Shopify) are running a monkey patch version of this PR since 3 months now, and so far the problem disappeared.
Implementation
So far I'm not very happy with my implementation, but I think the only way to make it better is to make the counter cache logic happen in
destroy_row,insert_rowetc, instead of doing it in callbacks.But it mean moving a lot of code, so I prefer to validates the idea before attempting it.
Non addressed issues
The counter update that happen when you change the foreign key, is subject to the same issue, but way harder to fix using this pattern. It would mean updating foreign key in distinct
UPDATEqueries before the main update, and checking theaffected_rowsfor each before incrementing / decrementing the counters.So it's feasible but again I prefer to validate the idea before attempting it.
/cc @ArthurN @jonleighton @tenderlove : What do you think?