Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

Prefill and invalidate getEvent cache are happening out of order when persisting transactions. #15757

@realtyem

Description

@realtyem

While working on learning how the _get_event_cache work, I spotted a strange condition of invalidation and prefill being added to a transaction to be run after the transaction was finished, but it was running the prefill, then invalidating it. Subsequently, it would just re-get the event the next time it was retrieved from the database.

In PersistEventsStore.persist_event_txn():

# Once the txn completes, invalidate all of the relevant caches. Note that we do this
# up here because it captures all the events_and_contexts before any are removed.
for event, _ in events_and_contexts:
self.store.invalidate_get_event_cache_after_txn(txn, event.event_id)
if event.redacts:
self.store.invalidate_get_event_cache_after_txn(txn, event.redacts)

Which uses txn.call_after() and txn.async_call_after() under the hood.

Lower down(inside the _update_metadata_tables_txn() which then calls add_to_cache()) near the end, there is a database read of the events and then the prefill bits(which are inside add_to_cache()):

async def prefill() -> None:
for cache_entry in to_prefill:
await self.store._get_event_cache.set(
(cache_entry.event.event_id,), cache_entry
)
txn.async_call_after(prefill)

This uses txn.async_call_after() (as you can see at the bottom)

The docustring for LoggingTransaction has the information on what happens with txn.call_after() and txn.async_call_after():

after_callbacks: A list that callbacks will be appended to
that have been added by `call_after` which should be run on
successful completion of the transaction. None indicates that no
callbacks should be allowed to be scheduled to run.
async_after_callbacks: A list that asynchronous callbacks will be appended
to by `async_call_after` which should run, before after_callbacks, on
successful completion of the transaction. None indicates that no
callbacks should be allowed to be scheduled to run.

...which summarizes nicely as 'async_call_after() runs before call_after()'.
So:

  1. The invalidation of events is added with call_after() and async_call_after()
  2. The prefill is added with async_call_after()
  3. The transaction finishes
  4. The async invalidation runs, purging the event from the cache
  5. The prefill runs(because it's async, and those run before the non-async)
  6. The non-async invalidation runs, purging the cache of the event we just loaded.
  7. ...time passes(probably not much)
  8. The event is reloaded from the database, because it's not in the cache.

Which, I think, means it's hitting the database for a read twice as often as necessary.

Bonus points attempt:
These were added to assist in the preparatory work for an external caching system, and for that reason I want to take that into account when looking at this. The external system will require the code to run async/await, so this will have to be maintained in any fix. I recommend an additional method to be added to AsyncLruCache that will explicitly handle setting a value for the external cache. However that isn't actually implemented yet(or upstreamed, whatever), so will no-op just as get_external does now.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions