Scratch changes to fix have_seen_event not being invalidated#13861
Conversation
…vent Fix for #13856 Fixed by calling `_invalidate_caches_for_event` when we persist an event. And an additional fix in `_invalidate_caches_for_event` to make sure it uses the correct cache key. This seems like it would be an easy foot-gun for any `tree=True` cache. Wrong: ```py self.have_seen_event.invalidate((room_id, event_id)) ``` Correct: ```py self.have_seen_event.invalidate(((room_id, event_id),)) ```
| ) | ||
| self._invalidate_local_get_event_cache(event_id) | ||
| self.have_seen_event.invalidate((room_id, event_id)) | ||
| self.have_seen_event.invalidate(((room_id, event_id),)) |
There was a problem hiding this comment.
This key lookup was wrong and we were never invalidating the have_seen_event cache even over replication.
Yes, the cache key literally a set wrapped in set. Something weird with the TreeCache I assume
ex. (('!TnCIJPKzdQdUlIyXdQ:test', '$Iu0eqEBN7qcyF1S9B3oNB3I91v2o5YOgRNPwi_78s-k'),)
LruCache cache values before pop {(('!TnCIJPKzdQdUlIyXdQ:test', '$Iu0eqEBN7qcyF1S9B3oNB3I91v2o5YOgRNPwi_78s-k'),): False}
We should probably check all other instances of this for the same problem. And ideally fix the cache so it uses the expected (room_id, event_id) key tuple instead.
| assert event.internal_metadata.stream_ordering | ||
|
|
||
| relation = relation_from_event(event) | ||
| self.store._invalidate_caches_for_event( |
There was a problem hiding this comment.
This needs to move because we can't touch the cache objects from this background queue thread processing and persisting the events.
Otherwise we run into ValueError: Cache objects can only be accessed from the main thread
synapse_main | 2022-09-21 20:53:42,362 - synapse.http.server - 123 - ERROR - POST-3 - Failed handle request via 'RoomCreateRestServlet': <SynapseRequest at 0x7f0fde7ec160 method='POST' uri='/_matrix/client/v3/createRoom' clientproto='HTTP/1.0' site='8080'>
synapse_main | Traceback (most recent call last):
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 306, in _async_render_wrapper
synapse_main | callback_return = await self._async_render(request)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/http/server.py", line 512, in _async_render
synapse_main | callback_return = await raw_callback_return
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/rest/client/room.py", line 161, in on_POST
synapse_main | info, _ = await self._room_creation_handler.create_room(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/handlers/room.py", line 909, in create_room
synapse_main | ) = await self._send_events_for_new_room(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/handlers/room.py", line 1105, in _send_events_for_new_room
synapse_main | await send(etype=EventTypes.Create, content=creation_content)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/handlers/room.py", line 1081, in send
synapse_main | ) = await self.event_creation_handler.create_and_send_nonmember_event(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/handlers/message.py", line 990, in create_and_send_nonmember_event
synapse_main | ev = await self.handle_new_client_event(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/util/metrics.py", line 113, in measured_func
synapse_main | r = await func(self, *args, **kwargs)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/handlers/message.py", line 1324, in handle_new_client_event
synapse_main | result, _ = await make_deferred_yieldable(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/controllers/persist_events.py", line 287, in handle_queue_loop
synapse_main | ret = await self._per_item_callback(room_id, item.task)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/controllers/persist_events.py", line 359, in _process_event_persist_queue_task
synapse_main | return await self._persist_event_batch(room_id, task)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/controllers/persist_events.py", line 748, in _persist_event_batch
synapse_main | await self.persist_events_store._persist_events_and_state_updates(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/logging/tracing.py", line 851, in _wrapper
synapse_main | return await func(*args, **kwargs) # type: ignore[misc]
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/databases/main/events.py", line 214, in _persist_events_and_state_updates
synapse_main | await self.db_pool.runInteraction(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 881, in runInteraction
synapse_main | return await delay_cancellation(_runInteraction())
synapse_main | File "/usr/local/lib/python3.9/site-packages/twisted/internet/defer.py", line 1692, in _inlineCallbacks
synapse_main | result = context.run(
synapse_main | File "/usr/local/lib/python3.9/site-packages/twisted/python/failure.py", line 518, in throwExceptionIntoGenerator
synapse_main | return g.throw(self.type, self.value, self.tb)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 848, in _runInteraction
synapse_main | result = await self.runWithConnection(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 972, in runWithConnection
synapse_main | return await make_deferred_yieldable(
synapse_main | File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 244, in inContext
synapse_main | result = inContext.theWork() # type: ignore[attr-defined]
synapse_main | File "/usr/local/lib/python3.9/site-packages/twisted/python/threadpool.py", line 260, in <lambda>
synapse_main | inContext.theWork = lambda: context.call( # type: ignore[attr-defined]
synapse_main | File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 117, in callWithContext
synapse_main | return self.currentContext().callWithContext(ctx, func, *args, **kw)
synapse_main | File "/usr/local/lib/python3.9/site-packages/twisted/python/context.py", line 82, in callWithContext
synapse_main | return func(*args, **kw)
synapse_main | File "/usr/local/lib/python3.9/site-packages/twisted/enterprise/adbapi.py", line 282, in _runWithConnection
synapse_main | result = func(conn, *args, **kw)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 965, in inner_func
synapse_main | return func(db_conn, *args, **kwargs)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/database.py", line 710, in new_transaction
synapse_main | r = func(cursor, *args, **kwargs)
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/databases/main/events.py", line 442, in _persist_events_txn
synapse_main | self.store._invalidate_caches_for_event(
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/storage/databases/main/cache.py", line 237, in _invalidate_caches_for_event
synapse_main | self.get_latest_event_ids_in_room.invalidate((room_id,))
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/util/caches/descriptors.py", line 370, in <lambda>
synapse_main | wrapped.invalidate = lambda key: cache.invalidate(key[0])
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/util/caches/deferred_cache.py", line 391, in invalidate
synapse_main | self.check_thread()
synapse_main | File "/usr/local/lib/python3.9/site-packages/synapse/util/caches/deferred_cache.py", line 136, in check_thread
synapse_main | raise ValueError(
synapse_main | ValueError: Cache objects can only be accessed from the main thread
There was a problem hiding this comment.
I could move the cache invalidation to the persist_events and persist_event layer. Is that problematic/flakey if something fails in the middle? 🤔
persist_events or persist_event or update_current_state (not really)
self._event_persist_queue
_process_event_persist_queue_task
_persist_event_batch
_persist_events_and_state_updates
_persist_events_txn
| assert event.internal_metadata.stream_ordering | ||
| # Invalidate related caches after we persist a new event | ||
| relation = relation_from_event(event) | ||
| self.main_store._invalidate_caches_for_event( |
⏩ Moved to #13863
Scratch changes to fix
have_seen_eventnot being invalidatedSee #13856
The problem
_invalidate_caches_for_eventdoesn't run in monolith mode which means we never even tried to clear thehave_seen_eventand other caches. And even in worker mode, it only runs on the workers, not the master (AFAICT).Additionally there is bug with the key being wrong so
_invalidate_caches_for_eventnever invalidates thehave_seen_eventcache even when it does run.Dev notes
Persist events
Replication stuff
https://github.com/matrix-org/synapse/blob/1a1abdda42551dad3aadc04a169c25f4cc651a2c/docs/replication.md
Pull Request Checklist
EventStoretoEventWorkerStore.".code blocks.(run the linters)