Fix missing sync events during historical batch imports#12319
Fix missing sync events during historical batch imports#12319erikjohnston merged 14 commits intomatrix-org:developfrom
Conversation
This can be used to flip betweek topological ordering (the default) and stream ordering as needed by the caller.
fb38e08 to
e974d6c
Compare
| end_token=stream_position.room_key, | ||
| limit=1, | ||
| order_by="stream", | ||
| ) |
This aligns with automatic selection of bounds depending on the token passed into the pagination function.
order_by argument|
@MadLittleMods I'm a bit confused on next steps here - this seems to have unearthed another (complex looking) bug I don't have time (currently) to work on a fix for. My suggestion would be to roll back to the explicit argument fix which resolves this issue and create a new issue to tackle the |
@Fizzadar Go for it 👍 We can get some feedback from the Synapse team whether that is acceptable |
…rovided Part of #12281 Context: #12319 (comment)
As an update, I've fixed up the other complex bug via #12370 So we can probably leave this PR as-is. I've assigned it back to the Synapse core for review ⏩ |
|
|
||
| assert ( | ||
| "m.room.create" in state_event_types | ||
| ), "Missing room full state in sync response" |
There was a problem hiding this comment.
Tested and fails on develop but passes with this PR ✅ (as expected)
|
Thanks @Fizzadar and @MadLittleMods for tracking this down! My main concern is how this changes clients back paginating using a I don't think that is a change we really want to make. My understanding is that that behaviour works correctly currently? Or does that also have issues with backfilled/historical events? |
| @@ -175,10 +175,6 @@ async def get_state_events( | |||
| state_filter = state_filter or StateFilter.all() | |||
There was a problem hiding this comment.
My main concern is how this changes clients back paginating using a
prev_batchtoken from a/synctimelinesection. Those tokens are currently stream tokens, but/messageswill return them in topological order with a topological next token. This change means that the first/messageswill return events in stream order but will still return a topological next token, meaning that the next/messageswill return events in a topological order.
/messages could be updated to always return in topological_ordering regardless of what type of pagination token is given. That's what it was doing before this PR anyway.
I don't think that is a change we really want to make. My understanding is that that behaviour works correctly currently? Or does that also have issues with backfilled/historical events?
I think we have two assumptions that we should probably enforce:
/messagesshould always sort bytopological_orderingregardless of pagination token given/syncshould always sort bystream_orderingregardless of pagination token given
This also aligns with what we have documented:
/syncreturns things in the order they arrive at the server (stream_ordering)./messages(and/backfillin the federation API) return them in the order determined by the event graph(topological_ordering, stream_ordering).
The current behavior of /messages and /sync returns events as expected but not state.
There is a bug in how incremental /sync returns state which this PR aims to fix. The best way to understand the problem is probably reading #12281 (comment) - but basically when calculating what state to return, /sync is ordering events by topological_ordering even though it should be stream_ordering.
I initially assumed we would want flexibility in how these endpoints sorted events according to the type of pagination token given but it's sounding like we actually want to enforce a given sort according to the endpoint. In which case, we can revert to @Fizzadar initial approach of plumbing a order_by argument which we can set.
There was a problem hiding this comment.
(Sorry for not getting to this yesterday, it sat at about the second thing on my todo list all day :/)
Thanks for the clarification. There's the added bonus that /sync will return events in pagination order for the first batch of events for the room, i.e. in an initial sync or when a user joins the room. (The rationale being that it's equivalent to getting no events then immediately doing a back pagination). This is relevant as we use get_recent_events_for_room for that case as well.
I initially assumed we would want flexibility in how these endpoints sorted events according to the type of pagination token given but it's sounding like we actually want to enforce a given sort according to the endpoint. In which case, we can revert to @Fizzadar initial approach of plumbing a
order_byargument which we can set.
I wonder if we're just making this harder for ourselves by re-using the same query to do both /messages and when looking up the state for /sync? I think the query we want to run is simply:
SELECT event_id FROM events
WHERE room_id = ? AND stream_ordering <= ?
ORDER BY stream_ordering DESC
LIMIT 1So it might be simpler to just have a separate get_last_event_in_room_before function that we call when getting the state instead? Rather than extending the already slightly horrific _paginate_room_events_txn function?
Entirely separately, and I'm not suggesting we necessarily do this now, but it occurs to me that we have a get_forward_extremities_for_room_at_stream_ordering store function that maps from room_id to the list of forward extremities in the room at the time. This can then be used to calculate the "current" state of the room at the time more accurately than the current method. The downside of this approach is that get_forward_extremities_for_room_at_stream_ordering will fail if the stream ordering is too old.
Another approach that has also literally just now occurred to me is that we could use the current_state_delta_stream to work out the changes in state between two stream orderings in the room, as the stream_id column correspond to the stream orderings of the changes (I think?).
Anyway, just wanted to record my thoughts here before I forget them
There was a problem hiding this comment.
@Fizzadar sorry for going around the houses really slowly on this 😞
There was a problem hiding this comment.
No worries at all @erikjohnston - totally agree a separate function makes sense here; I've pushed that up in 9a78d14. I just wrapped an existing function get_room_event_before_stream_ordering (which doesn't actually return an event) and pulled out the event.
Have undone the pagination ordering changes too!
There was a problem hiding this comment.
Thank you! Will have a look after I've finished doing the v1.57.0rc1 release ❤️
This uses an existing method, `get_last_event_in_room_before_stream_ordering`, to retrieve the most recent (stream ordered) event in a room and replaces uses of the complex pagination txn.
2a5ee07 to
8e319cc
Compare
Co-authored-by: Erik Johnston <erikj@jki.re>
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
|
Thank you again for bearing with us @Fizzadar ❤️ |
Synapse 1.58.0rc1 (2022-04-26) ============================== As of this release, the groups/communities feature in Synapse is now disabled by default. See [\#11584](matrix-org/synapse#11584) for details. As mentioned in [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1580), this feature will be removed in Synapse 1.61. Features -------- - Implement [MSC3383](matrix-org/matrix-spec-proposals#3383) for including the destination in server-to-server authentication headers. Contributed by @Bubu and @jcgruenhage for Famedly. ([\#11398](matrix-org/synapse#11398)) - Docker images and Debian packages from matrix.org now contain a locked set of Python dependencies, greatly improving build reproducibility. ([Board](https://github.com/orgs/matrix-org/projects/54), [\#11537](matrix-org/synapse#11537)) - Enable processing of device list updates asynchronously. ([\#12365](matrix-org/synapse#12365), [\#12465](matrix-org/synapse#12465)) - Implement [MSC2815](matrix-org/matrix-spec-proposals#2815) to allow room moderators to view redacted event content. Contributed by @tulir. ([\#12427](matrix-org/synapse#12427)) - Build Debian packages for Ubuntu 22.04 "Jammy Jellyfish". ([\#12543](matrix-org/synapse#12543)) Bugfixes -------- - Prevent a sync request from removing a user's busy presence status. ([\#12213](matrix-org/synapse#12213)) - Fix bug with incremental sync missing events when rejoining/backfilling. Contributed by Nick @ Beeper. ([\#12319](matrix-org/synapse#12319)) - Fix a long-standing bug which incorrectly caused `GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}` to return edited events rather than the original. ([\#12476](matrix-org/synapse#12476)) - Fix a bug introduced in Synapse 1.27.0 where the admin API for [deleting forward extremities](https://github.com/matrix-org/synapse/blob/erikj/fix_delete_event_response_count/docs/admin_api/rooms.md#deleting-forward-extremities) would always return a count of 1, no matter how many extremities were deleted. ([\#12496](matrix-org/synapse#12496)) - Fix a long-standing bug where the image thumbnails embedded into email notifications were broken. ([\#12510](matrix-org/synapse#12510)) - Fix a bug in the implementation of [MSC3202](matrix-org/matrix-spec-proposals#3202) where Synapse would use the field name `device_unused_fallback_keys`, rather than `device_unused_fallback_key_types`. ([\#12520](matrix-org/synapse#12520)) - Fix a bug introduced in Synapse 0.99.3 which could cause Synapse to consume large amounts of RAM when back-paginating in a large room. ([\#12522](matrix-org/synapse#12522)) Improved Documentation ---------------------- - Fix rendering of the documentation site when using the 'print' feature. ([\#12340](matrix-org/synapse#12340)) - Add a manual documenting config file options. ([\#12368](matrix-org/synapse#12368), [\#12527](matrix-org/synapse#12527)) - Update documentation to reflect that both the `run_background_tasks_on` option and the options for moving stream writers off of the main process are no longer experimental. ([\#12451](matrix-org/synapse#12451)) - Update worker documentation and replace old `federation_reader` with `generic_worker`. ([\#12457](matrix-org/synapse#12457)) - Strongly recommend [Poetry](https://python-poetry.org/) for development. ([\#12475](matrix-org/synapse#12475)) - Add some example configurations for workers and update architectural diagram. ([\#12492](matrix-org/synapse#12492)) - Fix a broken link in `README.rst`. ([\#12495](matrix-org/synapse#12495)) - Add HAProxy delegation example with CORS headers to docs. ([\#12501](matrix-org/synapse#12501)) - Remove extraneous comma in User Admin API's device deletion section so that the example JSON is actually valid and works. Contributed by @olmari. ([\#12533](matrix-org/synapse#12533)) Deprecations and Removals ------------------------- - The groups/communities feature in Synapse is now disabled by default. ([\#12344](matrix-org/synapse#12344)) - Remove unstable identifiers from [MSC3440](matrix-org/matrix-spec-proposals#3440). ([\#12382](matrix-org/synapse#12382)) Internal Changes ---------------- - Preparation for faster-room-join work: start a background process to resynchronise the room state after a room join. ([\#12394](matrix-org/synapse#12394)) - Preparation for faster-room-join work: Implement a tracking mechanism to allow functions to wait for full room state to arrive. ([\#12399](matrix-org/synapse#12399)) - Remove an unstable identifier from [MSC3083](matrix-org/matrix-spec-proposals#3083). ([\#12395](matrix-org/synapse#12395)) - Run CI in the locked [Poetry](https://python-poetry.org/) environment, and remove corresponding `tox` jobs. ([\#12425](matrix-org/synapse#12425), [\#12434](matrix-org/synapse#12434), [\#12438](matrix-org/synapse#12438), [\#12441](matrix-org/synapse#12441), [\#12449](matrix-org/synapse#12449), [\#12478](matrix-org/synapse#12478), [\#12514](matrix-org/synapse#12514), [\#12472](matrix-org/synapse#12472)) - Change Mutual Rooms' `unstable_features` flag to `uk.half-shot.msc2666.mutual_rooms` which matches the current iteration of [MSC2666](matrix-org/matrix-spec-proposals#2666). ([\#12445](matrix-org/synapse#12445)) - Fix typo in the release script help string. ([\#12450](matrix-org/synapse#12450)) - Fix a minor typo in the Debian changelogs generated by the release script. ([\#12497](matrix-org/synapse#12497)) - Reintroduce the list of targets to the linter script, to avoid linting unwanted local-only directories during development. ([\#12455](matrix-org/synapse#12455)) - Limit length of `device_id` to less than 512 characters. ([\#12454](matrix-org/synapse#12454)) - Dockerfile-workers: reduce the amount we install in the image. ([\#12464](matrix-org/synapse#12464)) - Dockerfile-workers: give the master its own log config. ([\#12466](matrix-org/synapse#12466)) - complement-synapse-workers: factor out separate entry point script. ([\#12467](matrix-org/synapse#12467)) - Back out experimental implementation of [MSC2314](matrix-org/matrix-spec-proposals#2314). ([\#12474](matrix-org/synapse#12474)) - Fix grammatical error in federation error response when the room version of a room is unknown. ([\#12483](matrix-org/synapse#12483)) - Remove unnecessary configuration overrides in tests. ([\#12511](matrix-org/synapse#12511)) - Refactor the relations code for clarity. ([\#12519](matrix-org/synapse#12519)) - Add type hints so `docker` and `stubs` directories pass `mypy --disallow-untyped-defs`. ([\#12528](matrix-org/synapse#12528)) - Update `delay_cancellation` to accept any awaitable, rather than just `Deferred`s. ([\#12468](matrix-org/synapse#12468)) - Handle cancellation in `EventsWorkerStore._get_events_from_cache_or_db`. ([\#12529](matrix-org/synapse#12529))
Synapse 1.58.0 (2022-05-03) =========================== As of this release, the groups/communities feature in Synapse is now disabled by default. See [\#11584](matrix-org/synapse#11584) for details. As mentioned in [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1580), this feature will be removed in Synapse 1.61. No significant changes since 1.58.0rc2. Synapse 1.58.0rc2 (2022-04-26) ============================== This release candidate fixes bugs related to Synapse 1.58.0rc1's logic for handling device list updates. Bugfixes -------- - Fix a bug introduced in Synapse 1.58.0rc1 where the main process could consume excessive amounts of CPU and memory while handling sentry logging failures. ([\#12554](matrix-org/synapse#12554)) - Fix a bug introduced in Synapse 1.58.0rc1 where opentracing contexts were not correctly sent to whitelisted remote servers with device lists updates. ([\#12555](matrix-org/synapse#12555)) Internal Changes ---------------- - Reduce unnecessary work when handling remote device list updates. ([\#12557](matrix-org/synapse#12557)) Synapse 1.58.0rc1 (2022-04-26) ============================== Features -------- - Implement [MSC3383](matrix-org/matrix-spec-proposals#3383) for including the destination in server-to-server authentication headers. Contributed by @Bubu and @jcgruenhage for Famedly. ([\#11398](matrix-org/synapse#11398)) - Docker images and Debian packages from matrix.org now contain a locked set of Python dependencies, greatly improving build reproducibility. ([Board](https://github.com/orgs/matrix-org/projects/54), [\#11537](matrix-org/synapse#11537)) - Enable processing of device list updates asynchronously. ([\#12365](matrix-org/synapse#12365), [\#12465](matrix-org/synapse#12465)) - Implement [MSC2815](matrix-org/matrix-spec-proposals#2815) to allow room moderators to view redacted event content. Contributed by @tulir @ Beeper. ([\#12427](matrix-org/synapse#12427)) - Build Debian packages for Ubuntu 22.04 "Jammy Jellyfish". ([\#12543](matrix-org/synapse#12543)) Bugfixes -------- - Prevent a sync request from removing a user's busy presence status. ([\#12213](matrix-org/synapse#12213)) - Fix bug with incremental sync missing events when rejoining/backfilling. Contributed by Nick @ Beeper. ([\#12319](matrix-org/synapse#12319)) - Fix a long-standing bug which incorrectly caused `GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}` to return edited events rather than the original. ([\#12476](matrix-org/synapse#12476)) - Fix a bug introduced in Synapse 1.27.0 where the admin API for [deleting forward extremities](https://github.com/matrix-org/synapse/blob/erikj/fix_delete_event_response_count/docs/admin_api/rooms.md#deleting-forward-extremities) would always return a count of 1, no matter how many extremities were deleted. ([\#12496](matrix-org/synapse#12496)) - Fix a long-standing bug where the image thumbnails embedded into email notifications were broken. ([\#12510](matrix-org/synapse#12510)) - Fix a bug in the implementation of [MSC3202](matrix-org/matrix-spec-proposals#3202) where Synapse would use the field name `device_unused_fallback_keys`, rather than `device_unused_fallback_key_types`. ([\#12520](matrix-org/synapse#12520)) - Fix a bug introduced in Synapse 0.99.3 which could cause Synapse to consume large amounts of RAM when back-paginating in a large room. ([\#12522](matrix-org/synapse#12522)) Improved Documentation ---------------------- - Fix rendering of the documentation site when using the 'print' feature. ([\#12340](matrix-org/synapse#12340)) - Add a manual documenting config file options. ([\#12368](matrix-org/synapse#12368), [\#12527](matrix-org/synapse#12527)) - Update documentation to reflect that both the `run_background_tasks_on` option and the options for moving stream writers off of the main process are no longer experimental. ([\#12451](matrix-org/synapse#12451)) - Update worker documentation and replace old `federation_reader` with `generic_worker`. ([\#12457](matrix-org/synapse#12457)) - Strongly recommend [Poetry](https://python-poetry.org/) for development. ([\#12475](matrix-org/synapse#12475)) - Add some example configurations for workers and update architectural diagram. ([\#12492](matrix-org/synapse#12492)) - Fix a broken link in `README.rst`. ([\#12495](matrix-org/synapse#12495)) - Add HAProxy delegation example with CORS headers to docs. ([\#12501](matrix-org/synapse#12501)) - Remove extraneous comma in User Admin API's device deletion section so that the example JSON is actually valid and works. Contributed by @olmari. ([\#12533](matrix-org/synapse#12533)) Deprecations and Removals ------------------------- - The groups/communities feature in Synapse is now disabled by default. ([\#12344](matrix-org/synapse#12344)) - Remove unstable identifiers from [MSC3440](matrix-org/matrix-spec-proposals#3440). ([\#12382](matrix-org/synapse#12382)) Internal Changes ---------------- - Preparation for faster-room-join work: start a background process to resynchronise the room state after a room join. ([\#12394](matrix-org/synapse#12394)) - Preparation for faster-room-join work: Implement a tracking mechanism to allow functions to wait for full room state to arrive. ([\#12399](matrix-org/synapse#12399)) - Remove an unstable identifier from [MSC3083](matrix-org/matrix-spec-proposals#3083). ([\#12395](matrix-org/synapse#12395)) - Run CI in the locked [Poetry](https://python-poetry.org/) environment, and remove corresponding `tox` jobs. ([\#12425](matrix-org/synapse#12425), [\#12434](matrix-org/synapse#12434), [\#12438](matrix-org/synapse#12438), [\#12441](matrix-org/synapse#12441), [\#12449](matrix-org/synapse#12449), [\#12478](matrix-org/synapse#12478), [\#12514](matrix-org/synapse#12514), [\#12472](matrix-org/synapse#12472)) - Change Mutual Rooms' `unstable_features` flag to `uk.half-shot.msc2666.mutual_rooms` which matches the current iteration of [MSC2666](matrix-org/matrix-spec-proposals#2666). ([\#12445](matrix-org/synapse#12445)) - Fix typo in the release script help string. ([\#12450](matrix-org/synapse#12450)) - Fix a minor typo in the Debian changelogs generated by the release script. ([\#12497](matrix-org/synapse#12497)) - Reintroduce the list of targets to the linter script, to avoid linting unwanted local-only directories during development. ([\#12455](matrix-org/synapse#12455)) - Limit length of `device_id` to less than 512 characters. ([\#12454](matrix-org/synapse#12454)) - Dockerfile-workers: reduce the amount we install in the image. ([\#12464](matrix-org/synapse#12464)) - Dockerfile-workers: give the master its own log config. ([\#12466](matrix-org/synapse#12466)) - complement-synapse-workers: factor out separate entry point script. ([\#12467](matrix-org/synapse#12467)) - Back out experimental implementation of [MSC2314](matrix-org/matrix-spec-proposals#2314). ([\#12474](matrix-org/synapse#12474)) - Fix grammatical error in federation error response when the room version of a room is unknown. ([\#12483](matrix-org/synapse#12483)) - Remove unnecessary configuration overrides in tests. ([\#12511](matrix-org/synapse#12511)) - Refactor the relations code for clarity. ([\#12519](matrix-org/synapse#12519)) - Add type hints so `docker` and `stubs` directories pass `mypy --disallow-untyped-defs`. ([\#12528](matrix-org/synapse#12528)) - Update `delay_cancellation` to accept any awaitable, rather than just `Deferred`s. ([\#12468](matrix-org/synapse#12468)) - Handle cancellation in `EventsWorkerStore._get_events_from_cache_or_db`. ([\#12529](matrix-org/synapse#12529))
Discovered after much in-depth investigation in #12281. Adding an explicit
order_byargument to change the ordering of results returned from room pagination queries and updating use in the sync & message handlers.Closes: #12281
Closes: #3305
Signed off by: Nick Mills-Barrett nick@beeper.com
Pull Request Checklist
(run the linters)