Conversation
|
(this speeds up LL by about 2-3x) |
erikjohnston
left a comment
There was a problem hiding this comment.
I've only had a chance to look at the lazy loading bits of this. Can you please split out the fixes to get_filtered_current_state_ids to a separate PR please?
Things look broadly fine, other than its quite confusing trying to keep track of what r is any given situation. Hopefully the suggested changes will help clear things up a bit.
The only major thing is whether we are currently pulling the members out of the database. It feels odd to pull out 6 arbitrary users regardless of membership, while it seems the sync code assumes that e.g. if there are joined users in the room MemberSummary.members will be nonempty
synapse/storage/roommember.py
Outdated
| " AND m.user_id = c.state_key" | ||
| " WHERE c.type = 'm.room.member' AND c.room_id = ?" | ||
| " GROUP BY m.membership" | ||
| ) |
There was a problem hiding this comment.
FYI you can write:
sql = """
SELECT count(*), m.membership FROM room_memberships as m
INNER JOIN current_state_events as c
ON m.event_id = c.event_id
AND m.room_id = c.room_id
AND m.user_id = c.state_key
WHERE c.type = 'm.room.member' AND c.room_id = ?
GROUP BY m.membership
"""rather than faffing around with quotes. It has the added benefit that it makes it much easier to c+p :)
There was a problem hiding this comment.
agreed; this was just trying to be consistent with what was already there.
synapse/storage/roommember.py
Outdated
| txn.execute(sql, (room_id,)) | ||
| res = {} | ||
| for r in txn: | ||
| summary = res.setdefault(to_ascii(r[1]), MemberSummary([], r[0])) |
There was a problem hiding this comment.
It'd be nicer to write
for count, membership in txn:
...so that its easier to verify that things match the intent
synapse/storage/roommember.py
Outdated
| dict of membership states, pointing to a MemberSummary named tuple. | ||
| """ | ||
|
|
||
| def f(txn): |
There was a problem hiding this comment.
Please name these something like _get_room_summary_txn or whatever, it makes stack traces a lot easier to understand
There was a problem hiding this comment.
(this was following the pattern already present, fwiw)
| txn.execute(sql, (room_id, 6)) | ||
| for r in txn: | ||
| summary = res.get(to_ascii(r[1])) | ||
| members = summary.members |
There was a problem hiding this comment.
Won't this explode if summary is None?
There was a problem hiding this comment.
I suppose it'll definitely not be None due the shape of the queries, but then we should probably not use get to make the intent a bit clearer.
synapse/handlers/sync.py
Outdated
| Membership.BAN | ||
| ): | ||
| for r in details.get(m, ([], 0))[0]: | ||
| member_ids[r[0]] = r[1] |
There was a problem hiding this comment.
I'm really not following what this is doing. Can we choose variable names and/or unpack tuples to help clarity please?
Unpacking this a bit seems to give:
for user_id, event_id in details.get(membership, empty_ms).members:
member_ids[user_id] = event_idI think?
|
|
||
| # "members" points to a truncated list of (user_id, event_id) tuples for users of | ||
| # a given membership type, suitable for use in calculating heroes for a room. | ||
| # "count" points to the total numberr of users of a given membership type. |
There was a problem hiding this comment.
Is there a reason that members is a list of tuples rather than a dict?
There was a problem hiding this comment.
yes; i was worried about creating a new large cache in addition to the one we already maintain of room members, and so wanted it to be small (i'm assuming that tuples take up less room as they don't include key strings).
There was a problem hiding this comment.
(i'm assuming that tuples take up less room as they don't include key strings)
Well this tuple is including the same number of strings, right? It might save a bit by not including the hash map I guess
There was a problem hiding this comment.
If we're worried about the size of the cache then what we can do is have get_room_summary return an object with a custom __len__ implementation that returns the sum of the sizes of MemberSummary.members. This would ensure that the total number of tuples stored by the cache is less than the given cache size (as opposed to the number of room summaries being less than the given cache size)
synapse/handlers/sync.py
Outdated
| # XXX: this may include false positives in the form of LL | ||
| # members which have snuck into state | ||
| batch.limited and | ||
| any(t == EventTypes.Member for (t, k) in state.keys()) |
There was a problem hiding this comment.
Please don't use .keys() or .items() etc, they are expensive in py2 due to them copying the data into a list. Either use six.iteritems etc to get iterators, or in this case simply drop .keys() as iterating over a dict iterates over the keys anyway.
| any(t == EventTypes.Member for (t, k) in state.keys()) | ||
| ) or | ||
| since_token is None | ||
| ) |
There was a problem hiding this comment.
Also, can this have a comment to explain why we're not always including the summary and when we do include a summary? The logic is non-trivial and its hard to know whether its right or not without a guiding comment.
done, it's back in #3792
Right. i guess it needs an |
|
@erikjohnston ptal |
|
i got an IRL lgtm from @erikjohnston on this. i think. |
|
n.b. this is now twinned with matrix-org/sytest#489 |
erikjohnston
left a comment
There was a problem hiding this comment.
Modulo perhaps wanting to further pin down the ordering of heroes
|
the failed test here seems to be unrelated; merging anyway. |
Features -------- - Python 3.5 and 3.6 support is now in beta. ([\#3576](#3576)) - Implement `event_format` filter param in `/sync` ([\#3790](#3790)) - Add synapse_admin_mau:registered_reserved_users metric to expose number of real reaserved users ([\#3846](#3846)) Bugfixes -------- - Remove connection ID for replication prometheus metrics, as it creates a large number of new series. ([\#3788](#3788)) - guest users should not be part of mau total ([\#3800](#3800)) - Bump dependency on pyopenssl 16.x, to avoid incompatibility with recent Twisted. ([\#3804](#3804)) - Fix existing room tags not coming down sync when joining a room ([\#3810](#3810)) - Fix jwt import check ([\#3824](#3824)) - fix VOIP crashes under Python 3 (#3821) ([\#3835](#3835)) - Fix manhole so that it works with latest openssh clients ([\#3841](#3841)) - Fix outbound requests occasionally wedging, which can result in federation breaking between servers. ([\#3845](#3845)) - Show heroes if room name/canonical alias has been deleted ([\#3851](#3851)) - Fix handling of redacted events from federation ([\#3859](#3859)) - ([\#3874](#3874)) - Mitigate outbound federation randomly becoming wedged ([\#3875](#3875)) Internal Changes ---------------- - CircleCI tests now run on the potential merge of a PR. ([\#3704](#3704)) - http/ is now ported to Python 3. ([\#3771](#3771)) - Improve human readable error messages for threepid registration/account update ([\#3789](#3789)) - Make /sync slightly faster by avoiding needless copies ([\#3795](#3795)) - handlers/ is now ported to Python 3. ([\#3803](#3803)) - Limit the number of PDUs/EDUs per federation transaction ([\#3805](#3805)) - Only start postgres instance for postgres tests on Travis CI ([\#3806](#3806)) - tests/ is now ported to Python 3. ([\#3808](#3808)) - crypto/ is now ported to Python 3. ([\#3822](#3822)) - rest/ is now ported to Python 3. ([\#3823](#3823)) - add some logging for the keyring queue ([\#3826](#3826)) - speed up lazy loading by 2-3x ([\#3827](#3827)) - Improved Dockerfile to remove build requirements after building reducing the image size. ([\#3834](#3834)) - Disable lazy loading for incremental syncs for now ([\#3840](#3840)) - federation/ is now ported to Python 3. ([\#3847](#3847)) - Log when we retry outbound requests ([\#3853](#3853)) - Removed some excess logging messages. ([\#3855](#3855)) - Speed up purge history for rooms that have been previously purged ([\#3856](#3856)) - Refactor some HTTP timeout code. ([\#3857](#3857)) - Fix running merged builds on CircleCI ([\#3858](#3858)) - Fix typo in replication stream exception. ([\#3860](#3860)) - Add in flight real time metrics for Measure blocks ([\#3871](#3871)) - Disable buffering and automatic retrying in treq requests to prevent timeouts. ([\#3872](#3872)) - mention jemalloc in the README ([\#3877](#3877)) - Remove unmaintained "nuke-room-from-db.sh" script ([\#3888](#3888))
Rather than pulling room member events out of state storage to calculate room summaries, instead generate it from the room_memberships table.