Bulk-invalidate e2e cached queries after claiming keys#16613
Bulk-invalidate e2e cached queries after claiming keys#16613DMRobertson merged 15 commits intodevelopfrom
Conversation
Missed in #7436
The order of the keys much match the order of each values iterable, but collections have no ordering guarantees.
gold star please
| for keys in key_tuples: | ||
| txn.call_after(cache_func.invalidate, keys) |
There was a problem hiding this comment.
@DMRobertson postulated if it would be better to do:
| for keys in key_tuples: | |
| txn.call_after(cache_func.invalidate, keys) | |
| def invalidate(): | |
| for keys in key_tuples: | |
| cache_func.invalidate(keys) | |
| txn.call_after(invalidate) |
But I don't think it matters much since txn_call_after just calls them in a loop anyway? Maybe one way would use less memory. 🤷
There was a problem hiding this comment.
Should we update the non-bulk versions of these to actually require tuples?
There was a problem hiding this comment.
I was wondering if we should actually be requiring List[str], since the keys column is []text. It'd save a conversion tuple->list here and there? 🤷
There was a problem hiding this comment.
I think tuples are used though because they're immutable so is what is used in the cache keys?
There was a problem hiding this comment.
ahh, that makes sense!
I'd be in favour of specifying Tuple[str, ...], but in another PR.
There was a problem hiding this comment.
Do they have to be strings? I'd be kind of surprised if we don't have other types in there?
Anyway, yes sounds like a different PR.
| # TODO Can we call this for just the last position or somehow batch | ||
| # _add_persisted_position. |
There was a problem hiding this comment.
I think this is a question for @erikjohnston -- _add_persisted_position seems fairly heavy and if we could optimize what we call it with (or take multiple IDs at once) I think that'd be a win.
There was a problem hiding this comment.
I think we certainly could make make it take multiple IDs? It does expect to be called with every position (though I don't think anything would break, just be less efficient)
There was a problem hiding this comment.
I guess if it has to be called w/ every position maybe it isn't a huge deal...
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
|
|
||
| def _mark_id_as_finished(self, next_id: int) -> None: | ||
| """The ID has finished being processed so we should advance the | ||
| def get_next_mult_txn(self, txn: LoggingTransaction, n: int) -> List[int]: |
There was a problem hiding this comment.
I wonder if get_next_txn should call this instead of duplicating some of the logic? (Same kind of goes for the cache & stream functions above TBH)
erikjohnston
left a comment
There was a problem hiding this comment.
Looks sane from my side, without having looked too closely
As threatened in #16554 (comment)
History:
/keys/claimis surprisingly slow #16554I wrote a dirty test for the new bulk invalidation that I'm not particularly happy with. Completely untested otherwise.
Nominating Erik because a) streams and b) PC helped me write this.