Fix post-GC purging of streams view time histogram#3364
Merged
Conversation
jleibs
approved these changes
Sep 19, 2023
jleibs
reviewed
Sep 19, 2023
a32d34a to
c7ebb4c
Compare
Member
Author
|
Thanks for that @jleibs, it wasn't too bad keeping track of the individual components, and the code makes much more sense now imho! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
--memory-limit) not correctly applied to streams histogram #2517The datastore is not the only place we store data. For each node in the entity tree we store a time histogram of where we have data. This was never properly purged post GC - a very rough heuristic was instead used (throw away everything up to the oldest time in the store - which will be an even worse heursistic after #3357).
With this PR, the store gc will book-keep exactly what was deleted, and it will be properly purged from all secondary indices.
The resulting code is a bit complicated, because the store has no idea about the hierarchical nature of the entity paths, but we store the time histograms (and other book-keeping) for each node. That is, logging to
/foo/barwe will note the data on/,/fooand/foo/bar, but the data will only be registered in the store for/foo/bar.I also fixed a bunch of other smaller things that came up.
Best reviewed commit-by-commit.
Checklist