Skip to content

Always protect at least one value on the timeline when running GC#3357

Merged
jleibs merged 1 commit intomainfrom
jleibs/gc_protections
Sep 19, 2023
Merged

Always protect at least one value on the timeline when running GC#3357
jleibs merged 1 commit intomainfrom
jleibs/gc_protections

Conversation

@jleibs
Copy link
Copy Markdown
Contributor

@jleibs jleibs commented Sep 19, 2023

What

Now that GC has the abillity to protect data, turn the feature on for our normal purge_fraction_of_ram operations.

Resolves: #1803

Checklist

@jleibs jleibs added the ⛃ re_datastore affects the datastore itself label Sep 19, 2023
@jleibs jleibs marked this pull request as ready for review September 19, 2023 02:06
@nikolausWest
Copy link
Copy Markdown
Member

Does this also work for "Save loop selection"?

@jleibs
Copy link
Copy Markdown
Contributor Author

jleibs commented Sep 19, 2023

Does this also work for "Save loop selection"?

I don't believe so but creating a ticket to do something similar there: #3366

@jleibs jleibs merged commit b1fcd57 into main Sep 19, 2023
@jleibs jleibs deleted the jleibs/gc_protections branch September 19, 2023 12:16
emilk added a commit that referenced this pull request Sep 20, 2023
### What
* Closes #2517

The datastore is not the only place we store data. For each node in the
entity tree we store a _time histogram_ of where we have data. This was
never properly purged post GC - a very rough heuristic was instead used
(throw away everything up to the oldest time in the store - which will
be an even worse heursistic after
#3357).

With this PR, the store gc will book-keep exactly what was deleted, and
it will be properly purged from all secondary indices.

The resulting code is a bit complicated, because the store has no idea
about the hierarchical nature of the entity paths, but we store the time
histograms (and other book-keeping) for each node. That is, logging to
`/foo/bar` we will note the data on `/`, `/foo` and `/foo/bar`, but the
data will only be registered in the store for `/foo/bar`.

I also fixed a bunch of other smaller things that came up.

Best reviewed commit-by-commit.

### Checklist
* [x] I have read and agree to [Contributor
Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and
the [Code of
Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md)
* [x] I've included a screenshot or gif (if applicable)
* [x] I have tested [demo.rerun.io](https://demo.rerun.io/pr/3364) (if
applicable)

- [PR Build Summary](https://build.rerun.io/pr/3364)
- [Docs
preview](https://rerun.io/preview/18cbc53fe96798cbe45ab84ddeb66aa183253e12/docs)
<!--DOCS-PREVIEW-->
- [Examples
preview](https://rerun.io/preview/18cbc53fe96798cbe45ab84ddeb66aa183253e12/examples)
<!--EXAMPLES-PREVIEW-->
- [Recent benchmark results](https://ref.rerun.io/dev/bench/)
- [Wasm size tracking](https://ref.rerun.io/dev/sizes/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

include in changelog ⛃ re_datastore affects the datastore itself

Projects

None yet

Development

Successfully merging this pull request may close these issues.

datastore: bake latest-at semantics into the garbage collector

4 participants