datastore: serialize back into a stream of MsgBundles#1527
Closed
datastore: serialize back into a stream of MsgBundles#1527
MsgBundles#1527Conversation
Contributor
Author
|
Just realized we'll want to be able to optionally specify a time range ultimately (e.g. for saving a selection); this is trivial to do at this point and will come in due time in another PR. |
MsgBundlesMsgBundles
Contributor
Author
|
Putting this on hold for now, see #1535 for rationale. |
Contributor
Author
|
Whether this is on hold or not, i need to backport the changes to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds
DataStore::as_msg_bundles(), which serializes aDataStoreback into a stream ofMsgBundles that is functionally equivalent to the original stream it was built from.Some shortcuts are taken, which is why the output stream is functionally equivalent but not yet identical to the original input stream: in particular, autogenerated cluster keys are dumped as if they were user-defined. That's for another PR.
This gets us most of the way towards #1394, although we'll still need another PR to integrate it all into the save-to-file logic.
This also fixes a nasty instability issue when sorting the dataframes used for testing, which cost me some hair while writing this and might possibly be the reason for the flaky
gc_correcttest we've seen for a while.