-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Recover Txs Faster [0/?]: Combine spill-by-value and spill-by-reference into one file/SharedTLog #2208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover Txs Faster [0/?]: Combine spill-by-value and spill-by-reference into one file/SharedTLog #2208
Conversation
This prepares us for incoming modifications to the TLog that can't easily coexist with our current on-disk state.
Advancing the MIN_RECRUITABLE and DEFAULT is just following the standard progression for 7.0. It was convenient to do while adding the V5 so that we can hook TLogServer back into being used.
The spilling type is now pulled out of the request, and then stored on LogData for later access, and persisted in the tlog metadata per tlog generation. It turns out that serializing types as Unversioned is a bit wonky.
...and test it in simulation, but not combined yet. It turns out that because of txsTag, we basically had to support spill-by-value anyway. Thus, if we treat all tags like txsTag when spilling and peeking, then we have an easy way to bring the two spilling types back into one implementation.
I can't figure out why I made this branch on version, and it's breaking having value and reference tlogs in the same SharedTLog
This actually results in TLog generations of different spill types in the same disk queue. We also drop the LS_ parameter from the filename to signify that there is no log spill configuration per file anymore.
| <ActorCompiler Include="MemoryPager.actor.cpp" /> | ||
| <ActorCompiler Include="LogRouter.actor.cpp" /> | ||
| <ClCompile Include="LatencyBandConfig.cpp" /> | ||
| <ActorCompiler Include="OldTLogServer_4_6.actor.cpp" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When can we remove this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4_6 actually has a reasonable story towards deprecation, since it was never publicly released. I'll start a conversation with Evan about if we could remove it for 7.0. 6_0 and 6_2 will likely be with us for quite some time...
const correctness and file renaming in comment. Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
Which lets us revert the unversioned serilaization of TLogSpillType
I forgot there were two
no idea what happened here Co-Authored-By: Jingyu Zhou <jingyuzhou@gmail.com>
In preparing to do some TLog overhauling again for #1031 , I started pondering what to do about spill-by-value and spill-by-reference being in two separate files. It turns out that I had done 98% of the work to have spill-by-value and spill-by-reference coexist in the same source file and disk queue file, and just didn't realize it. This PR leaves us with TLogServer being the only file from which one can recruit a new TLog, and introduces V5 of the log system, with which I'll continue making future changes.
I initially tried in this PR to fully fix my mistake, and combine OldTLogServer_6_0, OldTLogServer_6_2, and TLogServer into one file. However, it turns out that the key format that we use when storing metadata into the sqlite btree is rather unfriendly to a future restarting test that would run 7.0 -> 6.2 -> 7.0, so I abandoned this. As I was typing this, I partially unconvinced myself, and might be able to combine OldTLogServer_6_0 and OldTLogServer_6_2 into one file in the a later PR.
The concrete downgrade/upgrade issue is that we store metadata keys as (Name, LogID), and ~LogData issues clears for the range it owns in each metadata key. This means that I can't introduce an extra metadata key into a V3 file in 7.0, because 6.2 won't know to clear the key when it destroys a log generation, and this leaked key will come back and cause havoc in recovery code if we re-upgrade to 7.0. We should instead store all of this data as (LogID, Name) so that it can be cleared without knowing all of the key names, and that will be a future PR also.
The majority of the "change" in this PR is just copying TLogServer into TLogServer_6_2. I'd recommend reading the PR commit-by-commit instead of all the changes at once.
@jzhou77, I'm planning on using you as a first pass for "Recover Txs Faster" reviews before sending them to Evan, so that someone else is somewhat aware of the work. If you don't object to this, then whenever you approve, it'd make things a bit faster for me if you untag yourself and assign to Evan.