implement the base library for compacting logs#17632
implement the base library for compacting logs#17632ti-chi-bot[bot] merged 7 commits intotikv:masterfrom
Conversation
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
| }; | ||
|
|
||
| #[derive(Clone)] | ||
| struct CompactionSpy(Sender<SubcompactionResult>); |
There was a problem hiding this comment.
should we move the CompactionSpy into mod test because it is only used in the test?
There was a problem hiding this comment.
I guess the file itself is in the mod test. Do you meaning the test_utli mod?
| self.items.drain().map(|(key, c)| { | ||
| // Hacking: update the statistic when we really yield the compaction. | ||
| // (At `poll_next`.) | ||
| c.form(&key, &self.cfg) |
There was a problem hiding this comment.
Need to update the self.stat here, too.
There was a problem hiding this comment.
The stat will be updated in the poll_next, so the stat will only be updated when the user read the compaction from the stream.
|
|
||
| fn before_a_subcompaction_start(&mut self, _cid: CId, cx: SubcompactionStartCtx<'_>) { | ||
| let hash = cx.subc.crc64(); | ||
| if self.loaded.contains(&hash) { |
There was a problem hiding this comment.
What if two sub-compactions have the same crc64?
There was a problem hiding this comment.
Then the second one cannot be executed. Thankfully its input will probably be saved as the final subcompaction won't be written.
| let key = *o.key(); | ||
| let u = o.get_mut(); | ||
| u.add_file(file); | ||
| if u.size > self.cfg.subcompaction_size_threshold { |
There was a problem hiding this comment.
Will there be a lot of self.items to make it OOM if each entry of self.items are small.
There was a problem hiding this comment.
Perhaps. We may add something like memory quota in the future. But for now in fact we cannot do better.
Signed-off-by: hillium <yujuncen@pingcap.com>
|
|
||
| /// Finishing one tiny task. This will yield the current carrier thread | ||
| /// when needed. | ||
| pub fn step(&mut self) -> Step { |
There was a problem hiding this comment.
Can we use a wrapper of tokio::task::yield_now().await; here?
|
/hold |
Signed-off-by: hillium <yujuncen@pingcap.com>
|
/unhold |
|
@YuJuncen: Your PR was out of date, I have automatically updated it for you. If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 3pointer, iosmanthus, Leavrth The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
close tikv#17631 Added a new crate named `compact-log-backup`. Now it can merge some log files generated by log backup and make them become SSTs. Signed-off-by: hillium <yujuncen@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com>
* br: batch download and merge download sst before ingest (#19062) close #19086 Add a new PRC method called batch-download to download batch SST. Signed-off-by: RidRisR <79858083+RidRisR@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * fix build Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * make format Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * implement the base library for compacting logs (#17632) close #17631 Added a new crate named `compact-log-backup`. Now it can merge some log files generated by log backup and make them become SSTs. Signed-off-by: hillium <yujuncen@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * added `compact-log-bakcup` to `tikv-ctl` (#17845) close #17844 Signed-off-by: hillium <yujuncen@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: record `min_input_ts` and `max_input_ts` in Compaction (#18085) close #18084 `min_input_ts` and `max_input_ts` will present in a log files compaction. Signed-off-by: hillium <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: fix typo (#18090) ref #15990 Fixed a typo: `Migartion` -> `Migration`. Signed-off-by: hillium <yu745514916@live.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: filter out meta files by migration (#18123) close #18122 Now, `StreamMetaStorage` is able to filter out files by meta edits. Signed-off-by: hillium <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: added minimal compactions size (#18235) close #18234 Added `--minimal-compact-size` to `compact-log-backup`. Signed-off-by: hillium <yujuncen@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * log backup: fix several issues during compact log backup. (#18298) close #18308 log backup compact: fix several issues during compact a log backup Signed-off-by: 3pointer <luancheng@pingcap.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: correct version assignment in subcompaction metadata (#18389) close #18390 Fixed a bug that caused the time range of compaction generated SSTs are too huge. Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: add new field to track fully compacted data KV files and fix metafile filtering (#18837) close #18843 compact_log_backup: add new field to track fully compacted data KV files and fix metafile filtering Signed-off-by: 3pointer <luancheng@pingcap.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: use max ts among all storage checkpoint ts (#18848) close #18847 Now, `consistency` hook checks the storage checkpoint by the max value among them. Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: fix compact meta edit filter (#18842) close #18843 Merge the same meta edit from different migrations instead of replacing. Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: offload reading meta to diff cpus (#18885) close #18884 This PR spawns read s3 file tasks to remote threads. Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: read meta from checkpoint (#19068) close #19069 This PR makes `compact-log-backup` fills the migration with subcompactions skipped by checkpoint. Signed-off-by: hillium <yu745514916@live.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * fix build Signed-off-by: Juncen Yu <yujuncen@pingcap.com> --------- Signed-off-by: RidRisR <79858083+RidRisR@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> Signed-off-by: hillium <yu745514916@live.com> Signed-off-by: 3pointer <luancheng@pingcap.com> Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com> Signed-off-by: 山岚 <36239017+YuJuncen@users.noreply.github.com> Co-authored-by: ris <79858083+RidRisR@users.noreply.github.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Co-authored-by: 3pointer <luancheng@pingcap.com> Co-authored-by: Jianjun Liao <36503113+Leavrth@users.noreply.github.com>
* br: batch download and merge download sst before ingest (tikv#19062) close tikv#19086 Add a new PRC method called batch-download to download batch SST. Signed-off-by: RidRisR <79858083+RidRisR@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * fix build Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * make format Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * implement the base library for compacting logs (tikv#17632) close tikv#17631 Added a new crate named `compact-log-backup`. Now it can merge some log files generated by log backup and make them become SSTs. Signed-off-by: hillium <yujuncen@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * added `compact-log-bakcup` to `tikv-ctl` (tikv#17845) close tikv#17844 Signed-off-by: hillium <yujuncen@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: record `min_input_ts` and `max_input_ts` in Compaction (tikv#18085) close tikv#18084 `min_input_ts` and `max_input_ts` will present in a log files compaction. Signed-off-by: hillium <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: fix typo (tikv#18090) ref tikv#15990 Fixed a typo: `Migartion` -> `Migration`. Signed-off-by: hillium <yu745514916@live.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: filter out meta files by migration (tikv#18123) close tikv#18122 Now, `StreamMetaStorage` is able to filter out files by meta edits. Signed-off-by: hillium <yu745514916@live.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: added minimal compactions size (tikv#18235) close tikv#18234 Added `--minimal-compact-size` to `compact-log-backup`. Signed-off-by: hillium <yujuncen@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * log backup: fix several issues during compact log backup. (tikv#18298) close tikv#18308 log backup compact: fix several issues during compact a log backup Signed-off-by: 3pointer <luancheng@pingcap.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: correct version assignment in subcompaction metadata (tikv#18389) close tikv#18390 Fixed a bug that caused the time range of compaction generated SSTs are too huge. Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: add new field to track fully compacted data KV files and fix metafile filtering (tikv#18837) close tikv#18843 compact_log_backup: add new field to track fully compacted data KV files and fix metafile filtering Signed-off-by: 3pointer <luancheng@pingcap.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: use max ts among all storage checkpoint ts (tikv#18848) close tikv#18847 Now, `consistency` hook checks the storage checkpoint by the max value among them. Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: fix compact meta edit filter (tikv#18842) close tikv#18843 Merge the same meta edit from different migrations instead of replacing. Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: offload reading meta to diff cpus (tikv#18885) close tikv#18884 This PR spawns read s3 file tasks to remote threads. Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * compact_log_backup: read meta from checkpoint (tikv#19068) close tikv#19069 This PR makes `compact-log-backup` fills the migration with subcompactions skipped by checkpoint. Signed-off-by: hillium <yu745514916@live.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> * fix build Signed-off-by: Juncen Yu <yujuncen@pingcap.com> --------- Signed-off-by: RidRisR <79858083+RidRisR@users.noreply.github.com> Signed-off-by: Juncen Yu <yujuncen@pingcap.com> Signed-off-by: hillium <yu745514916@live.com> Signed-off-by: 3pointer <luancheng@pingcap.com> Signed-off-by: Jianjun Liao <jianjun.liao@outlook.com> Signed-off-by: 山岚 <36239017+YuJuncen@users.noreply.github.com> Co-authored-by: ris <79858083+RidRisR@users.noreply.github.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Co-authored-by: 3pointer <luancheng@pingcap.com> Co-authored-by: Jianjun Liao <36503113+Leavrth@users.noreply.github.com>
What is changed and how it works?
Issue Number: Close #17631
What's Changed:
The directory hierarchy:
Check List
Tests
Release note