Skip to content

Reduce the memory cost when there are stale snapshots for PageStorage #2199

@JaySon-Huang

Description

@JaySon-Huang

We meet some extreme cases that block PageStorage GC from running normally, for example:

These extreme cases/bugs leave lots of PageFile on disks. When running GC, we open lots of (several hundred to thousands of) PageFiles and read all theirs meta parts from data in PageFile::MetaMergingReader::initialize.

https://github.com/pingcap/tics/blob/ec5f976a8fb85db497d3f9f67cd0717885d8075a/dbms/src/Storages/Page/gc/LegacyCompactor.cpp#L110-L119
https://github.com/pingcap/tics/blob/ec5f976a8fb85db497d3f9f67cd0717885d8075a/dbms/src/Storages/Page/PageFile.cpp#L249-L255

Assume that each meta part of one PageFile is 500KiB, if there are 2000 PageFiles left on disk, then each round of GC we need to scan 1 GiB.
Instead of reading all meta parts of (thousands of) PageFiles once, we can allocate a smaller buffer size in PageFile::MetaMergingReader::initialize, and read the rest data from the disk while running PageFile::MetaMergingReader::moveNext.
This change will keep some file descriptor opened for a while and call ::read several times, but it can reduce the memory cost when there are lots of PageFiles.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions