-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Query cache on disk (RFC) #58228
Description
On-disk configuration for query cache can be provided in addition to the already existing in-memory configuration.
On-disk cache works almost independently of the in-memory cache. When data is written to the cache, it will be written to both of them (write-through). When data is searched inside the cache, it is searched in memory first, then on disk. If the data is found on disk, it will be also put in memory.
Therefore, the on-disk cache can have independent configurations of max size, max elements, max element size, compression method, compression level, etc.
The on-disk cache is organized in a directory with subdirectories named as first letters of the hexadecimal cache key, and files named like hexadecimal cache keys, similar to the filesystem cache. The files are removed by the eviction policy, while subdirectories aren't.
It should be possible to have no in-memory cache, but only on-disk cache.
The on-disk cache can be configured with custom disks (e.g., s3_plain), and allows co-locating inside the filesystem cache, similar to how it is done for temporary data on disk.
The metadata (a set of keys existing in the on-disk cache) is loaded in memory at server startup, and the LRU information is maintained in memory. When the server is restarted, the information for the eviction policy (about last access times) is not preserved because it does not persist on disk.
The server may additionally check (during the lookup) if the file suddenly appeared on disk - to support scenarios with a shared disk space between multiple servers.
If there is an exception related to a wrong file size (empty) or a wrong checksum during reading, the cache entry should be discarded.