Skip to content

Support for records bigger than a page #332

@lvca

Description

@lvca

Discussed in #320

Originally posted by lvca February 7, 2022
Currently (version <= 22.02.1) ArcadeDB does not support storing records larger than a page. This limitation must be overcome to:

  • store large records, especially binary records (blobs)
  • avoid worrying too much about importing the database if the business requirements change and larger records are needed.

First, a quick introduction to how records are stored on a page. The first part of the page contains the header with slots with pointers where the record content is located on the page. There can be a maximum of 2048 (Bucket.DEF_MAX_RECORDS_IN_PAGE) records on a page (2048 slots). The record content is prefixed by it size. The size is stored with a varint (variable integer). The record size is the length of the record:

  • 0 = deleted record
  • -1 = placeholder pointer that points to another record on another page
  • <-1 = placeholder content pointed from another record in another page

The record size is stored as a varint (variable integer size). The minimum size of a record stored on a page is 5 bytes. If the record is smaller than 5 bytes, it is filled with blanks.

In order to store large records, we must split the record into chunks and save all of them in sequence as a linked list. To let the bucket know it's a chunk of a record, the new size = -2 must be used.

NOTE: This doesn't interfere with the current content, because negative sizes are considered placeholder content, but records cannot be smaller than 5 bytes, so it's not possible to encounter a placeholder content record with record size = -2. The minimum would be -5.

So a record with size -2 will contain the first chunk of the record. The record content will have the following information before the actual chunk of data:

  • the chunk size in bytes as varint
  • the location of the next chunk, stored as a placeholder pointer containing the record slot in the same bucket

In order to read the entire record, the record is built in memory chunk after chunk, jumping between pages until the pointer to the next chunk is 0 (zero).

The first chunk has a record size = -2, while the other chunks will have a record size = -3. This allows the scan() and count() methods to skip those records once encountered.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions