Skip to content

Vision: Host functions and database support for non-Merklised-persistent data structures #245

@gavofyork

Description

@gavofyork

Related: #278
Related: #359

NOTE: This is intended to evolve into a full RFC, but is at present more of a bare collection of notes.

New overlay items:

  • Vec<u8>, keyed by name: Name.
  • BTreeMap<Vec<u8>, Vec<u8>>, keyed by name: Name.

These obey transactional principles. They may, at the direction of the runtime through a host function, be archived in a regular, non-Merklised database for later off-chain querying. An RPC may be provided in order to query the contents (if stored).

There are two sets of new host functions for creating, manipulating and querying these items; one set for the Vec<_>s and one for the BTreeMap<_>s.

The set for the Vec<_>s are prefixed blob_ and are:

  • blob_new(name: Name, mode: Mode)
  • blob_set(name: Name, value: &[u8])
  • blob_clone(name: Name, target_name: Name)
  • blob_rename(name: Name, target_name: Name)
  • blob_edit(name: Name, data: &[u8], offset: u32) -> u32
  • blob_append(name: Name, suffix: &[u8])
  • blob_exists(name: Name) -> bool
  • blob_get(name: Name) -> Option<Vec<u8>>
  • blob_len(name: Name) -> Option<u32>
  • blob_hash32(name: Name, algorithm: Hash32Algorithm) -> Option<[u8; 32]>
  • blob_delete(name: Name)

The set for the BTreeMap<_>s are prefixed map_ and are:

  • map_new(name: Name, mode: Mode): Creates a new map name with no items. It is cleared if it already exists.
  • map_clone(name: Name, target_name: Name): Creates a clone of the map name with name target_name.
  • map_rename(name: Name, target_name: Name): Alters the name of map name to target_name.
  • map_insert(name: Name, key: &[u8], value: &[u8]): Inserts a single (key, value) pair into map name, creating the map if it did not previously exist and overwriting the item if it did.
  • map_remove_item(name: Name, key: &[u8]): Removes the pair with the given key from the map name, if the map exists and contains the item. Does nothing otherwise.
  • map_exists(name: Name, key: &[u8]) -> bool: Returns true iff the map name is present in execution state.
  • map_contains(name: Name, key: &[u8]) -> Option<bool>: Returns Some iff the map name exists, None otherwise. The inner value of Some is true iff the map name contains key.
  • map_item_get(name: Name, key: &[u8]) -> Option<Vec<u8>>: Returns Some iff the map name exists and contains key. If so, the inner value is that associated with key.
  • map_item_len(name: Name, key: &[u8]) -> Option<u32>: Returns Some iff the map name exists and contains key. If so, the inner value is the length of the value associated with key.
  • map_item_hash32(name: Name, key: &[u8], algorithm: Hash32Algorithm) -> Option<[u8; 32]>: Returns Some iff the map name exists and contains key. If so, the inner value is the value associated with key when hashed with algorithm.
  • map_count(name: Name) -> Option<u32>: Returns Some iff the map name exists, None otherwise. If Some, then the inner value is the number of items in the map name.
  • map_root32(name: Name, structure: Root32Structure) -> Option<[u8; 32]>: Calculates and returns the root of the data structure structure containing the items held in the map name. Returns None if map name does not exist.
  • map_dump(name: Name) -> Option<Vec<(Vec<u8>, Vec<u8>)>>: Returns Some of a Vec of all items in the map name, sorted; or None if the map name does not exist.
  • map_dump_hashed(name: Name, algorithm: Hash32Algorithm) -> Option<Vec<([u8; 32], [u8; 32])>>: Returns Some of a Vec of all pairs of keys and values in the map name hashed with algorithm and in order of the (unhashed) key; or None if the map name does not exist.
  • map_next_key(name: Name, key: &[u8], count: u32) -> Option<Vec<Vec<u8>>>: Returns up to the next count keys in map name immediately following key. If fewer items exist after key than count, then only the remaining items are returned. If the map name does not exist then None is returned.
  • map_delete(name: Name): Delete the map name, clearing all data.

There exists Hash32Algorithm and Root32Structure which should ultimately be defined in separate RFCs.

Mode is defined as:

enum Mode {
    #[codec(index = 0)]
    Drop,
    #[codec(index = 1)]
    Archive,
}
  • Drop: The data may be discarded at the end of block execution without possibility of later querying, on-chain or off-chain. If creation of an item occurs without an explicit mode being given, then the default mode ThrowAway is assumed.
  • Archive: The data should be retained in an database associated with the block which is executing. It will not be accessible on-chain in later blocks (unless e.g. oraclised in some way).

It is intended that future RFCs introduce additional items to this as needed.

Runtime interfaces may exist allowing the runtime to generate proofs for light-clients.

These items can be used to ween ourselves from (mis-)using the Main State Trie, especially for large items which should never be in the PoV.

This functionality can be used to remove system events and code (:CODE) from the main state trie, avoiding possible PoV issues and the problems with storing events as a Vec<Event> (effectively concatenating encoded items). It allows for custom indexing and proof systems, improving efficiency and efficacy.

The host function map_dump_hashed allows for arbitrary digest ("Merkle hash") calculation within the runtime, allowing e.g. indexing by event topic and for proofs to use a base-2 Merkle structure. Multiple roots could be calculated to optimise for multiple indexing systems.

The host function blob_hash32 allows for comparison and retrieval of large blobs, e.g. of runtime code. The location of the "actual" code could even be moved outside of the main state into one of these items.

Storage values and maps may be marked as non-persistent and these host functions may be used, resulting in better performance as well as guaranteed non-persistence. This is useful for block-specific data (like block height), contextual information (e.g. transaction index) as well safe "thread-local" storage.

Future Work

The Mode may be extended to include the possibility of persisting the regular (non-Merklised) key/value database between blocks. Additional configuration may allow a map's keys to be of fixed length.

Metadata

Metadata

Assignees

Labels

I5-enhancementAn additional feature request.T1-FRAMEThis PR/Issue is related to core FRAME, the framework.

Type

No type

Projects

Status

Draft

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions