-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Substrate Runtime Interface #2334
Description
The current substrate runtime API is not the easiest to work with:
- We have a couple of different runtime environments (
with_std/without_std- or rather, staticly linked native code and wasm runtime), however interface should be the same. - Error handling might differ between environments.
- Implementation and interface are located in different places but they are really tightly coupled and I mean it: say, a resulting pointer from runtime API must have a certain alignment or we depend on that in unsafe code. TBH I won't be surprised if there is broken unsafe code out there since it is so easy to break it, so hard to check, and we don't really treat this code as unsafe (e.g. there are no proofs on safety).
- These unsafe invariants not usually stated anywhere, they are just in code. And there is also lack of documentation and the reason for that might be it is just not clear where this doc belongs.
- There is this weird behavior of
impl_function_executor, which allows you declare parameters such asusizewhich, however, will haveu32type, which might be baffling for new people working with this code. There was a PR that introduces an honest Rust interface, but unfortunately it has a downside: there is a desire for signatures of host functions to be trivially copyable without any changes to reduce error proness. - There are some differences and pitfalls to keep in mind when designing Substrate Runtime APIs. For example, multiple returns are not supported and thus if there is need to return multiple values it should be workarounded somehow. And yeah there are some inconstencies at the moment in the current implementation.
Basically, this is very error-prone and very boilerplaty code. This sounds as a good use-case for a code-generation. Here is my strawman proposal for such code generator:
We could introduce a special AST that describes an interface between the substrate runtime and the substrate host. It would support rather high-level types, e.g.: bytes_vec/bytes, bool, (T, J) (a tuple), *T (a raw pointer), [T; N] for parameters and return values. We could also declare if a function traps, maybe with a type Result or a special annotation.
Here is an example how it could look like:
# Not a rust file
# usize is not supported deliberatly, use u32.
fn malloc(size: u32) -> *const u8;
fn free(ptr: *const u8);
# `bytes` is converted to `Vec<u8>` on the substrate host side, and `&[u8]` on the rust side.
fn storage_exists(key: bytes) -> bool;
# `bytes?` - notation for optional value (for denoting absence of a value under the given key)
# note that we don't deal with problems such as how to return a bytes vector (which is two components `ptr` and `len`) via wasm: all such ABI details will be handled by codegeneration.
# note that we also don't need to care about details as creating slices with `from_raw_parts`, and caring about upholding all invariants.
fn child_storage(storage_key: bytes, key: bytes) -> bytes?;
and etc.
(Note that a new language is not necessary for this, we only need a model definition, which could be a rust expresion creating the model struct or it could be yaml)
Having this model, we can use some build.rs code generation for building definitions for several crates, such as:
sri-guest- declarations of every API function for usage from the runtime side (hence guest). Probably haswithout_stdandwith_stdversions generated. Supersedes most part ofsr-io, allocator part fromsr-std, externs part ofsrml-sandbox.sri-host-wasmi- a glue code that dispatches a call toExternalsto the appropriate function in some trait, supersedes current impl_function_executor. This trait would be implemented in today's wasm_executor.rs.
Code generation gives a lot of benefits, here are some:
- As was said, it removes duplication decreases error proness.
- The API is in one place, so we it could be more consistent.
- One single place for documentation of Substrate Runtime Interface,
- All quirks of implementation is implemented only once per type, not for every case.
- Definitions could potentially be used by other implementations of substrate runtime interface,
- We would be able experiment more easily. For example, we could benchmark different ways of passing values, what works best, etc.
- It would make us to decouple from wasmi and integrate other engines much more easily. The thing is we most likely want to have a couple of wasm engines at the same time (i.e. we want to have compilers for performance reasons, but we also want to have wasmi because it is super robust). So we just could generate different boilerplate for calling some trait methods and just dispatch them differently depending on the engine (
Externalsfor wasmi, potentiallyextern "C" fnfor others). - We would be able to easily change our wasm ABI: there is a proposal for multi-value return values in wasm. Code generation would make it easier to migrate to this.
- It might give us ability to introduce versioning to (akin to
impl_runtime_apismacro, but the other way around? : ) ) - I might be wrong but maybe it will provide us an easier way to implement cumulus and maybe make this code a bit cleaner.