This repository was archived by the owner on Aug 31, 2023. It is now read-only.
Compile wasmtime with effect-handlers/wasm-tools:func-ref-2#3
Merged
dhil merged 37 commits intoeffect-handlers:func-reffrom Aug 5, 2022
Merged
Compile wasmtime with effect-handlers/wasm-tools:func-ref-2#3dhil merged 37 commits intoeffect-handlers:func-reffrom
dhil merged 37 commits intoeffect-handlers:func-reffrom
Conversation
Instead of a regular `Vec`. These vectors are usually very small, for example here is the histogram of sizes when running Sightglass's `pulldown-cmark` benchmark: ``` ;; Number of samples = 10332 ;; Min = 0 ;; Max = 11 ;; ;; Mean = 2.496128532713901 ;; Standard deviation = 2.2859559855427243 ;; Variance = 5.225594767838607 ;; ;; Each ∎ is a count of 62 ;; 0 .. 1 [ 3134 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 1 .. 2 [ 2032 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 2 .. 3 [ 159 ]: ∎∎ 3 .. 4 [ 838 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎ 4 .. 5 [ 970 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 5 .. 6 [ 2566 ]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 6 .. 7 [ 303 ]: ∎∎∎∎ 7 .. 8 [ 272 ]: ∎∎∎∎ 8 .. 9 [ 40 ]: 9 .. 10 [ 18 ]: ``` By using a `SmallVec` with capacity of 6 we avoid the vast majority of heap allocations and get some nice benchmark wins of up to ~1.11x faster compilation. <h3>Sightglass Benchmark Results</h3> ``` compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 340361395.90 ± 63384608.15 (confidence = 99%) main.so is 0.88x to 0.92x faster than smallvec.so! smallvec.so is 1.09x to 1.13x faster than main.so! [3101467423 3425524333.41 4060621653] main.so [2820915877 3085162937.51 3375167352] smallvec.so compilation :: cycles :: benchmarks/spidermonkey/benchmark.wasm Δ = 988446098.59 ± 184075718.89 (confidence = 99%) main.so is 0.88x to 0.92x faster than smallvec.so! smallvec.so is 1.09x to 1.13x faster than main.so! [9006994951 9948091070.66 11792481990] main.so [8192243090 8959644972.07 9801848982] smallvec.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm Δ = 7854567.87 ± 2215491.16 (confidence = 99%) main.so is 0.89x to 0.94x faster than smallvec.so! smallvec.so is 1.07x to 1.12x faster than main.so! [80354527 93864666.76 119789198] main.so [77554917 86010098.89 94726994] smallvec.so compilation :: cycles :: benchmarks/bz2/benchmark.wasm Δ = 22810509.85 ± 6434024.63 (confidence = 99%) main.so is 0.89x to 0.94x faster than smallvec.so! smallvec.so is 1.07x to 1.12x faster than main.so! [233358190 272593088.57 347880715] main.so [225227821 249782578.72 275097380] smallvec.so compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 10849521.41 ± 4324757.85 (confidence = 99%) main.so is 0.90x to 0.96x faster than smallvec.so! smallvec.so is 1.04x to 1.10x faster than main.so! [133875427 156859544.47 222455440] main.so [126073854 146010023.06 181611647] smallvec.so compilation :: cycles :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 31508176.97 ± 12559561.91 (confidence = 99%) main.so is 0.90x to 0.96x faster than smallvec.so! smallvec.so is 1.04x to 1.10x faster than main.so! [388788638 455536988.31 646034523] main.so [366132033 424028811.34 527419755] smallvec.so ```
These are always length 1 for Wasm benchmarks. <h3>Sightglass Benchmark Results</h3> ``` compilation :: nanoseconds :: benchmarks/spidermonkey/benchmark.wasm Δ = 328624015.86 ± 40274677.93 (confidence = 99%) main.so is 0.88x to 0.91x faster than slots-smallvec.so! slots-smallvec.so is 1.10x to 1.13x faster than main.so! [3070752447 3203778792.55 3446269274] main.so [2503544039 2875154776.69 3197966713] slots-smallvec.so compilation :: nanoseconds :: benchmarks/pulldown-cmark/benchmark.wasm Δ = 9685705.06 ± 3221286.87 (confidence = 99%) main.so is 0.91x to 0.96x faster than slots-smallvec.so! slots-smallvec.so is 1.05x to 1.09x faster than main.so! [129356493 145594942.79 165038803] main.so [118555011 135909237.73 188780619] slots-smallvec.so compilation :: nanoseconds :: benchmarks/bz2/benchmark.wasm No difference in performance. [79080493 86757564.46 112649639] main.so [78083384 85934125.69 94992743] slots-smallvec.so ```
dhil
added a commit
to dhil/wasm-tools
that referenced
this pull request
Aug 3, 2022
This patch implements (most parts of) the frontend for the function
references proposal (currently we have not implemented initialised
locals). Moreover, the crates shrink, smith, and mutate have yet to
made compatible with the new features --- once the changeset has been
approved we can adjust those crates.
Boardly, the implementation of function references follows the
proposed spec closely, and as such, this patch introduces some
important changes to both the type structure and the validation
algorithm.
The structure of `ValType` is no longer a flat, instead, it contains a
new constructor `Ref(RefType)` which is parameterised by `RefType`,
which is a new struct type. This type is itself parameterised by a
`HeapType`, which is a new enumeration type. In addition, `ValType` is
extended with the constructor `Bot` to represent the special value
bottom type. It is special in the sense that it can only appear during
validation as there is not surface syntax for it (this is in
accordance with the spec). Concretely, the `ValType`, `RefType`, and
`HeapType` are now defined as follows.
```rust
pub enum ValType {
/// The value type is i32.
I32,
/// The value type is i64.
I64,
/// The value type is f32.
F32,
/// The value type is f64.
F64,
/// The value type is v128.
V128,
/// The value type is a reference. Which type of reference is decided by
/// RefType. This is a change in syntax from the function references proposal,
/// which now provides FuncRef and ExternRef as sugar for the generic ref
/// construct.
Ref(RefType),
/// Special bottom type.
Bot,
}
pub struct RefType {
/// Whether it's nullable
pub nullable: bool,
/// The relevant heap type
pub heap_type: HeapType,
}
pub enum HeapType {
/// It seems by example that u32s are directly used for arbitrary indexes,
/// but maybe a higher-level structure like TypeRef is relevant here?
Index(u32),
/// From reference types
Func,
/// From reference types
Extern,
/// Special bottom heap type
Bot,
}
```
The validation algorithm has been adapted to follow that of the
pseudocode algorithm in the appendix of the spec. We have made a few
administrative changes such as plumbing an instance of
`WasmModuleResources` around, because whenever a reference type has
heap type `Index(i)` we need to look up the defined type at index
`i`. The interface of `WasmModuleResources` has been extended with the
following new methods.
```rust
fn type_index_of_function(&self, func_idx: u32) -> Option<u32>;
fn matches(&self, t1: ValType, t2: ValType) -> bool;
fn check_value_type(
&self,
t: ValType,
features: &WasmFeatures,
offset: usize
) -> Result<(), BinaryReaderError>;
```
These methods are used during validation:
* `type_index_of_function` is necessary to obtain the result type of `RefFunc`
* `matches` implements the subtyping relation on types; it is used in
both operand validation and in element validation. It may need to
peek into context to retrieve function types.
* `check_value_type` is again used all over and needs to be shared. It now
needs access to the context because a value type with an invalid heap index
(depending on the number of types in the context) is not valid
The validation procedure has obviously been extended with the new
instructions; the only previous instruction whose validation needed to
be updated is `Select`.
The validation helper methods has changed too.
```rust
fn pop_operand(
&mut self,
expected: Option<ValType>,
resources: &impl WasmModuleResources,
) -> OperatorValidatorResult<ValType>
fn pop_ref(
&mut self,
resources: &impl WasmModuleResources,
) -> OperatorValidatorResult<RefType>
```
Comments and feedback are most welcome --- we have tried to follow the
structure of the code base to the best of our abilities, however, we
are not completely sure we have put the shared logic
(e.g. `type_index_of_function`, `matches`, and `check_value_type`) in
the right places. Also, help with getting the crates skrink, smith,
and mutate up to date would be much appreciated.
Additionally, we have another patch that makes wasmtime compile and
pass the testsuite (without function references) with our modified
wasm-tools (c.f. effect-handlers/wasmtime#3).
Co-authored-by: Daniel Hillerström <daniel.hillerstrom@ed.ac.uk>
Co-authored-by: cosine <CosineP@users.noreply.github.com>
On oss-fuzz a test case has been found that executes 30k iterations of a wasm trap which with a 60s timeout leaves 2ms for each invocation which under fuzzing instrumentation is a bit of a stretch with a ~20x slowdown. This commit places a limit on the number of inputs to the fuzzer at 200 to keep it reasonably sized.
* Support CLI parameters for string encoding * Fix `--skip-validate` * Fix printing binary to stdout
This is a collection of some minor renamings, refactorings, sharing of code, etc. This was all discovered during my addition of string support to adapter functions and I figured it'd be best to frontload this and land it ahead of the full patch since it's getting complex.
This commit aims to improve the readability of supporting the memory64 proposal in the `fact` adapter trampoline compiler. Previously there were a few sprinkled blocks that used `if` to generate different instructions inline, but as I've worked on support for strings this has become pretty unwieldy as strings do far more memory manipulation than other type conversions. A pattern that's easier to read is to have small instruction helpers that take the pointer width as an argument and internally dispatch to the correct instruction. This keeps the main translation code branch-free and a bit easier to follow. Additionally for more complicated branching logic it allows for deduplicating the main translation path by having lots of little branches instead of one large branch with everything duplicated on both halves.
x86 does not have dedicated instructions for scalar FMA, lower to a libcall which seems to be what llvm does.
Give the user the option to sign and to authenticate function return addresses with the operations introduced by the Pointer Authentication extension to the Arm instruction set architecture. Copyright (c) 2021, Arm Limited.
…dealliance#4585) This adds support for StructArgument on s390x. The ABI for this platform requires that the address of the buffer holding the copy of the struct argument is passed from caller to callee as hidden pointer, using a register or overflow stack slot. To implement this, I've added an optional "pointer" filed to ABIArg::StructArg, and code to handle the pointer both in common abi_impl code and the s390x back-end. One notable change necessary to make this work involved the "copy_to_arg_order" mechanism. Currently, for struct args we only need to copy the data (and that need to happen before setting up any other args), while for non-struct args we only need to set up the appropriate registers or stack slots. This order is ensured by sorting the arguments appropriately into a "copy_to_arg_order" list. However, for struct args with explicit pointers we need to *both* copy the data (again, before everything else), *and* set up a register or stack slot. Since we now need to touch the argument twice, we cannot solve the ordering problem by a simple sort. Instead, the abi_impl common code now provided *two* callbacks, emit_copy_regs_to_buffer and emit_copy_regs_to_arg, and expects the back end to first call copy..to_buffer for all args, and then call copy.._to_arg for all args. This required updates to all back ends. In the s390x back end, in addition to the new ABI code, I'm now adding code to actually copy the struct data, using the MVC instruction (for small buffers) or a memcpy libcall (for larger buffers). This also requires a bit of new infrastructure: - MVC is the first memory-to-memory instruction we use, which needed a bit of memory argument tweaking - We also need to set up the infrastructure to emit libcalls. (This implements the first half of issue bytecodealliance#4565.)
CosineP
approved these changes
Aug 3, 2022
Collaborator
CosineP
left a comment
There was a problem hiding this comment.
This is great, and looks like the tests pass too? Cool. Alright looks like we have our work cut out for us :)
crates/types/src/lib.rs
Outdated
| let externref_returns_count = returns | ||
| .iter() | ||
| .filter(|r| **r == WasmType::ExternRef) | ||
| .filter(|r| match **r { WasmType::Ref(WasmRefType { heap_type: WasmHeapType::Extern, .. }) => true, _ => false }) |
Collaborator
There was a problem hiding this comment.
Speaking of which do non-nullable extern refs exist?
Author
There was a problem hiding this comment.
I don't think so, as externref and funcref are interpreted as nullable refs c.f. https://github.com/WebAssembly/function-references/blob/main/proposals/function-references/Overview.md#Reference-Types.
| @@ -321,7 +321,7 @@ unsafe impl WasmTy for Option<ExternRef> { | |||
|
|
|||
Collaborator
There was a problem hiding this comment.
Where does ExternRef come from here?
Author
There was a problem hiding this comment.
Includes a modest improvement in memory usage and performance by removing analysis that was only used during fuzzing.
…liance#4601) * Wasmtime: Add a pointer to `VMRuntimeLimits` in component contexts * Save exit Wasm FP and PC in component-to-host trampolines Fixes bytecodealliance#4535 * Add comment about why we deref the trampoline's FP * Update some tests to use new `vmruntime_limits_*` methods
* Port `Shuffle` to ISLE (AArch64) Ported the existing implementation of `Shuffle` for AArch64 to ISLE. Copyright (c) 2022 Arm Limited * Cleanup by shadowing `rn`, `rn2`, and `_` Copyright (c) 2022 Arm Limited
This addresses bytecodealliance#4307. For the static API we generate 100 arbitrary test cases at build time, each of which includes 0-5 parameter types, a result type, and a WAT fragment containing an imported function and an exported function. The exported function calls the imported function, which is implemented by the host. At runtime, the fuzz test selects a test case at random and feeds it zero or more sets of arbitrary parameters and results, checking that values which flow host-to-guest and guest-to-host make the transition unchanged. The fuzz test for the dynamic API follows a similar pattern, the only difference being that test cases are generated at runtime. Signed-off-by: Joel Dice <joel.dice@fermyon.com>
This adds full i128 support to the s390x target, including new filetests and enabling the existing i128 runtest on s390x. The ABI requires that i128 is passed and returned via implicit pointer, but the front end still generates direct i128 types in call. This means we have to implement ABI support to implicitly convert i128 types to pointers when passing arguments. To do so, we add a new variant ABIArg::ImplicitArg. This acts like StructArg, except that the value type is the actual target type, not a pointer type. The required conversions have to be inserted in the prologue and at function call sites. Note that when dereferencing the implicit pointer in the prologue, we may require a temp register: the pointer may be passed on the stack so it needs to be loaded first, but the value register may be in the wrong class for pointer values. In this case, we use the "stack limit" register, which should be available at this point in the prologue. For return values, we use a mechanism similar to the one used for supporting multiple return values in the Wasmtime ABI. The only difference is that the hidden pointer to the return buffer must be the *first*, not last, argument in this case. (This implements the second half of issue bytecodealliance#4565.)
) * Add a dataflow-based representation of components This commit updates the inlining phase of compiling a component to creating a dataflow-based representation of a component instead of creating a final `Component` with a linear list of initializers. This dataflow graph is then linearized in a final step to create the actual final `Component`. The motivation for this commit stems primarily from my work implementing strings in fused adapters. In doing this my plan is to defer most low-level transcoding to the host itself rather than implementing that in the core wasm adapter modules. This means that small cranelift-generated trampolines will be used for adapter modules to call which then call "transcoding libcalls". The cranelift-generated trampolines will get raw pointers into linear memory and pass those to the libcall which core wasm doesn't have access to when passing arguments to an import. Implementing this with the previous representation of a `Component` was becoming too tricky to bear. The initialization of a transcoder needed to happen at just the right time: before the adapter module which needed it was instantiated but after the linear memories referenced had been extracted into the `VMComponentContext`. The difficulty here is further compounded by the current adapter module injection pass already being quite complicated. Adapter modules are already renumbering the index space of runtime instances and shuffling items around in the `GlobalInitializer` list. Perhaps the worst part of this was that memories could already be referenced by host function imports or exports to the host, and if adapters referenced the same memory it shouldn't be referenced twice in the component. This meant that `ExtractMemory` initializers ideally needed to be shuffled around in the initializer list to happen as early as possible instead of wherever they happened to show up during translation. Overall I did my best to implement the transcoders but everything always came up short. I have decided to throw my hands up in the air and try a completely different approach to this, namely the dataflow-based representation in this commit. This makes it much easier to edit the component after initial translation for injection of adapters, injection of transcoders, adding dependencies on possibly-already-existing items, etc. The adapter module partitioning pass in this commit was greatly simplified to something which I believe is functionally equivalent but is probably an order of magnitude easier to understand. The biggest downside of this representation I believe is having a duplicate representation of a component. The `component::info` was largely duplicated into the `component::dfg` module in this commit. Personally though I think this is a more appropriate tradeoff than before because it's very easy to reason about "convert representation A to B" code whereas it was very difficult to reason about shuffling around `GlobalInitializer` items in optimal fashions. This may also have a cost at compile-time in terms of shuffling data around, but my hope is that we have lots of other low-hanging fruit to optimize if it ever comes to that which allows keeping this easier-to-understand representation. Finally, to reiterate, the final representation of components is not changed by this PR. To the runtime internals everything is still the same. * Fix compile of factc
…lliance#4604) * components: ignore export aliases to types in translation. Currently, translation is ignoring type exports from components during translation by skipping over them before adding them to the exports map. If a component instantiates an inner component and aliases a type export of that instance, it will cause wasmtime to panic with a failure to find the export in the exports map. The fix is to add a representation for exported types to the map that is simply ignored when encountered. This also makes it easier to track places where we would have to support type exports in translation in the future. * Keep type information for type exports. This commit keeps the type information for type exports so that types can be properly aliased from an instance export and thereby adjusting the type index space accordingly. * Add a simple test case for type exports for the component model.
Since we do not have an instruction for this, this is a simple open-coded implementation. Needed by the cg_clif frontend.
Co-authored-by: Wasmtime Publish <wasmtime-publish@users.noreply.github.com>
When an adapter module depends on a particular core wasm instance this means that it actually depends on not only that instance but all prior core wasm instances as well. This is because core wasm instances must be instantiated in the specified order within a component and that cannot change depending on the dataflow between adapters. This commit fixes a possible panic from linearizing the component dfg where an adapter module tried to depend on an instance that hadn't been instantiated yet because the ordering dependency between core wasm instances hadn't been modeled.
dhil
pushed a commit
that referenced
this pull request
Apr 10, 2023
* Integrate experimental HTTP into wasmtime. * Reset Cargo.lock * Switch to bail!, plumb options partially. * Implement timeouts. * Remove generated files & wasm, add Makefile * Remove generated code textfile * Update crates/wasi-http/Cargo.toml Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com> * Update crates/wasi-http/Cargo.toml Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com> * Extract streams from request/response. * Fix read for len < buffer length. * Formatting. * types impl: swap todos for traps * streams_impl: idioms, and swap todos for traps * component impl: idioms, swap all unwraps for traps, swap all todos for traps * http impl: idiom * Remove an unnecessary mut. * Remove an unsupported function. * Switch to the tokio runtime for the HTTP request. * Add a rust example. * Update to latest wit definition * Remove example code. * wip: start writing a http test... * finish writing the outbound request example havent executed it yet * better debug output * wasi-http: some stubs required for rust rewrite of the example * add wasi_http tests to test-programs * CI: run the http tests * Fix some warnings. * bump new deps to latest releases (#3) * Add tests for wasi-http to test-programs (#2) * wip: start writing a http test... * finish writing the outbound request example havent executed it yet * better debug output * wasi-http: some stubs required for rust rewrite of the example * add wasi_http tests to test-programs * CI: run the http tests * bump new deps to latest releases h2 0.3.16 http 0.2.9 mio 0.8.6 openssl 0.10.48 openssl-sys 0.9.83 tokio 1.26.0 --------- Co-authored-by: Brendan Burns <bburns@microsoft.com> * Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs * Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs * Update crates/test-programs/tests/http_tests/runtime/wasi_http_tests.rs * wasi-http: fix cargo.toml file and publish script to work together (#4) unfortunately, the publish script doesn't use a proper toml parser (in order to not have any dependencies), so the whitespace has to be the trivial expected case. then, add wasi-http to the list of crates to publish. * Update crates/test-programs/build.rs * Switch to rustls * Cleanups. * Merge switch to rustls. * Formatting * Remove libssl install * Fix tests. * Rename wasi-http -> wasmtime-wasi-http * prtest:full Conditionalize TLS on riscv64gc. * prtest:full Fix formatting, also disable tls on s390x * prtest:full Add a path parameter to wit-bindgen, remove symlink. * prtest:full Fix tests for places where SSL isn't supported. * Update crates/wasi-http/Cargo.toml --------- Co-authored-by: Eduardo de Moura Rodrigues <16357187+eduardomourar@users.noreply.github.com> Co-authored-by: Pat Hickey <phickey@fastly.com> Co-authored-by: Pat Hickey <pat@moreproductive.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch makes
wasmtimecompile with branchfunc-ref-2in ourwasm-toolsfork. There are many todos left to be completed.The test suite compiles and passes.