Skip to content

wasmtime: Fix resetting stack-walking registers when entering/exiting Wasm#6321

Merged
fitzgen merged 2 commits intobytecodealliance:mainfrom
fitzgen:save-restore-stack-walking-registers
May 3, 2023
Merged

wasmtime: Fix resetting stack-walking registers when entering/exiting Wasm#6321
fitzgen merged 2 commits intobytecodealliance:mainfrom
fitzgen:save-restore-stack-walking-registers

Conversation

@fitzgen
Copy link
Copy Markdown
Member

@fitzgen fitzgen commented May 1, 2023

Fixes a regression from #6262, originally reported in bytecodealliance/wasmtime-dotnet#245

The issue was that we would enter Wasm and save the stack-walking registers but never clear them after Wasm returns. Then if a host-to-host call tried to capture a stack, we would mistakenly attempt to use those stale registers to start the stack walk. This mistake would be caught by an assertion, triggering a panic.

This commit fixes the issue by managing the save/restore in the CallThreadState construction/drop, rather than in the old set_prev method.

@fitzgen fitzgen requested a review from a team as a code owner May 1, 2023 23:08
@fitzgen fitzgen requested review from jameysharp and removed request for a team May 1, 2023 23:08
@fitzgen fitzgen force-pushed the save-restore-stack-walking-registers branch from bc28367 to e4c1d36 Compare May 2, 2023 17:13
@fitzgen fitzgen enabled auto-merge May 2, 2023 17:14
@alexcrichton
Copy link
Copy Markdown
Member

Oh one thing I thought of this morning -- this means that the backtrace code is no longer guaranteed to walk the stack in-order of activations. I don't believe there's any issues with that, but it's worth being aware of. The reason for this is that while the activations are ordered when we walk each activation we walk its old_* pointers which are the previous pointers for the limits.

If you had A call B call A which trapped, then you'd get the first contiguous region in A but then when you walk over the CallThreadState the first one points to the first A on the stack. The next CallThreadState would... actually I think B would be dropped entirely?

Hm ok so now as I write this out, I think it means that Backtrace is no longer guaranteed to capture all frames on the stack, no in order. That's ok though since it's guaranteed to capture everything within one store's limits in order which is all we need.

@fitzgen fitzgen added this pull request to the merge queue May 2, 2023
@jameysharp
Copy link
Copy Markdown
Contributor

I don't understand the issue you're raising, Alex, but my profiling work will produce misleading results if the subset of stack frames which are part of the user-selected modules are returned out of order. Should I be concerned about this?

@alexcrichton
Copy link
Copy Markdown
Member

Sort of yes and sort of no. Within a single Store<T> there's nothing to worry about, everything will always be in-order. What this means though is that if there are multiple stores on the stack then the frames will be out of order. Or more precisely, the originating store will have in-order frames and the other stores may or may not have frames interspersed in whatever order (depends on exact details).

This is ok for WasmBacktrace since it goes through a symbolication phase that discards anything it can't symbolicate, and it can only symbolicate modules registered in the store. In theory this discards any non-store frames, although as I'm typing this I'm now realizing that the same module can be instantiated in different stores so this isn't quite true.

For the profiler just added, however, it doesn't do any filtering, so it does mean that frames will be out of order since non-store-local frames are not filtered out.

@fitzgen I'm actually going to bump this out of the queue now that I think on this more. I'm worried about the case where a module is instantiated in two stores and then the store are interleaved. All frames will symbolicate in either store but the stacks are going to be quite odd. I think we should actually explicitly update backtracing to take a *mut VMRuntimeLimits parameter and only trace activations for those limits?

@alexcrichton alexcrichton removed this pull request from the merge queue due to a manual request May 2, 2023
@jameysharp
Copy link
Copy Markdown
Contributor

The new profiler actually does filter to only frames from a specified set of modules, but your point still stands because it has the same problem that WasmBacktrace does, that the same module might be instantiated in multiple stores.

fitzgen and others added 2 commits May 2, 2023 15:07
… Wasm

Fixes a regression from bytecodealliance#6262, originally reported in
bytecodealliance/wasmtime-dotnet#245

The issue was that we would enter Wasm and save the stack-walking registers but
never clear them after Wasm returns. Then if a host-to-host call tried to
capture a stack, we would mistakenly attempt to use those stale registers to
start the stack walk. This mistake would be caught by an assertion, triggering a
panic.

This commit fixes the issue by managing the save/restore in the
`CallThreadState` construction/drop, rather than in the old `set_prev`
method.

Co-Authored-By: Alex Crichton <alex@alexcrichton.com>
This way we can differentiate between the same module loaded in different stores
and avoid leaking other stores' frames into our backtraces.

Co-Authored-By: Jamey Sharp <jsharp@fastly.com>
@fitzgen fitzgen force-pushed the save-restore-stack-walking-registers branch from e4c1d36 to 6927bbf Compare May 2, 2023 22:42
@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime labels May 2, 2023
@github-actions
Copy link
Copy Markdown

github-actions bot commented May 2, 2023

Subscribe to Label Action

cc @fitzgen, @peterhuene

Details This issue or pull request has been labeled: "wasmtime:api", "wasmtime:ref-types"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: wasmtime:ref-types
  • peterhuene: wasmtime:api

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@fitzgen fitzgen added this pull request to the merge queue May 3, 2023
Merged via the queue into bytecodealliance:main with commit 60ce6f5 May 3, 2023
@fitzgen fitzgen deleted the save-restore-stack-walking-registers branch May 3, 2023 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:ref-types Issues related to reference types and GC in Wasmtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants