Skip to content

Add support for ruby 3.0.0#285

Merged
acj merged 12 commits intorbspy:masterfrom
acj:add-ruby-3-support
Mar 22, 2021
Merged

Add support for ruby 3.0.0#285
acj merged 12 commits intorbspy:masterfrom
acj:add-ruby-3-support

Conversation

@acj
Copy link
Copy Markdown
Member

@acj acj commented Feb 13, 2021

I'm splitting this out of #282 as it may take a bit of spelunking. I've run bindgen and updated the usual places, but it doesn't currently work on 3.0.0:

Linux:

$ ruby -v
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-linux]
$ ruby ci/ruby-programs/infinite.rb &
$ sudo env RUST_LOG=debug cargo run -- record --pid $(pgrep ruby)
Press Ctrl+C to stop
[2021-02-13T23:21:34Z DEBUG rbspy::core::address_finder::os_impl] symbol: Symbol: Value: 0x00317ac9 Size: 0x0006 Type: data object Bind: global Vis: default Section: 14 Name: ruby_version
[2021-02-13T23:21:34Z DEBUG rbspy::core::address_finder::os_impl] load header: Program Header: Type: LOAD Offset: 0x00030000 VirtAddr: 0x00030000 PhysAddr: 0x00030000 FileSize: 0x2a42a1 MemSize: 0x2a42a1 Flags: R E Align: 0x1000
[2021-02-13T23:21:34Z DEBUG rbspy::core::initialize] version: 3.0.0
[2021-02-13T23:21:34Z DEBUG rbspy::core::address_finder::os_impl] Trying to find address location another way
[2021-02-13T23:21:34Z DEBUG rbspy::core::address_finder::os_impl] bss_section header: SectionHeader { name: ".bss", shtype: 0x8, flags: 0x3, addr: 4095616, offset: 4091516, size: 67656, link: 0, info: 0, addralign: 32, entsize: 0 }
[2021-02-13T23:21:34Z DEBUG rbspy::core::address_finder::os_impl] read_addr: 7f9f675bde80
[2021-02-13T23:21:34Z DEBUG rbspy::core::address_finder::os_impl] successfully read data
Wrote raw data to /root/.cache/rbspy/records/rbspy-2021-02-13-KdNndnPMno.raw.gz
Writing formatted output to /root/.cache/rbspy/records/rbspy-2021-02-13-pxa4oxdRyE.flamegraph.svg
[2021-02-13T23:21:34Z ERROR inferno::flamegraph] No stack counts found
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Custom { kind: InvalidData, error: "No stack counts found" })', src/ui/flamegraph.rs:38:89

macOS:

$ ruby -v
ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin20]
$ ruby ci/ruby-programs/infinite.rb &
$ sudo env RUST_LOG=debug cargo run -- record --pid $(pgrep ruby)
Press Ctrl+C to stop
[2021-02-13T23:23:52Z DEBUG rbspy::core::initialize] version: 3.0.0
Wrote raw data to /Users/acj/.cache/rbspy/records/rbspy-2021-02-13-jqQNIRXiW6.raw.gz
Writing formatted output to /Users/acj/.cache/rbspy/records/rbspy-2021-02-13-cDlx202Und.flamegraph.svg
[2021-02-13T23:23:52Z ERROR inferno::flamegraph] No stack counts found
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Custom { kind: InvalidData, error: "No stack counts found" })', src/ui/flamegraph.rs:38:9

If someone is interested in working on this, please feel free to jump in. Maybe it's something simple. I'm happy to help with testing but won't have time to investigate further until the C symbols work is done.

Resolves #278 (eventually)

@acj acj mentioned this pull request Feb 15, 2021
7 tasks
@acj acj force-pushed the add-ruby-3-support branch 2 times, most recently from 4010710 to 3fb2c26 Compare February 22, 2021 02:25
@acj
Copy link
Copy Markdown
Member Author

acj commented Feb 26, 2021

Status update: There are significant thread-related changes in ruby 3.x due to the introduction of ractors (ruby actors). This breaks several assumptions in rbspy when we try to look up the current execution context and use it to find stack frames. I've been slowly working through these issues and trying to fix them without making the version-specific code harder to reason about.

This code should currently work on Linux with ruby 2.7+. It builds but is broken (no stack frames) on macOS, which is next on my list to debug. I haven't tried Windows or FreeBSD yet.

@acj
Copy link
Copy Markdown
Member Author

acj commented Mar 2, 2021

It's able to find stack traces on macOS if I re-run bindgen on that platform, but then it's broken on Linux. A diff of the bindings turns up several changes but nothing directly related to our stack frame code (afaict). I wouldn't be surprised if there's a type size or alignment issue, though. I'll try to narrow it down, and if it can't be resolved then we may need to fork the bindings based on platform -- or find a way to avoid the problematic types.

@acj acj force-pushed the add-ruby-3-support branch from b64373b to b8dacbc Compare March 4, 2021 12:24
@acj acj force-pushed the add-ruby-3-support branch 2 times, most recently from fd1a75e to e868e89 Compare March 9, 2021 03:59
@acj
Copy link
Copy Markdown
Member Author

acj commented Mar 9, 2021

This works on macOS now.

I tried it on Windows tonight and didn't have much luck. The tests pass, but it seems like we're looking at the wrong memory locations when we try to get stack frames. Makes me wonder if we'll need to run bindgen separately on each platform to get the correct types (like I've already done for macOS and Linux). The main sticking point, afaict, is that the pthread primitives are platform-specific and differ pretty significantly in their size (e.g. struct vs union), so it's easy to jump to the wrong location in memory if the types and platform are mismatched. We need to step over a few of those pthread types as we're walking through the ractor and threads structs to get the execution context. If anyone knows of a shortcut to get the EC, please let me know.

@acj acj force-pushed the add-ruby-3-support branch from e868e89 to 11b27a5 Compare March 14, 2021 18:39
@acj
Copy link
Copy Markdown
Member Author

acj commented Mar 14, 2021

If anyone knows of a shortcut to get the EC

It's the opposite of a shortcut, but I found a way to get the EC address without needing to step over pthread types. The VM struct has a pointer to the current main thread struct, which in turn has a pointer to the EC.

It works on Linux, macOS, and Windows without requiring separate bindings for each platform, which is a relief. rbspy seems to be generally broken on FreeBSD 12 for me (can't get stack frames, even on master with 2.7.2), so I'm keeping the build healthy but otherwise not spending time on it.

EDIT: This approach works but always profiles the main thread, which isn't what we want. Next puzzle is to find a way to refresh the EC pointer.

@acj acj force-pushed the add-ruby-3-support branch from 11b27a5 to 89d8b7f Compare March 15, 2021 01:17
@acj acj marked this pull request as ready for review March 15, 2021 01:24
@acj
Copy link
Copy Markdown
Member Author

acj commented Mar 15, 2021

@jvns When you have time, this is ready for a look please. The logic in the get_execution_context_from_vm macro is a bit hacky, but it's worked well for me so far, and it sidesteps the maintenance headache of needing to run bindgen on all platforms each time we add a new version of ruby 3+.

If we need to fall back to the bindgen approach (now or later), I think it would be good to invest some effort in automating all of it through GH Actions, maybe building on Igor's docker work, so that generating new bindings is as simple as running a CI job. (As I'm writing this, it seems like a good thing to do regardless, hah. :)

@jvns
Copy link
Copy Markdown
Collaborator

jvns commented Mar 15, 2021

lgtm! This seems very reasonable to me and it feels less hacky than some of the other things we do in rbspy like is_maybe_thread (this follows way less pointers!). Thanks for the comments too :)

I'm so happy you did this!

@acj acj force-pushed the add-ruby-3-support branch from 7ef6330 to f926d22 Compare March 20, 2021 19:43
@acj acj force-pushed the add-ruby-3-support branch from f926d22 to 2d27464 Compare March 20, 2021 19:48
@acj
Copy link
Copy Markdown
Member Author

acj commented Mar 20, 2021

This is in good shape now, I think. The Travis build gets through the tests and then fails when it uses libjemalloc, but I'm able to use libjemalloc and libtcmalloc locally with ruby 2.4. The malloc tests have been flaky in CI for me before.

If anyone has time to help test this, please do.

I'll probably merge tomorrow if there are no objections, and then cut a release next week.

@acj acj merged commit 35f4538 into rbspy:master Mar 22, 2021
@acj acj deleted the add-ruby-3-support branch March 22, 2021 00:03
acj added a commit that referenced this pull request Mar 27, 2021
- Merge pull request #295 from benfred/update_remoteprocess
- Merge pull request #285 from acj/add-ruby-3-support
- readme: Mention ruby-structs version parity in maintainer instructions
- Bump ruby-structs version to match rbspy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Ruby 3 support

2 participants