-
Notifications
You must be signed in to change notification settings - Fork 104
Read all necessary attributes to properly get labels for CME, Iseq, and cfunc #400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| byteorder = "1.4.3" | ||
| rbspy-testdata = "0.2.4" | ||
| rbspy-testdata = { git = "https://github.com/dalehamel/rbspy-testdata.git", branch = "update-coredumps" } | ||
| rstest = "0.24" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to facilitate table testing of the label calculations
| @@ -0,0 +1,207 @@ | |||
| # frozen_string_literal: true | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Used with permission from @ivoanjo #399 (comment)
| @@ -0,0 +1,26 @@ | |||
| require_relative './interesting_backtrace_helper' | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some preamble to test different cases, then load the test file from backtracie
.github/workflows/coredump.yml
Outdated
| - name: Create core dump | ||
| shell: sudo bash {0} | ||
| run: | | ||
| echo 0x7f > /proc/${{ steps.ruby-script.outputs.pid }}/coredump_filter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that 0xff also seems to work, and it might also be a good idea to do gcore -a
It drove me crazy trying to figure out why my string reads would work on a local process, but fail in the coredump - the memory region the string was in was not in the coredump lol.
Github uses 0x33 by default for coredump filter.
| let cfps = get_cfps(thread.cfp as usize, stack_base(&thread) as usize, source)?; | ||
| for cfp in cfps.iter() { | ||
| if cfp.iseq as usize == 0 { | ||
| let cfunc = ruby_frame_cfunc(&cfp, source)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I flipped the conditional around here and got rid of the frame flag filtering to match what ruby's own backtrace function is doing.
| // I don't know what the -1 is about. Also note that the stack_size is *not* in bytes! stack is a | ||
| // VALUE*, and so stack_size is in units of sizeof(VALUE). | ||
| // We must skip two dummy frames: | ||
| // https://github.com/ruby/ruby/blob/5445e0435260b449decf2ac16f9d09bae3cafe72/vm_backtrace.c#L477-L485%C2%A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that i think the other -1 is occurring implicitly in our calculation of how many CFPs to collect, hence why this is -1 and not -2 here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any particular feedback on this, but It's great to see this "I don't know what the -1 is about" comment finally being removed and replaced with an explanation :)
|
|
||
| macro_rules! get_classpath( | ||
| () => ( | ||
| fn rb_class_real<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See variable.c for this and the other helper functions.
| //if iseq_struct.body == std::ptr::null_mut() { | ||
| // return Err(format_err!("iseq body is null")); | ||
| //} | ||
| let cme = locate_method_entry(&cfp.ep, source)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we mimic vm_backtrace in checking if the CFP contains a CME, and if so, prefer that.
The iseq body passed in is only used if we do NOT have a CME containing imemo_ment.
| rstring.basic.klass as usize, | ||
| source | ||
| ).context("couldn't get ruby string from iseq body")?; | ||
| ).context(format!("couldn't get ruby path string array from iseq body: {:X}", frame_flag)).unwrap_or(("FAILED".to_string(), "FAILED".to_string())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the unwrap_or, we may frequently drop frames if the paths are invalid. I think it is better to indicate the failure instead.
| } | ||
|
|
||
| fn get_cfunc_name<T: ProcessMemory>( | ||
| fn ruby_frame_cfunc<T>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if a frame is a cfunc via the flags
| Ok((frame_flag & VM_FRAME_MAGIC_MASK) == VM_FRAME_MAGIC_CFUNC) | ||
| } | ||
|
|
||
| pub fn qualified_method_name(class_path: &str, method_name: &str, singleton: bool) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be a better place for this function, for now we may as well put it here since cfuncs can use it too.
| return Ok(me); | ||
| } | ||
| unsafe { | ||
| ep = ep.offset(0) as *mut usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking the method entry was previously completely broken. This would never iterate, and it could have actually gotten stuck in an infinite loop, but we were only using it on local method entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we've been lucky with this code. Much of this was based on an old-even-then gdb script and involved a lot of feeling around in the dark until it did something useful. And even after it was working, a lot of it was (and still is) mysterious and inscrutable to anybody who's not familiar with cruby internals. Thanks for taking the time to fix and expand it 👍
| if env_flags & VM_ENV_FLAG_LOCAL == 0 { | ||
| //println!("ALREADY LOCAL {:X}", env_me_cref); | ||
| } | ||
| while env_flags & VM_ENV_FLAG_LOCAL != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note the conditional should have been checking the flags, not the specval.
| Ok(me) | ||
| } | ||
|
|
||
| fn check_method_entry<T: ProcessMemory>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function was also pretty broken. I rewrote it to mimic the logic in ruby's vm_insnhelper.c
| vec![ | ||
| StackFrame { | ||
| name: "sleep [c function]".to_string(), | ||
| name: "Kernel#sleep [c function]".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With my changes we can get the cfunc class name in 3.3.0+
| absolute_path: Some("unknown".to_string()), | ||
| lineno: Some(192), | ||
| }, | ||
| StackFrame { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The coredump was invalid and this frame was dropped entirely. I redid the coredump and added the missing frame.
| ] | ||
| } | ||
|
|
||
| fn real_complex_trace_with_classes_3_4_5() -> Vec<StackFrame> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A backtrace using the helper inspired by Ivo's script.
There are a couple of cases where the frames don't match stackprof however, and I cannot for the life of me figure out why.
In particular, the "" frames have a frame flag of 0x7777... which is the "eval" type. There is no CME, the EP is local, and i don't see how we can possible get a class from them. I wonder if this is an issue with how we are reading the structs in rbspy or something, as stackprof should basically be running the same algorithm now :/
But it handles a bunch of other stuff correctly, such as anonymous modules, singletons, etc. So all in all it is worth it to keep for now, we can hopefully improve the fidelity even more later.
| #[test] | ||
| fn test_get_ruby_stack_trace_3_3_1() { | ||
| let source = coredump_3_3_0(); | ||
| let vm_addr = 0x7f7ff21f1868; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These and others needed to be updated since I replaced the coredump with a non-truncated on.
| assert_eq!(real_stack_trace_3_3_0(), stack_trace.unwrap().trace); | ||
| } | ||
|
|
||
| #[rstest] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added rstest here for table tests on these label calculation functions.
The logic of the functions themselves should match their counterparts in cruby.
In case we find cases where they don't, we can fix it and add regression tests here based on the input combos.
There is no need to update the test, just add new cases to the macro with the inputs to use and expected output.
01aa92e to
97a880a
Compare
| let imemo: rb_method_entry_struct = source.copy_struct(cme).context(cme)?; | ||
| let method_type = source.copy(imemo.def as usize, 1).context(imemo.def as usize)?; | ||
| // https://github.com/ruby/ruby/blob/v3_3_0/internal/imemo.h#L21 | ||
| let method_type = method_type[0] & 0xf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note 0xf is because it is a bitfield of 4 bits, so we read 1 byte and then mask out the upper bits.
|
Freebsd failure seems to be because of bumping the rbspy testdata crate, not sure why that is. |
|
Please take a look when you get a chance @acj |
63a8a21 to
3be9ec9
Compare
|
Returned rbspy-testdata to versioned crate, and rebased after #402 merged, removed references to removed |
acj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Really appreciate the effort to fix and clean up the various helpers, and also the detailed comments along the way. Go ahead and clean up the commits when you have time, and I'll do some local testing to hopefully shake loose any edge cases. Then I'm good with merging.
|
|
||
| module ModuleMethod | ||
| def call_instance() | ||
| InstanceMethod.new.work() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tiny nit: indentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 70c01cf
| end | ||
|
|
||
| loop do | ||
| work_main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too
| return Ok(me); | ||
| } | ||
| unsafe { | ||
| ep = ep.offset(0) as *mut usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we've been lucky with this code. Much of this was based on an old-even-then gdb script and involved a lot of feeling around in the dark until it did something useful. And even after it was working, a lot of it was (and still is) mysterious and inscrutable to anybody who's not familiar with cruby internals. Thanks for taking the time to fix and expand it 👍
3be9ec9 to
70c01cf
Compare
Hey @acj hope you are well, i'm back from vacation and i've addressed your style nits and squashed a few commits. LMK if you found anything that needs to be addressed. |
|
|
||
| const RUBY_QNIL: usize = 0x04; | ||
|
|
||
| //let mut ep = ep.clone() as *mut usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I missed this line in the earlier review. I'll clean it up after merging - just making a note here so that I remember.
|
Thanks! Released in 0.38.1 |
Fix for #399
<main>) to be elidedrb_profile_framesandrb_profile_frame_full_labellogic that stackprof uses.Depends on rbspy/rbspy-testdata#8 which updates the coredumps
I'll squash and tidy up some commits once this has been reviewed.