ObjectSpace.dump_all: dump shapes as well#6868
Conversation
|
I'm using this script to test this feature for now: require 'objspace'
ObjectSpace.trace_object_allocations_start
class Foo
def initialize
@a = 1
@b = 2
@c = 3
end
def foo
@foo ||= 1
end
def bar
@bar ||= 1
end
end
GC.start
gen = GC.count
f1 = Foo.new
f1.foo
f1.bar
f2 = Foo.new
f2.bar
f2.foo
ObjectSpace.dump_all(output: STDOUT, since: gen)
puts '-' * 40
puts "f1: #{ObjectSpace.dump(f1)}"
puts "f2: #{ObjectSpace.dump(f2)}"
s = RubyVM::Shape.of(f1)
p [s, s.id, s.type, s.depth, s.edges]The output looks like this: @jemmaissroff @tenderlove @k0kubun I'd love to hear your opinion on this. |
2221805 to
072c353
Compare
|
I ran |
40edcdd to
32579f4
Compare
|
@byroot @jemmaissroff and I were working on the same thing (but you beat us to opening a PR). I think this is a good idea.
I added
We had this considered this too, but since the JSON is basically backwards compatible, I figured it's fine to add to the normal dump. Jemma and I added the shape id for each object in #6864, so I think it makes sense to combine the normal heap dump with shape information.
Seems fine to me. |
|
Sounds good, then I guess I only need to add an argument for partial dump of shapes, I can do that tomorrow. |
|
I just realized we shouldn't dump the parent_id on the root shape: |
jemmaissroff
left a comment
There was a problem hiding this comment.
Thanks for working on this!
|
Thanks for this! Incidentally, we started working on this exact same idea EOD yesterday. To answer your questions:
Agreed.
@tenderlove added it to vm stats in #6798!
We thought about this question yesterday too. It seems like because
If we don't go with the |
Sorry for the duplicated work, I guess we got startled by the same growing shape count report 😄. BTW I think this PR is already in a good enough state to debug that issue @k0kubun, if you wanna try it? |
Ah yes, I'd love to try it. How do you intend to use this on our app? Just sending an entire dump to the log storage on the Rack middleware? |
32579f4 to
64a8ce0
Compare
No worries at all! And I didn't notice Aaron had already commented above too so more duplication 🤦♀️ |
|
So after experimenting with this, I think one interesting field missing is |
6a0dc17 to
f169ea9
Compare
template/id.h.tmpl
Outdated
| #define ID_JUNK RUBY_ID_JUNK | ||
| #define ID_INTERNAL RUBY_ID_INTERNAL | ||
|
|
||
| #define ID_INTERNAL_P(id) ((id & RUBY_ID_SCOPE_MASK) == RUBY_ID_INTERNAL) |
There was a problem hiding this comment.
I extracted this so at least it feels a bit less dirty to check for internal ids.
There was a problem hiding this comment.
Actually it already exists in symbol.h as is_junk_id().
f169ea9 to
0159e1e
Compare
|
Hum, I still have a segfault when printing I don't think I'll have much more time to debug this tonight. |
0159e1e to
0ca146a
Compare
0ca146a to
49b634f
Compare
shape.c
Outdated
| rb_shape_t * | ||
| rb_shape_get_next_iv_shape(rb_shape_t* shape, ID id) | ||
| { | ||
| RUBY_ASSERT(is_junk_id(id) || RTEST(rb_sym2str(ID2SYM(id)))); |
There was a problem hiding this comment.
So this helped crash when the unexpected IDs are set:
-- C level backtrace information -------------------------------------------
/home/runner/work/ruby/ruby/build/ruby(rb_print_backtrace+0x818) [0x55e04d7b1eb8] ../src/vm_dump.c:770
/home/runner/work/ruby/ruby/build/ruby(rb_vm_bugreport) ../src/vm_dump.c:1065
/home/runner/work/ruby/ruby/src/tool/lib/test/unit/parallel.rb: TestRubyLiteral#test_debug_frozen_string_in_array_literal(rb_assert_failure+0x77) [0x55e04d5bb0b2]
/home/runner/work/ruby/ruby/build/ruby(rb_shape_get_next_iv_shape+0x1f) [0x55e04d5a50e8] ../src/shape.c:212
/home/runner/work/ruby/ruby/build/ruby(rb_multi_ractor_p) ../src/shape.c:210
/home/runner/work/ruby/ruby/build/ruby(rb_vm_lock_enter) ../src/vm_sync.h:74
/home/runner/work/ruby/ruby/build/ruby(get_next_shape_internal) ../src/shape.c:122
/home/runner/work/ruby/ruby/build/ruby(rb_shape_get_next_iv_shape) ../src/shape.c:213
/home/runner/work/ruby/ruby/build/ruby(rb_shape_get_next) ../src/shape.c:219
/home/runner/work/ruby/ruby/build/ruby(generic_ivar_set+0x1ae) [0x55e04d77ab0e] ../src/variable.c:1286
/home/runner/work/ruby/ruby/build/ruby(ivar_set) ../src/variable.c:1547
/home/runner/work/ruby/ruby/build/ruby(rb_ivar_set) ../src/variable.c:1556
/home/runner/work/ruby/ruby/build/ruby(compile_array+0x531) [0x55e04d90e101]
/home/runner/work/ruby/ruby/build/ruby(iseq_compile_each0+0x261e) [0x55e04d9099ae]
The ID in question seem to be id_debug_created_info, which I can't really figure out what type it is.
I could use some help, but I think RUBY_ID_INTERNAL is far from covering all ids we shouldn't try to map to a string.
There was a problem hiding this comment.
id_debug_created_info looks like one of those special ids that is < tLAST_OP_ID, if that helps at all. is_notop_id checks for those.
There was a problem hiding this comment.
It's just some value in an enum. There is no associated string for that particular ID. Not all IDs have string representations, so we need to dump something else as the edge name.
It's defined in id.h. Unfortunately that file is generated, but you can find the source for it here.
When we derive the key name, we should probably look at the shape type. Only IVAR shapes will have an edge name that may have a string representation, but it's not guaranteed. As @composerinteralia, I think is_notop_id is the right check.
There was a problem hiding this comment.
We looked through id.def and id.c.tmpl and it looks like id_debug_created_info is the only id that works like this 😓. It's defined in id.def to be -, which carries the special meaning that it has no defined string which looks like it was done to hide them from marshal, which skips them by just checking if there is a string associated.
The easiest path forward is probably to duplicate that check and skip any id where !rb_id2str(id);
In the future we might consider having a different way for marshal to skip undesirable ivars (I'd be interested in looking into this, but not so close to release 😅).
There was a problem hiding this comment.
Actually, if we wanted to be conservative we could print out only ids where is_instance_id(x) (only the ones which start with an @), we might miss some internal things, but that should be really safe to do (and since this is for debugging we may not need it perfect)
There was a problem hiding this comment.
Ok, I tried a few more things, but ultimately ended up using is_instance_id like you suggested. Thanks for the help!
60b750f to
436e025
Compare
I see several arguments in doing so. First they use a non trivial amount of memory, so for various memory profiling/mapping tools it is relevant to have visibility of the space occupied by shapes. Then, some pathological code can create a tons of shape, so it is valuable to have a way to have a way to observe shapes without having to compile Ruby with `SHAPE_DEBUG=1`. And additionally it's likely much faster to dump then this way than to use `RubyVM::Shape`. There are however a few open questions: - Shapes can't respect the `since:` argument. Not sure what to do when it is provided. Would probably make sense to not dump them. - Maybe it would make more sense to have a separate `ObjectSpace.dump_shapes`? - Maybe instead `dump_all` should take a `shapes: false` argument? Additionally, `ObjectSpace.dump_shapes` is added for the use case of debugging the evolution of the shape tree.
436e025 to
a986aa5
Compare
|
I think this is good to go now. Anyone wants to review? |
jemmaissroff
left a comment
There was a problem hiding this comment.
✅ LGTM, thanks for writing! (I don't have approve permissions)
|
Tested and working in our app 🚀 |
I see several arguments in doing so.
First they use a non trivial amount of memory, so for various memory profiling/mapping tools it is relevant to have visibility of the space occupied by shapes.
Then, some pathological code can create a tons of shape, so it is valuable have a way to observe shapes without having to compile Ruby with
SHAPE_DEBUG=1.And additionally it's likely much faster to dump them this way than to use
RubyVM::Shape.There are however a few open questions:
since:argument. Not sure what to do when it is provided. Would probably make sense to not dump them.RubyVM::Shape.next_shape_idwas available on all builds, we could use that as a cursor for differential dumps.ObjectSpace.dump_shapes?dump_allshould take ashapes: falseargument?Additionally,
ObjectSpace.dump_shapesis added for the use case of debugging the evolution of the shape tree.