Conversation
ysbaddaden
left a comment
There was a problem hiding this comment.
That's pretty nice! I've been pondering about this for a while.
I see just a couple issues:
-
should stop the world while we dump the heap so it's MT compatible (that one's easy).
-
We can't allocate anything, but stdlib will happily allocate anywhere... Maybe we could introduce
Crystal::System.read(fd, slice)and.write(fd, slice)methods that would directly read from and write to a systemfdorhandle?
Do you mean that |
ysbaddaden
left a comment
There was a problem hiding this comment.
Thank you, it looks great to me for the initial step 🙇
|
It is possible to eliminate GC allocations from the class RawIO < IO
def initialize(@handle : LibC::HANDLE)
end
def read(slice : Bytes)
LibC.WriteFile(@handle, slice, slice.size, out read_count, nil)
read_count.to_i32
end
def write(slice : Bytes) : Nil
LibC.WriteFile(@handle, slice, slice.size, nil, nil)
end
end
macro open_raw_file(path, mode = "r", &block)
{% write = mode == "w" ? true : mode == "r" ? false : mode.raise "Unknown file mode: #{mode.id}" %}
# if `path.to_utf16` uses a fixed-size stack buffer then
# this macro could even be turned into a regular method
%handle = ::LibC.CreateFileW(
{{ path.to_utf16 }},
{% if write %} ::LibC::FILE_GENERIC_WRITE {% else %} ::LibC::FILE_GENERIC_READ {% end %},
::LibC::DEFAULT_SHARE_MODE,
nil,
{% if write %} ::LibC::CREATE_ALWAYS {% else %} ::LibC::OPEN_EXISTING {% end %},
::LibC::FILE_FLAG_BACKUP_SEMANTICS,
nil,
)
if %handle == ::LibC::INVALID_HANDLE_VALUE
::raise(::RuntimeError.new("CreateFileW"))
end
begin
%raw_io_buf = uninitialized ::ReferenceStorage(::RawIO)
{{ block.args[0] }} = ::RawIO.unsafe_construct(pointerof(%raw_io_buf), %handle)
{{ block.body }}
ensure
::LibC.CloseHandle(%handle)
end
end
open_raw_file("heap.bin", "w") do |f|
PerfTools::DumpHeap.full(f)
end |
If the environment variable `CRYSTAL_DUMP_TYPE_INFO` is set, at build time the compiler will emit a bunch of type information to a JSON file at that path. The JSON looks something like:
```json
{
"types": [
{
"name": "Regex",
"id": 46,
"min_subtype_id": 46,
"supertype_id": 188,
"has_inner_pointers": true,
"size": 8,
"align": 8,
"instance_size": 56,
"instance_align": 8,
"instance_vars": [
{
"name": "@re",
"type_name": "Pointer(LibPCRE2::Code)",
"offset": 8,
"size": 8
},
{
"name": "@jit",
"type_name": "Bool",
"offset": 16,
"size": 1
},
{
"name": "@source",
"type_name": "String",
"offset": 24,
"size": 8
},
{
"name": "@match_data",
"type_name": "Crystal::ThreadLocalValue(Pointer(LibPCRE2::MatchData))",
"offset": 32,
"size": 16
},
{
"name": "@options",
"type_name": "Regex::Options",
"offset": 48,
"size": 8
}
]
}
]
}
```
At the moment, this is intended to be an internal tool that supplements the similarly named `CRYSTAL_DUMP_TYPE_ID` environment variable. I originally made this to generate human-readable reports from [GC heap dumps](crystal-lang/perf-tools#30), but there are probably other good uses like enhancing the debugger support scripts.
| Thread.stop_world | ||
| GC.lock_write |
There was a problem hiding this comment.
I think the order can deadlock: a stopped thread might hold the read or write lock, and we won't be able to lock.
We probably don't need to lock the GC anyway: the current thread stopped the world, and we own the process.
| Thread.stop_world | |
| GC.lock_write | |
| Thread.stop_world |
| fn.call(obj, bytes.to_u64!) | ||
| end, pointerof(block)) | ||
| ensure | ||
| GC.unlock_write |
There was a problem hiding this comment.
| GC.unlock_write |
|
The RawIO is interesting, but it only works for file descriptors, and it still involves the event loop, and thus the fiber scheduler, which can raise, etc. We only need to write |
Resolves #5.
This is a very primitive binary dump and, as mentioned in the issue, requires compiler support to emit the appropriate type info at build time so that other tools can comprehend those dumps (
CRYSTAL_DUMP_TYPE_ID=1will let you visually identify some allocations in a hex editor at the moment).Apart from these custom formats, maybe we could try to produce industry or community standard memory dumps in the future...?