8273967: gtest os.dll_address_to_function_and_library_name_vm fails on macOS12#6193
8273967: gtest os.dll_address_to_function_and_library_name_vm fails on macOS12#6193dcubed-ojdk wants to merge 2 commits intoopenjdk:masterfrom
Conversation
|
/label add hotspot-runtime |
|
/label add serviceability |
|
👋 Welcome back dcubed! A progress list of the required criteria for merging this PR into |
|
@dcubed-ojdk |
|
@dcubed-ojdk |
Webrevs
|
|
Ping! Any takers? |
tstuefe
left a comment
There was a problem hiding this comment.
Hi Daniel,
looks good. Small remarks below. I leave it up to you if you take my suggestions.
Cheers, Thomas
| // The Mach-O binary format does not contain a "list of files" with address | ||
| // ranges like ELF. That makes sense since Mach-O can contain binaries for | ||
| // than one instruction set so there can be more than one address range for | ||
| // each "file". |
There was a problem hiding this comment.
Small nit, it seems confusing to have a Mac-specific comment in the BSD section.
Maybe this would be better in MachDecoder? E.g. implement the 6-arg version of decode() but stubbed out returning false, and with your comment there.
There was a problem hiding this comment.
It's actually fairly common to have Mac-specific stuff in the BSD files. The macOS
port was built on top of the BSD port and the BSD port was built by copying a LOT
of code from Linux into BSD specific files with modifications as needed.
If I pushed this change down into MachDecoder, then I would have to lose the
ShouldNotReachHere() call in order to not assert in non-release bits. I don't
think I want to do that since this may not be the only place that calls the
6-arg version of decode().
There was a problem hiding this comment.
It's actually fairly common to have Mac-specific stuff in the BSD files. The macOS port was built on top of the BSD port and the BSD port was built by copying a LOT of code from Linux into BSD specific files with modifications as needed.
I always wondered whether anyone actually builds the BSDs in head. I assume Oracle does not, right? I know there are downstream porters somewhere but only for old releases, or?
If I pushed this change down into MachDecoder, then I would have to lose the
ShouldNotReachHere()call in order to not assert in non-release bits. I don't think I want to do that since this may not be the only place that calls the 6-arg version of decode().
Fair enough, thanks for the clarification.
There was a problem hiding this comment.
Oracle does not build BSD in head. At one point, Dmitry Samersoff used to build BSD
in his lab, but I don't know if he still does that.
src/hotspot/os/bsd/os_bsd.cpp
Outdated
| if (offset) *offset = -1; | ||
| return false; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
We use dladdr() in several places in this code. I wonder whether it would make sense to fix all of those with a wrapper instead:
static int my_dladdr(const void* addr, Dl_info* info) {
if (addr != (void*)-1) {
return dladdr(addr, info);
} else {
// add comment here
return 0;
}
}
#define dladdr my_dladdr
There was a problem hiding this comment.
I'll take a look at the other calls to dladdr(). I'm trying to limit what I change
here to things that actually failed in our test on macOS12 on X64 and aarch64.
There was a problem hiding this comment.
I took a quick look at the other calls to dladdr() in src/hotspot/os/bsd/os_bsd.cpp
and I'm not comfortable with changing those uses without having a specific test
case that I can use to investigate those code paths.
We are fairly early in our testing on macOS12 so I may run into a reason to revisit
this choice down the road.
There was a problem hiding this comment.
I took a quick look at the other calls to
dladdr()in src/hotspot/os/bsd/os_bsd.cpp and I'm not comfortable with changing those uses without having a specific test case that I can use to investigate those code paths.We are fairly early in our testing on macOS12 so I may run into a reason to revisit this choice down the road.
Okay!
There was a problem hiding this comment.
I've had a chance to think about this overnight and I'm not liking
my duplication of code so I'm going to look at adding a wrapper
that is called by the two calls sites where know I need the special
handling.
dcubed-ojdk
left a comment
There was a problem hiding this comment.
@tstuefe - Thanks for your review.
src/hotspot/os/bsd/os_bsd.cpp
Outdated
| if (offset) *offset = -1; | ||
| return false; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
I'll take a look at the other calls to dladdr(). I'm trying to limit what I change
here to things that actually failed in our test on macOS12 on X64 and aarch64.
| // The Mach-O binary format does not contain a "list of files" with address | ||
| // ranges like ELF. That makes sense since Mach-O can contain binaries for | ||
| // than one instruction set so there can be more than one address range for | ||
| // each "file". |
There was a problem hiding this comment.
It's actually fairly common to have Mac-specific stuff in the BSD files. The macOS
port was built on top of the BSD port and the BSD port was built by copying a LOT
of code from Linux into BSD specific files with modifications as needed.
If I pushed this change down into MachDecoder, then I would have to lose the
ShouldNotReachHere() call in order to not assert in non-release bits. I don't
think I want to do that since this may not be the only place that calls the
6-arg version of decode().
gerard-ziemski
left a comment
There was a problem hiding this comment.
Looks good, just a few optional nitpicks that I personally would have done, if it were me doing the change.
src/hotspot/os/bsd/os_bsd.cpp
Outdated
| #endif | ||
|
|
||
| #if defined(__APPLE__) |
There was a problem hiding this comment.
Why not just do:
#else
here instead and collapse these 3 lines into 1?
There was a problem hiding this comment.
Hmmm... I'll take a look at doing that.
| #endif | ||
|
|
||
| #if defined(__APPLE__) | ||
| char localbuf[MACH_MAXSYMLEN]; |
There was a problem hiding this comment.
This __APPLE__ section is the only one, that I can see, using MACH_MAXSYMLEN, why not move:
#if defined(__APPLE__)
#define MACH_MAXSYMLEN 256
#endif
here (i.e. just the #define MACH_MAXSYMLEN 256 and minimize the need for __APPLE__ sections?
There was a problem hiding this comment.
Hmmm.... I'll take a look at cleaning this up a bit.
src/hotspot/os/bsd/os_bsd.cpp
Outdated
| if (addr == (address)(intptr_t)-1) { | ||
| // dladdr() in macOS12/Monterey returns success for -1, but that addr | ||
| // value should not be allowed to work to avoid confusion. | ||
| buf[0] = '\0'; | ||
| if (offset) *offset = -1; | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Do we need this here? Wouldn't the earlier call to Decoder::decode(addr, localbuf, MACH_MAXSYMLEN, offset, dlinfo.dli_fbase)) catch this with ShouldNotReachHere?
There was a problem hiding this comment.
That's the 5-parameter version of decode() and it doesn't have ShouldNotReachHere.
So if that code site is called and returns false, then we get into
dll_address_to_library_name() and reach this dladdr() call which
will accept the "-1"...
|
@dcubed-ojdk This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 85 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
dcubed-ojdk
left a comment
There was a problem hiding this comment.
@tstuefe - Thanks for closing the loop on my previous replies.
@gerard-ziemski - Thanks for the review!
I'm going to make more tweaks to this fix and will update the
PR after my test cycle is complete.
src/hotspot/os/bsd/os_bsd.cpp
Outdated
| if (offset) *offset = -1; | ||
| return false; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
I've had a chance to think about this overnight and I'm not liking
my duplication of code so I'm going to look at adding a wrapper
that is called by the two calls sites where know I need the special
handling.
| // The Mach-O binary format does not contain a "list of files" with address | ||
| // ranges like ELF. That makes sense since Mach-O can contain binaries for | ||
| // than one instruction set so there can be more than one address range for | ||
| // each "file". |
There was a problem hiding this comment.
Oracle does not build BSD in head. At one point, Dmitry Samersoff used to build BSD
in his lab, but I don't know if he still does that.
| #endif | ||
|
|
||
| #if defined(__APPLE__) | ||
| char localbuf[MACH_MAXSYMLEN]; |
There was a problem hiding this comment.
Hmmm.... I'll take a look at cleaning this up a bit.
src/hotspot/os/bsd/os_bsd.cpp
Outdated
| #endif | ||
|
|
||
| #if defined(__APPLE__) |
There was a problem hiding this comment.
Hmmm... I'll take a look at doing that.
src/hotspot/os/bsd/os_bsd.cpp
Outdated
| if (addr == (address)(intptr_t)-1) { | ||
| // dladdr() in macOS12/Monterey returns success for -1, but that addr | ||
| // value should not be allowed to work to avoid confusion. | ||
| buf[0] = '\0'; | ||
| if (offset) *offset = -1; | ||
| return false; | ||
| } |
There was a problem hiding this comment.
That's the 5-parameter version of decode() and it doesn't have ShouldNotReachHere.
So if that code site is called and returns false, then we get into
dll_address_to_library_name() and reach this dladdr() call which
will accept the "-1"...
|
@tstuefe and @gerard-ziemski - please re-review when you get the chance. |
|
This version has been tested with Mach5 Tier1 and with runs of the specific |
|
@tstuefe - Thanks for the re-review! |
|
Thank you Dan for the fix! |
|
@gerard-ziemski - Thanks for the re-review! |
|
/integrate |
|
Going to push as commit 92d2176.
Your commit was automatically rebased without conflicts. |
|
@dcubed-ojdk Pushed as commit 92d2176. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
|
@dcubed-ojdk: I just ran into this in the Julia runtime and reached a similar conclusion/fix. Did you find out anything more about why this happens? At a glance, I didn't see any blatant |
|
@dnadlinger - No I haven't chased this any further. Also please see this note in the bug report: |
|
@dcubed-ojdk: Thanks for taking the time to replyr regardless! The dlopen/dladdr/… man pages all show up fine on my machine, by the way (macOS 12.1). |
macOS12 has changed the dladdr() function to accept "-1" as a valid address and
we have functions that use dladdr() to convert DLL addresses into function or
library names. We also have a gtest that verifies that "-1" is not a valid value to use
as a symbol address.
As you might imagine, existing code that uses
os::dll_address_to_function_name()or
os::dll_address_to_library_name()can get quite confused (and sometimes crash)if an
addrparameter of-1was allowed to be used.I've also made two cleanup changes as part of this fix:
In
src/hotspot/os/bsd/os_bsd.cppthere is some macOS specific code that shouldbe properly
#ifdef'ed. There is also some code that makes sense for ELF formatfiles, but not for Mach-O format files so that code needs to be excluded on macOS.
In
src/hotspot/share/runtime/os.cppI noticed a simple typo in a comment on an#endifthat I fixed. That typo does not appear anywhere else in the HotSpot codebase so I'd like to fix it with this bug ID since I'm in related areas.
This fix has been tested with Mach5 Tier[1-6].
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6193/head:pull/6193$ git checkout pull/6193Update a local copy of the PR:
$ git checkout pull/6193$ git pull https://git.openjdk.java.net/jdk pull/6193/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6193View PR using the GUI difftool:
$ git pr show -t 6193Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6193.diff