process_collector: fill in most statistics on macOS#1600
Merged
bwplotka merged 3 commits intoprometheus:mainfrom Sep 4, 2024
Merged
process_collector: fill in most statistics on macOS#1600bwplotka merged 3 commits intoprometheus:mainfrom
bwplotka merged 3 commits intoprometheus:mainfrom
Conversation
Unfortunately, the virtual memory, resident memory, and network stats will require access to undocumented C functions. I was warned off of cgo in IRC because it would then have to be enabled in a bunch of different projects that use this module, but I already was against it because that would break the ability to cross-compile. There is no interface to `dlopen` built into golang. The `github.com/ebitengine/purego` module looks promising (I can cross-compile and call these methods), but I'm currently getting unexpected results. I'll follow up with that separately if I can get it working, but hopefully this stuff is pretty uncontroversial. Tested on macOS 10.14.6 (amd64), macOS 14.6.1 (amd64), and macOS 15.0 (arm64) by spawning `/usr/bin/ulimit -a -S` and `/usr/sbin/lsof -c $my_process` from the test exporter process, and `ps -o lstart,vsize,rss,utime,stime,command` from the shell, and comparing results with the exported metrics. I can't find documentation for `RLIMIT_AS` on macOS (specifically if it's in bytes or pages). It's currently being reported back as `RLIM_INFINITY`, which seems reasonable, because I've come across reports that the value is ignored anyway[1]. The bash 3.2 code for the built-in `ulimit` divides the value reported by `getrusage(2)` by 1024 when printing, as it does for `RLIMIT_DATA`, which is documented as being bytes in `getrusage(2)`. The help for `ulimit` indicates it prints both in kbytes, so it's reasonable to assume this is already in bytes. [1] https://issues.chromium.org/issues/40581251#comment3 Signed-off-by: Matt Harbison <mharbison72@gmail.com>
8bb6fc6 to
5836830
Compare
Contributor
Author
|
For context about the memory stats future work, it looks like No idea if the "try to determine if this task has the split libraries mapped in..." extra processing around line 130 is relevant to Go processes. Most blog and SO references I've seen for getting virtual memory counts from |
Co-authored-by: Ben Kochie <superq@gmail.com> Signed-off-by: Matt Harbison <57785103+mharbison72@users.noreply.github.com>
SuperQ
approved these changes
Aug 30, 2024
Member
SuperQ
left a comment
There was a problem hiding this comment.
LGTM, will leave the final review up to the maintainers.
d0708ad to
5d98e83
Compare
Contributor
Author
|
I guess I forgot to tag a maintainer. |
bwplotka
approved these changes
Sep 3, 2024
Member
bwplotka
left a comment
There was a problem hiding this comment.
Amazing starting point, thanks! Tiny suggestions, but not blocking, up to you. LGTM
ArthurSens
reviewed
Sep 3, 2024
Member
|
Thanks! |
mharbison72
added a commit
to mharbison72/client_golang
that referenced
this pull request
Sep 17, 2024
prometheus#1600) Unfortunately, these values aren't available from getrusage(2), or any other builtin Go API. Using cgo is one alternative. It's possible to conditionalize everything such that cgo can remain disabled on non-Darwin platforms, or even when cross-compiling Darwin executables on a non-Darwin platform (and stub in code that causes the metrics to not be exported). `CGO_ENABLED=1` is set by default on macOS, but unfortunately is off for the non-host architecture, even when gcc supports cross-compiling. (e.g. building with GOARCH=amd on an M2 mac skipped the cgo code.) I think that's too subtle of a distinction to rely on cgo. There's no builtin equivalent of `syscall.NewLazyDLL()` and `.NewProc()` on macOS that Go provides for Windows, so we're stuck with a 3rd party dependency. But it seems stable, maintained, ang getting a fair amount of usage. I'm avoiding their struct deserialization because these native structs are packed differently than the equivalent Go structs, which was causing bad values to be returned. The code is heavy with inline comments, and I tried keeping the type names the same as the C code to make it easier to search for them. I'm not sure that we need to do the `mach_vm_region()` call to adjust the `task_info()` values, because I've never seen that conditional evaluate to True on either amd64, arm64, or when amd64 is run under Rosetta. But this is what `ps(1)` does, and I think it's reasonable to try to match that unless somebody knows it's dead code. Signed-off-by: Matt Harbison <mharbison72@gmail.com>
mharbison72
added a commit
to mharbison72/client_golang
that referenced
this pull request
Sep 24, 2024
prometheus#1600) Unfortunately, these values aren't available from getrusage(2), or any other builtin Go API. Go itself doesn't provide a mechanism (like on Windows) to call into system libraries. Using a 3rd party package[1] to dynamically call system libraries was proposed and rejected, to avoid adding to the number of dependencies. That leaves using cgo, which is used here when available. When not available (either because of cross compiling or explicitly disabling it), a stub function is linked instead, and the metrics are not exported. That way, cross compiling of other platforms is unaffected (and can also still be done with Darwin too, but at the cost of not exporting these metrics). Note that building an amd64 image on an arm64 mac or vice-versa is cross compiling, and will use the stub method by default. This can be avoided by setting `CGO_ENABLED=1` in the environment to force the use of cgo for both architectures. I'm unsure of the usefulness of the potential adjustment made to the virtual memory value after calling `mach_vm_region()`. I've not seen that code get run with a native amd64 or arm64 image, or with an amd64 image running under Rosetta. But that's what the `ps(1)` command does, and I think we should report what the system tools do. When I was testing this on a beta of macOS 15 with Go 1.21.13 (the current minimum support for this module), the amd64 image ran fine under Rosetta, but the arm64 image immediately printed a message that it was killed, even prior to the cgo call. This seems to be a recurring issue on macOS[2][3], and passing `-ldflags -s` to `go build` avoided the issue. Go 1.23.1 worked out of the box, without fiddling with linker flags, so I don't think this is an issue- Go 1.21 is simply too old to support macOS 15, but I thought it was worth noting. I supposed we could gate the cgo code with an additional build flag, if anyone is concerned about this. [1] https://github.com/ebitengine/purego [2] golang/go#19841 (comment) [3] golang/go#11887 (comment)
mharbison72
added a commit
to mharbison72/client_golang
that referenced
this pull request
Sep 24, 2024
prometheus#1600) Unfortunately, these values aren't available from getrusage(2), or any other builtin Go API. Go itself doesn't provide a mechanism (like on Windows) to call into system libraries. Using a 3rd party package[1] to dynamically call system libraries was proposed and rejected, to avoid adding to the number of dependencies. That leaves using cgo, which is used here when available. When not available (either because of cross compiling or explicitly disabling it), a stub function is linked instead, and the metrics are not exported. That way, cross compiling of other platforms is unaffected (and can also still be done with Darwin too, but at the cost of not exporting these metrics). Note that building an amd64 image on an arm64 mac or vice-versa is cross compiling, and will use the stub method by default. This can be avoided by setting `CGO_ENABLED=1` in the environment to force the use of cgo for both architectures. I'm unsure of the usefulness of the potential adjustment made to the virtual memory value after calling `mach_vm_region()`. I've not seen that code get run with a native amd64 or arm64 image, or with an amd64 image running under Rosetta. But that's what the `ps(1)` command does, and I think we should report what the system tools do. When I was testing this on a beta of macOS 15 with Go 1.21.13 (the current minimum support for this module), the amd64 image ran fine under Rosetta, but the arm64 image immediately printed a message that it was killed, even prior to the cgo call. This seems to be a recurring issue on macOS[2][3], and passing `-ldflags -s` to `go build` avoided the issue. Go 1.23.1 worked out of the box, without fiddling with linker flags, so I don't think this is an issue- Go 1.21 is simply too old to support macOS 15, but I thought it was worth noting. I supposed we could gate the cgo code with an additional build flag, if anyone is concerned about this. [1] https://github.com/ebitengine/purego [2] golang/go#19841 (comment) [3] golang/go#11887 (comment) Signed-off-by: Matt Harbison <mharbison72@gmail.com>
mharbison72
added a commit
to mharbison72/client_golang
that referenced
this pull request
Sep 25, 2024
prometheus#1600) Unfortunately, these values aren't available from getrusage(2), or any other builtin Go API. Go itself doesn't provide a mechanism (like on Windows) to call into system libraries. Using a 3rd party package[1] to dynamically call system libraries was proposed and rejected, to avoid adding to the number of dependencies. That leaves using cgo, which is used here when available. When not available (either because of cross compiling or explicitly disabling it), a stub function is linked instead, and the metrics are not exported. That way, cross compiling of other platforms is unaffected (and can also still be done with Darwin too, but at the cost of not exporting these metrics). Note that building an amd64 image on an arm64 mac or vice-versa is cross compiling, and will use the stub method by default. This can be avoided by setting `CGO_ENABLED=1` in the environment to force the use of cgo for both architectures. I'm unsure of the usefulness of the potential adjustment made to the virtual memory value after calling `mach_vm_region()`. I've not seen that code get run with a native amd64 or arm64 image, or with an amd64 image running under Rosetta. But that's what the `ps(1)` command does, and I think we should report what the system tools do. When I was testing this on a beta of macOS 15 with Go 1.21.13 (the current minimum support for this module), the amd64 image ran fine under Rosetta, but the arm64 image immediately printed a message that it was killed, even prior to the cgo call. This seems to be a recurring issue on macOS[2][3], and passing `-ldflags -s` to `go build` avoided the issue. Go 1.23.1 worked out of the box, without fiddling with linker flags, so I don't think this is an issue- Go 1.21 is simply too old to support macOS 15, but I thought it was worth noting. I supposed we could gate the cgo code with an additional build flag, if anyone is concerned about this. [1] https://github.com/ebitengine/purego [2] golang/go#19841 (comment) [3] golang/go#11887 (comment) Signed-off-by: Matt Harbison <mharbison72@gmail.com>
mharbison72
added a commit
to mharbison72/client_golang
that referenced
this pull request
Sep 25, 2024
prometheus#1600) Unfortunately, these values aren't available from getrusage(2), or any other builtin Go API. Go itself doesn't provide a mechanism (like on Windows) to call into system libraries. Using a 3rd party package[1] to dynamically call system libraries was proposed and rejected, to avoid adding to the number of dependencies. That leaves using cgo, which is used here when available. When not available (either because of cross compiling or explicitly disabling it), a stub function is linked instead, and the metrics are not exported. That way, cross compiling of other platforms is unaffected (and can also still be done with Darwin too, but at the cost of not exporting these metrics). Note that building an amd64 image on an arm64 mac or vice-versa is cross compiling, and will use the stub method by default. This can be avoided by setting `CGO_ENABLED=1` in the environment to force the use of cgo for both architectures. I'm unsure of the usefulness of the potential adjustment made to the virtual memory value after calling `mach_vm_region()`. I've not seen that code get run with a native amd64 or arm64 image, or with an amd64 image running under Rosetta. But that's what the `ps(1)` command does, and I think we should report what the system tools do. When I was testing this on a beta of macOS 15 with Go 1.21.13 (the current minimum support for this module), the amd64 image ran fine under Rosetta, but the arm64 image immediately printed a message that it was killed, even prior to the cgo call. This seems to be a recurring issue on macOS[2][3], and passing `-ldflags -s` to `go build` avoided the issue. Go 1.23.1 worked out of the box, without fiddling with linker flags, so I don't think this is an issue- Go 1.21 is simply too old to support macOS 15, but I thought it was worth noting. I supposed we could gate the cgo code with an additional build flag, if anyone is concerned about this. [1] https://github.com/ebitengine/purego [2] golang/go#19841 (comment) [3] golang/go#11887 (comment) Signed-off-by: Matt Harbison <mharbison72@gmail.com>
amberpixels
pushed a commit
to amberpixels/prometheus_client_golang
that referenced
this pull request
Nov 29, 2024
* process_collector: fill in most statistics on macOS Unfortunately, the virtual memory, resident memory, and network stats will require access to undocumented C functions. I was warned off of cgo in IRC because it would then have to be enabled in a bunch of different projects that use this module, but I already was against it because that would break the ability to cross-compile. There is no interface to `dlopen` built into golang. The `github.com/ebitengine/purego` module looks promising (I can cross-compile and call these methods), but I'm currently getting unexpected results. I'll follow up with that separately if I can get it working, but hopefully this stuff is pretty uncontroversial. Tested on macOS 10.14.6 (amd64), macOS 14.6.1 (amd64), and macOS 15.0 (arm64) by spawning `/usr/bin/ulimit -a -S` and `/usr/sbin/lsof -c $my_process` from the test exporter process, and `ps -o lstart,vsize,rss,utime,stime,command` from the shell, and comparing results with the exported metrics. I can't find documentation for `RLIMIT_AS` on macOS (specifically if it's in bytes or pages). It's currently being reported back as `RLIM_INFINITY`, which seems reasonable, because I've come across reports that the value is ignored anyway[1]. The bash 3.2 code for the built-in `ulimit` divides the value reported by `getrusage(2)` by 1024 when printing, as it does for `RLIMIT_DATA`, which is documented as being bytes in `getrusage(2)`. The help for `ulimit` indicates it prints both in kbytes, so it's reasonable to assume this is already in bytes. [1] https://issues.chromium.org/issues/40581251#comment3 Signed-off-by: Matt Harbison <mharbison72@gmail.com> * Update prometheus/process_collector_darwin.go Co-authored-by: Ben Kochie <superq@gmail.com> Signed-off-by: Matt Harbison <57785103+mharbison72@users.noreply.github.com> --------- Signed-off-by: Matt Harbison <mharbison72@gmail.com> Signed-off-by: Matt Harbison <57785103+mharbison72@users.noreply.github.com> Co-authored-by: Ben Kochie <superq@gmail.com> Co-authored-by: Bartlomiej Plotka <bwplotka@gmail.com> Signed-off-by: Eugene <eugene@amberpixels.io>
This was referenced Feb 19, 2025
Merged
1 task
1 task
1 task
1 task
This was referenced Feb 18, 2026
Open
1 task
1 task
1 task
1 task
This was referenced Feb 24, 2026
Open
1 task
1 task
This was referenced Mar 5, 2026
1 task
This was referenced Mar 13, 2026
Open
Open
This was referenced Mar 23, 2026
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Unfortunately, the virtual memory, resident memory, and network stats will require access to undocumented C functions. I was warned off of cgo in IRC because it would then have to be enabled in a bunch of different projects that use this module, but I already was against it because that would break the ability to cross-compile. There is no interface to
dlopenbuilt into golang. Thegithub.com/ebitengine/puregomodule looks promising (I can cross-compile and call these methods), but I'm currently getting unexpected results. I'll follow up with that separately if I can get it working, but hopefully this stuff is pretty uncontroversial.Tested on macOS 10.14.6 (amd64), macOS 14.6.1 (amd64), and macOS 15.0 (arm64) by spawning
/usr/bin/ulimit -a -Sand/usr/sbin/lsof -c $my_processfrom the test exporter process, andps -o lstart,vsize,rss,utime,stime,commandfrom the shell, and comparing results with the exported metrics.I can't find documentation for
RLIMIT_ASon macOS (specifically if it's in bytes or pages). It's currently being reported back asRLIM_INFINITY, which seems reasonable, because I've come across reports that the value is ignored anyway[1]. The bash 3.2 code for the built-inulimitdivides the value reported bygetrusage(2)by 1024 when printing, as it does forRLIMIT_DATA, which is documented as being bytes ingetrusage(2). The help forulimitindicates it prints both in kbytes, so it's reasonable to assume this is already in bytes.[1] https://issues.chromium.org/issues/40581251#comment3