Skip to content

process.GetInfoForPid collects the total thread count#96

Merged
AndersonQ merged 1 commit intoelastic:mainfrom
AndersonQ:34461-beats-collect-total-thread-count
Sep 29, 2023
Merged

process.GetInfoForPid collects the total thread count#96
AndersonQ merged 1 commit intoelastic:mainfrom
AndersonQ:34461-beats-collect-total-thread-count

Conversation

@AndersonQ
Copy link
Copy Markdown
Member

@AndersonQ AndersonQ commented Jul 28, 2023

What does this PR do?

Collect the total thread count, num_threads, as part of the information collected by process.GetInfoForPid

  • On linux:
    • it extends the information parsed from /proc/[PID]/stat to get the total thread count
    • unit tests added, they mock /proc/[PID]/stat
  • On darwin:
    • it uses the syscall proc_pidinfo
    • unit test only validates GetInfoForPid does not return an error and the total thread count ins't zero
  • On windows:

Why is it important?

To enable metricbeat to collect the total thread count per process as requested by elastic/beats#34461.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.md

Related issues

@AndersonQ AndersonQ requested a review from a team as a code owner July 28, 2023 15:35
@AndersonQ AndersonQ requested review from belimawr and fearful-symmetry and removed request for a team July 28, 2023 15:35
@AndersonQ AndersonQ self-assigned this Jul 28, 2023
@AndersonQ AndersonQ added enhancement New feature or request Team:Elastic-Agent Label for the Agent team labels Jul 28, 2023
@AndersonQ AndersonQ marked this pull request as draft July 28, 2023 15:37
@elasticmachine
Copy link
Copy Markdown
Collaborator

elasticmachine commented Jul 28, 2023

💔 Build Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-08-04T12:44:55.052+0000

  • Duration: 4 min 16 sec

Steps errors 1

Expand to view the steps failures

Mage check
  • Took 0 min 9 sec . View more details here
  • Description: mage -v check

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch from 26b107e to 3b97f2f Compare August 1, 2023 16:45
@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch 2 times, most recently from 8924cb8 to 7de1ac7 Compare September 13, 2023 09:50
@AndersonQ AndersonQ marked this pull request as ready for review September 13, 2023 09:51
@AndersonQ AndersonQ changed the title WIP: add threads num_threads per process Collect the total thread count on process.GetInfoForPid Sep 13, 2023
@AndersonQ AndersonQ changed the title Collect the total thread count on process.GetInfoForPid process.GetInfoForPid collects the total thread count Sep 13, 2023
@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch 3 times, most recently from fc1b925 to ef72b6f Compare September 13, 2023 11:43
@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch 2 times, most recently from b298151 to c3fa121 Compare September 13, 2023 12:01
Comment on lines +20 to +31
// Windows Process Snapshotting docs:
// - https://learn.microsoft.com/en-us/previous-versions/windows/desktop/proc_snap/overview-of-process-snapshotting
// PssCaptureSnapshot docs in https://learn.microsoft.com/en-us/windows/win32/api/processsnapshot/nf-processsnapshot-psscapturesnapshot
// PssQuerySnapshot docs in https://learn.microsoft.com/en-us/windows/win32/api/processsnapshot/nf-processsnapshot-pssquerysnapshot

// Use golang.org/x/sys/windows/mkwinsyscall instead of adriansr/mksyscall
// below once https://github.com/golang/go/issues/42373 is fixed.
//go:generate go get github.com/adriansr/mkwinsyscall
//go:generate $GOPATH/bin/mkwinsyscall.exe -systemdll -output zsyscall_windows.go syscall_windows.go

//sys PssCaptureSnapshot(processHandle syscall.Handle, captureFlags PSSCaptureFlags, threadContextFlags uint32, snapshotHandle *syscall.Handle) (err error) [failretval!=0] = kernel32.PssCaptureSnapshot
//sys PssQuerySnapshot(snapshotHandle syscall.Handle, informationClass uint32, buffer *PssThreadInformation, bufferLength uint32) (err error) [failretval!=0] = kernel32.PssQuerySnapshot
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anyone has a suggestion on how to better organise it, I'd appreciate. I don't quite like how it's now, but also I'm not sure which other organisation could be better.

assert.Equal(t, want, got)
}

func TestParsePIDInfos(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully get the name of this test. If I understood it correctly, it's testing parseProcStat, however it is called ParsePIDInfos.

Suggested change
func TestParsePIDInfos(t *testing.T) {
func TestParseProcStat(t *testing.T) {

@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch from c3fa121 to c0d90b9 Compare September 18, 2023 14:36
@AndersonQ AndersonQ requested a review from belimawr September 18, 2023 14:37

// TestFetchNumThreads only checks some value is returned for NumThreads. However,
// it cannot verify the correctness of the value.
func TestFetchNumThreads(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could test this by writing a program that runs a known number of threads. The trick is that it can't be a Go program to get control over it. A simple C program with a main function that just sleeps until it is terminated is guaranteed to be only one thread, or you could spawn a known number of threads directly with the pthreads API, or you could write it in C++ and use with std:thread which is probably easier to do.

Hardest part would be getting CI to compile the program but this repository already depends on CGO so there must be a C compiler available already.

Copy link
Copy Markdown
Member

@cmacknz cmacknz Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also work for Linux, and if you used std:thread in C++ it would also work for Windows without writing a separate test program. As mentioned, the biggest hurdle will be ensuring you have a C++ toolchain available with the appropriate version of C++ available for std:thread support but I hope this is already the case.

@@ -0,0 +1,168 @@
// from https://github.com/zodiacon/Win10SysProgBookSamples/blob/56883f5126f5e29a03d01896f87ce7b82f51fa10/Chapter20/snapproc/snapproc.cpp
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the license of this code? Are you allowed to copy it here?

It seems like you might be better off with just linking to the book with instructions on how to build it unless you've modified this significantly.

Copy link
Copy Markdown
Contributor

@belimawr belimawr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good from my side, but I like Craig's idea to have a simple C++ program spawning a known number of threads for testing.

@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch 2 times, most recently from c9b166f to a913cc8 Compare September 27, 2023 16:23
@cmacknz
Copy link
Copy Markdown
Member

cmacknz commented Sep 28, 2023

The images used for CI are defined in

LINUX_AGENT_IMAGE: "golang:${GO_VERSION}"
WINDOWS_AGENT_IMAGE: "family/ci-windows-2022"

For Linux at least the Go image contains a pre-installed version of g++ you could use.

I have no idea what is in that Windows container, but you could modify the powershell script that runs the tests to find out by trying to invoke a few of the toolchains you expect might exist.

function withGolang($version) {
Write-Host "-- Install golang --"
choco install -y golang --version $version
$env:ChocolateyInstall = Convert-Path "$((Get-Command choco).Path)\..\.."
Import-Module "$env:ChocolateyInstall\helpers\chocolateyProfile.psm1"
refreshenv
go version
go env
}

We have a dependency manager called chocolately, it's possible you could use it to install mingw.

As for actually compiling the binaries it might be easier to do it with an explicit mage target like mage buildTestBinaries similar to the mage buildSystemTestBinaries that exists in Beats.

Even if you can't get it to work in CI you can document the commands used to build the executables in mage and just not invoke them in CI (although I hope doing it for Linux at least won't be that bad).

@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch from 954635a to afd1b55 Compare September 29, 2023 12:56
* On linux:
  -  it extends the information parsed from /proc/[PID]/stat to get the total thread count
  -  unit tests added, they mock /proc/[PID]/stat
* On darwin:
  - it uses the syscall proc_pidinfo
  - unit test only validates GetInfoForPid does not return an error and the total thread count ins't zero
* On windows:
  - it uses the Process Snapshotting APIs: https://learn.microsoft.com/en-us/previous-versions/windows/desktop/proc_snap/overview-of-process-snapshotting
  - unit test only validates GetInfoForPid does not return an error and the total thread count ins't
* Add a unit test using a C++ program which runs exactly 42 threads so we can assert the data read by process.GetInfoForPid
  On windows the and newer the process start with more threads than the program requests to parallel load needed tabels/dll or some windows specific things. Therefore the test on windows accepts 42 or 45 for the number of threads allocated to the process. See the following for details https://stackoverflow.com/questions/42789199/why-there-are-three-unexpected-worker-threads-when-a-win32-console-application-s/42789684#42789684
@AndersonQ AndersonQ force-pushed the 34461-beats-collect-total-thread-count branch from 878ae11 to 2f28d53 Compare September 29, 2023 15:13
@AndersonQ AndersonQ merged commit a5fb65f into elastic:main Sep 29, 2023
@AndersonQ AndersonQ deleted the 34461-beats-collect-total-thread-count branch September 29, 2023 16:11
// is launching. It seems to be due to it loading dll/tables in parallel
// or de default thread pool. See the following for more info:
// https://stackoverflow.com/questions/42789199/why-there-are-three-unexpected-worker-threads-when-a-win32-console-application-s/42789684#42789684
expected := 45
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ensure it's always 45 threads?

One option is to run the program for a while and use the eventually assertion to wait for when the number of threads goes back to 42

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it takes around 30s on my machine to have the number go back to 42. As it's been pretty deterministic, on my machine and on CI, I'm doing this double test. It accounts exactly for the extra threads, if something changes in the future the test sill fails and there should be enough explanation in the test for anyone else o understand the problem and fix it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Team:Elastic-Agent Label for the Agent team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants