Fix cpu core count detection#5625
Conversation
src/Shared/NativeMethodsShared.cs
Outdated
| IntPtr ptr = IntPtr.Zero; | ||
| try | ||
| { | ||
| ptr = Marshal.AllocHGlobal((int)ReturnLength); |
There was a problem hiding this comment.
Any reason not to make this part of the original definition? If it fails, you don't have to free it.
There was a problem hiding this comment.
I usually put Marshal.AllocHGlobal inside try/catch statements to catch potential OOM exceptions. Calling free with IntPtr.Zero is safe. Can change it, if you prefer.
src/Shared/NativeMethodsShared.cs
Outdated
| { | ||
| count++; | ||
| } | ||
| pos += Marshal.ReadInt32(ptr, pos + 4); |
There was a problem hiding this comment.
nit:
Can you add this to the for statement?
Also, is this right? Maybe I'm misunderstanding, but it looks like you're increasing it by whatever 32-bit integer happens to be four bytes along? Is that the size of the current struct?
src/Shared/NativeMethodsShared.cs
Outdated
| int count = 0; | ||
| for(int pos = 0; pos < ReturnLength; ) | ||
| { | ||
| LOGICAL_PROCESSOR_RELATIONSHIP Type = (LOGICAL_PROCESSOR_RELATIONSHIP)Marshal.ReadInt16(ptr, pos); |
There was a problem hiding this comment.
https://docs.microsoft.com/windows/win32/api/winnt/ns-winnt-system_logical_processor_information seems to say this is just a byte, not an int16.
src/Shared/NativeMethodsShared.cs
Outdated
| LOGICAL_PROCESSOR_RELATIONSHIP Type = (LOGICAL_PROCESSOR_RELATIONSHIP)Marshal.ReadInt16(ptr, pos); | ||
| if(Type == LOGICAL_PROCESSOR_RELATIONSHIP.RelationProcessorCore) | ||
| { | ||
| count++; |
There was a problem hiding this comment.
Looking at the documentation:
The specified logical processors share a single processor core. The ProcessorCore member contains additional information.
I switched this to run whenever numberOfCpus exceeded 1 and printed out the number of cpus it calculated. It printed 6. I have 6 dual cores. I think we can get a more accurate count if we use the ProcessorCore to get processors/core as well.
There was a problem hiding this comment.
Ok, reused previous corefx code, count should be more accurate now.
Don't think so as Also, shouldn't this cpu checking code be located in https://github.com/dotnet/runtime/blob/96f178d32b7ba62485917ac46ef1edcfd3c2d10d/src/coreclr/src/classlibnative/bcltype/system.cpp#L327 instead? |
src/Shared/NativeMethodsShared.cs
Outdated
|
|
||
| internal unsafe struct GROUP_RELATIONSHIP | ||
| { | ||
| private byte MaximumGroupCount; |
There was a problem hiding this comment.
nit: This is defined as
WORD MaximumGroupCount;in the unmanaged world so should be an ushort here.
src/Shared/NativeMethodsShared.cs
Outdated
| } | ||
|
|
||
| internal enum LOGICAL_PROCESSOR_RELATIONSHIP | ||
| { |
src/Shared/NativeMethodsShared.cs
Outdated
| } | ||
| } | ||
|
|
||
| public unsafe static int GetPhysicalCoreCount() |
There was a problem hiding this comment.
Can you please add a comment with a short <summary>?
src/Shared/NativeMethodsShared.cs
Outdated
| int groupCount = current->Group.ActiveGroupCount; | ||
| for (int i = 0; i < groupCount; i++) | ||
| { | ||
| processorCount += (groupInfo + i)->ActiveProcessorCount; |
There was a problem hiding this comment.
Add a check that this dereference is not overrunning the containing buffer, just in case.
There was a problem hiding this comment.
I'm not quite sure about the kind of check you have in mind here?
byte* ptr = bufferPtr, endPtr = bufferPtr + len;
while (ptr < endPtr)
This seems correct from what I can tell.
What check would you add to processorCount += (groupInfo + i)->ActiveProcessorCount;?
There was a problem hiding this comment.
If groupCount is a bogus number, (groupInfo + i) could be pointing outside of the bufferPtr <-> endPtr block. I believe it's a good security practice to not trust the data and check that all memory accesses are within the bounds when using unsafe C# code.
Strictly speaking, this should also include current->Relationship and current->Group.ActiveGroupCount because ptr being less than endPtr is not sufficient for these reads to be from within the buffer.
There was a problem hiding this comment.
Since this code was directly taken from dotnet/runtime unmodified, and has been pushed to production, I'm inclined to believe it works and does not need additional checks. I'll add the checks since you prefer though.
There was a problem hiding this comment.
Added some checks 4223cc2, please feel free to let me know if that's not what you had in mind.
There was a problem hiding this comment.
These checks will now just ignore invalid data which is, to our understanding, logically impossible. Should they not be asserting or throwing?
src/MSBuild/XMake.cs
Outdated
| String.Equals(switchName, "maxcpucount", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| int numberOfCpus = Environment.ProcessorCount; | ||
| if (numberOfCpus == 32 && NativeMethodsShared.IsWindows) // 32-bit process, 32-bit Windows had a 32-core limit |
There was a problem hiding this comment.
I think it would be better if either:
- This code ran always, i.e. remove the
numberOfCpus == 32condition. It would help test the new code on more machines. Unless calling the Win32 API has a measurable perf impact. - Or this code ran only if the process is actually 32-bit. Users may choose to run 64-bit msbuild.exe and my understanding is that this fix is not needed there.
|
Closing to reopen and trigger a CI build. See #5646 |
src/Shared/NativeMethodsShared.cs
Outdated
|
|
||
| internal unsafe struct GROUP_RELATIONSHIP | ||
| { | ||
| private byte MaximumGroupCount; |
There was a problem hiding this comment.
There was a build error related to the MaximumGroupCount:
error CS0169: The field 'NativeMethodsShared.GROUP_RELATIONSHIP.MaximumGroupCount' is never used
and ActiveGroupCount:
error CS0649: Field 'NativeMethodsShared.GROUP_RELATIONSHIP.ActiveGroupCount' is never assigned to, and will always have its default value 0
and SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX.Size:
error CS0649: Field 'NativeMethodsShared.SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX.Size' is never assigned to, and will always have its default value 0
and SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX.Relationship:
error CS0649: Field 'NativeMethodsShared.SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX.Relationship' is never assigned to, and will always have its default value
475840d to
ff27678
Compare
Forgind
left a comment
There was a problem hiding this comment.
I think this works—thank you!
src/MSBuild/XMake.cs
Outdated
| String.Equals(switchName, "maxcpucount", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| int numberOfCpus = Environment.ProcessorCount; | ||
| if (IntPtr.Size == 4 && NativeMethodsShared.IsWindows) // 32-bit process, 32-bit Windows had a 32-core limit |
There was a problem hiding this comment.
Is there a more intuitive way to get that it's 32-bit? If not, this is fine. I was only confused until I read the comment.
There was a problem hiding this comment.
If there isn't another way, can we comment with a link to https://docs.microsoft.com/en-us/dotnet/api/system.intptr.size?view=netcore-3.1, or the excerpt from that page
The size of a pointer or handle in this process, measured in bytes. The value of this property is 4 in a 32-bit process, and 8 in a 64-bit process. You can define the process type by setting the /platform switch when you compile your code with the C# and Visual Basic compilers.
There was a problem hiding this comment.
To be clear, I understood immediately after reading the comment. Although I'd count this as somewhat subtle but definitely simple and functional.
There was a problem hiding this comment.
Maybe extract it to a readonly static property is32bit? Or use an enum?
There was a problem hiding this comment.
Good point! You reminded me of https://github.com/dotnet/msbuild/blob/master/src/Shared/EnvironmentUtilities.cs#L11
We should follow this model.
There was a problem hiding this comment.
Just use this but test for the negation?
There was a problem hiding this comment.
Also it looks like there is a static bool in environment.
https://github.com/dotnet/runtime/blob/ea2b09beea535abf960651c59359e40f99f2fecf/src/libraries/System.Private.CoreLib/src/System/Environment.cs#L138
https://docs.microsoft.com/en-us/dotnet/api/system.environment.is64bitprocess?view=netcore-3.1
There was a problem hiding this comment.
Either way works, but I'd make a new Is32Bit version. We probably don't have to worry about 16-bit machines, and we don't currently have to worry about 128-bit machines, but just in case this code lasts a while...
There was a problem hiding this comment.
EnvironmentUtilities is internal to this assembly thus cannot be referenced without adding an InternalsVisibleTo which I don't think we want to do. Used !Environment.Is64BitProcess instead.
src/Shared/NativeMethodsShared.cs
Outdated
| // Walk each SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX in the buffer, where the Size of each dictates how | ||
| // much space it's consuming. For each group relation, count the number of active processors in each of its group infos. | ||
| int processorCount = 0; | ||
| byte* ptr = bufferPtr, endPtr = bufferPtr + len; |
There was a problem hiding this comment.
nit:
split onto two lines
I also kinda liked your for loop better, but this is fine.
There was a problem hiding this comment.
split onto two lines
done
| RelationGroup, | ||
| RelationAll = 0xffff | ||
| } | ||
| internal struct SYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX |
There was a problem hiding this comment.
Question for myself. I noticed this struct has more members: https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-system_logical_processor_information_ex.
Can you simply not define all members and it will populate only what you have defined without raising some sort of error?
There was a problem hiding this comment.
Can you simply not define all members and it will populate only what you have defined without raising some sort of error?
Fair question. I think it depends on what you need to marshal. In case of union, this way of doing things seems to be fine and the original writer had a comment about it: https://github.com/dotnet/runtime/blob/3ab9c0b79300a5b3d954421aa4da15a9c2ca1271/src/libraries/Common/src/Interop/Windows/kernel32/Interop.GetLogicalProcessorInformationEx.cs#L27
src/Shared/NativeMethodsShared.cs
Outdated
| /// Useful for getting the exact core count in 32 bits processes, | ||
| /// as Environment.ProcessorCount has a 32-core limit in that case. | ||
| /// </summary> | ||
| public unsafe static int GetPhysicalCoreCount() |
There was a problem hiding this comment.
Ideally this should be copy pasted code from a trustworthy repo (dotnet runtime, corefx, etc) and we just copy paste it and add a comment citing the source. If that's the case, can you please cite the source? (github file line permalink)
Well, ideally ideally it should be in a shared library of some sorts, but let's not get too ideal :)
src/Shared/NativeMethodsShared.cs
Outdated
| if (groupInfo != null) | ||
| { | ||
| int groupCount = current->Group.ActiveGroupCount; | ||
| for (int i = 0; i < groupCount; i++) |
There was a problem hiding this comment.
I may have misunderstood, but I thought @ladipro was thinking something more like this?
| for (int i = 0; i < groupCount; i++) | |
| for (int i = 0; i < groupCount && groupInfo + i < endPtr; i++) |
There was a problem hiding this comment.
Thanks, yes, I was thinking something like this, not the null checks added in 4223cc2. It would have to be:
for (int i = 0; i < groupCount && groupInfo + i + 1 <= endPtr; i++)i.e. the entire structure pointed to by (groupInfo + i) fits in the buffer.
I mentioned this to @danmosemsft offline and he doesn't consider this kind of hardening necessary when calling a trusted API. @mfkl, feel free to revert the 4223cc2. Apologies for randomizing you.
There was a problem hiding this comment.
Yeah, clearly we don't do it in the code this came from in dotnet/runtime and I think it would be simply "testing the OS has no bugs". But if there's a security context, don't take my word for it - we should ask others.
src/MSBuild/XMake.cs
Outdated
| String.Equals(switchName, "maxcpucount", StringComparison.OrdinalIgnoreCase)) | ||
| { | ||
| int numberOfCpus = Environment.ProcessorCount; | ||
| if (!Environment.Is64BitProcess && NativeMethodsShared.IsWindows) // 32-bit process, 32-bit Windows had a 32-core limit |
There was a problem hiding this comment.
64-bit Windows had a 64-core limit, too. I think this should be called when we're running on .NET Framework regardless of bitness (on .NET Core/.NET 5, we can rely on Environment.ProcessorCount to do the trick for us, right?).
There was a problem hiding this comment.
64-bit Windows had a 64-core limit, too. I think this should be called when we're running on .NET Framework regardless of bitness (on .NET Core/.NET 5, we can rely on
Environment.ProcessorCountto do the trick for us, right?).
.NET 5 Environment.ProcessorCount calls the native GetProcessorCount function. So I think its all good.
https://github.com/dotnet/runtime/blob/ea2b09beea535abf960651c59359e40f99f2fecf/src/libraries/System.Private.CoreLib/src/System/Environment.cs#L13
This reverts commit 4223cc2.
Forgind
left a comment
There was a problem hiding this comment.
This looks how I imagined it, thanks! I don't know whether Environment.ProcessorCount works properly on .NET Core/.NET 5+.
rainersigwald
left a comment
There was a problem hiding this comment.
I think this looks good to me. I'm trying to pull some strings internally to borrow a machine big enough to test it on (looks like Azure will charge me merely $50,000/month for a big enough VM!).
Co-authored-by: Rainer Sigwald <raines@microsoft.com>
Co-authored-by: Rainer Sigwald <raines@microsoft.com>
@rainersigwald I reached out to some people within the dotnet foundation to see if there might be funds for something like this. I'll let you know if I hear anything. |
|
@rainersigwald I emailed AMD and they have offered to test on their 64-core ThreadRipper 3990X processor. |
|
Current status: I got access to a Standard_F72s_v2 VM, installed a private Visual Studio with this fix and tested it out. It almost works! In 64-bit everything looks great. In 32-bit, the new code correctly identifies 2 processor groups (2 NUMA nodes), but Windows lies to us again and claims only 32 cores in each, when really there are 36--so we're artificially limited to 64 MSBuild nodes. This is in the docs:
I'll start a thread up with some Windows folks internally to see if this is the best we can do. |
Works around win32 limitation--a group will report only up to 32 processors on a 32-bit process.
|
Latest push seems to work, at least until AMD blows things up. |
|
However, depending on how dotnet/runtime#41902 is resolved, we may need to do this on .NET Core/Windows too. |
|
@rainersigwald "AMD's Milan CPUs for the third generation of Epyc products are unlikely to support quadruple smt." I imagine in the future it might happen, but lucky not so soon. |
rainersigwald
left a comment
There was a problem hiding this comment.
Tested my latest commit on the 72-core machine in dotnet build and msbuild.exe scenarios and both successfully started and used 72 nodes. Looks good to me!
|
Thank you @mfkl! This should be available starting in 16.8 preview 4. |
|
Thank you for picking it up and finishing this :-) |
|
By the way, GitHub somehow mangled the squash/merge commit--I did not intentionally erase you from the commit history. Sorry about that! |
|
No worries. |
Makes three PRs (#5669, #5625, and #5653) use Change Waves (#5710) to opt out as a unit. Also added a test to ensure that change waves worked properly for one of the three PRs and corrected a minor issue with tests for changes under a change wave that assume it is enabled when there exist tests that disable the relevant change wave without resetting it to empty at the end.
|
This PR for the MSBuild cpu core count detection has an issue with debugging processor groups. I've created a new issue here and included some extra information for debugging without physical hardware: #7943 |
Attempt to fix #435
Potentially relevant: