Add NUMA and thread affinity support for Unix#10938
Conversation
|
Fixes #1986 |
40819cf to
2b3318a
Compare
|
How are we going to test this? (The NUMA part in particular.) |
src/pal/src/numa/numa.cpp
Outdated
| BOOL success = TRUE; | ||
|
|
||
| LOGEXIT("GetNumaProcessorNodeEx returns BOOL %d\n", success); | ||
| PERF_EXIT(GetNumaProcessorNodeEx); |
There was a problem hiding this comment.
nit: is this supposed to be GetNumaHighestNodeNumber?
There was a problem hiding this comment.
Thank you for spotting that, I have fixed it.
|
@dotnet-bot test tizen armel cross debug build |
|
@jkotas, as for the individual functions, I have tested them locally on my native Linux box that has 8 cores in two NUMA nodes. Regarding different NUMA configurations testing, some can be done on a VM. As for performance testing, I don't have a clear plan yet, ideally, we could use some real world benchmark like the techempower running on a physical machine with more processors / NUMA nodes and measure performance with the NUMA stuff enabled and disabled. |
2b3318a to
fb6129e
Compare
This change adds new PAL functions for NUMA and thread affinity support for Unix and also enables related code in GC and VM for FEATURE_PAL. It doesn't reflect the limits imposed by CGROUPS on systems with CGROUPS enables yet.
|
@dotnet-bot test Windows_NT x64 Debug Build and Test please |
|
@dotnet-bot test Ubuntu arm Cross Release Build please |
2 similar comments
|
@dotnet-bot test Ubuntu arm Cross Release Build please |
|
@dotnet-bot test Ubuntu arm Cross Release Build please |
|
@dotnet-bot test Ubuntu arm Cross Release Build |
|
@swgillespie, @adityamandaleeka could you please review? |
src/pal/src/numa/numa.cpp
Outdated
|
|
||
| /*++ | ||
| Function: | ||
| FreeLookupArrays |
There was a problem hiding this comment.
Sigh. Thank you for spotting that.
|
|
||
| for (int i = 0; i < numaNodesCount; i++) | ||
| { | ||
| int st = numa_node_to_cpus(i, mask); |
There was a problem hiding this comment.
Looks like we're not doing anything with this return value. Assert that it's 0?
There was a problem hiding this comment.
Adding assert, the only failure of this function could be that the mask is not large enough, but since the mask is allocated by the numa_allocate_cpumask function, that cannot happen.
|
|
||
| if (currentGroupCpus != 0) | ||
| { | ||
| g_groupToCpuCount[currentGroup] = currentGroupCpus; |
There was a problem hiding this comment.
g_groupToCpuCount[currentGroup] is set here for whatever currentGroup is when we get here. What if we filled a group and moved on to the next one in the for loop above (near line 175)?
Do we not need to set g_groupToCpuCount for those groups (and also the affinity mask below)?
There was a problem hiding this comment.
Yes, we do. It is missing. Thank you for spotting it.
| else | ||
| { | ||
| // The process has affinity in more than one group, in such case | ||
| // the function needs to return zero in both masks. |
There was a problem hiding this comment.
The comment here says we need to return 0 in both masks, but unless I'm missing something we're not setting systemMask to 0, right?
|
@adityamandaleeka I've reflected your feedback. |
This change adds new PAL functions for NUMA and thread affinity support
for Unix and also enables related code in GC and VM for FEATURE_PAL.
It doesn't reflect the limits imposed by CGROUPS on systems with
CGROUPS enables yet.