Fix division by zero on AIX with fewer logProcs than physProcs#2229
Conversation
|
|
|
Hi @dbwiddis. Thank you for looking into this. The case is closed, but I would like to provide my view in the physical processors, so writing my thoughts here: I have no experience with Raptor or other Power based systems, but let me try to explain how this works on IBM Power systems. I might not be 100% accurate on this - feel free to correct me if I have gotten something wrong :) Besides very old systems and a few speciel models, all Power servers are shipped with the PowerVM hypervisor in firmware. There is no way to run “bare-metal” - the hypervisor is always present. On top of PowerVM you install any number of AIX, Linux or IBM i logical partitions (LPAR’s or what everyone else calls a VM or guest). The Power servers exists in configurations from 1,2,3 and 4 processor sockets up to 16 processor sockets in a single system. The socket is populated with a Power processor (latest gen. is Power10) which could have from 4 to 16 processor cores each. So a very small system could have 1-socket w. 4 Power cores and the largest could have 16 sockets with 240 cores. Each of the Power cores are able to execute independant hardware threads, similar to Hyperthreading (SMT-2) on Intel. The Power cores are able to execute in SMT-2, SMT-4 or SMT-8 mode independently for each LPAR/VM running. Each of these hardware threads will show up as a logical processor / vCPU in the LPAR/VM. So the large system with 240 Power10 cores would be able to provide up to 1920 logical processors (in SMT-8) assigned to any number of LPAR’s / VM’s. To my understanding the number of physical processors (or cores) is not really relevant for any LPAR/VM, and I don’t think oshi would have to take this into account. It is true that you can see this number for within an LPAR with the ‘lparstat’ command (and possible in other ways). If you could run bare-metal then ‘logical processors’ > cores > sockets. Metrics on the physical system, I/O adapters and virtualisation is provided through the HMC (Hardware Management Console) which is where you could look at the overall system allocation and resources. When you configure an LPAR you can specify its processor mode as ‘dedicated’ in which case you state a minimum, allocated and maximum number of cores assigned to this specific LPAR. You can not over-subscribe the number of allocated processors. The min and max values are the range within you are able to dynamically change the allocation value within having to reboot the LPAR. So if you allocate 2 cores (in SMT-8) this would show up as 2*8 = 16 logical processors in your AIX or Linux LPAR. You can also configure an LPAR with a processor mode as ‘shared’ and things get’s a little more complicated here. You still allocate processor cores and have the min and max which works in the same way and will show up in AIX or Linux as logical processors. The extra dimension is entitlements. As an example, you can allocate 2 cores but only entitle 1.5 cores, which results in 2*8 = 16 logical processors being available (and visible) to AIX or Linux but only 12 of the 16 cores being guarantied (entitled) to the LPAR. If the 0.5 processor not entitled is not in use elsewhere, then PowerVM will allow the LPAR to use it. PowerVM will look at a ‘weight’ which you can set for each LPAR to determine who get’s extra resources first in case of constraints. When you run in the type of setup you can see from ‘lparstat’ that you might be using 100% of you logical processors, but 200% of you entitlements. In this case you could have given the LPAR eg. 4 cores, but one 2.0 as entitlement. You could also allocate 1 cores but only entitle 0.25 cores or even less, which is called micro-partitioning - in this case you could end up using 100% entitlements but only 25% if the visible logical processors. There is also a min and max for the entitlements. /Mark |
Fixes #2228