Fixed low-high base Power Table address handling. Added 0x1Ah CPUs #33
Conversation
|
I'm not sure I understand this. What's the point of querying the high half if it's always discarded? |
|
Oh, you mean it's not low and high part, but begin and end address? |
|
Not exactly. To be precise, we don't really know what the upper part of the address is for. And the bottom part is actually the beginning of the table. That's what we need |
|
It's basically 2 addresses, the "hi" part is the second base address and I believe it might be some extra metrics for the GPU, however I haven't really done anything with that second address and I believe noone does. The code definitely needs to map the low 32bits only, at least for now. If we need the second base we have to handle it separately or map the whole range for both tables at once. I don't really know if the second address is always exactly after the end of first table. Presumably it is not, thus they provide both addresses. It might be also some legacy behavior for an idea that they had and then abandoned. |
|
Oh ok, just noticed that the +620 on it sure does look close to the recognized size of 6AC. Anyway, I'd prefer if masking was done on the resolution side, the function is called "get_pm_table_base" and not "get_pm_table_base_in_low_and_garbage_in_the_high_dword". The way it's done here will inevitably surprise someone if AMD ever decides to move the table to some 64 bit address in a future cpu and the function gets naively updated with a new "case". |
|
The masking is done in ioctl_read_pm_table, isn't that the correct place? Otherwise it is trying to map a huge address (the long value). You are right about the potential 64-bit address, but as it is right now we have to mask the value to map the correct address, maybe conditionally? PS: I'm not entirely sure if there aren't such cases with 64bit addresses already, Iso maybe the right thing to do is to use condition, based on the CPU "class". I believe my old code used a 64-bit address for desktop and 32-bit for older APUs. |
|
@namazso review the new commit. Made it to suit your requirements. Since we still don't know what the high address is for, I removed it |
|
here's my equivalent old code private float[] ReadTableFromMemory(int tableSizeInBytes)
{
float[] table = new float[tableSizeInBytes / 4];
if (Utils.Is64Bit)
{
IntPtr dramBaseAddress = smu.SMU_TYPE >= SMU.SmuType.TYPE_CPU4 && smu.SMU_TYPE < SMU.SmuType.TYPE_CPU9 || smu.SMU_TYPE == SMU.SmuType.TYPE_APU2
? new IntPtr((long)DramBaseAddressHi << 32 | DramBaseAddressLo)
: new IntPtr(DramBaseAddressLo);
byte[] bytes = io.ReadMemory(dramBaseAddress, tableSizeInBytes);
if (bytes != null && bytes.Length > 0)
{
Buffer.BlockCopy(bytes, 0, table, 0, bytes.Length);
}
else
{
Console.WriteLine("Error: ReadMemory returned null or empty byte array.");
}
}
else
{
try
{
for (int i = 0; i < table.Length; ++i)
{
int offset = i * sizeof(float);
if (io.GetPhysLong((UIntPtr)(DramBaseAddress + offset), out uint data))
{
byte[] bytes = BitConverter.GetBytes(data);
Buffer.BlockCopy(bytes, 0, table, offset, bytes.Length);
}
else
{
Console.WriteLine($"Error: GetPhysLong failed at offset {offset}.");
}
}
}
catch (Exception ex)
{
Console.WriteLine("Error occurred while reading table: " + ex.Message);
}
}
return table;
}So this change would probably work for RavenRidge or other APU from Zen and Zen+ generations, but but might not for newer ones and desktop/server. On my 6800HS, win11, uefi mode that would clamp the address and won't work. |
|
@Erruar Thanks! I noticed the original masked class 1 as well, but the updated doesn't. Does class 1 not need the masking, or we don't know and better leave it alone?
No, not really. If g_table_base contained, well, the base, you wouldn't need any masking. The problem is it's not true its name, it maybe contains garbage in the upper half. That is, until AMD makes a new CPU, someone adds a new
Yeah, physical addresses can be up to 52 bits, so that's correct. The problem is that g_table_base did not contain the table base, not that physical addresses cannot possibly be longer than 32 bits. |
|
I can try to modify the code to roughly match my old implementation as it proven to work and has been tested by many users with ZenTimings. I just need to extract my cpu "classes" and match to the 3 classes this module has. Where should I add the condition - in the same ioctl for reading the table? My old code assumes any newer cpu/apu would use 64bit and might add that as the default case. What do you think? Currently trying to test my modules with all the cpus I have - 1800X, 1600AF, 240GE, 3600, 3900X, 5300g, 5600X, 7950X, epyc 4124p, 9600X, 9900X, 9950X3D, |
|
IMO the masking definitely goes in get_pm_table_base as it's part of resolving the base. It'd be wrong to have a function named "get_pm_table_base" that doesn't, in fact, get the PM table's base. We should probably just split |
|
IMO all current codenames in class 3 have the same behaviour - they have main base address and a secondary one. They can be retreived by sending the same command, but with different args. If we're going to split, then there should be a second base address (table_base_alt/table_base_secondary). Then we should also map the secondary address and read second table? Then maybe combine them? The thing is I don't know if they are always consecutive bytes in memory or the secondary may be shifted and have empty bytes betwen both tables. The reutnr value in class 3 is definitely different type than the other bitshift to get 64bit address in class 1. Clients like LHM use the base size only. I think it is a mistake on their side to return it as a 64bit value (if that is the original code). I think we can leave the low address only and use that to read the table. If the client application requests the full size, then both tables will be read as one array of data (if the secondary is always right "after" the main one, otherwise we have to concat them in the module and hardcode the sizes for these codenames). It seems RavenRidge derivates don't have table versions with different sizes, but the main table is always 0x608, and the secondary one is 0xA4. However, looking at current RM, only first one is used and the buffer size is 0x608. Maybe things changed with newer versions or I'm reading it wrong. The particular code in ryzen_smu is really old, from back in the day when we started to figure out the commands. That is several years ago. PS: The other options might be to provide a ioctl to read second table if the address is resolved (only in class 3). Would return "not supported" for the rest. |
|
@namazso I tested my changes on various processors, and came to the conclusion that everything works as intended. All classes give the correct values for their code names. I think it's safe to make changes to your project now |

#30