Skip to content

Fixed low-high base Power Table address handling. Added 0x1Ah CPUs #33

Merged
namazso merged 3 commits into
namazso:mainfrom
Erruar:main
Oct 15, 2025
Merged

Fixed low-high base Power Table address handling. Added 0x1Ah CPUs #33
namazso merged 3 commits into
namazso:mainfrom
Erruar:main

Conversation

@Erruar

@Erruar Erruar commented Oct 8, 2025

Copy link
Copy Markdown
Contributor

#30

@namazso

namazso commented Oct 8, 2025

Copy link
Copy Markdown
Owner

I'm not sure I understand this. What's the point of querying the high half if it's always discarded?

@Erruar

Erruar commented Oct 8, 2025

Copy link
Copy Markdown
Contributor Author

I'm not sure I understand this. What's the point of querying the high half if it's always discarded?

On low-high base address systems its in low (first 32) and high (second 32 bits) format.
Screenshot_20251008-222622

@namazso

namazso commented Oct 8, 2025

Copy link
Copy Markdown
Owner

Oh, you mean it's not low and high part, but begin and end address?

@Erruar

Erruar commented Oct 8, 2025

Copy link
Copy Markdown
Contributor Author

Not exactly. To be precise, we don't really know what the upper part of the address is for. And the bottom part is actually the beginning of the table. That's what we need

@irusanov

irusanov commented Oct 8, 2025

Copy link
Copy Markdown
Contributor

It's basically 2 addresses, the "hi" part is the second base address and I believe it might be some extra metrics for the GPU, however I haven't really done anything with that second address and I believe noone does. The code definitely needs to map the low 32bits only, at least for now. If we need the second base we have to handle it separately or map the whole range for both tables at once. I don't really know if the second address is always exactly after the end of first table. Presumably it is not, thus they provide both addresses.

It might be also some legacy behavior for an idea that they had and then abandoned.

@namazso

namazso commented Oct 8, 2025

Copy link
Copy Markdown
Owner

Oh ok, just noticed that the +620 on it sure does look close to the recognized size of 6AC. Anyway, I'd prefer if masking was done on the resolution side, the function is called "get_pm_table_base" and not "get_pm_table_base_in_low_and_garbage_in_the_high_dword". The way it's done here will inevitably surprise someone if AMD ever decides to move the table to some 64 bit address in a future cpu and the function gets naively updated with a new "case".

@irusanov

irusanov commented Oct 8, 2025

Copy link
Copy Markdown
Contributor

The masking is done in ioctl_read_pm_table, isn't that the correct place? Otherwise it is trying to map a huge address (the long value). You are right about the potential 64-bit address, but as it is right now we have to mask the value to map the correct address, maybe conditionally?

PS: I'm not entirely sure if there aren't such cases with 64bit addresses already, Iso maybe the right thing to do is to use condition, based on the CPU "class". I believe my old code used a 64-bit address for desktop and 32-bit for older APUs.

@Erruar

Erruar commented Oct 8, 2025

Copy link
Copy Markdown
Contributor Author

@namazso review the new commit. Made it to suit your requirements. Since we still don't know what the high address is for, I removed it

@irusanov

irusanov commented Oct 8, 2025

Copy link
Copy Markdown
Contributor

here's my equivalent old code

        private float[] ReadTableFromMemory(int tableSizeInBytes)
        {
            float[] table = new float[tableSizeInBytes / 4];

            if (Utils.Is64Bit)
            {
                IntPtr dramBaseAddress = smu.SMU_TYPE >= SMU.SmuType.TYPE_CPU4 && smu.SMU_TYPE < SMU.SmuType.TYPE_CPU9 || smu.SMU_TYPE == SMU.SmuType.TYPE_APU2
                        ? new IntPtr((long)DramBaseAddressHi << 32 | DramBaseAddressLo)
                        : new IntPtr(DramBaseAddressLo);

                byte[] bytes = io.ReadMemory(dramBaseAddress, tableSizeInBytes);

                if (bytes != null && bytes.Length > 0)
                {
                    Buffer.BlockCopy(bytes, 0, table, 0, bytes.Length);
                }
                else
                {
                    Console.WriteLine("Error: ReadMemory returned null or empty byte array.");
                }
            }
            else
            {
                try
                {
                    for (int i = 0; i < table.Length; ++i)
                    {
                        int offset = i * sizeof(float);
                        if (io.GetPhysLong((UIntPtr)(DramBaseAddress + offset), out uint data))
                        {
                            byte[] bytes = BitConverter.GetBytes(data);
                            Buffer.BlockCopy(bytes, 0, table, offset, bytes.Length);
                        }
                        else
                        {
                            Console.WriteLine($"Error: GetPhysLong failed at offset {offset}.");
                        }
                    }
                }
                catch (Exception ex)
                {
                    Console.WriteLine("Error occurred while reading table: " + ex.Message);
                }
            }

            return table;
        }

So this change would probably work for RavenRidge or other APU from Zen and Zen+ generations, but but might not for newer ones and desktop/server.

On my 6800HS, win11, uefi mode
00000003 FE300000

that would clamp the address and won't work.

@namazso

namazso commented Oct 8, 2025

Copy link
Copy Markdown
Owner

@Erruar Thanks! I noticed the original masked class 1 as well, but the updated doesn't. Does class 1 not need the masking, or we don't know and better leave it alone?

@irusanov

The masking is done in ioctl_read_pm_table, isn't that the correct place?

No, not really. If g_table_base contained, well, the base, you wouldn't need any masking. The problem is it's not true its name, it maybe contains garbage in the upper half. That is, until AMD makes a new CPU, someone adds a new case, tries putting in a large physical address, and gets surprised.

Otherwise it is trying to map a huge address (the long value).

Yeah, physical addresses can be up to 52 bits, so that's correct. The problem is that g_table_base did not contain the table base, not that physical addresses cannot possibly be longer than 32 bits.

@irusanov

irusanov commented Oct 8, 2025

Copy link
Copy Markdown
Contributor

@Erruar, @namazso check my comment above, we need to mask it for specific cpu class only, otherwise it will break the rest of the CPUs

@Erruar

Erruar commented Oct 8, 2025

Copy link
Copy Markdown
Contributor Author

@namazso class 1 should be fine. As @irusanov said class 1 (newer CPUs) have 64 bit addresses and it's okay to have this bit shift with high low. On older CPUs it works incorrectly. I fixed it and now its good

@irusanov

irusanov commented Oct 8, 2025

Copy link
Copy Markdown
Contributor

I can try to modify the code to roughly match my old implementation as it proven to work and has been tested by many users with ZenTimings. I just need to extract my cpu "classes" and match to the 3 classes this module has. Where should I add the condition - in the same ioctl for reading the table? My old code assumes any newer cpu/apu would use 64bit and might add that as the default case.

What do you think?

Currently trying to test my modules with all the cpus I have - 1800X, 1600AF, 240GE, 3600, 3900X, 5300g, 5600X, 7950X, epyc 4124p, 9600X, 9900X, 9950X3D,

@namazso

namazso commented Oct 8, 2025

Copy link
Copy Markdown
Owner

IMO the masking definitely goes in get_pm_table_base as it's part of resolving the base. It'd be wrong to have a function named "get_pm_table_base" that doesn't, in fact, get the PM table's base. We should probably just split 3 up to two different classes since it's quite obviously different behavior on the hardware side. Most of this code was ported directly from LHM, so I'm not sure about the rationale behind the original code.

@irusanov

irusanov commented Oct 9, 2025

Copy link
Copy Markdown
Contributor

IMO all current codenames in class 3 have the same behaviour - they have main base address and a secondary one. They can be retreived by sending the same command, but with different args. If we're going to split, then there should be a second base address (table_base_alt/table_base_secondary). Then we should also map the secondary address and read second table? Then maybe combine them? The thing is I don't know if they are always consecutive bytes in memory or the secondary may be shifted and have empty bytes betwen both tables.

The reutnr value in class 3 is definitely different type than the other bitshift to get 64bit address in class 1.

Clients like LHM use the base size only. I think it is a mistake on their side to return it as a 64bit value (if that is the original code).

I think we can leave the low address only and use that to read the table. If the client application requests the full size, then both tables will be read as one array of data (if the secondary is always right "after" the main one, otherwise we have to concat them in the module and hardcode the sizes for these codenames).

It seems RavenRidge derivates don't have table versions with different sizes, but the main table is always 0x608, and the secondary one is 0xA4.
Source: https://github.com/amkillam/ryzen_smu/blob/172c316f53ac8f066afd7cb9e1da517084273368/smu.c#L1026

However, looking at current RM, only first one is used and the buffer size is 0x608. Maybe things changed with newer versions or I'm reading it wrong. The particular code in ryzen_smu is really old, from back in the day when we started to figure out the commands. That is several years ago.

PS: The other options might be to provide a ioctl to read second table if the address is resolved (only in class 3). Would return "not supported" for the rest.

@Erruar

Erruar commented Oct 14, 2025

Copy link
Copy Markdown
Contributor Author

@namazso I tested my changes on various processors, and came to the conclusion that everything works as intended. All classes give the correct values for their code names. I think it's safe to make changes to your project now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants