[Arm64] For reflection invoke, use ArgLocDesc#10257
Conversation
c74ab3e to
870d324
Compare
|
Fixes #10107 |
870d324 to
9788f94
Compare
|
@dotnet-bot test Windows_NT Arm64 Checked |
|
/cc @dotnet/arm64-contrib @RussKeldorph |
|
@dotnet-bot test Windows_NT Arm64 Checked |
9788f94 to
906d03e
Compare
|
@dotnet-bot test Windows_NT Arm64 Checked |
|
@jashook #10107 looks like it is a pri1 test to me. I have not been running these yet. Are there a significant number of pri1 fails? Should I be looking at these for Arm64/Unix? The default unix test documentation does not really recommend them on unix. Please advise. |
src/vm/callingconvention.h
Outdated
| m_argLocDescForStructInRegs.m_cFloatReg = cFPRegs; | ||
| m_argLocDescForStructInRegs.m_idxGenReg = m_idxGenReg; | ||
| m_argLocDescForStructInRegs.m_idxFloatReg = m_idxFPReg; | ||
| m_argLocDescForStructInRegs.m_isSinglePrecision = isSinglePrecision; |
There was a problem hiding this comment.
I understand that you have modelled your changes as done for unix_amd64. I don't really like the way this has been implemented for amd64_unix. I dont think above needs to be done here in GetNextOffset(). GetNextOffset() only returns the offset in TransitionBlock. In order to get ArgLocDesc we have another function GetArgLoc(). So first you get the offset and then use that to get ArgLocDesc. This would also eliminate the need for fields m_hasArgLocDescForStructInRegs & m_argLocDescForStructInRegs in ArgIterator which I think is not required.
There was a problem hiding this comment.
@rahku the reason why I have implemented it this way for unix amd64 is that there is no single offset in the TransitionBlock for structs passed in registers on Unix amd64. A struct can be split between upto two floating point and two general purpose registers. In this case, we return a special offset value TransitionBlock::StructInRegsOffset. I have originally refactored the code to let the GetNextOffset return a bitfield that would contain both the floating point registers and general purpose registers offsets, but it turned out to complicate code in many callers of the function and so it was decided during the code review to revert that change and rather use this side storage for these details.
But unless ARM64 can also split a single argument between multiple general purpose and floating point registers, I agree with you that we should use a single offset like we do e.g. for AMD64 on Windows.
src/vm/argdestination.h
Outdated
| } | ||
|
|
||
| #if defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING) | ||
| #if defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING) || defined(_TARGET_ARM64_) |
There was a problem hiding this comment.
changing above to
#if (defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING)) || defined(TARGET_ARM64)
will make it clear what the real conditions are
| #if defined(_TARGET_AMD64_) && defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING) | ||
| #if defined(_TARGET_AMD64_) && defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING) || defined(_TARGET_ARM64_) | ||
| ArgLocDesc m_argLocDescForStructInRegs; | ||
| bool m_hasArgLocDescForStructInRegs; |
There was a problem hiding this comment.
I don't think this is a good design. I would not do it for arm64. I have explained details below.
|
Thanks for debugging and fixing this. |
906d03e to
c75bfa3
Compare
|
@dotnet-bot test Windows_NT Arm64 Checked |
|
@sdmaclea for correctness testing we should be trying to run as many tests as we can, both Pri1 and Pri0. |
src/vm/argdestination.h
Outdated
| #endif // (UNIX_AMD64_ABI && FEATURE_UNIX_AMD64_STRUCT_PASSING) || _TARGET_ARM64_ | ||
|
|
||
| #if defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING) | ||
|
|
There was a problem hiding this comment.
shouldn't function "void ZeroStructInRegisters(int fieldBytes)" also be moved out of ifdef arm64
|
LGTM otherwise |
src/vm/reflectioninvocation.cpp
Outdated
| } | ||
| #endif | ||
|
|
||
| #ifdef _TARGET_ARM64_ |
There was a problem hiding this comment.
Why is this different for ARM64? It seems you can make it the same and then you would not need the change in the argdestination.h in IsStructPassedInRegs.
There was a problem hiding this comment.
Are you suggesting to add fields m_hasArgLocDescForStructInRegs & m_argLocDescForStructInRegs in ArgIterator for arm64 as well? These are not required for any other architecture except unix amd64. It is possible on arm64 to get ArgLocDesc with just the offset address. This is how it is for other architectures as well you first get the offset and then get ArgLocDesc. As you said that on amd64 unix that is not possible so I would let the special case be just for amd64 unix and let the logic for arm64 be same as other arch.
There was a problem hiding this comment.
No, I was suggesting that to keep the code uniform for all platforms. All platforms now have ArgIterator::GetArgLocDescForStructInRegs and it just always returns NULL on non-amd64. So it seems strange to deviate from that for ARM64.
src/vm/argdestination.h
Outdated
| { | ||
| LIMITED_METHOD_CONTRACT; | ||
| #if defined(_TARGET_ARM64_) | ||
| return m_argLocDescForStructInRegs != NULL; |
There was a problem hiding this comment.
I don't understand why we need to handle this differently on ARM64 and why NULL m_argLocDescForStructInRegs cannot indicate that there is no struct passed in regs.
There was a problem hiding this comment.
for amd64 we seem to always be setting m_argLocDescForStructInRegs for all arguments irrespective of whether the offset corresponds to struct_in_regs....which i don't think is right. You could refactor amd64 unix code to not do that and then this can be made same for both arch.
There was a problem hiding this comment.
That's not the case. It is set only for structs in registers, for others, it is NULL. The ArgDestination::m_argLocDescForStructInRegs is set in the constructor and we pass ArgIterator::GetArgLocDescForStructInRegs result to the constructor. And the ArgIterator::GetArgLocDescForStructInRegs looks like this:
ArgLocDesc* GetArgLocDescForStructInRegs()
{
#if defined(UNIX_AMD64_ABI) && defined(FEATURE_UNIX_AMD64_STRUCT_PASSING)
return m_hasArgLocDescForStructInRegs ? &m_argLocDescForStructInRegs : NULL;
#else
return NULL;
#endif
}So it only returns non-NULL if the argument is a struct passed in registers.
There was a problem hiding this comment.
by "you could" i meant "one would need to "
There was a problem hiding this comment.
I'm not sure I understand your comment. I've just shown that we are not setting m_argLocDescForStructInRegs for all arguments.
There was a problem hiding this comment.
never mind regarding the above comment. Yes I now see that you re-initialize m_hasArgLocDescForStructInRegs to be false at the start of every call to ArgIterator::NextOffset(). So yes this function can be made to
return m_argLocDescForStructInRegs != NULL;
for all applicable arch (amd64 & arm64)
src/vm/argdestination.h
Outdated
| if (typeFloat) | ||
| { | ||
| // Zero out memory before copying. | ||
| memset(dest, 0, floatRegCount * sizeof(double)); |
There was a problem hiding this comment.
Why do we zero the memory when we use 64 bit stores in the loop below that overwrite all the zeroed bytes anyways?
There was a problem hiding this comment.
This was mostly used when debugging, thank you for catching it. Will remove.
872d225 to
efeb173
Compare
|
@dotnet-bot test Windows_NT Arm64 Checked |
|
@jashook as discussed can you decouple this from FEATURE_UNIX_AMD64_STRUCT_PASSING so that it is evident that this only for HFA's. We can look at merging with FEATURE_UNIX_AMD64_STRUCT_PASSING later. |
efeb173 to
d4245d1
Compare
|
@rahku ptal the change has been done as discussed. |
|
@dotnet-bot test Windows_NT Arm64 Checked |
8e2af8e to
06e2dcb
Compare
src/vm/reflectioninvocation.cpp
Outdated
| #endif | ||
|
|
||
| #ifdef _TARGET_ARM64_ | ||
| ArgLocDesc argLocDescForStructInRegs; |
There was a problem hiding this comment.
I still wonder why we have to differ for ARM64 here. Based on the IsHFA function above, the ArgDestination should have the m_argLocDescForStructInRegs set to NULL for non-HFA args. So if you used the original code at this place and just made sure that m_hasArgLocDescForStructInRegs in the ArgIterator is set only for HFA args, it seems it should work.
There was a problem hiding this comment.
Yes and that seems like unification with unix amd64 which we decided to do later and not as part of this work.
There was a problem hiding this comment.
@jashook I am sorry for the delay, I was OOF for the last four days. Looking at the change again, I am not sure I understand one thing related to my last comment here. It seems that IsHFA now always returns true, since you always pass the ArgDestination constructor a non-null argLocDescForStructInRegs argument. Is that correct?
|
@dotnet-bot test Windows_NT Arm64 Checked |
1 similar comment
|
@dotnet-bot test Windows_NT Arm64 Checked |
|
@dotnet-bot test Ubuntu16.04 arm Cross Debug Build |
|
@dotnet-bot test Ubuntu arm Cross Release Build |
06e2dcb to
3674e1a
Compare
|
@dotnet-bot test Windows_NT Arm64 Checked |
|
Looks good |
|
ping @janvorli |
1 similar comment
|
ping @janvorli |
3674e1a to
a7e677d
Compare
|
@janvorli @rahku @gkhanna79 ptal. I have modified arm64 to use m_argLocDescForStructInRegs based on @janvorli and @gkhanna79's suggestions. /cc @RussKeldorph |
|
@dotnet-bot test Windows_NT Arm64 Checked |
src/vm/callingconvention.h
Outdated
| int cbArg = StackElemSize(argSize); | ||
| int cArgSlots = cbArg / STACK_ELEM_SIZE; | ||
|
|
||
| auto setupArgLocDesc = [argType, &thValueType, this](int argOfs) |
There was a problem hiding this comment.
It seems that you can do even this the way it is done for amd64. Instead of having this lambda and calling it from three code paths below, it seems you can just add code to initialize the m_argLocDescForStructInRegs inside the if (thValueType.IsHFA()) under the case ELEMENT_TYPE_VALUETYPE above.
At that place, instead of translating the just generated offset back to the specific m_idxXXXReg, you would just call m_argLocDescForStructInRegs.Init() and set the m_argLocDescForStructInRegs.m_idxFloatReg to m_idxFPReg and m_argLocDescForStructInRegs.m_cFloatReg to cFPRegs. Since the lambda below sets the m_argLocDescForStructInRegs for HFA only, it should get the equivalent result.
It looks the change would also be simpler. I hope I have not missed something.
a7e677d to
9ccbaf7
Compare
|
@dotnet-bot test Windows_NT Arm64 Checked |
|
Thank you for the review @janvorli |
|
Thanks for getting this through |
|
@jashook the Windows ARM64 build has failed in the passing struct with floats. Looking at the change again, it seems that you are missing settings m_argLocDescForStructInRegs.m_isSinglePrecision at the place where you set the m_argLocDescForStructInRegs.m_cFloatReg and m_argLocDescForStructInRegs.m_idxFloatReg. 17:28:30 Error: passing struct with floats via reflection. Callee received [1.2, 5.6, 0, 0] instead of [1.2, 3.4, 5.6, 7.8] |
For arm64 if the struct needs to be passed in registers, then pass that information along when we do the reflection invoke. Note this adds more test cases for passing different struct types via reflection. I am guessing similar work would need to be done for Arm32.
9ccbaf7 to
aa0eee4
Compare
| m_argLocDescForStructInRegs.m_cFloatReg = cFPRegs; | ||
| m_argLocDescForStructInRegs.m_idxFloatReg = m_idxFPReg; | ||
|
|
||
| m_argLocDescForStructInRegs.m_isSinglePrecision = isFloatType; |
There was a problem hiding this comment.
The type information was missing.
|
@dotnet-bot test Windows_NT Arm64 Checked. I have already run testing on these changes here; however, to be pedantic I will run them one last time before checking in. |
|
Test failure is unrelated and requires a test drop update. |
For arm64 if the struct needs to be passed in registers, then pass
that information along when we do the reflection invoke.
Note this adds more test cases for passing different struct types via reflection.
I am guessing similar work would need to be done for Arm32.