Skip to content

Fix compilation on some ARM architectures.#22080

Merged
opencv-pushbot merged 1 commit intoopencv:3.4from
vrabaud:fix_arm
Jun 13, 2022
Merged

Fix compilation on some ARM architectures.#22080
opencv-pushbot merged 1 commit intoopencv:3.4from
vrabaud:fix_arm

Conversation

@vrabaud
Copy link
Copy Markdown
Contributor

@vrabaud vrabaud commented Jun 8, 2022

This condition is the same as the line above.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch

have[CV_CPU_NEON] = true;
#endif
#if (defined __ARM_FP && (((__ARM_FP & 0x2) != 0) && defined __ARM_NEON__))
#if (defined __ARM_FP && (((__ARM_FP & 0x2) != 0) && (defined __ARM_NEON__ || (defined __ARM_NEON && defined __aarch64__))))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If condition is reused then it makes sense to make setting of this flag as nested code.

#if (defined __ARM_NEON__ || (defined __ARM_NEON && defined __aarch64__))
    have[CV_CPU_NEON] = true;
    #if (defined __ARM_FP && ((__ARM_FP & 0x2) != 0)
    have[CV_CPU_FP16] = true;
    #endif
#endif

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are getting complicated for Armv7/Armv8 and build/run time.
__ARM_NEON__ and __ARM_NEON is defined in Armv7, but only __ARM_NEON is defined in Arm8
Moreover, what the predefined macro express is if the compiler is ready for NEON/FP16 instructions and intended to enable them, when the have array express is if they are available on actual device at run time.

It's much more easier to separate the block in Armv7 part and Armv8 part like following.

#elif (defined __clang__ || defined __APPLE__)
#if defined __arm__
    // Armv7
    // Check if NEON/FP16 instructions are provided on the processor (not by compiler)
    // i.e. something equivalent to check at line 563 or 581
    have[CV_CPU_NEON] = some_flags & some_check1;
    have[CV_CPU_FP16] = some_flags & some_check2;
#elif defined __aarch64__
    // Armv8
    have[CV_CPU_NEON] = true;
    have[CV_CPU_FP16] = true;
    // It's guaranteed that NEON/FP16 are supported on Armv8 processor
    // So this flag can be decided at compile time.
#endif
#endif

Now, there are places to argue, since this section is clearly for iOS.
Since the variety of processor is limited and relatively few, you may go and check EVERY iPhone/iPad processor ever made and if EVERY processor supports NEON/FP16, it's safe to enable the flag at compile time.

This part is quite a mess, and you are trying to place another rock on an unstable piled stone tower.
If you are trying to modify this part, I suggest to totally re-write this (for iOS) part.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition looks strange: https://github.com/opencv/opencv/blame/4.6.0/modules/core/src/system.cpp#L607

#elif (defined __clang__ || defined __APPLE__)

Perhaps modern Apple platforms are already NEON/FP16 capable.


No idea how __clang__ check could help with detection.
Code has been added here (8+ years): https://github.com/opencv/opencv/blame/bdb088dccab11c9031e92414e02a0964ff614562/modules/core/src/system.cpp#L340


@vrabaud Do you observe issues with Apple? Or with other ARM platforms?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition looks strange.

Yes, that's true.

Perhaps modern Apple platforms are already NEON/FP16 capable.

I've done a quick search and the last "Armv7 iPhone" is iPhone5C, which is released on 2013
ref: https://en.wikipedia.org/wiki/IPhone_5C

First released September 20, 2013
CPU 1.3 GHz dual core 32-bit ARMv7-A "Swift"

Probably it's safe to drop "32bit Arm iPhone" support, but I vote to keep the support situation, since it's 3.4 branch, and drop it in "next" branch

No idea how clang check could help with detection.

It's not helping, probably it's better to refactor this part.
Also, the build for latest apple M1 & M2 arrives here, so it's a quite big surgery to refactor this part.

So, I feel the same question to ask

@vrabaud Do you observe issues with Apple? Or with other ARM platforms?

If @vrabaud couldl describe more in detail about the issue you are facing, that would help to decide which way to go.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, I think I found the correct API to call on iOS
API sysctlbyname
Detail of API
Sample on SO

According to the sample code

#if defined __arm__
    // Armv7
    size_t size;
    int neon;
    int half;
    size = sizeof(neon);
    sysctlbyname("hw.optional.AdvSIMD", &neon, &size, NULL, 0);
    sysctlbyname("hw.optional.AdvSIMD_HPFPCvt", &half, &size, NULL, 0);
    have[CV_CPU_NEON] = neon == 1;
    have[CV_CPU_FP16] = half == 1;
#elif defined __aarch64__
    // Armv8
    have[CV_CPU_NEON] = true;
    have[CV_CPU_FP16] = true;
#endif

But still, covering with __clang__ && __APPLE__ might not work due to Mac OS on Apple M1&M2 CPU, which makes things more complicated....

Copy link
Copy Markdown
Contributor Author

@vrabaud vrabaud Jun 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, my initial issue is that I have an internal ARM toolchain that defines __ARM_NEON but not __ARM_NEON__ (which is legacy according to https://gcc.gnu.org/legacy-ml/gcc-patches/2015-01/msg02717.html). My problem has nothing to do with iOS.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thank you
If possible, could you describe more about your platform please?
Probably is better to create another "if" section for your platform rather than populating this iOS part

Any specific macro to hook your platform?
Is it Armv8 or Armv7?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unfortunately have no idea, sorry, this is all internal. Could a fix then be the following where we leave __APPLE__ untouched but just modify __clang__ ?

    #elif (defined __APPLE__)
    #if (defined __ARM_NEON__ || (defined __ARM_NEON && defined __aarch64__))
        have[CV_CPU_NEON] = true;
    #endif
    #if (defined __ARM_FP  && (((__ARM_FP & 0x2) != 0) && defined __ARM_NEON__))
        have[CV_CPU_FP16] = true;
    #endif
    #elif (defined __clang__)
    #if (defined __ARM_NEON__ || (defined __ARM_NEON && defined __aarch64__))
        have[CV_CPU_NEON] = true;
        #if (defined __ARM_FP  && ((__ARM_FP & 0x2) != 0))
        have[CV_CPU_FP16] = true;
        #endif
    #endif
    #endif

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a right to decide, but I vote for making another ifdef at this part.
No one knows what side effect could happen when we modify the iOS part, so better to keep it as-is for now.

Doesn't you compiler accepts -dM -E option? which should dump hundreds of macros.
If you have anything specific and unique, that will help the affect to be minimum.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the updated fix.

This condition is the same as the line above.
Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@tomoaki0705 Thank you for the review!

@opencv-pushbot opencv-pushbot merged commit cac8641 into opencv:3.4 Jun 13, 2022
@vrabaud vrabaud deleted the fix_arm branch June 14, 2022 11:37
@alalek alalek mentioned this pull request Jun 16, 2022
@alalek alalek mentioned this pull request Aug 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: build/install platform: arm ARM boards related issues: RPi, NVIDIA TK/TX, etc platform: ios/osx

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants