Features/expand microarch for aarch64 by t-karatsu · Pull Request #13780 · spack/spack

t-karatsu · 2019-11-19T05:06:40Z

Add microarchitecture branching from aarch64
　Add thunderx2 microarchitecture and a64fx microarchitecture.
Add process to determine aarch64microarchitecture
　Information field got from /proc/cpuinfo of aarch64 and x86_64 are different.
　features information is got from Features in /proc/cpuinfo of aarch64.
　vendor information is got from CPU implementer in /proc/cpuinfo of aarch64.

* Add process to determine aarch64 microarchitecture

* Add optimize flags for gcc on aarch64 familty.

becker33

This all looks good.

Can you add a test to lib/spack/spack/test/llnl/util/cpu.py? That will probably require adding a file to lib/spack/spack/test/data/targets named linux-<OS>-thunderx2 to test the target detection.

The tests in lib/spack/spack/test/llnl/util/cpu.py are mostly already parameterized by microarchitecture, so you shouldn't need to write new test code, just add parameter sets that test the new architectures.

tgamblin · 2019-11-19T19:13:12Z

@NickRF FYI

t-karatsu · 2019-11-20T05:45:42Z

Thanks for your comments. I just implemented unit tests for this pull request. And linux-centos7-thunderx2 was added as test data. Could you confirm it?

alalazo · 2019-11-20T07:05:10Z

lib/spack/llnl/util/cpu/microarchitectures.json

    },
+    "thunderx2": {
+      "from": "aarch64",
+      "vendor": "0x43",


Apologies for arriving late at this. Are we keeping vendor names non human-readable? This will be displayed as 0x43 - aarch64 when people ask for:

$ spack arch --known-targets

Well... CPU implementer in /proc/cpuinfo is only displayed with specific number as vendor information. So, I implemented vendor with numbers. Actually, 0x43 means Cavium, it is vendor name. (I will check once about 0x46)

I check $ spack arch --known-targets, and confirm following.

0x43 - aarch64 thunderx2 0x46 - aarch64 a64fx

Is there way of changing vendor name displayed with this command?

Well... CPU implementer in /proc/cpuinfo is only displayed with specific number as vendor information. So, I implemented vendor with numbers.

No worries, was wondering if we want to map the code to the name. Probably it makes sense to do it when we detect the raw information in more or less the same way that we do to map a few instructions sets from Darwin to what we currently have in the JSON file:

spack/lib/spack/llnl/util/cpu/detect.py

Lines 114 to 123 in 66cf530

if 'sse4.1' in info['flags']:

info['flags'] += ' sse4_1'

if 'sse4.2' in info['flags']:

info['flags'] += ' sse4_2'

if 'avx1.0' in info['flags']:

info['flags'] += ' avx'

if 'clfsopt' in info['flags']:

info['flags'] += ' clflushopt'

if 'xsave' in info['flags']:

info['flags'] += ' xsavec xsaveopt'

@t-karatsu Do you have a recent version of lscpu from util-linux installed on your system? I am wondering if they do this mapping...

Oh, thanks! We will consider implementations of vendor name mapping.

@t-karatsu Do you have a recent version of lscpu from util-linux installed on your system? I am wondering if they do this mapping...

Some of the thunderX2 systems we are able to use, show vendor names(Cavium) using the lscpu command.

Thanks for confirming @t-karatsu! I'll search into their code and see if I can find where they keep the mapping. We might be able to reuse that.

@becker33 @t-karatsu @tgamblin This file can be interesting for us: lscpu-arm.c. The part that is most relevant to this discussion is:

static const struct hw_impl hw_implementer[] = { { 0x41, arm_part, "ARM" }, { 0x42, brcm_part, "Broadcom" }, { 0x43, cavium_part, "Cavium" }, { 0x44, dec_part, "DEC" }, { 0x48, hisi_part, "HiSilicon" }, { 0x4e, nvidia_part, "Nvidia" }, { 0x50, apm_part, "APM" }, { 0x51, qcom_part, "Qualcomm" }, { 0x53, samsung_part, "Samsung" }, { 0x56, marvell_part, "Marvell" }, { 0x66, faraday_part, "Faraday" }, { 0x69, intel_part, "Intel" }, { -1, unknown_part, "unknown" }, };

alalazo

A few comments and possibly a couple of typos spotted.

alalazo · 2019-11-20T08:04:06Z

lib/spack/llnl/util/cpu/microarchitectures.json

+        ],
+        "clang": {
+          "versions": ":",
+          "flags": "-march=armv8-a -mcpu=generic"


Was this copied verbatim? Do we need to check the options for clang?

The setting "-march=armv8xx" seems to be correct on aarch64 machines, but the settings for each version are still under scrutiny. Perhaps you can follow this file to see what you can set.
https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/AArch64TargetParser.def

I think we can have the same values displayed by:

$ llc -march=aarch64 -mcpu=help

Thanks. I executed the command llc in llvm9.0, and following is shown:

[n0013@apollo13 bin]$ ./llc -march=aarch64 -mcpu=help Available CPUs for this target: apple-latest - Select the apple-latest processor. cortex-a35 - Select the cortex-a35 processor. cortex-a53 - Select the cortex-a53 processor. cortex-a55 - Select the cortex-a55 processor. cortex-a57 - Select the cortex-a57 processor. cortex-a72 - Select the cortex-a72 processor. cortex-a73 - Select the cortex-a73 processor. cortex-a75 - Select the cortex-a75 processor. cortex-a76 - Select the cortex-a76 processor. cortex-a76ae - Select the cortex-a76ae processor. cyclone - Select the cyclone processor. exynos-m1 - Select the exynos-m1 processor. exynos-m2 - Select the exynos-m2 processor. exynos-m3 - Select the exynos-m3 processor. exynos-m4 - Select the exynos-m4 processor. exynos-m5 - Select the exynos-m5 processor. falkor - Select the falkor processor. generic - Select the generic processor. kryo - Select the kryo processor. saphira - Select the saphira processor. thunderx - Select the thunderx processor. thunderx2t99 - Select the thunderx2t99 processor. thunderxt81 - Select the thunderxt81 processor. thunderxt83 - Select the thunderxt83 processor. thunderxt88 - Select the thunderxt88 processor. tsv110 - Select the tsv110 processor. Available features for this target: many features... Use +feature to enable a feature, or -feature to disable it. For example, llc -mcpu=mycpu -mattr=+feature1,-feature2 [n0013@apollo13 bin]$

If the corresponding CPU can be specified, it seems good to use the -mcpu option. In this case, I will set -mcpu=thunderx2t99.

alalazo · 2019-11-20T08:06:57Z

lib/spack/llnl/util/cpu/microarchitectures.json

+          },
+          {
+            "versions": "7:7.9",
+            "flags": "-arch=armv8.2a+crc+crypt+fp16"


Is -arch a typo (instead of -march) here and below?

Same question for +crypt instead of +crypto

I'm Sorry... it's my lack of checking. I will send new pull request due to fix these typos.

@t-karatsu No worries, these typos are easy to miss.

alalazo · 2019-11-20T08:09:47Z

lib/spack/llnl/util/cpu/microarchitectures.json

+          },
+          {
+            "versions": "8:",
+            "flags": "-arch=armv8.2a+crc+aes+sh2+fp16+sve -msve-vector-bits=512"


Curious why crypto was omitted here but not above.

In gcc 8 changes (https://gcc.gnu.org/gcc-8/changes.html):

The Armv8-A +crypto extension has now been split into two extensions for finer grained control:

+aes which contains the Armv8-A AES crytographic instructions.

+sha2 which contains the Armv8-A SHA2 and SHA1 cryptographic instructions.

From what I understand reading the man pages of GCC 8 and what you linked crypto is still available, and it is equivalent to setting +aes+sha2+simd.

Pasting the snippet I read below for ease of reference:

crypto Enable Crypto extension. This also enables Advanced SIMD and floating-point instructions. fp Enable floating-point instructions. This is on by default for all possible values for options -march and -mcpu. simd Enable Advanced SIMD instructions. This also enables floating-point instructions. This is on by default for all possible values for options -march and -mcpu. sve Enable Scalable Vector Extension instructions. This also enables Advanced SIMD and floating-point instructions. lse Enable Large System Extension instructions. This is on by default for -march=armv8.1-a. rdma Enable Round Double Multiply Accumulate instructions. This is on by default for -march=armv8.1-a. fp16 Enable FP16 extension. This also enables floating-point instructions. fp16fml Enable FP16 fmla extension. This also enables FP16 extensions and floating-point instructions. This option is enabled by default for -march=armv8.4-a. Use of this option with architectures prior to Armv8.2-A is not supported. rcpc Enable the RcPc extension. This does not change code generation from GCC, but is passed on to the assembler, enabling inline asm statements to use instructions from the RcPc extension. dotprod Enable the Dot Product extension. This also enables Advanced SIMD instructions. aes Enable the Armv8-a aes and pmull crypto extension. This also enables Advanced SIMD instructions. sha2 Enable the Armv8-a sha2 crypto extension. This also enables Advanced SIMD instructions. sha3 Enable the sha512 and sha3 crypto extension. This also enables Advanced SIMD instructions. Use of this option with architectures prior to Armv8.2-A is not supported. sm4 Enable the sm3 and sm4 crypto extension. This also enables Advanced SIMD instructions. Use of this option with architectures prior to Armv8.2-A is not supported. Feature crypto implies aes, sha2, and simd, which implies fp. Conversely, nofp implies nosimd, which implies nocrypto, noaes and nosha2.

Suggested change

"flags": "-arch=armv8.2a+crc+aes+sh2+fp16+sve -msve-vector-bits=512"

"flags": "-arch=armv8.2a+crc+aes+sha2+fp16+sve -msve-vector-bits=512"

t-karatsu added 3 commits November 19, 2019 13:30

* Add microarch branching from aarch64

9afe426

* Add process to determine aarch64 microarchitecture

fix typo & fix for flake8.

f9760f5

* Add process for checking vendor

d6c84de

* Add optimize flags for gcc on aarch64 familty.

alalazo self-assigned this Nov 19, 2019

alalazo added arm compilers microarchitectures labels Nov 19, 2019

becker33 requested changes Nov 19, 2019

View reviewed changes

t-karatsu added 2 commits November 20, 2019 13:54

* Fix version range of gcc and optimization flags of clang.

20e0b96

* Add process for unit test and test data file.

60a5022

becker33 approved these changes Nov 20, 2019

View reviewed changes

becker33 merged commit 513fe55 into spack:develop Nov 20, 2019

alalazo reviewed Nov 20, 2019

View reviewed changes

t-karatsu deleted the features/expand_microarch_for_aarch64 branch November 21, 2019 07:54

alalazo mentioned this pull request Nov 21, 2019

Display human readable text for AArch64 vendors #13825

Merged

tgamblin mentioned this pull request Dec 2, 2019

Fixed detection for cascadelake microarchitecture #13820

Merged

	if 'sse4.1' in info['flags']:
	info['flags'] += ' sse4_1'
	if 'sse4.2' in info['flags']:
	info['flags'] += ' sse4_2'
	if 'avx1.0' in info['flags']:
	info['flags'] += ' avx'
	if 'clfsopt' in info['flags']:
	info['flags'] += ' clflushopt'
	if 'xsave' in info['flags']:
	info['flags'] += ' xsavec xsaveopt'

	"flags": "-arch=armv8.2a+crc+aes+sh2+fp16+sve -msve-vector-bits=512"
	"flags": "-arch=armv8.2a+crc+aes+sha2+fp16+sve -msve-vector-bits=512"

Conversation

t-karatsu commented Nov 19, 2019

Uh oh!

becker33 left a comment

Choose a reason for hiding this comment

Uh oh!

tgamblin commented Nov 19, 2019

Uh oh!

t-karatsu commented Nov 20, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalazo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalazo Nov 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

alalazo Nov 20, 2019 •

edited

Loading