[BPF] Make llvm-objdump disasm default cpu v4#102166
Conversation
Currently, with the following example,
$ cat t.c
void foo(int a, _Atomic int *b)
{
*b &= a;
}
$ clang --target=bpf -O2 -c -mcpu=v3 t.c
$ llvm-objdump -d t.o
t.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <foo>:
0: c3 12 00 00 51 00 00 00 <unknown>
1: 95 00 00 00 00 00 00 00 exit
Basically, the default cpu for llvm-objdump is v1 and it won't be
able to decode insn properly.
If we add --mcpu=v3 to llvm-objdump command line, we will have
$ llvm-objdump -d --mcpu=v3 t.o
t.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 <foo>:
0: c3 12 00 00 51 00 00 00 w1 = atomic_fetch_and((u32 *)(r2 + 0x0), w1)
1: 95 00 00 00 00 00 00 00 exit
The atomic_fetch_and insn can be decoded properly.
Using latest cpu version --mcpu=v4 can also decode properly like
the above --mcpu=v3.
To avoid the above '<unknown>' decoding with common 'llvm-objdump -d t.o',
this patch marked the default cpu for llvm-objdump with the current
highest cpu number v4 in ELFObjectFileBase::tryGetCPUName(). The
cpu number in ELFObjectFileBase::tryGetCPUName() will be adjusted
in the future if cpu number is increased e.g. v5 etc. Such an approach
also aligns with gcc-bpf as discussed in [1].
Six bpf unit tests are affected with this change. I changed test output
for three unit tests and added --mcpu=v1 for the other three unit tests,
to demonstrate the default (cpu v4) behavior and explicit --mcpu=v1
behavior.
[1] https://lore.kernel.org/bpf/6f32c0a1-9de2-4145-92ea-be025362182f@linux.dev/T/#m0f7e63c390bc8f5a5523e7f2f0537becd4205200
|
@llvm/pr-subscribers-mc @llvm/pr-subscribers-llvm-binary-utilities Author: None (yonghong-song) ChangesCurrently, with the following example, Disassembly of section .text: 0000000000000000 <foo>: Basically, the default cpu for llvm-objdump is v1 and it won't be able to decode insn properly. If we add --mcpu=v3 to llvm-objdump command line, we will have t.o: file format elf64-bpf Disassembly of section .text: 0000000000000000 <foo>: The atomic_fetch_and insn can be decoded properly. Using latest cpu version --mcpu=v4 can also decode properly like the above --mcpu=v3. To avoid the above '<unknown>' decoding with common 'llvm-objdump -d t.o', this patch marked the default cpu for llvm-objdump with the current highest cpu number v4 in ELFObjectFileBase::tryGetCPUName(). The cpu number in ELFObjectFileBase::tryGetCPUName() will be adjusted in the future if cpu number is increased e.g. v5 etc. Such an approach also aligns with gcc-bpf as discussed in [1]. Six bpf unit tests are affected with this change. I changed test output for three unit tests and added --mcpu=v1 for the other three unit tests, to demonstrate the default (cpu v4) behavior and explicit --mcpu=v1 behavior. [1] https://lore.kernel.org/bpf/6f32c0a1-9de2-4145-92ea-be025362182f@linux.dev/T/#m0f7e63c390bc8f5a5523e7f2f0537becd4205200 Full diff: https://github.com/llvm/llvm-project/pull/102166.diff 7 Files Affected:
diff --git a/llvm/lib/Object/ELFObjectFile.cpp b/llvm/lib/Object/ELFObjectFile.cpp
index 53c3de06d118c..f79c233d93fe8 100644
--- a/llvm/lib/Object/ELFObjectFile.cpp
+++ b/llvm/lib/Object/ELFObjectFile.cpp
@@ -441,6 +441,8 @@ std::optional<StringRef> ELFObjectFileBase::tryGetCPUName() const {
case ELF::EM_PPC:
case ELF::EM_PPC64:
return StringRef("future");
+ case ELF::EM_BPF:
+ return StringRef("v4");
default:
return std::nullopt;
}
diff --git a/llvm/test/CodeGen/BPF/objdump_atomics.ll b/llvm/test/CodeGen/BPF/objdump_atomics.ll
index 3ec364f7368b5..c4cb16b2c3641 100644
--- a/llvm/test/CodeGen/BPF/objdump_atomics.ll
+++ b/llvm/test/CodeGen/BPF/objdump_atomics.ll
@@ -2,7 +2,7 @@
; CHECK-LABEL: test_load_add_32
; CHECK: c3 21
-; CHECK: r2 = atomic_fetch_add((u32 *)(r1 + 0), r2)
+; CHECK: w2 = atomic_fetch_add((u32 *)(r1 + 0), w2)
define void @test_load_add_32(ptr %p, i32 zeroext %v) {
entry:
atomicrmw add ptr %p, i32 %v seq_cst
diff --git a/llvm/test/CodeGen/BPF/objdump_cond_op.ll b/llvm/test/CodeGen/BPF/objdump_cond_op.ll
index 3b2e6c1922fc4..c64a0f2f29382 100644
--- a/llvm/test/CodeGen/BPF/objdump_cond_op.ll
+++ b/llvm/test/CodeGen/BPF/objdump_cond_op.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=bpfel -filetype=obj -o - %s | llvm-objdump --no-print-imm-hex -d - | FileCheck %s
+; RUN: llc -mtriple=bpfel -filetype=obj -o - %s | llvm-objdump --no-print-imm-hex --mcpu=v1 -d - | FileCheck %s
; Source Code:
; int gbl;
diff --git a/llvm/test/CodeGen/BPF/objdump_imm_hex.ll b/llvm/test/CodeGen/BPF/objdump_imm_hex.ll
index 1760bb6b6c521..38b93e8a39b55 100644
--- a/llvm/test/CodeGen/BPF/objdump_imm_hex.ll
+++ b/llvm/test/CodeGen/BPF/objdump_imm_hex.ll
@@ -53,8 +53,8 @@ define i32 @test(i64, i64) local_unnamed_addr #0 {
%14 = phi i32 [ %12, %10 ], [ %7, %4 ]
%15 = phi i32 [ 2, %10 ], [ 1, %4 ]
store i32 %14, ptr @gbl, align 4
-; CHECK-DEC: 63 12 00 00 00 00 00 00 *(u32 *)(r2 + 0) = r1
-; CHECK-HEX: 63 12 00 00 00 00 00 00 *(u32 *)(r2 + 0x0) = r1
+; CHECK-DEC: 63 12 00 00 00 00 00 00 *(u32 *)(r2 + 0) = w1
+; CHECK-HEX: 63 12 00 00 00 00 00 00 *(u32 *)(r2 + 0x0) = w1
br label %16
; <label>:16: ; preds = %13, %8
diff --git a/llvm/test/CodeGen/BPF/objdump_static_var.ll b/llvm/test/CodeGen/BPF/objdump_static_var.ll
index a91074ebddd46..b743d82fe5e3d 100644
--- a/llvm/test/CodeGen/BPF/objdump_static_var.ll
+++ b/llvm/test/CodeGen/BPF/objdump_static_var.ll
@@ -1,5 +1,5 @@
-; RUN: llc -mtriple=bpfel -filetype=obj -o - %s | llvm-objdump --no-print-imm-hex -d - | FileCheck --check-prefix=CHECK %s
-; RUN: llc -mtriple=bpfeb -filetype=obj -o - %s | llvm-objdump --no-print-imm-hex -d - | FileCheck --check-prefix=CHECK %s
+; RUN: llc -mtriple=bpfel -filetype=obj -o - %s | llvm-objdump --no-print-imm-hex --mcpu=v1 -d - | FileCheck --check-prefix=CHECK %s
+; RUN: llc -mtriple=bpfeb -filetype=obj -o - %s | llvm-objdump --no-print-imm-hex --mcpu=v1 -d - | FileCheck --check-prefix=CHECK %s
; src:
; static volatile long a = 2;
diff --git a/llvm/test/MC/BPF/insn-unit.s b/llvm/test/MC/BPF/insn-unit.s
index 84735d196030d..e0a4864837798 100644
--- a/llvm/test/MC/BPF/insn-unit.s
+++ b/llvm/test/MC/BPF/insn-unit.s
@@ -34,9 +34,9 @@
r6 = *(u16 *)(r1 + 8) // BPF_LDX | BPF_H
r7 = *(u32 *)(r2 + 16) // BPF_LDX | BPF_W
r8 = *(u64 *)(r3 - 30) // BPF_LDX | BPF_DW
-// CHECK-64: 71 05 00 00 00 00 00 00 r5 = *(u8 *)(r0 + 0)
-// CHECK-64: 69 16 08 00 00 00 00 00 r6 = *(u16 *)(r1 + 8)
-// CHECK-64: 61 27 10 00 00 00 00 00 r7 = *(u32 *)(r2 + 16)
+// CHECK-64: 71 05 00 00 00 00 00 00 w5 = *(u8 *)(r0 + 0)
+// CHECK-64: 69 16 08 00 00 00 00 00 w6 = *(u16 *)(r1 + 8)
+// CHECK-64: 61 27 10 00 00 00 00 00 w7 = *(u32 *)(r2 + 16)
// CHECK-32: 71 05 00 00 00 00 00 00 w5 = *(u8 *)(r0 + 0)
// CHECK-32: 69 16 08 00 00 00 00 00 w6 = *(u16 *)(r1 + 8)
// CHECK-32: 61 27 10 00 00 00 00 00 w7 = *(u32 *)(r2 + 16)
@@ -47,9 +47,9 @@
*(u16 *)(r1 + 8) = r8 // BPF_STX | BPF_H
*(u32 *)(r2 + 16) = r9 // BPF_STX | BPF_W
*(u64 *)(r3 - 30) = r10 // BPF_STX | BPF_DW
-// CHECK-64: 73 70 00 00 00 00 00 00 *(u8 *)(r0 + 0) = r7
-// CHECK-64: 6b 81 08 00 00 00 00 00 *(u16 *)(r1 + 8) = r8
-// CHECK-64: 63 92 10 00 00 00 00 00 *(u32 *)(r2 + 16) = r9
+// CHECK-64: 73 70 00 00 00 00 00 00 *(u8 *)(r0 + 0) = w7
+// CHECK-64: 6b 81 08 00 00 00 00 00 *(u16 *)(r1 + 8) = w8
+// CHECK-64: 63 92 10 00 00 00 00 00 *(u32 *)(r2 + 16) = w9
// CHECK-32: 73 70 00 00 00 00 00 00 *(u8 *)(r0 + 0) = w7
// CHECK-32: 6b 81 08 00 00 00 00 00 *(u16 *)(r1 + 8) = w8
// CHECK-32: 63 92 10 00 00 00 00 00 *(u32 *)(r2 + 16) = w9
@@ -57,7 +57,7 @@
lock *(u32 *)(r2 + 16) += r9 // BPF_STX | BPF_W | BPF_XADD
lock *(u64 *)(r3 - 30) += r10 // BPF_STX | BPF_DW | BPF_XADD
-// CHECK-64: c3 92 10 00 00 00 00 00 lock *(u32 *)(r2 + 16) += r9
+// CHECK-64: c3 92 10 00 00 00 00 00 lock *(u32 *)(r2 + 16) += w9
// CHECK-32: c3 92 10 00 00 00 00 00 lock *(u32 *)(r2 + 16) += w9
// CHECK: db a3 e2 ff 00 00 00 00 lock *(u64 *)(r3 - 30) += r10
diff --git a/llvm/test/MC/BPF/load-store-32.s b/llvm/test/MC/BPF/load-store-32.s
index 826b13b1a48cc..996d696e91a0c 100644
--- a/llvm/test/MC/BPF/load-store-32.s
+++ b/llvm/test/MC/BPF/load-store-32.s
@@ -1,6 +1,6 @@
# RUN: llvm-mc -triple bpfel -filetype=obj -o %t %s
# RUN: llvm-objdump --no-print-imm-hex --mattr=+alu32 -d -r %t | FileCheck --check-prefix=CHECK-32 %s
-# RUN: llvm-objdump --no-print-imm-hex -d -r %t | FileCheck %s
+# RUN: llvm-objdump --no-print-imm-hex --mcpu=v1 -d -r %t | FileCheck %s
// ======== BPF_LDX Class ========
w5 = *(u8 *)(r0 + 0) // BPF_LDX | BPF_B
|
|
cc @jemarch |
|
@4ast PPC used a cpu 'future' so they do not need to update tryGetCPUName(). Do we need to add a 'latest' cpu flavor to avoid updating tryGetCPUName()? I am not 100% sure about this since we may update tryGetCPUName() very infrequently as we do not increase cpu number very often. WDYT? |
| // CHECK-64: 71 05 00 00 00 00 00 00 r5 = *(u8 *)(r0 + 0) | ||
| // CHECK-64: 69 16 08 00 00 00 00 00 r6 = *(u16 *)(r1 + 8) | ||
| // CHECK-64: 61 27 10 00 00 00 00 00 r7 = *(u32 *)(r2 + 16) | ||
| // CHECK-64: 71 05 00 00 00 00 00 00 w5 = *(u8 *)(r0 + 0) |
There was a problem hiding this comment.
Orthogonal to this change, but I find this disassembly difference between CPU versions quite annoying. It seems that it is better to avoid multiple textual representations for the same instruction encoding.
eddyz87
left a comment
There was a problem hiding this comment.
Otherwise the change looks good, but I agree that having "latest" would be a tad nicer.
As discussed in [1], introduce BPF instructions with load-acquire and
store-release semantics under -mcpu=v5.
A "load_acquire" is a BPF_LDX instruction with a new mode modifier,
BPF_MEMACQ ("acquiring atomic load"). Similarly, a "store_release" is a
BPF_STX instruction with another new mode modifier, BPF_MEMREL
("releasing atomic store").
BPF_MEMACQ and BPF_MEMREL share the same numeric value, 0x7 (or 0b111).
For example:
long foo(long *ptr) {
return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
}
foo() can be compiled to:
f9 10 00 00 00 00 00 00 r0 = load_acquire((u64 *)(r1 + 0x0))
95 00 00 00 00 00 00 00 exit
Opcode 0xf9, or 0b11111001, can be decoded as:
0b 111 11 001
BPF_MEMACQ BPF_DW BPF_LDX
Similarly:
void bar(short *ptr, short val) {
__atomic_store_n(ptr, val, __ATOMIC_RELEASE);
}
bar() can be compiled to:
eb 21 00 00 00 00 00 00 store_release((u16 *)(r1 + 0x0), w2)
95 00 00 00 00 00 00 00 exit
Opcode 0xeb, or 0b11101011, can be decoded as:
0b 111 01 011
BPF_MEMREL BPF_H BPF_STX
Inline assembly is also supported. For example:
asm volatile("%0 = load_acquire((u64 *)(%1 + 0x0))" :
"=r"(ret) : "r"(ptr) : "memory");
Let 'llvm-objdump -d' use -mcpu=v5 by default, just like commit
0395868 ("[BPF] Make llvm-objdump disasm default cpu v4
(llvm#102166)").
Add two macros, __BPF_FEATURE_LOAD_ACQUIRE and
__BPF_FEATURE_STORE_RELEASE, to let developers detect these new features
in source code. They can also be disabled using two new llc options,
-disable-load-acquire and -disable-store-release, respectively.
[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/
As discussed in [1], introduce BPF instructions with load-acquire and
store-release semantics under -mcpu=v5.
A "load_acquire" is a BPF_LDX instruction with a new mode modifier,
BPF_MEMACQ ("acquiring atomic load"). Similarly, a "store_release" is a
BPF_STX instruction with another new mode modifier, BPF_MEMREL
("releasing atomic store").
BPF_MEMACQ and BPF_MEMREL share the same numeric value, 0x7 (or 0b111).
For example:
long foo(long *ptr) {
return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
}
foo() can be compiled to:
f9 10 00 00 00 00 00 00 r0 = load_acquire((u64 *)(r1 + 0x0))
95 00 00 00 00 00 00 00 exit
Opcode 0xf9, or 0b11111001, can be decoded as:
0b 111 11 001
BPF_MEMACQ BPF_DW BPF_LDX
Similarly:
void bar(short *ptr, short val) {
__atomic_store_n(ptr, val, __ATOMIC_RELEASE);
}
bar() can be compiled to:
eb 21 00 00 00 00 00 00 store_release((u16 *)(r1 + 0x0), w2)
95 00 00 00 00 00 00 00 exit
Opcode 0xeb, or 0b11101011, can be decoded as:
0b 111 01 011
BPF_MEMREL BPF_H BPF_STX
Inline assembly is also supported. For example:
asm volatile("%0 = load_acquire((u64 *)(%1 + 0x0))" :
"=r"(ret) : "r"(ptr) : "memory");
Let 'llvm-objdump -d' use -mcpu=v5 by default, just like commit
0395868 ("[BPF] Make llvm-objdump disasm default cpu v4
(llvm#102166)").
Add two macros, __BPF_FEATURE_LOAD_ACQUIRE and
__BPF_FEATURE_STORE_RELEASE, to let developers detect these new features
in source code. They can also be disabled using two new llc options,
-disable-load-acquire and -disable-store-release, respectively.
[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/
As discussed in [1], introduce BPF instructions with load-acquire and
store-release semantics under -mcpu=v5.
A "load_acquire" is a BPF_LDX instruction with a new mode modifier,
BPF_MEMACQ ("acquiring atomic load"). Similarly, a "store_release" is a
BPF_STX instruction with another new mode modifier, BPF_MEMREL
("releasing atomic store").
BPF_MEMACQ and BPF_MEMREL share the same numeric value, 0x7 (or 0b111).
For example:
long foo(long *ptr) {
return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
}
foo() can be compiled to:
f9 10 00 00 00 00 00 00 r0 = load_acquire((u64 *)(r1 + 0x0))
95 00 00 00 00 00 00 00 exit
Opcode 0xf9, or 0b11111001, can be decoded as:
0b 111 11 001
BPF_MEMACQ BPF_DW BPF_LDX
Similarly:
void bar(short *ptr, short val) {
__atomic_store_n(ptr, val, __ATOMIC_RELEASE);
}
bar() can be compiled to:
eb 21 00 00 00 00 00 00 store_release((u16 *)(r1 + 0x0), w2)
95 00 00 00 00 00 00 00 exit
Opcode 0xeb, or 0b11101011, can be decoded as:
0b 111 01 011
BPF_MEMREL BPF_H BPF_STX
Inline assembly is also supported. For example:
asm volatile("%0 = load_acquire((u64 *)(%1 + 0x0))" :
"=r"(ret) : "r"(ptr) : "memory");
Let 'llvm-objdump -d' use -mcpu=v5 by default, just like commit
0395868 ("[BPF] Make llvm-objdump disasm default cpu v4
(llvm#102166)").
Add two macros, __BPF_FEATURE_LOAD_ACQUIRE and
__BPF_FEATURE_STORE_RELEASE, to let developers detect these new features
in source code. They can also be disabled using two new llc options,
-disable-load-acquire and -disable-store-release, respectively.
Also use ACQUIRE or RELEASE if user requested weaker memory orders
(RELAXED or CONSUME) until we actually support them. Requesting a
stronger memory order (i.e. SEQ_CST) will cause an error.
[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/
As discussed in [1], introduce BPF instructions with load-acquire and
store-release semantics under -mcpu=v5.
A "load_acquire" is a BPF_LDX instruction with a new mode modifier,
BPF_MEMACQ ("acquiring atomic load"). Similarly, a "store_release" is a
BPF_STX instruction with another new mode modifier, BPF_MEMREL
("releasing atomic store").
BPF_MEMACQ and BPF_MEMREL share the same numeric value, 0x7 (or 0b111).
For example:
long foo(long *ptr) {
return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
}
foo() can be compiled to:
f9 10 00 00 00 00 00 00 r0 = load_acquire((u64 *)(r1 + 0x0))
95 00 00 00 00 00 00 00 exit
Opcode 0xf9, or 0b11111001, can be decoded as:
0b 111 11 001
BPF_MEMACQ BPF_DW BPF_LDX
Similarly:
void bar(short *ptr, short val) {
__atomic_store_n(ptr, val, __ATOMIC_RELEASE);
}
bar() can be compiled to:
eb 21 00 00 00 00 00 00 store_release((u16 *)(r1 + 0x0), w2)
95 00 00 00 00 00 00 00 exit
Opcode 0xeb, or 0b11101011, can be decoded as:
0b 111 01 011
BPF_MEMREL BPF_H BPF_STX
Inline assembly is also supported. For example:
asm volatile("%0 = load_acquire((u64 *)(%1 + 0x0))" :
"=r"(ret) : "r"(ptr) : "memory");
Let 'llvm-objdump -d' use -mcpu=v5 by default, just like commit
0395868 ("[BPF] Make llvm-objdump disasm default cpu v4
(llvm#102166)").
Add two macros, __BPF_FEATURE_LOAD_ACQUIRE and
__BPF_FEATURE_STORE_RELEASE, to let developers detect these new features
in source code. They can also be disabled using two new llc options,
-disable-load-acquire and -disable-store-release, respectively.
Also use ACQUIRE or RELEASE if user requested weaker memory orders
(RELAXED or CONSUME) until we actually support them. Requesting a
stronger memory order (i.e. SEQ_CST) will cause an error.
[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/
As discussed in [1], introduce BPF instructions with load-acquire and
store-release semantics under -mcpu=v5.
A "load_acquire" is a BPF_LDX instruction with a new mode modifier,
BPF_MEMACQ ("acquiring atomic load"). Similarly, a "store_release" is a
BPF_STX instruction with another new mode modifier, BPF_MEMREL
("releasing atomic store").
BPF_MEMACQ and BPF_MEMREL share the same numeric value, 0x7 (or 0b111).
For example:
long foo(long *ptr) {
return __atomic_load_n(ptr, __ATOMIC_ACQUIRE);
}
foo() can be compiled to:
f9 10 00 00 00 00 00 00 r0 = load_acquire((u64 *)(r1 + 0x0))
95 00 00 00 00 00 00 00 exit
Opcode 0xf9, or 0b11111001, can be decoded as:
0b 111 11 001
BPF_MEMACQ BPF_DW BPF_LDX
Similarly:
void bar(short *ptr, short val) {
__atomic_store_n(ptr, val, __ATOMIC_RELEASE);
}
bar() can be compiled to:
eb 21 00 00 00 00 00 00 store_release((u16 *)(r1 + 0x0), w2)
95 00 00 00 00 00 00 00 exit
Opcode 0xeb, or 0b11101011, can be decoded as:
0b 111 01 011
BPF_MEMREL BPF_H BPF_STX
Inline assembly is also supported. For example:
asm volatile("%0 = load_acquire((u64 *)(%1 + 0x0))" :
"=r"(ret) : "r"(ptr) : "memory");
Let 'llvm-objdump -d' use -mcpu=v5 by default, just like commit
0395868 ("[BPF] Make llvm-objdump disasm default cpu v4
(llvm#102166)").
Add two macros, __BPF_FEATURE_LOAD_ACQUIRE and
__BPF_FEATURE_STORE_RELEASE, to let developers detect these new features
in source code. They can also be disabled using two new llc options,
-disable-load-acquire and -disable-store-release, respectively.
Also use ACQUIRE or RELEASE if user requested weaker memory orders
(RELAXED or CONSUME) until we actually support them. Requesting a
stronger memory order (i.e. SEQ_CST) will cause an error.
[1] https://lore.kernel.org/all/20240729183246.4110549-1-yepeilin@google.com/
Currently, with the following example,
$ cat t.c
void foo(int a, _Atomic int *b)
{
*b &= a;
}
$ clang --target=bpf -O2 -c -mcpu=v3 t.c
$ llvm-objdump -d t.o
t.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 :
0: c3 12 00 00 51 00 00 00
1: 95 00 00 00 00 00 00 00 exit
Basically, the default cpu for llvm-objdump is v1 and it won't be able to decode insn properly.
If we add --mcpu=v3 to llvm-objdump command line, we will have
$ llvm-objdump -d --mcpu=v3 t.o
t.o: file format elf64-bpf
Disassembly of section .text:
0000000000000000 :
0: c3 12 00 00 51 00 00 00 w1 = atomic_fetch_and((u32 *)(r2 + 0x0), w1)
1: 95 00 00 00 00 00 00 00 exit
The atomic_fetch_and insn can be decoded properly. Using latest cpu version --mcpu=v4 can also decode properly like the above --mcpu=v3.
To avoid the above '' decoding with common 'llvm-objdump -d t.o', this patch marked the default cpu for llvm-objdump with the current highest cpu number v4 in ELFObjectFileBase::tryGetCPUName(). The cpu number in ELFObjectFileBase::tryGetCPUName() will be adjusted in the future if cpu number is increased e.g. v5 etc. Such an approach also aligns with gcc-bpf as discussed in [1].
Six bpf unit tests are affected with this change. I changed test output for three unit tests and added --mcpu=v1 for the other three unit tests, to demonstrate the default (cpu v4) behavior and explicit --mcpu=v1 behavior.
[1] https://lore.kernel.org/bpf/6f32c0a1-9de2-4145-92ea-be025362182f@linux.dev/T/#m0f7e63c390bc8f5a5523e7f2f0537becd4205200