-
-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Add scalar support for offload #150288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add scalar support for offload #150288
Conversation
d99cf66 to
0330a0e
Compare
|
@Sa4dUs, what's the current state of this? Given the omp/ol expectation of scalars being 64 bit integers I guess this code only works for i64 and usize and everything else will need the casts (or only works by chance)? |
|
it works for all 64-bit scalars. for f64 it's not what omp expects but somehow it just works, not sure it it will break in some weird edge case |
| } | ||
| }; | ||
|
|
||
| let gep = builder.inbounds_gep(cx.type_f32(), gep_base, &[i32_0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is generally to move away from typed geps here, e.g.
https://www.npopov.com/2025/01/05/This-year-in-LLVM-2024.html
llvm/llvm-project#68882
Since we only index at 0 it doesn't differ (4*0=0), but can you change it from f32 to i8 to not confuse other readers about the meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am also wondering if our index logic here is correct in general. It's derived from what clang generates, but that makes it more verbose than it needs to be, I'm just testing if further simplifications here help with the benchmark behaving weird.
|
☔ The latest upstream changes made this pull request unmergeable. Please resolve the merge conflicts. |
|
☔ The latest upstream changes (presumably #150606) made this pull request unmergeable. Please resolve the merge conflicts. |
0330a0e to
ea71b7b
Compare
ea71b7b to
8b7e6b6
Compare
5189bbb to
45ccec5
Compare
45ccec5 to
38f7fc7
Compare
|
rustbot has assigned @jdonszelmann. Use |
| llvm::set_value_name(a0, CString::new("dyn_ptr").unwrap().as_bytes()); | ||
|
|
||
| let bb = | ||
| unsafe { llvm::LLVMAppendBasicBlockInContext(cx.llcx(), new_fn, c"entry".as_ptr()) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use cx.append_block to make this safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as i'm using SimpleCx and SBuilder i don't have access to that safe abstractions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, we can't change the trait to accept a simpleCx, but those functions should work with a SimpleCx, right? In that case you can duplicate the function, see e.g. memset which we have twice. Not ideal, but at least no unsafe.
| let num_bits = scalar_width(cx, old_ty); | ||
|
|
||
| let trunc = unsafe { | ||
| llvm::LLVMBuildTrunc( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cx.trunc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
| let target_symbol = symbol_name_for_instance_in_crate(tcx, fn_target, LOCAL_CRATE); | ||
|
|
||
| let sig = tcx.fn_sig(fn_target.def_id()).skip_binder().skip_binder(); | ||
| let sig = tcx.fn_sig(fn_target.def_id()).instantiate_identity(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was meant to give the compiler enough info to infer types, but it seems that just using skip_binder() works. i had left it while i was trying to make that work
38f7fc7 to
5e980e1
Compare
This comment has been minimized.
This comment has been minimized.
5e980e1 to
272a1a6
Compare
This PR adds scalar support to the offload feature. The scalar management has two main parts:
On the host side, each scalar arg is casted to
ixtype, zero extended toi64and passed to the kernel like that.On the device, the each scalar arg (
i64at that point), is truncated toixand then casted to the original type.r? @ZuseZ4