In C, when I pass a pointer to a function, the compiler always seems to assume that the data pointed to by that pointer might be continuously modified in another thread, even though in actual API usage patterns this is usually not the case.
Problem Description Consider the following typical API usage pattern:
int create_xxx(int *p_xxx);
int xxx_do_something(int xxx);
int entry() {
int xxx;
create_xxx(&xxx);
xxx_do_something(xxx);
xxx_do_something(xxx);
xxx_do_something(xxx);
return 0;
}
When compiled with gcc -S -Ofast, the generated assembly code shows that the compiler reloads the value of xxx from the stack each time xxx_do_something is called. (Clang and MSVC are essentially equivalent; Godbolt.)
entry:
subq $24, %rsp
leaq 12(%rsp), %rdi
call xxx_create
movl 12(%rsp), %edi # reload
call xxx_do_something
movl 12(%rsp), %edi # reload
call xxx_do_something
movl 12(%rsp), %edi # reload
call xxx_do_something
xorl %eax, %eax
addq $24, %rsp
ret
Desired Behavior
In actual API design, the create_xxx function typically initializes the data but does not continuously modify it in a background thread. I want the compiler to recognize this and keep the value of xxx in a register instead of repeatedly loading it from memory:
entry:
push %rbx # save a call-preserved register
subq $16, %rsp # space for xxx while keeping RSP aligned by 16
leaq 12(%rsp), %rdi
call xxx_create
movl 12(%rsp), %edi # load into EDI as a function arg
movl %edi, %ebx # and save a copy in a EBX
call xxx_do_something
movl %ebx, %edi # copy a register instead of reloading
call xxx_do_something
movl %ebx, %edi # ditto
call xxx_do_something
xorl %eax, %eax # return 0
addq $16, %rsp
pop %rbx # restore our caller's RBX
ret
Problems with Manual Copying Although manually saving the value works:
int _xxx;
xxx_create(&_xxx);
int xxx = _xxx;
// do other work
xxx_do_something(xxx);
xxx_do_something(xxx);
xxx_do_something(xxx);
It has several disadvantages:
Compilation ordering constraints: In assembly,
int xxx = _xxxmust occur before// do other work. The compiler cannot reorder this through out-of-order execution because it assumes that other threads might change the value of_xxxduring// do other work, even though in the codeint xxx = _xxxis placed before// do other work.Unnecessary stack usage: The compiler might allocate stack space to save
xxx(if// do other work is lengthy), whereas if the compiler knew that_xxxwouldn't be modified again, it would only need to dereference once beforexxx_do_something(_xxx).Performance overhead for complex types: If
xxxis an array or structure, reassignment also consumes performance.
Question Is there a standard way to tell the C compiler that:
The data pointed to by the pointer passed to the function won't be continuously modified by that function in a background thread?
Or, the function might modify the data, but after the function returns, the data won't be continuously changed by other threads?
I'm using GCC and Clang, and would prefer cross-compiler solutions or compiler-specific extensions.