Skip to content

ld-linux-x86-64.so.2 (dynamic linker/loader) calls set_tid_address, returning the native pid, before shadow gets control, and sometimes uses it after. (tgen, rust runtime) #3537

@sporksmith

Description

@sporksmith

If we look at the strace log of tgen processes in the tor minimal test, fairly near the beginning we get something like:

00:00:01.000000000 [tid 1000] sched_getaffinity(718573, 8, <pointer>) = -3 (ESRCH)

where 718573 is the native pid of the tgen process.

We can get a little better idea of what's going on by running strace over tgen natively:

$ strace -k tgen
...
set_tid_address(0x71db81b80f10)         = 715841
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(_dl_deallocate_tls+0x6ce) [0x1513e]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(_dl_catch_error+0x3a38) [0x20e08]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(_dl_catch_error+0x727c) [0x2464c]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(_dl_catch_error+0x246c) [0x1f83c]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(_dl_catch_error+0x41c8) [0x21598]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(_dl_catch_error+0x2ec8) [0x20298]
...
sched_getaffinity(715841, 8, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]) = 8
 > /usr/lib/x86_64-linux-gnu/libc.so.6(pthread_getaffinity_np+0x20) [0x95cf0]
 > /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0(omp_test_nest_lock+0x296) [0x200e6]
 > /usr/lib/x86_64-linux-gnu/libgomp.so.1.0.0() [0xbfe2]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(__nptl_change_stack_perm+0x12ce) [0x647e]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(__nptl_change_stack_perm+0x13b8) [0x6568]
 > /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2(_dl_catch_error+0x2efa) [0x202ca]

It appears that in shadow we don't have control yet for the set_tid_address call, which returns the native pid, but do have control for the later sched_getaffinity call that uses it.

As far as I can tell the ESRCH that shadow ends up returning here doesn't have any ill effect, and this doesn't appear to cause the execution of tgen to otherwise diverge, but this difference shows up if we try diff'ing tgen strace logs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: BugError or flaw producing unexpected results

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions