Skip to content

golang runtime spins, causing deadlock #1968

@sporksmith

Description

@sporksmith

Once the VDSO time functions are intercepted (pending #1951), the snowflake simulation (#1549) deadlocks in the client process, in both ptrace and preload modes.

We can see from the shim log that clock_gettime is getting called repeatedly. We can also see it being called at the point where I stopped the program below, in thread 2.5.

When I had stopped it at another point, the code was in https://go.dev/src/runtime/proc.go, which describes having a worker thread spin for a while, while it waits for work to arrive. (I failed to save the stack trace that time...)

I didn't 100% step/trace through that code, but it seems likely we have little recourse but to resolve #1792 in some way and move time forward to un-stick ourselves here.

(gdb) thread apply all bt

Thread 2.5 (LWP 210272 "client"):
#0  0x00007ffff72bd1d9 in shim_native_syscallv (n=1, args=0x7fffc68eff10) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:94
#1  0x00007ffff72bdbd6 in shim_syscallv (n=1, args=0x7fffc68eff10) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:370
#2  0x00007ffff72bb56d in shim_api_syscallv (n=1, args=0x7fffc68eff10) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_api_syscall.c:24
#3  0x00007ffff7fbaca5 in syscall (n=1) at /home/jnewsome/projects/shadow/dev/src/lib/libc_preload/libc_impls.c:36
#4  0x00007ffff7fbaaf8 in write (a=5, b=140736524648432, c=163, d=163, e=0, f=140736524648192) at /home/jnewsome/projects/shadow/dev/src/lib/libc_preload/syscall_wrappers.c:1048
#5  0x00007ffff72bbac1 in shimlogger_log (base=0xcb7c80, level=LOGLEVEL_TRACE, fileName=0x7ffff72c63bc "/home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c", functionName=0x7ffff72c63fb "shim_syscallv", lineNumber=357, format=0x7ffff72c6409 "Handled syscall %ld from the shim; we avoided inter-process overhead.", vargs=0x7fffc68f0980) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_logger.c:83
#6  0x00007ffff72c47f5 in logger_log (logger=0xcb7c80, level=LOGLEVEL_TRACE, fileName=0x7ffff72c63bc "/home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c", functionName=0x7ffff72c63fb "shim_syscallv", lineNumber=357, format=0x7ffff72c6409 "Handled syscall %ld from the shim; we avoided inter-process overhead.") at /home/jnewsome/projects/shadow/dev/src/lib/logger/logger.c:171
#7  0x00007ffff72bdb06 in shim_syscallv (n=228, args=0x7fffc68f0bb0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:357
#8  0x00007ffff72bb56d in shim_api_syscallv (n=228, args=0x7fffc68f0bb0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_api_syscall.c:24
#9  0x00007ffff7fbaca5 in syscall (n=228) at /home/jnewsome/projects/shadow/dev/src/lib/libc_preload/libc_impls.c:36
#10 0x00007ffff72b82a4 in _replacement_clock_gettime (arg1=0x0, arg2=0x7fffc68f0c00) at /home/jnewsome/projects/shadow/dev/src/lib/shim/patch_vdso.c:153
#11 0x000000000046ec1c in runtime.walltime1 () at /home/jnewsome/opt/go/src/runtime/sys_linux_amd64.s:250
#12 0x0000000000000000 in ?? ()

Thread 2.4 (LWP 210271 "client"):
#0  0x00007ffff72bd1d9 in shim_native_syscallv (n=202, args=0x7ffff7339440 <_tlss+441552>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:94
#1  0x00007ffff72bdbd6 in shim_syscallv (n=202, args=0x7ffff7339440 <_tlss+441552>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:370
#2  0x00007ffff72bb56d in shim_api_syscallv (n=202, args=0x7ffff7339440 <_tlss+441552>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_api_syscall.c:24
#3  0x00007ffff7fbaca5 in syscall (n=202) at /home/jnewsome/projects/shadow/dev/src/lib/libc_preload/libc_impls.c:36
#4  0x00007ffff72bf05f in _futex_wait_abs (s=0x7fffeedd2268, timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:30
#5  0x00007ffff72bef38 in _shadow_sem_timedwait (sem=0x7fffeedd2268, abs_timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:130
#6  0x00007ffff72bf009 in shadow_sem_wait (sem=0x7fffeedd2268) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:155
#7  0x00007ffff72c0dbb in BinarySpinningSem::wait (this=0x7fffeedd2268, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/binary_spinning_sem.cc:34
#8  0x00007ffff72be84f in shimevent_recvEventFromShadow (data=0x7fffeedd2008, e=0x7ffff7339878 <_tlss+442632>, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/ipc.cc:68
--Type <RET> for more, q to quit, c to continue without paging--
#9  0x00007ffff72bd5c5 in _shim_emulated_syscall_event (syscall_event=0x7ffff7339a18 <_tlss+443048>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:239
#10 0x00007ffff72bd481 in shim_emulated_syscallv (n=202, args=0x7ffff7339d10 <_tlss+443808>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:334
#11 0x00007ffff72bdb87 in shim_syscallv (n=202, args=0x7ffff7339d10 <_tlss+443808>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:364
#12 0x00007ffff72bdd85 in shim_syscall (n=202) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:379
#13 0x00007ffff72bc873 in _shim_seccomp_handle_sigsys (sig=31, info=0x7ffff7339ef0 <_tlss+444288>, voidUcontext=0x7ffff7339dc0 <_tlss+443984>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_seccomp.c:63
#14 <signal handler called>
#15 runtime.futex () at /home/jnewsome/opt/go/src/runtime/sys_linux_amd64.s:580
#16 0x0000000000433846 in runtime.futexsleep (addr=0xc9ef98 <runtime.newmHandoff+24>, val=0, ns=-1) at /home/jnewsome/opt/go/src/runtime/os_linux.go:44
#17 0x000000000040dabf in runtime.notesleep (n=0xc9ef98 <runtime.newmHandoff+24>) at /home/jnewsome/opt/go/src/runtime/lock_futex.go:159
#18 0x000000000043e2ba in runtime.templateThread () at /home/jnewsome/opt/go/src/runtime/proc.go:2278
#19 0x000000000043cb88 in runtime.mstart1 () at /home/jnewsome/opt/go/src/runtime/proc.go:1306
#20 0x000000000043ca8e in runtime.mstart () at /home/jnewsome/opt/go/src/runtime/proc.go:1272
#21 0x0000000000401813 in runtime/cgo(.text) ()
#22 0x00007fffc7131f80 in ?? ()
#23 0x00007fffffffe7b0 in ?? ()
#24 0x00007ffff72b8e50 in ?? () at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim.c:566 from /home/jnewsome/.local/lib/libshadow-shim.so
#25 0x00007ffff7e92155 in clone () from /lib/x86_64-linux-gnu/libc.so.6
#26 0x000000c000001380 in ?? ()
#27 0x000000000043ca20 in ?? ()
#28 0x00000000004013e3 in threadentry ()
#29 0x0000000000000000 in ?? ()

Thread 2.3 (LWP 210270 "client"):
#0  0x00007ffff72bd1d9 in shim_native_syscallv (n=202, args=0x7ffff731e040 <_tlss+329936>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:94
#1  0x00007ffff72bdbd6 in shim_syscallv (n=202, args=0x7ffff731e040 <_tlss+329936>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:370
#2  0x00007ffff72bb56d in shim_api_syscallv (n=202, args=0x7ffff731e040 <_tlss+329936>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_api_syscall.c:24
#3  0x00007ffff7fbaca5 in syscall (n=202) at /home/jnewsome/projects/shadow/dev/src/lib/libc_preload/libc_impls.c:36
#4  0x00007ffff72bf05f in _futex_wait_abs (s=0x7fffeedcde68, timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:30
#5  0x00007ffff72bef38 in _shadow_sem_timedwait (sem=0x7fffeedcde68, abs_timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:130
#6  0x00007ffff72bf009 in shadow_sem_wait (sem=0x7fffeedcde68) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:155
#7  0x00007ffff72c0dbb in BinarySpinningSem::wait (this=0x7fffeedcde68, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/binary_spinning_sem.cc:34
#8  0x00007ffff72be84f in shimevent_recvEventFromShadow (data=0x7fffeedcdc08, e=0x7ffff731e478 <_tlss+331016>, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/ipc.cc:68
#9  0x00007ffff72bd5c5 in _shim_emulated_syscall_event (syscall_event=0x7ffff731e618 <_tlss+331432>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:239
#10 0x00007ffff72bd481 in shim_emulated_syscallv (n=202, args=0x7ffff731e910 <_tlss+332192>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:334
#11 0x00007ffff72bdb87 in shim_syscallv (n=202, args=0x7ffff731e910 <_tlss+332192>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:364
#12 0x00007ffff72bdd85 in shim_syscall (n=202) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:379
#13 0x00007ffff72bc873 in _shim_seccomp_handle_sigsys (sig=31, info=0x7ffff731eaf0 <_tlss+332672>, voidUcontext=0x7ffff731e9c0 <_tlss+332368>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_seccomp.c:63
#14 <signal handler called>
#15 runtime.futex () at /home/jnewsome/opt/go/src/runtime/sys_linux_amd64.s:580
#16 0x0000000000433846 in runtime.futexsleep (addr=0xc9f040 <runtime.sig>, val=0, ns=-1) at /home/jnewsome/opt/go/src/runtime/os_linux.go:44
#17 0x000000000040dba6 in runtime.notetsleep_internal (n=0xc9f040 <runtime.sig>, ns=-1, ~r2=<optimized out>) at /home/jnewsome/opt/go/src/runtime/lock_futex.go:182
#18 0x000000000040ddcc in runtime.notetsleepg (n=0xc9f040 <runtime.sig>, ns=-1, ~r2=<optimized out>) at /home/jnewsome/opt/go/src/runtime/lock_futex.go:236
#19 0x000000000046a0c5 in os/signal.signal_recv (~r0=<optimized out>) at /home/jnewsome/opt/go/src/runtime/sigqueue.go:168
#20 0x0000000000520a45 in os/signal.loop () at /home/jnewsome/opt/go/src/os/signal/signal_unix.go:23
#21 0x000000000046d2e1 in runtime.goexit () at /home/jnewsome/opt/go/src/runtime/asm_amd64.s:1371
#22 0x0000000000000000 in ?? ()

Thread 2.2 (LWP 210269 "client"):
#0  0x00007ffff72bd1d9 in shim_native_syscallv (n=202, args=0x7ffff7302c40 <_tlss+218320>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:94
#1  0x00007ffff72bdbd6 in shim_syscallv (n=202, args=0x7ffff7302c40 <_tlss+218320>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:370
#2  0x00007ffff72bb56d in shim_api_syscallv (n=202, args=0x7ffff7302c40 <_tlss+218320>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_api_syscall.c:24
#3  0x00007ffff7fbaca5 in syscall (n=202) at /home/jnewsome/projects/shadow/dev/src/lib/libc_preload/libc_impls.c:36
#4  0x00007ffff72bf05f in _futex_wait_abs (s=0x7fffeedcda68, timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:30
#5  0x00007ffff72bef38 in _shadow_sem_timedwait (sem=0x7fffeedcda68, abs_timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:130
#6  0x00007ffff72bf009 in shadow_sem_wait (sem=0x7fffeedcda68) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:155
#7  0x00007ffff72c0dbb in BinarySpinningSem::wait (this=0x7fffeedcda68, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/binary_spinning_sem.cc:34
#8  0x00007ffff72be84f in shimevent_recvEventFromShadow (data=0x7fffeedcd808, e=0x7ffff7303078 <_tlss+219400>, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/ipc.cc:68
#9  0x00007ffff72bd5c5 in _shim_emulated_syscall_event (syscall_event=0x7ffff7303218 <_tlss+219816>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:239
#10 0x00007ffff72bd481 in shim_emulated_syscallv (n=35, args=0x7ffff7303510 <_tlss+220576>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:334
#11 0x00007ffff72bdb87 in shim_syscallv (n=35, args=0x7ffff7303510 <_tlss+220576>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:364
#12 0x00007ffff72bdd85 in shim_syscall (n=35) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:379
#13 0x00007ffff72bc873 in _shim_seccomp_handle_sigsys (sig=31, info=0x7ffff73036f0 <_tlss+221056>, voidUcontext=0x7ffff73035c0 <_tlss+220752>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_seccomp.c:63
#14 <signal handler called>
#15 runtime.usleep () at /home/jnewsome/opt/go/src/runtime/sys_linux_amd64.s:145
#16 0x0000000000445759 in runtime.sysmon () at /home/jnewsome/opt/go/src/runtime/proc.go:5166
#17 0x000000000043cb88 in runtime.mstart1 () at /home/jnewsome/opt/go/src/runtime/proc.go:1306
#18 0x000000000043ca8e in runtime.mstart () at /home/jnewsome/opt/go/src/runtime/proc.go:1272
#19 0x0000000000401813 in runtime/cgo(.text) ()
#20 0x00007fffc8133f80 in ?? ()
#21 0x00007fffffffe700 in ?? ()
#22 0x00007ffff72b8e50 in ?? () at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim.c:566 from /home/jnewsome/.local/lib/libshadow-shim.so
--Type <RET> for more, q to quit, c to continue without paging--
#23 0x00007ffff7e92155 in clone () from /lib/x86_64-linux-gnu/libc.so.6
#24 0x000000c000000900 in ?? ()
#25 0x000000000043ca20 in ?? ()
#26 0x00000000004013e3 in threadentry ()
#27 0x0000000000000000 in ?? ()

Thread 2.1 (LWP 210264 "client"):
#0  0x00007ffff72bd1d9 in shim_native_syscallv (n=202, args=0x7ffff72e7840 <_tlss+106704>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:94
#1  0x00007ffff72bdbd6 in shim_syscallv (n=202, args=0x7ffff72e7840 <_tlss+106704>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:370
#2  0x00007ffff72bb56d in shim_api_syscallv (n=202, args=0x7ffff72e7840 <_tlss+106704>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_api_syscall.c:24
#3  0x00007ffff7fbaca5 in syscall (n=202) at /home/jnewsome/projects/shadow/dev/src/lib/libc_preload/libc_impls.c:36
#4  0x00007ffff72bf05f in _futex_wait_abs (s=0x7fffeedcd668, timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:30
#5  0x00007ffff72bef38 in _shadow_sem_timedwait (sem=0x7fffeedcd668, abs_timeout=0x0) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:130
#6  0x00007ffff72bf009 in shadow_sem_wait (sem=0x7fffeedcd668) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shadow_sem.c:155
#7  0x00007ffff72c0dbb in BinarySpinningSem::wait (this=0x7fffeedcd668, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/binary_spinning_sem.cc:34
#8  0x00007ffff72be84f in shimevent_recvEventFromShadow (data=0x7fffeedcd408, e=0x7ffff72e7c78 <_tlss+107784>, spin=true) at /home/jnewsome/projects/shadow/dev/src/lib/shim/ipc.cc:68
#9  0x00007ffff72bd5c5 in _shim_emulated_syscall_event (syscall_event=0x7ffff72e7e18 <_tlss+108200>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:239
#10 0x00007ffff72bd481 in shim_emulated_syscallv (n=202, args=0x7ffff72e8110 <_tlss+108960>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:334
#11 0x00007ffff72bdb87 in shim_syscallv (n=202, args=0x7ffff72e8110 <_tlss+108960>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:364
#12 0x00007ffff72bdd85 in shim_syscall (n=202) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_syscall.c:379
#13 0x00007ffff72bc873 in _shim_seccomp_handle_sigsys (sig=31, info=0x7ffff72e82f0 <_tlss+109440>, voidUcontext=0x7ffff72e81c0 <_tlss+109136>) at /home/jnewsome/projects/shadow/dev/src/lib/shim/shim_seccomp.c:63
#14 <signal handler called>
#15 runtime.futex () at /home/jnewsome/opt/go/src/runtime/sys_linux_amd64.s:580
#16 0x0000000000433846 in runtime.futexsleep (addr=0xc70b30 <runtime.m0+336>, val=0, ns=-1) at /home/jnewsome/opt/go/src/runtime/os_linux.go:44
#17 0x000000000040dabf in runtime.notesleep (n=0xc70b30 <runtime.m0+336>) at /home/jnewsome/opt/go/src/runtime/lock_futex.go:159
#18 0x000000000043cc79 in runtime.mPark () at /home/jnewsome/opt/go/src/runtime/proc.go:1340
#19 0x000000000043eb8e in runtime.stoplockedm () at /home/jnewsome/opt/go/src/runtime/proc.go:2495
#20 0x0000000000440dac in runtime.schedule () at /home/jnewsome/opt/go/src/runtime/proc.go:3103
#21 0x000000000044117d in runtime.park_m (gp=0xc000001b00) at /home/jnewsome/opt/go/src/runtime/proc.go:3318
#22 0x000000000046b47b in runtime.mcall () at /home/jnewsome/opt/go/src/runtime/asm_amd64.s:327
#23 0x000000000046b374 in runtime.rt0_go () at /home/jnewsome/opt/go/src/runtime/asm_amd64.s:226
#24 0x0000000000000000 in ?? ()

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type: BugError or flaw producing unexpected results

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions