Skip to content

posix_spawn fails starting in glibc-2.38 #3266

@robgjansen

Description

@robgjansen

The fork-shadow test is failing starting in Fedora 39, which uses glibc-2.38 wherein posix_spawn starts using the CLONE_CLEAR_SIGHAND flag of the clone3() syscall. CLONE_CLEAR_SIGHAND is currently unsupported by Shadow, causing the posix_spawn call to fail with ENOTSUP (95).

The Linux kernel added support for the CLONE_CLEAR_SIGHAND flag to clone3() starting in Linux v5.5 according to the man page and this kernel patch set merged into the kernel here. Later, glibc-2.38 starting using the clone3 CLONE_CLEAR_SIGHAND flag as an optimization to posix_spawn starting in this commit.

Fedora 39 is one of the first to adopt glibc-2.38, so the fork-shadow test started failing there first, but more distros are soon to follow so it would be good to have a plan for addressing this. In particular, this is currently preventing us from adding Fedora 39 to our CI and supported list of distros.


The fork-shadow test fails here:

# cat build/src/test/clone/fork-shadow.data/hosts/testnode/test_fork.1000.stderr 
panic: PanicInfo { payload: Any { .. }, message: Some(assertion `left == right` failed
  left: 95
 right: 0), location: Location { file: "test/clone/test_fork.rs", line: 1732, col: 17 }, can_unwind: true, force_no_backtrace: false }

The shadow log:

Start testing: Jan 01 23:37 UTC
----------------------------------------------------------
15/193 Testing: fork-shadow
15/193 Test: fork-shadow
Command: "/usr/bin/sh" "-c" "      rm -rf fork-shadow.data       &&  /root/shadow/build/src/main/shadow       --data-directory=fork-shadow.data       --log-level=info       --use-memory-manager=false --use-cpu-pinning false --strace-logging-mode standard --parallelism 1       /root/shadow/src/test/clone/fork.yaml      
 && (true)       "
Directory: /root/shadow/build/src/test/clone
"fork-shadow" start time: Jan 01 23:37 UTC
Output:
----------------------------------------------------------
00:00:00.000003 [15352:shadow] n/a [WARN] [n/a] [main.rs:133] [shadow_rs::core::main] Shadow is running as root. Shadow does not emulate Linux permissions, and some
            applications may behave differently when running as root. It is recommended to run
            Shadow as a non-root user.
** Starting Shadow 3.1.0
00:00:00.000209 [15352:shadow] n/a [INFO] [n/a] [main.rs:191] [shadow_rs::core::main] Shadow 3.1.0
00:00:00.000216 [15352:shadow] n/a [INFO] [n/a] [main.rs:191] [shadow_rs::core::main] GLib 2.78.3
00:00:00.000217 [15352:shadow] n/a [INFO] [n/a] [main.rs:191] [shadow_rs::core::main] Built on 2024-01-01--23:01:26
00:00:00.000217 [15352:shadow] n/a [INFO] [n/a] [main.rs:191] [shadow_rs::core::main] Built from git branch <unknown>
00:00:00.000218 [15352:shadow] n/a [INFO] [n/a] [main.rs:191] [shadow_rs::core::main] Shadow was built with PROFILE=release, OPT_LEVEL=3, DEBUG=false, RUSTFLAGS="-C force-frame-pointers=y", CFLAGS="-std=gnu11 -O3 -ggdb -fno-omit-frame-pointer -Wreturn-type -Wswitch -Werror -DNDEBUG"
00:00:00.000219 [15352:shadow] n/a [INFO] [n/a] [main.rs:191] [shadow_rs::core::main] For more information, visit https://shadow.github.io or https://github.com/shadow
00:00:00.000220 [15352:shadow] n/a [INFO] [n/a] [main.rs:193] [shadow_rs::core::main] Logging current startup arguments and environment
00:00:00.000221 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: /root/shadow/build/src/main/shadow
00:00:00.000222 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: --data-directory=fork-shadow.data
00:00:00.000222 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: --log-level=info
00:00:00.000223 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: --use-memory-manager=false
00:00:00.000224 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: --use-cpu-pinning
00:00:00.000224 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: false
00:00:00.000225 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: --strace-logging-mode
00:00:00.000225 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: standard
00:00:00.000226 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: --parallelism
00:00:00.000227 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: 1
00:00:00.000227 [15352:shadow] n/a [INFO] [n/a] [main.rs:441] [shadow_rs::core::main] arg: /root/shadow/src/test/clone/fork.yaml
00:00:00.000237 [15352:shadow] n/a [INFO] [n/a] [main.rs:449] [shadow_rs::core::main] env: "G_DEBUG"="fatal-criticals"
00:00:00.000910 [15352:shadow] n/a [INFO] [n/a] [sim_config.rs:374] [shadow_rs::core::sim_config] Resolved binary path "../../target/debug/test_fork" to "/root/shadow/build/src/target/debug/test_fork"
00:00:00.005625 [15352:shadow] n/a [INFO] [n/a] [main.rs:217] [shadow_rs::core::main] Log message buffering is enabled for efficiency
00:00:00.005745 [15352:shadow] n/a [WARN] [n/a] [tsc.c:192] [TscC_nativeCyclesPerSecond] Couldn't get CPU TSC frequency
00:00:00.005745 [15352:shadow] n/a [WARN] [n/a] [manager.rs:79] [shadow_rs::core::manager] Couldn't find native TSC frequency. Emulated rdtsc may use a rate different than managed code expects
00:00:00.005897 [15352:shadow] n/a [INFO] [n/a] [manager.rs:127] [shadow_rs::core::manager] Preloading the openssl crypto library is disabled
00:00:00.006264 [15352:shadow] n/a [INFO] [n/a] [controller.rs:65] [shadow_rs::core::controller] Running simulation
00:00:00.006600 [15421:shadow-worker] 00:00:00.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:448] [_tracker_logNode] [shadow-heartbeat] [node-header] interval-seconds,recv-bytes,send-bytes,cpu-percent,delayed-count,avgdelay-milliseconds;inbound-localhost-counters;outbound-localhost-counters;inbound-remote-counters;outbound-remote-counters where counters are: packets-total,bytes-total,packets-control,bytes-control-header,packets-control-retrans,bytes-control-header-retrans,packets-data,bytes-data-header,bytes-data-payload,packets-data-retrans,bytes-data-header-retrans,bytes-data-payload-retrans
00:00:00.006623 [15421:shadow-worker] 00:00:00.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.011269 [15421:shadow-worker] 00:00:01.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.013168 [15421:shadow-worker] 00:00:01.000000000 [WARN] [testnode:11.0.0.1] [futex.rs:44] [shadow_rs::host::syscall::handler::futex] (LOG_ONCE) set_robust_list was called but we don't yet support it
00:00:00.028280 [15352:shadow] n/a [INFO] [n/a] [manager.rs:683] [shadow_rs::core::manager] Process resource usage at simtime 1100000000 reported by getrusage(): ru_maxrss=0.154 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=558, ru_nivcsw=1
00:00:00.028285 [15352:shadow] n/a [INFO] [n/a] [manager.rs:701] [shadow_rs::core::manager] System memory usage in bytes at simtime 1100000000 ns reported by /proc/meminfo: {"mem_total":270302900224,"mem_free":5121146880,"swap_total":1023406080,"swap_free":1018425344,"buffers":3242778624,"cached":246234632192,"s_reclaimable":8019468288,"shmem":16125952}
00:00:00.061384 [15421:shadow-worker] 00:00:01.330000000 [WARN] [testnode:11.0.0.1] [process.rs:503] [shadow_rs::host::process] thread ThreadId(1056) has no syscall_condition. How?
00:00:00.063514 [15421:shadow-worker] 00:00:01.330000000 [WARN] [testnode:11.0.0.1] [process.rs:503] [shadow_rs::host::process] thread ThreadId(1058) has no syscall_condition. How?
00:00:00.194659 [15421:shadow-worker] 00:00:01.330000000 [WARN] [testnode:11.0.0.1] [clone.rs:133] [shadow_rs::host::syscall::handler::clone] (LOG_ONCE) Ignoring CLONE_VFORK (and CLONE_VM if set). In *typical* usage this won't result in incorrect behavior.
00:00:00.282906 [15421:shadow-worker] 00:00:01.330000000 [WARN] [testnode:11.0.0.1] [clone.rs:183] [shadow_rs::host::syscall::handler::clone] Unhandled clone flags: CloneFlags(CLONE_CLEAR_SIGHAND)
00:00:00.282908 [15421:shadow-worker] 00:00:01.330000000 [ERROR] [testnode:11.0.0.1] [rc.rs:131] [shadow_shim_helper_rs::rootedcell::rc] Dropped without calling `safely_drop`
00:00:00.287839 [15421:shadow-worker] 00:00:02.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.287870 [15421:shadow-worker] 00:00:03.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.287928 [15352:shadow] n/a [INFO] [n/a] [manager.rs:683] [shadow_rs::core::manager] Process resource usage at simtime 3000000000 reported by getrusage(): ru_maxrss=0.155 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=3020, ru_nivcsw=5
00:00:00.287935 [15352:shadow] n/a [INFO] [n/a] [manager.rs:701] [shadow_rs::core::manager] System memory usage in bytes at simtime 3000000000 ns reported by /proc/meminfo: {"mem_total":270302900224,"mem_free":5116030976,"swap_total":1023406080,"swap_free":1018425344,"buffers":3242778624,"cached":246235070464,"s_reclaimable":8019468288,"shmem":16187392}
00:00:00.287951 [15421:shadow-worker] 00:00:04.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.287975 [15421:shadow-worker] 00:00:05.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.288011 [15352:shadow] n/a [INFO] [n/a] [manager.rs:683] [shadow_rs::core::manager] Process resource usage at simtime 5000000000 reported by getrusage(): ru_maxrss=0.155 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=3021, ru_nivcsw=5
00:00:00.288013 [15352:shadow] n/a [INFO] [n/a] [manager.rs:701] [shadow_rs::core::manager] System memory usage in bytes at simtime 5000000000 ns reported by /proc/meminfo: {"mem_total":270302900224,"mem_free":5116030976,"swap_total":1023406080,"swap_free":1018425344,"buffers":3242778624,"cached":246235070464,"s_reclaimable":8019468288,"shmem":16187392}
00:00:00.288028 [15421:shadow-worker] 00:00:06.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.288051 [15421:shadow-worker] 00:00:07.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.288080 [15352:shadow] n/a [INFO] [n/a] [manager.rs:683] [shadow_rs::core::manager] Process resource usage at simtime 7000000000 reported by getrusage(): ru_maxrss=0.155 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=3022, ru_nivcsw=5
00:00:00.288083 [15352:shadow] n/a [INFO] [n/a] [manager.rs:701] [shadow_rs::core::manager] System memory usage in bytes at simtime 7000000000 ns reported by /proc/meminfo: {"mem_total":270302900224,"mem_free":5116030976,"swap_total":1023406080,"swap_free":1018425344,"buffers":3242778624,"cached":246235070464,"s_reclaimable":8019468288,"shmem":16187392}
00:00:00.288098 [15421:shadow-worker] 00:00:08.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.288121 [15421:shadow-worker] 00:00:09.000000000 [INFO] [testnode:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.288150 [15352:shadow] n/a [INFO] [n/a] [manager.rs:683] [shadow_rs::core::manager] Process resource usage at simtime 9000000000 reported by getrusage(): ru_maxrss=0.155 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=3023, ru_nivcsw=5
00:00:00.288152 [15352:shadow] n/a [INFO] [n/a] [manager.rs:701] [shadow_rs::core::manager] System memory usage in bytes at simtime 9000000000 ns reported by /proc/meminfo: {"mem_total":270302900224,"mem_free":5116030976,"swap_total":1023406080,"swap_free":1018425344,"buffers":3242778624,"cached":246235070464,"s_reclaimable":8019468288,"shmem":16187392}
00:00:00.288595 [15421:shadow-worker] 00:00:10.000000000 [ERROR] [testnode:11.0.0.1] [process.rs:1525] [shadow_rs::host::process] process 'testnode.test_fork.1000' exited with status StoppedByShadow; expected end state was exited: 0 but was running
00:00:00.289029 [15352:shadow] n/a [INFO] [n/a] [manager.rs:514] [shadow_rs::core::manager] Global syscall counts: {newfstatat:301, rt_sigaction:269, write:190, close:168, read:135, exit_group:109, fork:93, openat:81, lseek:64, getdents64:56, fstat:52, pipe:48, ioctl:44, wait4:40, waitid:33, mmap:30, brk:28, clone:24, open:20, fcntl:16, setpgid:13, getpid:12, gettid:10, kill:10, nanosleep:10, getsid:8, getuid:8, readlink:8, getpgrp:6, rt_sigprocmask:6, tgkill:6, getrandom:5, unknown-syscall:5, execve:4, futex:4, getegid:4, geteuid:4, getgid:4, getpgid:4, getppid:4, sysinfo:4, sigaltstack:2, vfork:2, clone3:1, mprotect:1, munmap:1, poll:1, prlimit64:1, set_robust_list:1, setsid:1}
00:00:00.289037 [15352:shadow] n/a [INFO] [n/a] [manager.rs:522] [shadow_rs::core::manager] Global allocated object counts: {Descriptor:792, OpenFile:486, TaskRef:362, DescriptorTable:245, Event:236, LegacyDescriptor:228, Timer:134, Epoll:124, Thread:120, RegularFile:104, OpenFileInner:96, StatusListener:93, SysCallCondition:93, Relay:3, NetworkInterface:2, FutexTable:1, Router:1}
00:00:00.289044 [15352:shadow] n/a [INFO] [n/a] [manager.rs:523] [shadow_rs::core::manager] Global deallocated object counts: {Descriptor:787, OpenFile:484, TaskRef:362, DescriptorTable:244, Event:236, LegacyDescriptor:225, Timer:134, Epoll:124, Thread:120, RegularFile:101, OpenFileInner:94, StatusListener:93, SysCallCondition:93, Relay:3, NetworkInterface:2, FutexTable:1, Router:1}
00:00:00.289045 [15352:shadow] n/a [WARN] [n/a] [manager.rs:529] [shadow_rs::core::manager] Memory leak detected
00:00:00.291358 [15352:shadow] n/a [INFO] [n/a] [controller.rs:67] [shadow_rs::core::controller] Finished simulation
00:00:00.304789 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export] Failed to run the simulation
00:00:00.304853 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export] 
00:00:00.304878 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export] Caused by:
00:00:00.304901 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]     1 managed processes in unexpected final state
00:00:00.304923 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export] 
00:00:00.304946 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export] Stack backtrace:
00:00:00.304969 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    0: anyhow::error::<impl anyhow::Error>::msg
00:00:00.304991 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    1: anyhow::__private::format_err
00:00:00.305012 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    2: shadow_rs::core::controller::Controller::run
00:00:00.305031 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    3: shadow_rs::core::main::run_shadow
00:00:00.305050 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    4: main_runShadow
00:00:00.305068 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    5: __libc_start_call_main
00:00:00.305087 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    6: __libc_start_main_impl
00:00:00.305105 [15352:shadow] n/a [ERROR] [n/a] [main.rs:471] [shadow_rs::core::main::export]    7: _start
** Shadow did not complete successfully: Failed to run the simulation
**   1 managed processes in unexpected final state
** See the log for details
<end of output>
Test time =   0.33 sec
----------------------------------------------------------
Test Fail Reason:
Error regular expression found in output. Regex=[Memory leak detected]
"fork-shadow" end time: Jan 01 23:37 UTC
"fork-shadow" time elapsed: 00:00:00
----------------------------------------------------------

End testing: Jan 01 23:37 UTC

shadow =   0.33 sec*proc

Metadata

Metadata

Assignees

Labels

Priority: HighPrioritized ahead of most other issuesType: BugError or flaw producing unexpected results

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions