-
Notifications
You must be signed in to change notification settings - Fork 268
crash: rootedcell "Dropped without calling safely_drop" #3290
Description
Describe the issue
I can consistently crash shadow by running any tokio program with enough threads. I've seen this crash a few times on actual experiments with more modest thread counts, but it was infrequent enough that I hadn't been able to make a stable reproducer and so couldn't tell if it wasn't just a bad memory module or some such. Once I realized I could cause it with enough tokio threads, reproducing is easy, on multiple machines.
To Reproduce
Run any shadow experiment with the following Rust program:
#[tokio::main(worker_threads = 20000)]
async fn main() {
println!("Hello, world!");
}where 20000 is just an arbitrarily huge number of threads. I've made a tiny reproducer branch, but it's basically just the above with the simplest shadow config:
https://github.com/jtracey/actions-sandbox/tree/shadow-bug
Outside Shadow, it takes a moment to run, but does run. Inside Shadow, it consistently crashes. Should be able to reproduce with cargo install --path . && shadow shadow.config.yaml.
Operating System (please complete the following information):
- OS and version: post the output of
lsb_release -d
Debian GNU/Linux 12 (bookworm)(Qubes),Ubuntu 22.04.2 LTS(ripple's clonk) - Kernel version: post the output of
uname -a
6.1.57-1.qubes.fc37.x86_64
Shadow (please complete the following information):
- Version and build information: post the output of
shadow --show-build-info
Shadow 3.1.0 — v3.1.0-99-gde06d5240 2024-01-26--14:49:37
GLib 2.74.6
Built on 2024-01-30--01:16:24
Built from git branch main
Shadow was built with PROFILE=debug, OPT_LEVEL=1, DEBUG=true, RUSTFLAGS="-C force-frame-pointers=y", CFLAGS="-std=gnu11 -O0 -ggdb -fno-omit-frame-pointer -Wreturn-type -Wswitch -DDEBUG"
- Which processes you are trying to run inside the Shadow simulation:
above tokio toy ("shadow-bug")
Additional context
RUST_BACKTRACE=1 shadow.log
** Starting Shadow 3.1.0
00:00:00.000003 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Shadow 3.1.0 — v3.1.0-99-gde06d5240 2024-01-26--14:49:37
00:00:00.000035 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] GLib 2.74.6
00:00:00.000036 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Built on 2024-01-30--01:16:24
00:00:00.000037 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Built from git branch main
00:00:00.000038 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Shadow was built with PROFILE=debug, OPT_LEVEL=1, DEBUG=true, RUSTFLAGS="-C force-frame-pointers=y", CFLAGS="-std=gnu11 -O0 -ggdb -fno-omit-frame-pointer -Wreturn-type -Wswitch -DDEBUG"
00:00:00.000039 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] For more information, visit https://shadow.github.io or https://github.com/shadow
00:00:00.000040 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:198] [shadow_rs::shadow] Logging current startup arguments and environment
00:00:00.000044 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:450] [shadow_rs::shadow] arg: shadow
00:00:00.000045 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:450] [shadow_rs::shadow] arg: shadow.config.yaml
00:00:00.000940 [123247:shadow] n/a [INFO] [n/a] [sim_config.rs:375] [shadow_rs::core::sim_config] Resolved binary path "shadow-bug" to "/home/user/.cargo/bin/shadow-bug"
00:00:00.002472 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:222] [shadow_rs::shadow] Log message buffering is enabled for efficiency
00:00:00.002673 [123247:shadow] n/a [WARN] [n/a] [tsc.c:192] [TscC_nativeCyclesPerSecond] Couldn't get CPU TSC frequency
00:00:00.002674 [123247:shadow] n/a [WARN] [n/a] [manager.rs:82] [shadow_rs::core::manager] Couldn't find native TSC frequency. Emulated rdtsc may use a rate different than managed code expects
00:00:00.002868 [123247:shadow] n/a [INFO] [n/a] [manager.rs:130] [shadow_rs::core::manager] Preloading the openssl crypto library is disabled
00:00:00.003298 [123247:shadow] n/a [INFO] [n/a] [controller.rs:65] [shadow_rs::core::controller] Running simulation
00:00:00.003445 [123247:shadow] n/a [INFO] [n/a] [manager.rs:251] [shadow_rs::core::manager] The parallelism option was 0, so using parallelism=8
00:00:00.004092 [123262:shadow-worker] 00:00:00.000000000 [INFO] [host0:11.0.0.1] [tracker.c:448] [_tracker_logNode] [shadow-heartbeat] [node-header] interval-seconds,recv-bytes,send-bytes,cpu-percent,delayed-count,avgdelay-milliseconds;inbound-localhost-counters;outbound-localhost-counters;inbound-remote-counters;outbound-remote-counters where counters are: packets-total,bytes-total,packets-control,bytes-control-header,packets-control-retrans,bytes-control-header-retrans,packets-data,bytes-data-header,bytes-data-payload,packets-data-retrans,bytes-data-header-retrans,bytes-data-payload-retrans
00:00:00.004124 [123262:shadow-worker] 00:00:00.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004188 [123262:shadow-worker] 00:00:01.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004265 [123247:shadow] n/a [INFO] [n/a] [manager.rs:694] [shadow_rs::core::manager] Process resource usage at simtime 2000000000 reported by getrusage(): ru_maxrss=0.154 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=61, ru_nivcsw=26
00:00:00.004271 [123247:shadow] n/a [INFO] [n/a] [manager.rs:712] [shadow_rs::core::manager] System memory usage in bytes at simtime 2000000000 ns reported by /proc/meminfo: {"mem_total":32351444992,"mem_free":11180363776,"swap_total":1073737728,"swap_free":1073737728,"buffers":323063808,"cached":13112467456,"s_reclaimable":534978560,"shmem":199069696}
00:00:00.004290 [123262:shadow-worker] 00:00:02.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004313 [123262:shadow-worker] 00:00:03.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004351 [123247:shadow] n/a [INFO] [n/a] [manager.rs:694] [shadow_rs::core::manager] Process resource usage at simtime 4000000000 reported by getrusage(): ru_maxrss=0.154 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=63, ru_nivcsw=28
00:00:00.004354 [123247:shadow] n/a [INFO] [n/a] [manager.rs:712] [shadow_rs::core::manager] System memory usage in bytes at simtime 4000000000 ns reported by /proc/meminfo: {"mem_total":32351444992,"mem_free":11180363776,"swap_total":1073737728,"swap_free":1073737728,"buffers":323063808,"cached":13112467456,"s_reclaimable":534978560,"shmem":199069696}
00:00:00.004369 [123262:shadow-worker] 00:00:04.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:04.692359 [123262:shadow-worker] 00:00:05.000000000 [ERROR] [host0:11.0.0.1] [rc.rs:131] [shadow_shim_helper_rs::rootedcell::rc] Dropped without calling `safely_drop`
thread 'shadow-worker' panicked at /home/user/Code/archive/shadow/src/lib/shadow-shim-helper-rs/src/rootedcell/rc.rs:145:17:
Dropped without calling `safely_drop`
stack backtrace:
0: rust_begin_unwind
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
1: core::panicking::panic_fmt
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
2: <shadow_shim_helper_rs::rootedcell::rc::RootedRcCommon<T> as core::ops::drop::Drop>::drop
at /home/user/Code/archive/shadow/src/lib/shadow-shim-helper-rs/src/rootedcell/rc.rs:145:17
3: core::ptr::drop_in_place<shadow_shim_helper_rs::rootedcell::rc::RootedRcCommon<shadow_shim_helper_rs::rootedcell::refcell::RootedRefCell<shadow_rs::host::descriptor::descriptor_table::DescriptorTable>>>
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498:1
4: core::ptr::drop_in_place<shadow_shim_helper_rs::rootedcell::rc::RootedRc<shadow_shim_helper_rs::rootedcell::refcell::RootedRefCell<shadow_rs::host::descriptor::descriptor_table::DescriptorTable>>>
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498:1
5: shadow_rs::host::syscall::handler::clone::<impl shadow_rs::host::syscall::handler::SyscallHandler>::clone_internal
at /home/user/Code/archive/shadow/src/main/host/syscall/handler/clone.rs:295:5
6: shadow_rs::host::syscall::handler::clone::<impl shadow_rs::host::syscall::handler::SyscallHandler>::clone3_original
at /home/user/Code/archive/shadow/src/main/host/syscall/handler/clone.rs:364:9
7: shadow_rs::host::syscall::handler::clone::<impl shadow_rs::host::syscall::handler::SyscallHandler>::clone3
at /home/user/Code/archive/shadow/src/main/host/syscall/handler/clone.rs:335:5
8: core::ops::function::Fn::call
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:79:5
9: <F as shadow_rs::host::syscall::handler::SyscallHandlerFn<(T1,T2)>>::call
at /home/user/Code/archive/shadow/src/main/host/syscall/handler/mod.rs:768:9
10: shadow_rs::host::syscall::handler::SyscallHandler::run_handler
at /home/user/Code/archive/shadow/src/main/host/syscall/handler/mod.rs:364:38
11: shadow_rs::host::syscall::handler::SyscallHandler::syscall
at /home/user/Code/archive/shadow/src/main/host/syscall/handler/mod.rs:172:22
12: shadow_rs::host::managed_thread::ManagedThread::resume
at /home/user/Code/archive/shadow/src/main/host/managed_thread.rs:274:31
13: shadow_rs::host::thread::Thread::resume
at /home/user/Code/archive/shadow/src/main/host/thread.rs:469:19
14: shadow_rs::host::process::Process::resume
at /home/user/Code/archive/shadow/src/main/host/process.rs:1201:19
15: shadow_rs::host::host::Host::resume
at /home/user/Code/archive/shadow/src/main/host/host.rs:487:13
16: shadow_rs::host::host::Host::add_application::{{closure}}
at /home/user/Code/archive/shadow/src/main/host/host.rs:452:13
17: shadow_rs::core::work::task::TaskRef::execute
at /home/user/Code/archive/shadow/src/main/core/work/task.rs:32:9
18: shadow_rs::host::host::Host::execute
at /home/user/Code/archive/shadow/src/main/host/host.rs:859:43
19: shadow_rs::core::manager::Manager::run::{{closure}}::{{closure}}::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/manager.rs:419:37
20: shadow_rs::core::manager::for_each_host::{{closure}}::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/manager.rs:818:13
21: shadow_rs::core::worker::Worker::with_active_host::{{closure}}::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/worker.rs:114:32
22: core::option::Option<T>::map
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/option.rs:1072:29
23: shadow_rs::core::worker::Worker::with_active_host::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/worker.rs:114:13
24: shadow_rs::core::worker::Worker::with::{{closure}}::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/worker.rs:419:43
25: core::option::Option<T>::map
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/option.rs:1072:29
26: shadow_rs::core::worker::Worker::with::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/worker.rs:419:27
27: std::thread::local::LocalKey<T>::try_with
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/local.rs:270:16
28: shadow_rs::core::worker::Worker::with
at /home/user/Code/archive/shadow/src/main/core/worker.rs:418:9
29: shadow_rs::core::worker::Worker::with_active_host
at /home/user/Code/archive/shadow/src/main/core/worker.rs:112:9
30: shadow_rs::core::manager::for_each_host::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/manager.rs:817:9
31: scheduler::thread_per_core::HostIter<HostType>::for_each
at /home/user/Code/archive/shadow/src/lib/scheduler/src/thread_per_core.rs:208:43
32: scheduler::HostIter<HostType>::for_each
at /home/user/Code/archive/shadow/src/lib/scheduler/src/lib.rs:246:39
33: shadow_rs::core::manager::for_each_host
at /home/user/Code/archive/shadow/src/main/core/manager.rs:815:5
34: shadow_rs::core::manager::Manager::run::{{closure}}::{{closure}}
at /home/user/Code/archive/shadow/src/main/core/manager.rs:416:29
35: scheduler::SchedulerScope<HostType>::run_with_data::{{closure}}
at /home/user/Code/archive/shadow/src/lib/scheduler/src/lib.rs:226:17
36: scheduler::thread_per_core::SchedulerScope<HostType>::run_with_data::{{closure}}
at /home/user/Code/archive/shadow/src/lib/scheduler/src/thread_per_core.rs:174:13
37: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2021:9
38: scheduler::pools::unbounded::work_loop
at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:269:31
39: scheduler::pools::unbounded::UnboundedThreadPool::new::{{closure}}
at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:71:21
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
00:00:05.036183 [123262:shadow-worker] 00:00:05.000000000 [ERROR] [host0:11.0.0.1] [rc.rs:131] [shadow_shim_helper_rs::rootedcell::rc] Dropped without calling `safely_drop`
00:00:05.036405 [123262:shadow-worker] 00:00:05.000000000 [ERROR] [host0:11.0.0.1] [rc.rs:131] [shadow_shim_helper_rs::rootedcell::rc] Dropped without calling `safely_drop`
thread '<unnamed>' panicked at lib/scheduler/src/pools/unbounded.rs:193:17:
A work thread panicked
stack backtrace:
0: rust_begin_unwind
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
1: core::panicking::panic_fmt
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
2: <scheduler::pools::unbounded::WorkerScope as core::ops::drop::Drop>::drop
at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:193:17
3: core::ptr::drop_in_place<scheduler::pools::unbounded::WorkerScope>
at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498:1
4: scheduler::pools::unbounded::UnboundedThreadPool::scope
at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:159:5
5: scheduler::thread_per_core::ThreadPerCoreSched<HostType>::scope
at /home/user/Code/archive/shadow/src/lib/scheduler/src/thread_per_core.rs:104:9
6: scheduler::Scheduler<HostType>::scope
at /home/user/Code/archive/shadow/src/lib/scheduler/src/lib.rs:150:43
7: shadow_rs::core::manager::Manager::run
at /home/user/Code/archive/shadow/src/main/core/manager.rs:404:17
8: shadow_rs::core::controller::Controller::run
at /home/user/Code/archive/shadow/src/main/core/controller.rs:66:33
9: shadow_rs::shadow::run_shadow
at /home/user/Code/archive/shadow/src/main/shadow.rs:226:5
10: main_runShadow
at /home/user/Code/archive/shadow/src/main/shadow.rs:473:22
11: main
at /home/user/Code/archive/shadow/src/main/main.c:17:12
12: __libc_start_call_main
at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
13: __libc_start_main_impl
at ./csu/../csu/libc-start.c:360:3
14: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fatal runtime error: failed to initiate panic, error 5
Aborted