Skip to content

crash: rootedcell "Dropped without calling safely_drop" #3290

@jtracey

Description

@jtracey

Describe the issue
I can consistently crash shadow by running any tokio program with enough threads. I've seen this crash a few times on actual experiments with more modest thread counts, but it was infrequent enough that I hadn't been able to make a stable reproducer and so couldn't tell if it wasn't just a bad memory module or some such. Once I realized I could cause it with enough tokio threads, reproducing is easy, on multiple machines.

To Reproduce

Run any shadow experiment with the following Rust program:

#[tokio::main(worker_threads = 20000)]
async fn main() {
    println!("Hello, world!");
}

where 20000 is just an arbitrarily huge number of threads. I've made a tiny reproducer branch, but it's basically just the above with the simplest shadow config:
https://github.com/jtracey/actions-sandbox/tree/shadow-bug

Outside Shadow, it takes a moment to run, but does run. Inside Shadow, it consistently crashes. Should be able to reproduce with cargo install --path . && shadow shadow.config.yaml.

Operating System (please complete the following information):

  • OS and version: post the output of lsb_release -d
    Debian GNU/Linux 12 (bookworm) (Qubes), Ubuntu 22.04.2 LTS (ripple's clonk)
  • Kernel version: post the output of uname -a
    6.1.57-1.qubes.fc37.x86_64

Shadow (please complete the following information):

  • Version and build information: post the output of shadow --show-build-info
Shadow 3.1.0 — v3.1.0-99-gde06d5240 2024-01-26--14:49:37
GLib 2.74.6
Built on 2024-01-30--01:16:24
Built from git branch main
Shadow was built with PROFILE=debug, OPT_LEVEL=1, DEBUG=true, RUSTFLAGS="-C force-frame-pointers=y", CFLAGS="-std=gnu11 -O0 -ggdb -fno-omit-frame-pointer -Wreturn-type -Wswitch -DDEBUG"
  • Which processes you are trying to run inside the Shadow simulation:
    above tokio toy ("shadow-bug")

Additional context

RUST_BACKTRACE=1 shadow.log
** Starting Shadow 3.1.0
00:00:00.000003 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Shadow 3.1.0 — v3.1.0-99-gde06d5240 2024-01-26--14:49:37
00:00:00.000035 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] GLib 2.74.6
00:00:00.000036 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Built on 2024-01-30--01:16:24
00:00:00.000037 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Built from git branch main
00:00:00.000038 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] Shadow was built with PROFILE=debug, OPT_LEVEL=1, DEBUG=true, RUSTFLAGS="-C force-frame-pointers=y", CFLAGS="-std=gnu11 -O0 -ggdb -fno-omit-frame-pointer -Wreturn-type -Wswitch -DDEBUG"
00:00:00.000039 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:196] [shadow_rs::shadow] For more information, visit https://shadow.github.io or https://github.com/shadow
00:00:00.000040 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:198] [shadow_rs::shadow] Logging current startup arguments and environment
00:00:00.000044 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:450] [shadow_rs::shadow] arg: shadow
00:00:00.000045 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:450] [shadow_rs::shadow] arg: shadow.config.yaml
00:00:00.000940 [123247:shadow] n/a [INFO] [n/a] [sim_config.rs:375] [shadow_rs::core::sim_config] Resolved binary path "shadow-bug" to "/home/user/.cargo/bin/shadow-bug"
00:00:00.002472 [123247:shadow] n/a [INFO] [n/a] [shadow.rs:222] [shadow_rs::shadow] Log message buffering is enabled for efficiency
00:00:00.002673 [123247:shadow] n/a [WARN] [n/a] [tsc.c:192] [TscC_nativeCyclesPerSecond] Couldn't get CPU TSC frequency
00:00:00.002674 [123247:shadow] n/a [WARN] [n/a] [manager.rs:82] [shadow_rs::core::manager] Couldn't find native TSC frequency. Emulated rdtsc may use a rate different than managed code expects
00:00:00.002868 [123247:shadow] n/a [INFO] [n/a] [manager.rs:130] [shadow_rs::core::manager] Preloading the openssl crypto library is disabled
00:00:00.003298 [123247:shadow] n/a [INFO] [n/a] [controller.rs:65] [shadow_rs::core::controller] Running simulation
00:00:00.003445 [123247:shadow] n/a [INFO] [n/a] [manager.rs:251] [shadow_rs::core::manager] The parallelism option was 0, so using parallelism=8
00:00:00.004092 [123262:shadow-worker] 00:00:00.000000000 [INFO] [host0:11.0.0.1] [tracker.c:448] [_tracker_logNode] [shadow-heartbeat] [node-header] interval-seconds,recv-bytes,send-bytes,cpu-percent,delayed-count,avgdelay-milliseconds;inbound-localhost-counters;outbound-localhost-counters;inbound-remote-counters;outbound-remote-counters where counters are: packets-total,bytes-total,packets-control,bytes-control-header,packets-control-retrans,bytes-control-header-retrans,packets-data,bytes-data-header,bytes-data-payload,packets-data-retrans,bytes-data-header-retrans,bytes-data-payload-retrans
00:00:00.004124 [123262:shadow-worker] 00:00:00.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004188 [123262:shadow-worker] 00:00:01.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004265 [123247:shadow] n/a [INFO] [n/a] [manager.rs:694] [shadow_rs::core::manager] Process resource usage at simtime 2000000000 reported by getrusage(): ru_maxrss=0.154 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=61, ru_nivcsw=26
00:00:00.004271 [123247:shadow] n/a [INFO] [n/a] [manager.rs:712] [shadow_rs::core::manager] System memory usage in bytes at simtime 2000000000 ns reported by /proc/meminfo: {"mem_total":32351444992,"mem_free":11180363776,"swap_total":1073737728,"swap_free":1073737728,"buffers":323063808,"cached":13112467456,"s_reclaimable":534978560,"shmem":199069696}
00:00:00.004290 [123262:shadow-worker] 00:00:02.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004313 [123262:shadow-worker] 00:00:03.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:00.004351 [123247:shadow] n/a [INFO] [n/a] [manager.rs:694] [shadow_rs::core::manager] Process resource usage at simtime 4000000000 reported by getrusage(): ru_maxrss=0.154 GiB, ru_utime=0.000 minutes, ru_stime=0.000 minutes, ru_nvcsw=63, ru_nivcsw=28
00:00:00.004354 [123247:shadow] n/a [INFO] [n/a] [manager.rs:712] [shadow_rs::core::manager] System memory usage in bytes at simtime 4000000000 ns reported by /proc/meminfo: {"mem_total":32351444992,"mem_free":11180363776,"swap_total":1073737728,"swap_free":1073737728,"buffers":323063808,"cached":13112467456,"s_reclaimable":534978560,"shmem":199069696}
00:00:00.004369 [123262:shadow-worker] 00:00:04.000000000 [INFO] [host0:11.0.0.1] [tracker.c:474] [_tracker_logNode] [shadow-heartbeat] [node] 1,0,0,0.000000,0,0.000000;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0;0,0,0,0,0,0,0,0,0,0,0,0
00:00:04.692359 [123262:shadow-worker] 00:00:05.000000000 [ERROR] [host0:11.0.0.1] [rc.rs:131] [shadow_shim_helper_rs::rootedcell::rc] Dropped without calling `safely_drop`
thread 'shadow-worker' panicked at /home/user/Code/archive/shadow/src/lib/shadow-shim-helper-rs/src/rootedcell/rc.rs:145:17:
Dropped without calling `safely_drop`
stack backtrace:
   0: rust_begin_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
   2: <shadow_shim_helper_rs::rootedcell::rc::RootedRcCommon<T> as core::ops::drop::Drop>::drop
             at /home/user/Code/archive/shadow/src/lib/shadow-shim-helper-rs/src/rootedcell/rc.rs:145:17
   3: core::ptr::drop_in_place<shadow_shim_helper_rs::rootedcell::rc::RootedRcCommon<shadow_shim_helper_rs::rootedcell::refcell::RootedRefCell<shadow_rs::host::descriptor::descriptor_table::DescriptorTable>>>
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498:1
   4: core::ptr::drop_in_place<shadow_shim_helper_rs::rootedcell::rc::RootedRc<shadow_shim_helper_rs::rootedcell::refcell::RootedRefCell<shadow_rs::host::descriptor::descriptor_table::DescriptorTable>>>
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498:1
   5: shadow_rs::host::syscall::handler::clone::<impl shadow_rs::host::syscall::handler::SyscallHandler>::clone_internal
             at /home/user/Code/archive/shadow/src/main/host/syscall/handler/clone.rs:295:5
   6: shadow_rs::host::syscall::handler::clone::<impl shadow_rs::host::syscall::handler::SyscallHandler>::clone3_original
             at /home/user/Code/archive/shadow/src/main/host/syscall/handler/clone.rs:364:9
   7: shadow_rs::host::syscall::handler::clone::<impl shadow_rs::host::syscall::handler::SyscallHandler>::clone3
             at /home/user/Code/archive/shadow/src/main/host/syscall/handler/clone.rs:335:5
   8: core::ops::function::Fn::call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ops/function.rs:79:5
   9: <F as shadow_rs::host::syscall::handler::SyscallHandlerFn<(T1,T2)>>::call
             at /home/user/Code/archive/shadow/src/main/host/syscall/handler/mod.rs:768:9
  10: shadow_rs::host::syscall::handler::SyscallHandler::run_handler
             at /home/user/Code/archive/shadow/src/main/host/syscall/handler/mod.rs:364:38
  11: shadow_rs::host::syscall::handler::SyscallHandler::syscall
             at /home/user/Code/archive/shadow/src/main/host/syscall/handler/mod.rs:172:22
  12: shadow_rs::host::managed_thread::ManagedThread::resume
             at /home/user/Code/archive/shadow/src/main/host/managed_thread.rs:274:31
  13: shadow_rs::host::thread::Thread::resume
             at /home/user/Code/archive/shadow/src/main/host/thread.rs:469:19
  14: shadow_rs::host::process::Process::resume
             at /home/user/Code/archive/shadow/src/main/host/process.rs:1201:19
  15: shadow_rs::host::host::Host::resume
             at /home/user/Code/archive/shadow/src/main/host/host.rs:487:13
  16: shadow_rs::host::host::Host::add_application::{{closure}}
             at /home/user/Code/archive/shadow/src/main/host/host.rs:452:13
  17: shadow_rs::core::work::task::TaskRef::execute
             at /home/user/Code/archive/shadow/src/main/core/work/task.rs:32:9
  18: shadow_rs::host::host::Host::execute
             at /home/user/Code/archive/shadow/src/main/host/host.rs:859:43
  19: shadow_rs::core::manager::Manager::run::{{closure}}::{{closure}}::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/manager.rs:419:37
  20: shadow_rs::core::manager::for_each_host::{{closure}}::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/manager.rs:818:13
  21: shadow_rs::core::worker::Worker::with_active_host::{{closure}}::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/worker.rs:114:32
  22: core::option::Option<T>::map
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/option.rs:1072:29
  23: shadow_rs::core::worker::Worker::with_active_host::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/worker.rs:114:13
  24: shadow_rs::core::worker::Worker::with::{{closure}}::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/worker.rs:419:43
  25: core::option::Option<T>::map
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/option.rs:1072:29
  26: shadow_rs::core::worker::Worker::with::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/worker.rs:419:27
  27: std::thread::local::LocalKey<T>::try_with
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/thread/local.rs:270:16
  28: shadow_rs::core::worker::Worker::with
             at /home/user/Code/archive/shadow/src/main/core/worker.rs:418:9
  29: shadow_rs::core::worker::Worker::with_active_host
             at /home/user/Code/archive/shadow/src/main/core/worker.rs:112:9
  30: shadow_rs::core::manager::for_each_host::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/manager.rs:817:9
  31: scheduler::thread_per_core::HostIter<HostType>::for_each
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/thread_per_core.rs:208:43
  32: scheduler::HostIter<HostType>::for_each
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/lib.rs:246:39
  33: shadow_rs::core::manager::for_each_host
             at /home/user/Code/archive/shadow/src/main/core/manager.rs:815:5
  34: shadow_rs::core::manager::Manager::run::{{closure}}::{{closure}}
             at /home/user/Code/archive/shadow/src/main/core/manager.rs:416:29
  35: scheduler::SchedulerScope<HostType>::run_with_data::{{closure}}
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/lib.rs:226:17
  36: scheduler::thread_per_core::SchedulerScope<HostType>::run_with_data::{{closure}}
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/thread_per_core.rs:174:13
  37: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2021:9
  38: scheduler::pools::unbounded::work_loop
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:269:31
  39: scheduler::pools::unbounded::UnboundedThreadPool::new::{{closure}}
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:71:21
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
00:00:05.036183 [123262:shadow-worker] 00:00:05.000000000 [ERROR] [host0:11.0.0.1] [rc.rs:131] [shadow_shim_helper_rs::rootedcell::rc] Dropped without calling `safely_drop`
00:00:05.036405 [123262:shadow-worker] 00:00:05.000000000 [ERROR] [host0:11.0.0.1] [rc.rs:131] [shadow_shim_helper_rs::rootedcell::rc] Dropped without calling `safely_drop`
thread '<unnamed>' panicked at lib/scheduler/src/pools/unbounded.rs:193:17:
A work thread panicked
stack backtrace:
   0: rust_begin_unwind
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/panicking.rs:72:14
   2: <scheduler::pools::unbounded::WorkerScope as core::ops::drop::Drop>::drop
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:193:17
   3: core::ptr::drop_in_place<scheduler::pools::unbounded::WorkerScope>
             at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498:1
   4: scheduler::pools::unbounded::UnboundedThreadPool::scope
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/pools/unbounded.rs:159:5
   5: scheduler::thread_per_core::ThreadPerCoreSched<HostType>::scope
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/thread_per_core.rs:104:9
   6: scheduler::Scheduler<HostType>::scope
             at /home/user/Code/archive/shadow/src/lib/scheduler/src/lib.rs:150:43
   7: shadow_rs::core::manager::Manager::run
             at /home/user/Code/archive/shadow/src/main/core/manager.rs:404:17
   8: shadow_rs::core::controller::Controller::run
             at /home/user/Code/archive/shadow/src/main/core/controller.rs:66:33
   9: shadow_rs::shadow::run_shadow
             at /home/user/Code/archive/shadow/src/main/shadow.rs:226:5
  10: main_runShadow
             at /home/user/Code/archive/shadow/src/main/shadow.rs:473:22
  11: main
             at /home/user/Code/archive/shadow/src/main/main.c:17:12
  12: __libc_start_call_main
             at ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
  13: __libc_start_main_impl
             at ./csu/../csu/libc-start.c:360:3
  14: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fatal runtime error: failed to initiate panic, error 5
Aborted

Metadata

Metadata

Assignees

Labels

Type: BugError or flaw producing unexpected results

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions