Implement basic signal emulation#1881
Conversation
1641971 to
f57b726
Compare
f57b726 to
218fc17
Compare
|
Sorry this PR grew into a bit of a monster. I didn't want to merge the new data structures without the code that used those data structure, nor that code without tests to exercise it. I might be able to split this PR into some smaller individual commits along those lines if that'd make things easier for you, though I'm not sure they'd build or make a lot of sense on their own. |
|
Fixes #1455 |
There was a problem hiding this comment.
Cool! I learned a lot about signals reading it :)
Also, I was testing the PR and I ran into the error "Lock is already held. This is probably a deadlock".
Here's the contrived simulation config if you want to look at it:
general:
stop_time: 10s
network:
graph:
type: 1_gbit_switch
hosts:
server:
network_node_id: 0
processes:
- path: /bin/python3
args: -m http.server 80
start_time: 3s
- path: /bin/kill
args: '1000'
start_time: 4s
- path: /bin/true
start_time: 5sEdit:
I also get a different error with the following config:
general:
stop_time: 10s
network:
graph:
type: 1_gbit_switch
hosts:
server:
network_node_id: 0
processes:
- path: /bin/sleep
args: '5'
start_time: 3s
- path: /bin/kill
args: '1000'
start_time: 4s
- path: /bin/true
start_time: 5s00:00:00.043300 [581144:shadow-worker] 00:00:04.000000000 [INFO] [server:11.0.0.1] [process.c:373] [_process_getAndLogReturnCode] main success code '0' for process 'server.kill.1001'
**ERROR ENCOUNTERED**
At process: 581136 (parent 565393)
At file: src/main/host/syscall/time.c
At line: 68
At function: _syscallhandler_nanosleep_helper
Message: nanosleep unblocked but a timer is still pending.
**BEGIN BACKTRACE**
Obtained 16 stack frames:
build/src/main/shadow(+0x128efd5) [0x5634e1752fd5]
build/src/main/shadow(utility_handleError+0xeb) [0x5634e17531af]
build/src/main/shadow(+0x12b57ab) [0x5634e17797ab]
build/src/main/shadow(syscallhandler_nanosleep+0x67) [0x5634e177987e]
build/src/main/shadow(syscallhandler_make_syscall+0x204a) [0x5634e176cced]
build/src/main/shadow(threadpreload_resume+0x25e) [0x5634e1743634]
build/src/main/shadow(thread_resume+0xcb) [0x5634e173e346]
build/src/main/shadow(process_continue+0x14c) [0x5634e1736638]
build/src/main/shadow(+0x127e241) [0x5634e1742241]
build/src/main/shadow(task_execute+0x78) [0x5634e17598fb]
build/src/main/shadow(event_execute+0x16b) [0x5634e1759384]
build/src/main/shadow(worker_runEvent+0x30) [0x5634e172f94f]
build/src/main/shadow(+0x129018d) [0x5634e175418d]
build/src/main/shadow(+0x126b7fb) [0x5634e172f7fb]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f078dab0609]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7f078d51c293]
**END BACKTRACE**
**ABORTING**
Ah nice. I failed to release the hostlock before raising a fatal signal natively in the shim-side handler. Fixed: ee7dc2b |
I actually do have some fixes for how nanosleep handles getting interrupted. It was originally in this PR but I broke it out to slim this one down. I'll resurrect that one after getting this merged. We'll probably need to fix up some other syscalls as well; e.g. I think there are others with timeouts that assume the timeout expired if Opened #1889 |
|
Btw while debugging your first example, I learned that the python simplehttp server doesn't install a handler to gracefully shut down on SIGTERM. It does for SIGINT. With this config the process and shadow exit gracefully: It looks like nginx also shuts down on SIGINT (in addition to SIGTERM): http://nginx.org/en/docs/control.html Maybe SIGINT will be a better default "graceful shutdown" signal, though we'll probably want a way to override it as well. |
ee7dc2b to
374194d
Compare
Implements most of MS2 from #1851.
Notable pieces still missing:
sigaltstack.Neither of these are very difficult, but seemed worth leaving out of this already-large-PR, and maybe deferring to MS3.