Implement vfork syscall

Part of https://github.com/shadow/shadow/issues/1987

The `vfork` syscall (or `clone` or `clone3` with the `CLONE_VFORK` flag) is a way of saving some overhead when spawning a new process. Unlike `fork`, the child process shares memory with the parent (hence saving the overhead of copying page tables to make memory copy-on-write in the child). The parent process is suspended until the child process exits or `exec`s.

## Importance

Use-cases verified to *not* use vfork:

* `arti` uses `std::process::Command` to spawn pluggable transport processes, which currently [uses `fork`](https://github.com/rust-lang/rust/blob/712d962cef99a61432f65cb764b70768c6a520a5/library/std/src/sys/unix/process/process_unix.rs#L227).

* More generally, `vfork` will probably not be used much in Rust until https://github.com/rust-lang/libc/issues/1596 is fixed.

* `tor` also uses `fork`, not `vfork`, to spawn processes.

* Using `strace` on a simple `bash` script on my machine shows it using `fork`-like `clone` invocations (not `vfork`).

Use-cases that do use vfork:

* The `posix_spawn` libc function is specified as using `vfork`. https://www.man7.org/linux/man-pages/man3/posix_spawn.3.html
* python3's `subprocess` module
* `dash` (which is what `/bin/sh` resolves to on many systems)
* Rust's `std::process::Command::spawn`. It's also unusual in that it uses `clone` with `vfork` and a new stack; e.g. from strace: `clone3({flags=CLONE_VM|CLONE_VFORK, exit_signal=SIGCHLD, stack=0x7efc570b3000, stack_size=0x9000}, 88)`

## Feasibility

Implementing the shim-side code for `vfork` is tricky. Unlike spawning a new thread, the child process continues running on the same stack. Unlike fork, modifications to that stack are seen in the parent as well. Therefore we can't return from our syscall handling functions, since this would corrupt the stack in the parent. We also can't long jump to the point where the syscall was made (as we do when spawning a new thread) and have the parent return normally, since this would also corrupt the stack in the parent.

We *might* be able to return normally in the child process, and later long-jump in the parent process when it gets to run again. This seems pretty tricky, though.

One possibility is to just treat `vfork` exactly like `fork` (and treat the `CLONE_VFORK` flag as a no-op). In principle this would break code that relies on implementation details of `vfork` under Linux, e.g. by intentionally writing to parent memory from the child, but relying on such implementation details is already pretty fragile and not-portable. e.g. the [vfork man page](https://www.man7.org/linux/man-pages/man2/vfork.2.html) states that POSIX.1 specifies that

> behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling [_exit(2)](https://www.man7.org/linux/man-pages/man2/_exit.2.html) or one of the [exec(3)](https://www.man7.org/linux/man-pages/man3/exec.3.html) family of functions.".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement vfork syscall #3123

Importance

Feasibility

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement vfork syscall #3123

Description

Importance

Feasibility

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions