Shim leak on recovery: `is_same_process` rejects live shim, leaves orphan holding host ports

## Summary

Every `boxlite` invocation after `boxlite run -d` triggers recovery, which misdetects the live shim as dead, deletes its PID file, and marks the box `Stopped` — **without killing the actual shim process**. Subsequent `boxlite exec` then spawns a fresh shim, leaving the original as an orphan that still holds host TCP ports (from image `EXPOSE`), vsock channels, file handles, and the box's full `--memory` allocation.

For most workloads this is a silent leak (RAM accumulates, `top` eventually shows N zombie `boxlite-shim` processes). For images with `EXPOSE` directives (`docker:dind`, anything with TCP ports), the leaked shim blocks the next box's gvproxy from binding the same host port → `EADDRINUSE` → gvproxy fails to create the virtual network → entire box has no outbound network (ARP probes go nowhere). This is what blocks `test:integration:dind` from passing on a host that has previously run any dind box.

## Reproduction

```bash
boxlite run -d --name a docker:dind sleep infinity  # spawns shim, binds host :2375 :2376
boxlite ls                                          # recovery fires; pid file removed, box marked Stopped
                                                    # shim still alive, still binding :2375 :2376
boxlite rm -f a                                     # state record removed; shim STILL alive
ss -tlnp | grep 2375                                # → libkrun VM pid=XXXXX, orphan

# Now any dind box fails:
make test:integration:dind  # → DNS timeout, ARP INCOMPLETE, build fails
```

Workaround until fixed: `pkill -9 boxlite-shim` between runs.

## Root cause

Two pieces of code with an implicit contract that doesn't hold:

**`vmm/controller/spawn.rs:91`** — shim is deliberately spawned **without CLI args** so secret config (sent via stdin pipe) never lands in world-readable `/proc/<pid>/cmdline`:

```rust
// 4. Build isolated command — no CLI args, config sent via stdin pipe
let no_args: &[String] = &[];
let mut cmd = jail.command(self.binary_path, no_args);
```

**`util/process.rs:283`** — `is_same_process_linux` validates ownership by checking cmdline contains `box_id`:

```rust
args.iter().any(|arg| arg.contains("boxlite-shim")) && cmdline.contains(box_id)
//                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^
//                                                    always false (cmdline has no args)
```

`cmdline.contains(box_id)` is always `false` because spawn never put `box_id` there → `is_same_process` always returns `false` for live shims → recovery at `rt_impl.rs:1196` hits the `else` branch that deletes the pid file and marks the box `Stopped`. The shim itself is not signalled, so it keeps running, keeps holding its resources, and is no longer tracked by the runtime.

`Box process dead, cleaned up stale PID file` is the visible signature in the logs.

## Impact

- **`--memory`-sized RAM leak per `run -d` + `ls/exec`**: 4 dind boxes with `--memory 2048` leak 8 GB even if user thinks they were all removed
- **Per-box `exec` after `run -d` silently spawns a fresh shim**: in-memory state of original box is lost; user thinks they're talking to same box but they're not (acutely bad for dockerd-style stateful PID 1)
- **Visible failure for `EXPOSE`-having images**: next box of same image (or with same port mapping) gets `EADDRINUSE`; gvproxy fails silently; entire VM has no outbound network; symptom looks like "boxlite doesn't support docker:dind" but is actually leak-driven port collision

## Suggested fix

Pass `box_id` as the sole argv to shim (`box_id` is a short random identifier, not sensitive — unlike the config which keeps stdin transport):

```rust
// vmm/controller/spawn.rs
let args = [self.box_id.to_string()];
let mut cmd = jail.command(self.binary_path, &args);
```

shim's main doesn't need to read the arg; it's purely there so `/proc/<pid>/cmdline` carries it for `is_same_process` to validate. ~3 lines of code.

After this fix:
- Recovery correctly identifies live shims → no false `Stopped` → no orphan creation on `exec`
- `boxlite stop/rm` paths already work (they read pid file → SIGTERM → graceful)
- `test:integration:dind` passes deterministically (no manual `pkill` needed between runs)

Verified locally: shim cmdline becomes `/path/to/boxlite-shim <box_id>`, recovery passes, repeated `run -d`/`exec`/`rm` cycles leave no orphans, `make test:integration:dind` passes on fresh host state without `pkill` between runs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shim leak on recovery: `is_same_process` rejects live shim, leaves orphan holding host ports #565

Summary

Reproduction

Root cause

Impact

Suggested fix

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Shim leak on recovery: is_same_process rejects live shim, leaves orphan holding host ports #565

Description

Summary

Reproduction

Root cause

Impact

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Shim leak on recovery: `is_same_process` rejects live shim, leaves orphan holding host ports #565