Skip to content
This repository was archived by the owner on Mar 16, 2026. It is now read-only.

improve startup/failure reporting of Worker instances#25

Merged
il-steffen merged 1 commit intoIntelLabs:masterfrom
il-steffen:fix/detect_abort
Nov 4, 2022
Merged

improve startup/failure reporting of Worker instances#25
il-steffen merged 1 commit intoIntelLabs:masterfrom
il-steffen:fix/detect_abort

Conversation

@il-steffen
Copy link
Copy Markdown
Contributor

@il-steffen il-steffen commented Oct 28, 2022

This prevents printing fuzzer status line when actually still waiting for Workers/Qemu to boot.
It also detects when Qemu instances abort/exit, in particular the first Worker may abort and others never become READY due to missing created snapshot image. Previously, the manager would just keep waiting for other Workers to come online.

In detail:

  • on ABORT hypercall and/or QemuIOException, let workers send ABORT msg to manager thread
  • also send ABORT on Qemu startup to catch exit due to malformed qemu cmdline etc.
  • update manager to keep track of READY vs ABORTed workers
    • do not print status line before first worker has become ready
      (this leaves the previous printed status messages, "waiting for snapshot" or "waiting for VM to start")
    • once at least one Worker has become ready (successful boot + snapshot created), check that set of READY workers is bigger than ABORTed workers, and exit otherwise

To test:

  • launch the fuzzer with bad qemu options, e.g. "qemu_extra: -foobar"
  • launch a harness that errors out, e.g. trigger ABORT hypercall before / after snapshot

@il-steffen il-steffen requested a review from Wenzel October 29, 2022 01:05
@il-steffen il-steffen marked this pull request as ready for review October 29, 2022 01:05
@il-steffen il-steffen force-pushed the fix/detect_abort branch 3 times, most recently from ab05c69 to 74105c4 Compare November 2, 2022 18:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant