tests/inst: Add destructive test framework#2127
tests/inst: Add destructive test framework#2127openshift-merge-robot merged 1 commit intoostreedev:masterfrom
Conversation
As I was working on extending some of ostree's destructive test suite to do reboots: ostreedev/ostree#2127 I realized that the Debian autopkgtest API for rebooting is better, because it allows *saving state external to the host*. Rather than having the test count boots as ostree is doing today, the "mark" allows us to more reliably dispatch. And further, becase we don't rely on writing anything to disk on the target, we can add clean support for "forced reboots" that might kill the OS before we write to persistent storage there. The "between reboot" state lives in the test runner's memory instead. We retain support for the previous (two!) reboot APIs here for now. I tested this with basically the example script from the Debian autopkgtest specification: ``` set -xeuo pipefail case "${AUTOPKGTEST_REBOOT_MARK:-}" in "") echo "test beginning"; /tmp/autopkgtest-reboot mark1 ;; mark1) echo "test in mark1"; /tmp/autopkgtest-reboot mark2 ;; mark2) echo "test in mark2" ;; *) echo "unexpected mark: ${AUTOPKGTEST_REBOOT_MARK}"; exit 1;; esac echo "ok autopkgtest rebooting" ``` I think it will make sense actually to implement more of the autopkgtest API - Debian has a nontrivial number of tests using this, and I think there's even work upstream in e.g. systemd to bridge its tests to autopkgtest. Which would mean we gain "run systemd's tests in kola" for free.
As I was working on extending some of ostree's destructive test suite to do reboots: ostreedev/ostree#2127 I realized that the Debian autopkgtest API for rebooting is better, because it allows *saving state external to the host*. Rather than having the test count boots as ostree is doing today, the "mark" allows us to more reliably dispatch. And further, becase we don't rely on writing anything to disk on the target, we can add clean support for "forced reboots" that might kill the OS before we write to persistent storage there. The "between reboot" state lives in the test runner's memory instead. We retain support for the previous (two!) reboot APIs here for now. I tested this with basically the example script from the Debian autopkgtest specification: ``` set -xeuo pipefail case "${AUTOPKGTEST_REBOOT_MARK:-}" in "") echo "test beginning"; /tmp/autopkgtest-reboot mark1 ;; mark1) echo "test in mark1"; /tmp/autopkgtest-reboot mark2 ;; mark2) echo "test in mark2" ;; *) echo "unexpected mark: ${AUTOPKGTEST_REBOOT_MARK}"; exit 1;; esac echo "ok autopkgtest rebooting" ``` I think it will make sense actually to implement more of the autopkgtest API - Debian has a nontrivial number of tests using this, and I think there's even work upstream in e.g. systemd to bridge its tests to autopkgtest. Which would mean we gain "run systemd's tests in kola" for free.
As I was working on extending some of ostree's destructive test suite to do reboots: ostreedev/ostree#2127 I realized that the Debian autopkgtest API for rebooting is better, because it allows *saving state external to the host*. Rather than having the test count boots as ostree is doing today, the "mark" allows us to more reliably dispatch. And further, becase we don't rely on writing anything to disk on the target, we can add clean support for "forced reboots" that might kill the OS before we write to persistent storage there. The "between reboot" state lives in the test runner's memory instead. We retain support for the previous (two!) reboot APIs here for now. I tested this with basically the example script from the Debian autopkgtest specification: ``` set -xeuo pipefail case "${AUTOPKGTEST_REBOOT_MARK:-}" in "") echo "test beginning"; /tmp/autopkgtest-reboot mark1 ;; mark1) echo "test in mark1"; /tmp/autopkgtest-reboot mark2 ;; mark2) echo "test in mark2" ;; *) echo "unexpected mark: ${AUTOPKGTEST_REBOOT_MARK}"; exit 1;; esac echo "ok autopkgtest rebooting" ``` I think it will make sense actually to implement more of the autopkgtest API - Debian has a nontrivial number of tests using this, and I think there's even work upstream in e.g. systemd to bridge its tests to autopkgtest. Which would mean we gain "run systemd's tests in kola" for free.
As I was working on extending some of ostree's destructive test suite to do reboots: ostreedev/ostree#2127 I realized that the Debian autopkgtest API for rebooting is better, because it allows *saving state external to the host*. Rather than having the test count boots as ostree is doing today, the "mark" allows us to more reliably dispatch. And further, becase we don't rely on writing anything to disk on the target, we can add clean support for "forced reboots" that might kill the OS before we write to persistent storage there. The "between reboot" state lives in the test runner's memory instead. We retain support for the previous (two!) reboot APIs here for now. I tested this with basically the example script from the Debian autopkgtest specification: ``` set -xeuo pipefail case "${AUTOPKGTEST_REBOOT_MARK:-}" in "") echo "test beginning"; /tmp/autopkgtest-reboot mark1 ;; mark1) echo "test in mark1"; /tmp/autopkgtest-reboot mark2 ;; mark2) echo "test in mark2" ;; *) echo "unexpected mark: ${AUTOPKGTEST_REBOOT_MARK}"; exit 1;; esac echo "ok autopkgtest rebooting" ``` I think it will make sense actually to implement more of the autopkgtest API - Debian has a nontrivial number of tests using this, and I think there's even work upstream in e.g. systemd to bridge its tests to autopkgtest. Which would mean we gain "run systemd's tests in kola" for free.
As I was working on extending some of ostree's destructive test suite to do reboots: ostreedev/ostree#2127 I realized that the Debian autopkgtest API for rebooting is better, because it allows *saving state external to the host*. Rather than having the test count boots as ostree is doing today, the "mark" allows us to more reliably dispatch. And further, becase we don't rely on writing anything to disk on the target, we can add clean support for "forced reboots" that might kill the OS before we write to persistent storage there. The "between reboot" state lives in the test runner's memory instead. We retain support for the previous (two!) reboot APIs here for now. I tested this with basically the example script from the Debian autopkgtest specification: ``` set -xeuo pipefail case "${AUTOPKGTEST_REBOOT_MARK:-}" in "") echo "test beginning"; /tmp/autopkgtest-reboot mark1 ;; mark1) echo "test in mark1"; /tmp/autopkgtest-reboot mark2 ;; mark2) echo "test in mark2" ;; *) echo "unexpected mark: ${AUTOPKGTEST_REBOOT_MARK}"; exit 1;; esac echo "ok autopkgtest rebooting" ``` I think it will make sense actually to implement more of the autopkgtest API - Debian has a nontrivial number of tests using this, and I think there's even work upstream in e.g. systemd to bridge its tests to autopkgtest. Which would mean we gain "run systemd's tests in kola" for free.
ceb0ff0 to
d818375
Compare
|
OK a lot more work here; we're testing that we reliably survive forced poweroffs. Still TODO:
|
d66194f to
dcf3914
Compare
190aca6 to
3addc62
Compare
|
OK lifting WIP on this - fault injection would be another level, and I need to solve some other problems before doing that like being able to pull containers/binaries/packages from the "host" cosa container at least in qemu so we don't rely on internet access (as I really want to be able to run this test in a loop and not have it randomly flake for internet reasons). |
jlebon
left a comment
There was a problem hiding this comment.
Some comments, but LGTM overall. Cool stuff! It took a while honestly to grok how everything fits together. I think the cognitive load of thinking across reboots made it harder.
tests/inst/src/destructive.rs
Outdated
| let res = res.context("Failed during upgrade")?; | ||
| if res { | ||
| println!( | ||
| "Failed to interrupt upgrade, attempt {}/{}", |
There was a problem hiding this comment.
We could lower the timeout by e.g. 10% in this case to increase the odds for the next time (maybe after we get to 5 retries or something).
There was a problem hiding this comment.
I thought about doing things like this - an issue I've seen is that when I start up 4 instances of this test in parallel, there's heavy CPU usage in the VMs where they're all doing the initial setup, and that inflates the timing for the test upgrade.
The better fix I think would be something like an env OSTREE_PAUSE_POINT=pre-deploy,post-deploy,pre-cleanup where the harness can control when the process continues. (Or maybe implement this with scripting gdb or so)
43a46b8 to
7ccd3e6
Compare
|
OK ended up doing more fixes and tweaks here; I noticed that the results weren't including the "no-interrupt" case, and fixing/handling that required another tricky special case. I also noticed there were fewer "completed" results for live interrupts than I expected, and that turned out to be me forgetting it needs to be |
This adds infrastructure to the Rust test suite for destructive tests, and adds a new `transactionality` test which runs rpm-ostree in a loop (along with `ostree-finalize-staged`) and repeatedly uses either `kill -9`, `reboot` and `reboot -ff`. The main goal here is to flush out any "logic errors". So far I've validated that this passes a lot of cycles using ``` $ kola run --qemu-image=fastbuild-fedora-coreos-ostree-qemu.qcow2 ext.ostree.destructive-rs.transactionality --debug --multiply 8 --parallel 4 ``` a number of times.
7ccd3e6 to
1101c02
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, jlebon The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
We want to test upgrades that actually change files as a general rule; in some cases we want to test "large" upgrades to validate performance. This code generates a "synthetic" upgrade that adds an ELF note to a percentage of ELF files (randomly selected). By doing it this way we are only actually testing one version of the code. Migrated from coreos/coreos-assembler#1635 using the Rust code from ostreedev/ostree#2127
We want to test upgrades that actually change files as a general rule; in some cases we want to test "large" upgrades to validate performance. This code generates a "synthetic" upgrade that adds an ELF note to a percentage of ELF files (randomly selected). By doing it this way we are only actually testing one version of the code. Migrated from coreos/coreos-assembler#1635 using the Rust code from ostreedev/ostree#2127
We want to test upgrades that actually change files as a general rule; in some cases we want to test "large" upgrades to validate performance. This code generates a "synthetic" upgrade that adds an ELF note to a percentage of ELF files (randomly selected). By doing it this way we are only actually testing one version of the code. Migrated from coreos/coreos-assembler#1635 using the Rust code from ostreedev/ostree#2127
We want to test upgrades that actually change files as a general rule; in some cases we want to test "large" upgrades to validate performance. This code generates a "synthetic" upgrade that adds an ELF note to a percentage of ELF files (randomly selected). By doing it this way we are only actually testing one version of the code. Migrated from coreos/coreos-assembler#1635 using the Rust code from ostreedev/ostree#2127
This adds infrastructure to the Rust test suite for destructive
tests, and adds a new
transactionalitytest which runsrpm-ostree in a loop (along with
ostree-finalize-staged) andrepeatedly kills them.
The main goal here is to flush out any "logic errors". I plan
to further extend this to reboots and then force poweroffs.