Bug fixes and improvements for the udev test suite#8819
Bug fixes and improvements for the udev test suite#8819mwilck wants to merge 24 commits intosystemd:masterfrom
Conversation
/dev/null is required by udev for output redirection, otherwise it emits irritating warnings for several test cases. Create this device in the test environment. This requires dropping "nodev" from the mountpoint for /dev in the test environment.
Allow testing cases where multiple devices are added and removed. This implies a change of the data structure: every test allows for multiple devices to be added, and "exp_name" etc. are now properties of the device, not of the test.
It's not necessary to write the rules for every udev run, as we now may have many (rather than just 2) per test.
Allow testing cases where multiple devices are added and removed simultaneously. Tests are started as synchronously as possible using a semaphore, in order to test possible race conditions. If this isn't desired, the test parameter "sleep_us" can be set to the number of microseconds to wait between udev invocations.
More often than not, the created devnode is the basename of the sysfs entry. The "devnode" device may be used to override the auto-detected node name. Permissions and major/minor number are now verified on the devnode itself, not on symlinks. This allows separate testing for devnodes and symlinks an a follow-up patch.
Expected device node names are now tested automatically. Use exp_name only for symlinks in the test definitions, to avoid confusion and duplicated tests.
Test if symlinks are created correctly by comparing the symlink targets to the devnode path. This implies (for the symlink) that major/minor numbers and permissions are correct, as we have tested that on the devnode already.
Instead of testing the existence or non-exisitence of just a single symlink, allow testing of several links per device.
Change the test definitions to work with the previous patch.
udev hasn't supported renaming device nodes for some time.
the "last_rule" option hasn't been supported for some time. Therefore this test fails if a "not_exp_links" attribute is added, as it should be. Mark it appropriately.
Add some rules that make it a bit harder to pass, mainly the non-existence checks.
These rules have survived from an ancient version of the code and save no purpose any more.
As we can check multiple links in a single test now, these 3 tests can be merged into one.
As we can test multiple devices and multiple links per device in one test now, these two tests can be merged into one.
This is helpful to catch possible regressions in the test. Also, don't count wait() errors, they are likely not udev errors.
Add 4 new tests using multiple devices. Number 2-4 use many devices claiming the same symlink, where only one device has a higher priority thatn the others. They fail sporadically with the current code, if a race condition causes the symlink to point to the wrong device. Test 4 is like test 2 with sleeps in between, it's much less likely to fail.
for easier reproduction of sporadic test failures.
Manually listing all devices in the test definition becomes cumbersome with lots of devices. Add a function that scans on all block devices in the test sysfs and generates a list of devices to test.
Script for generating LOTS of SCSI disk and partition devices in the fake sysfs we use for udev testing. This script is supposed to be run after sys-script.py. It uses code from sys-script.py as template to generate additional SCSI disk data structures. Together with the "generator" code in udev-test.pl added in the previous patch, it allows to run udev tests with almost arbitrarily many devices, and thus to do performance scaling tests.
keszybz
left a comment
There was a problem hiding this comment.
Can't locate IPC/SysV.pm in @INC (you may need to install the IPC::SysV module) (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at /builddir/build/BUILD/systemd-239/test/udev-test.pl line 23.
It seems we need to tweak our CI, otherwise we can't even check this properly.
| { | ||
| devpath => "/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1", | ||
| exp_name => "sda1" , | ||
| exp_rem_error => "yes", |
There was a problem hiding this comment.
So this one is actually the single (?) case where devices has two items. It'd be great to make this a separate commit, so it is not buried in all the other trivial cases where devices has one item.
Oops - IPC::SysV is a "core" Perl module, I didn't expect that to cause problems. That's not the only module I needed to add; there's also POSIX, IPC:Semaphore, Time:HiRes, and Cwd. They are all core modules. |
test/udev-test.pl
Outdated
|
|
||
| $sema->remove; | ||
| print "$error errors occurred\n\n"; | ||
| print "$error errors occurred. $good good results.\n\n"; |
There was a problem hiding this comment.
I think it would make sense to go one step further, and verify that good is above some threshold (and that threshold would have to be adjusted when tests are added or removed).
There was a problem hiding this comment.
I added an additional patch for the "expected good" count.
|
I added IPC::SysV to Fedora CI in https://src.fedoraproject.org/rpms/systemd/c/1eb8bc72df8f4dccff1e094b11eefce8adda896f?branch=rpm-master. The other ones might be pulled in by something else. Let's add them if it turns out that they are missing. If you push again, Fedora CI will run. The failures on Ubuntu seem to be caused by something different. |
Right. That's because these tests show a deficiency of the current udev code. The failure is sporadic, therefore adding the repeat count increases the failure probability. Merging #8667 should fix this. I kept the two pull requests separate on purpose for now.
That happens all the time. AFAIK it's not caused by my patches, but I'll double-check.
OK, it's /bin/mknod on Ubuntu. I'll fix that. |
umount emits an error message "no mount point specified" if the tmpfs isn't mounted yet, which is the normal case. Suppress that by redirecting stderr.
On some distros, mknod isn't under /usr/bin. Rely on PATH instead. Fixes: e4c1f9a "test/udev-test.pl: create /dev/null"
|
I just pushed fixes for the mknod and umount issues you mentioned. Wrt the "multiple-devices" test, I need to understand the policy. If all tests are required to pass do to CI requirements, I can add " |
Since 'test/udev-test.pl: count "good" results', we know how many checks succeeded. Add an "expected good" count to make that number more meaningful.
|
@keszybz, anything else I can do to get this merged? |
, so in practice, any use of any module might cause the test to fail immediately. I think it would make sense to check if all the modules used in
Would it be better to move the test cases related to #8667 there so that the addition of the expected errors could be avoided? |
That's dangerous for a regression test. The current behavior - fail the test, and require one-time developer action in the CI setup - seems to be safer to me. For the IPC::SysV module, that has happened already according to @keszybz' comment above, so AFAICS there's no further action needed, right?
I could do that. I can also disable the tests, or modify the test to temporarily ignore certain failures and spit out a warning. I was wondering whether systemd might have an established policy for cases like this. What exactly is the recommendation? |
I think it depends on the perception of the test. If it's expected that it has to fail no matter what, then checks like b1ffacb should be removed as well.
As far as I can see,
In my opinion this PR is just a part of #8667, so I think it would make sense to move all the commits there and continue the discussion there too. It would automatically resolve all the issues related to whether the test should be disabled due to the expected failures or not. |
The only case like this I was able to find is #5520, where the test has been waiting for the bug to be fixed. Apparently, it will be merged as soon as it doesn't fail. |
I don't think I can help with that.
OK, I'll work on it ASAP, currently busy with other stuff. |
|
Superseded by #9551. |
This patch set fixes a few minor problems and adds new features to the udev test program, udev-test.pl, and the python code that is used to set up the fake sysfs for these tests.
The new features are:
The motivation for doing this was to reproduce the problem addressed in #8667 in the test suite, and to test for performance regressions (lock contention) caused by my proposed fix for the same issue.
I verified that all tests are still correct and pass. I actually made the test suite a bit harder to pass by adding additional consistence checks. The ability to test several symlinks in a single test allowed me to merge a few tests.
Two of the tests added in the "add multiple device test" commit do FAIL sporadically with the current code because of the mentioned race condition.