Bug fixes and improvements for the udev test suite by mwilck · Pull Request #8819 · systemd/systemd

mwilck · 2018-04-25T20:23:12Z

This patch set fixes a few minor problems and adds new features to the udev test program, udev-test.pl, and the python code that is used to set up the fake sysfs for these tests.
The new features are:

allow addition and removal of several devices in a single test
allow testing for the (non-)existence of several symlinks per device in a single test
allow auto-generation of test device lists from the fake sysfs
generate (almost) arbitrarily many SCSI device entries and test on them

The motivation for doing this was to reproduce the problem addressed in #8667 in the test suite, and to test for performance regressions (lock contention) caused by my proposed fix for the same issue.
I verified that all tests are still correct and pass. I actually made the test suite a bit harder to pass by adding additional consistence checks. The ability to test several symlinks in a single test allowed me to merge a few tests.
Two of the tests added in the "add multiple device test" commit do FAIL sporadically with the current code because of the mentioned race condition.

/dev/null is required by udev for output redirection, otherwise it emits irritating warnings for several test cases. Create this device in the test environment. This requires dropping "nodev" from the mountpoint for /dev in the test environment.

Allow testing cases where multiple devices are added and removed. This implies a change of the data structure: every test allows for multiple devices to be added, and "exp_name" etc. are now properties of the device, not of the test.

It's not necessary to write the rules for every udev run, as we now may have many (rather than just 2) per test.

Allow testing cases where multiple devices are added and removed simultaneously. Tests are started as synchronously as possible using a semaphore, in order to test possible race conditions. If this isn't desired, the test parameter "sleep_us" can be set to the number of microseconds to wait between udev invocations.

More often than not, the created devnode is the basename of the sysfs entry. The "devnode" device may be used to override the auto-detected node name. Permissions and major/minor number are now verified on the devnode itself, not on symlinks. This allows separate testing for devnodes and symlinks an a follow-up patch.

Expected device node names are now tested automatically. Use exp_name only for symlinks in the test definitions, to avoid confusion and duplicated tests.

Test if symlinks are created correctly by comparing the symlink targets to the devnode path. This implies (for the symlink) that major/minor numbers and permissions are correct, as we have tested that on the devnode already.

Instead of testing the existence or non-exisitence of just a single symlink, allow testing of several links per device.

Change the test definitions to work with the previous patch.

udev hasn't supported renaming device nodes for some time.

the "last_rule" option hasn't been supported for some time. Therefore this test fails if a "not_exp_links" attribute is added, as it should be. Mark it appropriately.

Add some rules that make it a bit harder to pass, mainly the non-existence checks.

These rules have survived from an ancient version of the code and save no purpose any more.

As we can check multiple links in a single test now, these 3 tests can be merged into one.

As we can test multiple devices and multiple links per device in one test now, these two tests can be merged into one.

This is helpful to catch possible regressions in the test. Also, don't count wait() errors, they are likely not udev errors.

Add 4 new tests using multiple devices. Number 2-4 use many devices claiming the same symlink, where only one device has a higher priority thatn the others. They fail sporadically with the current code, if a race condition causes the symlink to point to the wrong device. Test 4 is like test 2 with sleeps in between, it's much less likely to fail.

for easier reproduction of sporadic test failures.

Manually listing all devices in the test definition becomes cumbersome with lots of devices. Add a function that scans on all block devices in the test sysfs and generates a list of devices to test.

Script for generating LOTS of SCSI disk and partition devices in the fake sysfs we use for udev testing. This script is supposed to be run after sys-script.py. It uses code from sys-script.py as template to generate additional SCSI disk data structures. Together with the "generator" code in udev-test.pl added in the previous patch, it allows to run udev tests with almost arbitrarily many devices, and thus to do performance scaling tests.

keszybz

Can't locate IPC/SysV.pm in @INC (you may need to install the IPC::SysV module) (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at /builddir/build/BUILD/systemd-239/test/udev-test.pl line 23.

It seems we need to tweak our CI, otherwise we can't even check this properly.

keszybz · 2018-04-26T09:22:52Z

test/udev-test.pl

+                        {
+                                devpath         => "/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/sda1",
+                                exp_name        => "sda1" ,
+                                exp_rem_error   => "yes",


So this one is actually the single (?) case where devices has two items. It'd be great to make this a separate commit, so it is not buried in all the other trivial cases where devices has one item.

mwilck · 2018-04-26T09:55:56Z

It seems we need to tweak our CI, otherwise we can't even check this properly.

Oops - IPC::SysV is a "core" Perl module, I didn't expect that to cause problems. That's not the only module I needed to add; there's also POSIX, IPC:Semaphore, Time:HiRes, and Cwd. They are all core modules.

keszybz

The tests fail for me after 2d27a67 'add multiple device test' (sometimes), and then the next commit is d299247 'add repeat count' after which the tests fail reliably.

keszybz · 2018-04-26T09:44:18Z

test/udev-test.pl


 $sema->remove;
-print "$error errors occurred\n\n";
+print "$error errors occurred. $good good results.\n\n";


I think it would make sense to go one step further, and verify that good is above some threshold (and that threshold would have to be adjusted when tests are added or removed).

I added an additional patch for the "expected good" count.

keszybz · 2018-04-26T10:26:31Z

I added IPC::SysV to Fedora CI in https://src.fedoraproject.org/rpms/systemd/c/1eb8bc72df8f4dccff1e094b11eefce8adda896f?branch=rpm-master. The other ones might be pulled in by something else. Let's add them if it turns out that they are missing. If you push again, Fedora CI will run.

The failures on Ubuntu seem to be caused by something different.

autopkgtest [21:19:55]: test udev: [-----------------------
umount: test/tmpfs: no mount point specified.
Can't exec "/usr/bin/mknod": No such file or directory at ./udev-test.pl line 2254.
...
91 errors occurred. 6539 good results.

mwilck · 2018-04-26T11:10:38Z

The tests fail for me after 2d27a67 'add multiple device test' (sometimes), and then the next commit is d299247 'add repeat count' after which the tests fail reliably.

Right. That's because these tests show a deficiency of the current udev code. The failure is sporadic, therefore adding the repeat count increases the failure probability. Merging #8667 should fix this. I kept the two pull requests separate on purpose for now.

umount: test/tmpfs: no mount point specified.

That happens all the time. AFAIK it's not caused by my patches, but I'll double-check.

Can't exec "/usr/bin/mknod": No such file or directory at ./udev-test.pl line 2254

OK, it's /bin/mknod on Ubuntu. I'll fix that.

umount emits an error message "no mount point specified" if the tmpfs isn't mounted yet, which is the normal case. Suppress that by redirecting stderr.

On some distros, mknod isn't under /usr/bin. Rely on PATH instead. Fixes: e4c1f9a "test/udev-test.pl: create /dev/null"

mwilck · 2018-04-26T11:37:38Z

I just pushed fixes for the mknod and umount issues you mentioned.

Wrt the "multiple-devices" test, I need to understand the policy. If all tests are required to pass do to CI requirements, I can add "exp_err_" flags to these tests. We'd need to mark clearly that these errors are only "expected" because we know the current udev code is buggy.

Since 'test/udev-test.pl: count "good" results', we know how many checks succeeded. Add an "expected good" count to make that number more meaningful.

mwilck · 2018-04-30T20:05:48Z

@keszybz, anything else I can do to get this merged?

evverx · 2018-05-06T01:50:28Z

Oops - IPC::SysV is a "core" Perl module, I didn't expect that to cause problems.

Fedora, CentOS and RH are unusual in the sense that an additional package has to be installed there to get the core Perl modules:

, so in practice, any use of any module might cause the test to fail immediately. I think it would make sense to check if all the modules used in test-udev.pl are installed and skip the test if they aren't.

Two of the tests added in the "add multiple device test" commit do FAIL sporadically with the current code because of the mentioned race condition.

Would it be better to move the test cases related to #8667 there so that the addition of the expected errors could be avoided?

mwilck · 2018-05-16T10:34:29Z

I think it would make sense to check if all the modules used in test-udev.pl are installed and skip the test if they aren't.

That's dangerous for a regression test. The current behavior - fail the test, and require one-time developer action in the CI setup - seems to be safer to me. For the IPC::SysV module, that has happened already according to @keszybz' comment above, so AFAICS there's no further action needed, right?

Would it be better to move the test cases related to #8667 there so that the addition of the expected errors could be avoided?

I could do that. I can also disable the tests, or modify the test to temporarily ignore certain failures and spit out a warning. I was wondering whether systemd might have an established policy for cases like this.

What exactly is the recommendation?

evverx · 2018-05-16T12:54:22Z

That's dangerous for a regression test.

I think it depends on the perception of the test. If it's expected that it has to fail no matter what, then checks like b1ffacb should be removed as well.

For the IPC::SysV module, that has happened already according to @keszybz' comment above, so AFAICS there's no further action needed, right?

As far as I can see, perl(IPC::SysV) was added as a build dependency. It is a great shortcut, which made the test work on Fedora Rawhide CI x86_64. To make it more robust, though, perl should be installed as described in https://fedoraproject.org/wiki/Changes/perl_Package_to_Install_Core_Modules.

What exactly is the recommendation?

In my opinion this PR is just a part of #8667, so I think it would make sense to move all the commits there and continue the discussion there too. It would automatically resolve all the issues related to whether the test should be disabled due to the expected failures or not.

evverx · 2018-05-16T13:48:29Z

I was wondering whether systemd might have an established policy for cases like this.

The only case like this I was able to find is #5520, where the test has been waiting for the bug to be fixed. Apparently, it will be merged as soon as it doesn't fail.

mwilck · 2018-05-16T15:02:41Z

@evverx,

To make it more robust, though, perl should be installed as described in https://fedoraproject.org/wiki/Changes/perl_Package_to_Install_Core_Modules.

I don't think I can help with that.

Apparently, it will be merged as soon as it doesn't fail.

OK, I'll work on it ASAP, currently busy with other stuff.

mwilck · 2018-07-09T16:29:47Z

Superseded by #9551.

mwilck added 21 commits April 25, 2018 22:12

test/udev-test.pl: create /dev/null

e4c1f9a

/dev/null is required by udev for output redirection, otherwise it emits irritating warnings for several test cases. Create this device in the test environment. This requires dropping "nodev" from the mountpoint for /dev in the test environment.

test/udev-test.pl: allow multiple devices per test

7739131

Allow testing cases where multiple devices are added and removed. This implies a change of the data structure: every test allows for multiple devices to be added, and "exp_name" etc. are now properties of the device, not of the test.

test/udev-test.pl: create rules only once

d2d8c31

It's not necessary to write the rules for every udev run, as we now may have many (rather than just 2) per test.

test/udev-test.pl: remove exp_name for device nodes

5e31715

Expected device node names are now tested automatically. Use exp_name only for symlinks in the test definitions, to avoid confusion and duplicated tests.

test/udev-test.pl: test correctness of symlink targets

0eb8228

Test if symlinks are created correctly by comparing the symlink targets to the devnode path. This implies (for the symlink) that major/minor numbers and permissions are correct, as we have tested that on the devnode already.

test/udev-test.pl: allow checking multiple symlinks

2e12cd9

Instead of testing the existence or non-exisitence of just a single symlink, allow testing of several links per device.

test/udev-test.pl: adapt test definitions to multi-symlink

e00758c

Change the test definitions to work with the previous patch.

test/udev-test.pl: fix wrong test descriptions

485c299

udev hasn't supported renaming device nodes for some time.

test/udev-test.pl: last_rule is unsupported

271d80c

the "last_rule" option hasn't been supported for some time. Therefore this test fails if a "not_exp_links" attribute is added, as it should be. Mark it appropriately.

test/udev-test.pl: Make some tests a little harder

3b163cb

Add some rules that make it a bit harder to pass, mainly the non-existence checks.

test/udev-test.pl: remove bogus rules from magic subsys test

943e7ab

These rules have survived from an ancient version of the code and save no purpose any more.

test/udev-test.pl: merge "space and var with space" tests

20b2a3d

As we can check multiple links in a single test now, these 3 tests can be merged into one.

test/udev-test.pl: merge import parent tests into one

b90eb19

As we can test multiple devices and multiple links per device in one test now, these two tests can be merged into one.

test/udev-test.pl: count "good" results

890a322

This is helpful to catch possible regressions in the test. Also, don't count wait() errors, they are likely not udev errors.

test/sys-script.py: add missing DEVNAME entries to uevents

d299247

test/udev-test.pl: add repeat count

4896c62

for easier reproduction of sporadic test failures.

test/udev-test.pl: generator for large list of block devices

501f0c0

Manually listing all devices in the test definition becomes cumbersome with lots of devices. Add a function that scans on all block devices in the test sysfs and generates a list of devices to test.

mwilck mentioned this pull request Apr 25, 2018

udev: lock symlink directories in link_update() #8667

Closed

yuwata added the tests label Apr 26, 2018

keszybz reviewed Apr 26, 2018

View reviewed changes

mwilck added 2 commits April 26, 2018 13:25

test/udev-test.pl: suppress umount error message at startup

a113721

umount emits an error message "no mount point specified" if the tmpfs isn't mounted yet, which is the normal case. Suppress that by redirecting stderr.

test/udev-test.pl: don't hardcode path of mknod

1aed5cf

On some distros, mknod isn't under /usr/bin. Rely on PATH instead. Fixes: e4c1f9a "test/udev-test.pl: create /dev/null"

test/udev_test.pl: add "expected good" count

3b09ee7

Since 'test/udev-test.pl: count "good" results', we know how many checks succeeded. Add an "expected good" count to make that number more meaningful.

mwilck mentioned this pull request Jul 9, 2018

udev: fix race condition in symlink generation code #9551

Closed

mwilck closed this Jul 9, 2018

PizzaLovingNerd pushed a commit to risiIndustries/systemd-pkg that referenced this pull request Mar 14, 2022

Add deps for systemd/systemd#8819

1eb8bc7

Uh oh!

Conversation

mwilck commented Apr 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

keszybz left a comment

Choose a reason for hiding this comment

Uh oh!

keszybz Apr 26, 2018

Choose a reason for hiding this comment

Uh oh!

mwilck commented Apr 26, 2018

Uh oh!

keszybz left a comment

Choose a reason for hiding this comment

Uh oh!

keszybz Apr 26, 2018

Choose a reason for hiding this comment

Uh oh!

mwilck Apr 26, 2018

Choose a reason for hiding this comment

Uh oh!

keszybz commented Apr 26, 2018

Uh oh!

mwilck commented Apr 26, 2018

Uh oh!

mwilck commented Apr 26, 2018

Uh oh!

mwilck commented Apr 30, 2018

Uh oh!

evverx commented May 6, 2018

Uh oh!

mwilck commented May 16, 2018

Uh oh!

evverx commented May 16, 2018

Uh oh!

evverx commented May 16, 2018

Uh oh!

mwilck commented May 16, 2018

Uh oh!

mwilck commented Jul 9, 2018

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

4 participants

mwilck commented Apr 25, 2018 •

edited

Loading