Commit 3a054014e8 replaced our modprobe
with busybox's modprobe. Unfortunately, busybox's modprobe appears to be
unable to properly load modules with more than 1 level of dependencies.
The zfs and zpool commands will invoke modprobe if /dev/zvol is missing,
which concealed this problem. However, this caused problems because some
invocations would fail and under certain circumstances, init would be
killed, causing a kernel panic. This issue was made clear by commit
c812c35100771bb527f6b03853fa6d8ef66a48fe, which ensured that the zpool
and zfs commands were not run until the ZFS module was loaded.
busybox modprobe's failure to load module dependencies correctly appears
to occur because busybox modprobe does not wait until until a module is
loaded before loading a module that depends on it, which is a race. It
would be best to correct this race by waiting until the module has
properly loaded, but it is not clear that the race is the only thing
going wrong and developer time is a premium.
We implement a workaround by modifying the busy loop added in the
previous commit to explicit call `modprobe zfs` on each iteration. While
the first few calls fail due to bugs in busybox modprobe, it will
eventually work, after which each call is a noop. This lets us keep
looping until either the loop exit condition that /dev/zvol exist is
reached or the 5 second timeout is reached.
Once the busybox modprobe issue is fixed, this workaround should be safe
to revert.
Signed-off-by: Richard Yao <ryao@gentoo.org>
udev may still be processing rules and this can cause very bad
things. For instance, modules_load may have loaded an usb host
controller driver and we must wait for the udev rules to terminate.
However, this may lead to other race conditions, but we have
observed that adding scandelay=n where n >= 5 actually fixes the
issue of booting off USB under certain scenarios.