1
linux/drivers
Daniel Kobras c06aad854f [PATCH] dm: Fix deadlock under high i/o load in raid1 setup.
On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
we experienced frequent deadlock of the system under high i/o load.  'cat
/dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
after a few GB, 'cp' would be left in 'D' state along with kjournald and
kmirrord.  The functions cp and kjournald were blocked in did vary, but
kmirrord's wchan always pointed to 'mempool_alloc()'.  We've seen this pattern
on 2.6.15 and 2.6.17 kernels.  http://lkml.org/lkml/2005/4/20/142 indicates
that this problem has been around even before.

So much for the facts, here's my interpretation: mempool_alloc() first tries
to atomically allocate the requested memory, or falls back to hand out
preallocated chunks from the mempool.  If both fail, it puts the calling
process (kmirrord in this case) on a private waitqueue until somebody refills
the pool.  Where the only 'somebody' is kmirrord itself, so we have a
deadlock.

I worked around this problem by falling back to a (blocking) kmalloc when
before kmirrord would have ended up on the waitqueue.  This defeats part of
the benefits of using the mempool, but at least keeps the system running.  And
it could be done with a two-line change.  Note that mempool_alloc() clears the
GFP_NOIO flag internally, and only uses it to decide whether to wait or return
an error if immediate allocation fails, so the attached patch doesn't change
behaviour in the non-deadlocking case.  Path is against current git
(2.6.18-rc4), but should apply to earlier versions as well.  I've tested on
2.6.15, where this patch makes the difference between random lockup and a
stable system.

Signed-off-by: Daniel Kobras <kobras@linux.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-08-27 11:01:28 -07:00
..
acorn
acpi Merge trivial low-risk suspend hotkey bugzilla-5918 into release 2006-08-20 21:49:29 -04:00
amba
atm
base [PATCH] cpu hotplug: use hotplug version of registration in late inits 2006-07-31 13:28:39 -07:00
block [PATCH] nbd: Abort request on data reception failure 2006-07-31 13:28:39 -07:00
bluetooth [Bluetooth] Enable SCO support for Broadcom HID proxy dongle 2006-07-24 12:44:34 -07:00
cdrom
char [WATCHDOG] Kconfig typos fix. 2006-08-15 11:17:22 +02:00
clocksource
connector [PATCH] Process Events: Fix biarch compatibility issue. use __u64 timestamp 2006-07-31 13:28:36 -07:00
cpufreq [CPUFREQ] [2/2] demand load governor modules. 2006-07-31 18:37:06 -04:00
crypto [CRYPTO] padlock: Fix alignment after aes_ctx rearrange 2006-07-15 11:08:50 +10:00
dio
dma [I/OAT]: Remove pci_module_init() from Intel I/OAT DMA engine 2006-07-21 14:50:13 -07:00
edac [PATCH] drivers/edac/edac_mc.h must #include <linux/platform_device.h> 2006-08-06 08:57:46 -07:00
eisa
fc4 [SCSI] More buffer->request_buffer changes 2006-07-14 09:41:13 -05:00
firmware
hwmon [PATCH] hwmon: abituguru timeout fixes 2006-08-26 13:05:19 -07:00
i2c [PATCH] i2c: tps65010 build fixes 2006-08-26 13:05:12 -07:00
ide [PATCH] PATCH: 2.6.18 oops on boot fix for IDE 2006-08-09 15:43:27 -07:00
ieee1394 [PATCH] ieee1394: sbp2: enable auto spin-up for Maxtor disks 2006-08-06 08:57:48 -07:00
infiniband Merge gregkh@master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 2006-08-26 13:04:23 -07:00
input Input: psmouse - fix Intellimouse 4.0 initialization 2006-08-23 00:48:03 -04:00
isdn [PATCH] eicon: fix define conflict with ptrace 2006-08-06 08:57:48 -07:00
leds [PATCH] net48xx LED cleanups 2006-07-14 21:53:54 -07:00
macintosh Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc 2006-07-31 13:39:52 -07:00
mca
md [PATCH] dm: Fix deadlock under high i/o load in raid1 setup. 2006-08-27 11:01:28 -07:00
media V4L/DVB (4431): Add several error checks to dst 2006-08-08 15:53:04 -03:00
message [SCSI] mptfc: correct out of order event processing 2006-08-06 15:48:31 -05:00
mfd
misc
mmc [MMC] Another stray 'io' reference 2006-08-07 14:47:54 +01:00
mtd
net Merge branch 'upstream-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into upstream-fixes 2006-08-24 00:41:25 -04:00
nubus
oprofile
parisc
parport
pci [PATCH] PCI: kerneldoc correction in pci-driver 2006-08-26 13:05:59 -07:00
pcmcia [PATCH] pcmcia: fix ioctl GET_CONFIGURATION_INFO for pcmcia_cards 2006-07-31 13:28:41 -07:00
pnp [PATCH] pnpacpi: reject ACPI_PRODUCER resources 2006-08-06 08:57:49 -07:00
rapidio
rtc [PATCH] drivers/rtc: fix rtc-s3c.c 2006-08-27 11:01:28 -07:00
s390 Merge gregkh@master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 2006-08-26 13:04:23 -07:00
sbus [SPARC] sbus: Make sure sbus nodes are named uniquely. 2006-07-21 14:18:06 -07:00
scsi Merge gregkh@master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6 2006-08-26 13:04:23 -07:00
serial [SERIAL] sunzilog: Mirror the sunsab serial setup bug fix. 2006-08-23 15:53:39 -07:00
sh
sn
spi
tc
telephony
usb [PATCH] unusual_devs update for UCR-61S2B 2006-08-26 13:06:24 -07:00
video [PATCH] add imacfb documentation and detection 2006-08-14 12:54:28 -07:00
w1
zorro
Kconfig
Makefile