1
linux/drivers/md
Kiyoshi Ueda a77e28c7e1 dm multipath: fix oops when request based io fails when no paths
The patch posted at http://marc.info/?l=dm-devel&m=124539787228784&w=2
which was merged into cec47e3d4a ("dm:
prepare for request based option") introduced a regression in
request-based dm.

If map_request() calls dm_kill_unmapped_request() to complete a cloned
bio without dispatching it, clone->bio is still set when
dm_end_request() is called and the BUG_ON(clone->bio) is incorrect.

The patch fixes this bug by freeing bio in dm_end_request() if the clone
has bio.  I've redone my tests to cover all I/O paths and confirmed
there's no other regression.

Here is the oops I hit in request-based dm when I do I/O to a multipath
device which doesn't have any active path nor queue_if_no_path setting:

------------[ cut here ]------------
kernel BUG at /root/2.6.31-rc4.rqdm/drivers/md/dm.c:828!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 1
Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq dm_mirror dm_region_hash dm_log dm_service_time dm_multipath scsi_dh dm_mod video output sbs sbshc battery ac sg sr_mod e1000e button cdrom serio_raw rtc_cmos rtc_core rtc_lib piix lpfc scsi_transport_fc ata_piix libata megaraid_sas sd_mod scsi_mod crc_t10dif ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode]
Pid: 7, comm: ksoftirqd/1 Not tainted 2.6.31-rc4.rqdm #1 Express5800/120Lj [N8100-1417]
RIP: 0010:[<ffffffffa023629d>]  [<ffffffffa023629d>] dm_softirq_done+0xbd/0x100 [dm_mod]
RSP: 0018:ffff8800280a1f08  EFLAGS: 00010282
RAX: ffffffffa02544e0 RBX: ffff8802aa1111d0 RCX: ffff8802aa1111e0
RDX: ffff8802ab913e70 RSI: 0000000000000000 RDI: ffff8802ab913e70
RBP: ffff8800280a1f28 R08: ffffc90005457040 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 00000000fffffffb
R13: ffff8802ab913e88 R14: ffff8802ab9c1438 R15: 0000000000000100
FS:  0000000000000000(0000) GS:ffff88002809e000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000003d54a98640 CR3: 000000029f0a1000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ksoftirqd/1 (pid: 7, threadinfo ffff8802ae50e000, task ffff8802ae4f8040)
Stack:
 ffff8800280a1f38 0000000000000020 ffffffff814f30a0 0000000000000004
<0> ffff8800280a1f58 ffffffff8116b245 ffff8800280a1f38 ffff8800280a1f38
<0> ffff8800280a1f58 0000000000000001 ffff8800280a1fa8 ffffffff810477bc
Call Trace:
 <IRQ>
 [<ffffffff8116b245>] blk_done_softirq+0x75/0x90
 [<ffffffff810477bc>] __do_softirq+0xcc/0x210
 [<ffffffff81047170>] ? ksoftirqd+0x0/0x110
 [<ffffffff8100ce7c>] call_softirq+0x1c/0x50
 <EOI>
 [<ffffffff8100e785>] do_softirq+0x65/0xa0
 [<ffffffff81047170>] ? ksoftirqd+0x0/0x110
 [<ffffffff810471e0>] ksoftirqd+0x70/0x110
 [<ffffffff81059559>] kthread+0x99/0xb0
 [<ffffffff8100cd7a>] child_rip+0xa/0x20
 [<ffffffff8100c73c>] ? restore_args+0x0/0x30
 [<ffffffff810594c0>] ? kthread+0x0/0xb0
 [<ffffffff8100cd70>] ? child_rip+0x0/0x20
Code: 44 89 e6 48 89 df e8 23 fb f2 e0 be 01 00 00 00 4c 89 f7 e8 f6 fd ff ff 5b 41 5c 41 5d 41 5e c9 c3 4c 89 ef e8 85 fe ff ff eb ed <0f> 0b eb fe 41 8b 85 dc 00 00 00 48 83 bb 10 01 00 00 00 89 83
RIP  [<ffffffffa023629d>] dm_softirq_done+0xbd/0x100 [dm_mod]
 RSP <ffff8800280a1f08>
---[ end trace 16af0a1d8542da55 ]---

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2009-09-04 20:40:16 +01:00
..
raid6test md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
.gitignore
bitmap.c Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
bitmap.h md: move headers out of include/linux/raid/ 2009-03-31 14:27:03 +11:00
dm-bio-record.h dm: preserve bi_io_vec when resubmitting bios 2009-04-02 19:55:23 +01:00
dm-crypt.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-delay.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-exception-store.c dm exception store: really fix type lookup 2009-06-30 15:18:14 +01:00
dm-exception-store.h dm: use i_size_read 2009-06-22 10:12:14 +01:00
dm-io.c dm io: retry after barrier error 2009-06-22 10:12:26 +01:00
dm-ioctl.c dm: enable request based option 2009-06-22 10:12:36 +01:00
dm-kcopyd.c dm kcopyd: fix callback race 2009-04-09 00:27:17 +01:00
dm-linear.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-log-userspace-base.c dm raid1: add userspace log 2009-06-22 10:12:35 +01:00
dm-log-userspace-transfer.c dm-log-userspace: fix printk format warning 2009-08-16 08:35:58 -07:00
dm-log-userspace-transfer.h dm raid1: add userspace log 2009-06-22 10:12:35 +01:00
dm-log.c dm log: fix create_log_context to use logical_block_size of log device 2009-06-22 10:12:33 +01:00
dm-mpath.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-mpath.h
dm-path-selector.c dm: path selector use module refcount directly 2009-04-02 19:55:27 +01:00
dm-path-selector.h dm mpath: add start_io and nr_bytes to path selectors 2009-06-22 10:12:27 +01:00
dm-queue-length.c dm mpath: add queue length load balancer 2009-06-22 10:12:27 +01:00
dm-raid1.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-region-hash.c dm raid1: keep retrying alloc if mempool_alloc failed 2009-06-22 10:12:13 +01:00
dm-round-robin.c dm mpath: add start_io and nr_bytes to path selectors 2009-06-22 10:12:27 +01:00
dm-service-time.c dm mpath: add service time load balancer 2009-06-22 10:12:28 +01:00
dm-snap-persistent.c dm snapshot: use barrier when writing exception store 2009-06-22 10:12:26 +01:00
dm-snap-transient.c dm snapshot: move status to exception store 2009-04-02 19:55:35 +01:00
dm-snap.c dm snapshot: support barriers 2009-06-22 10:12:25 +01:00
dm-stripe.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-sysfs.c dm: sysfs add suspended attribute 2009-06-22 10:12:29 +01:00
dm-table.c dm table: pass correct dev area size to device_area_is_valid 2009-07-23 20:30:42 +01:00
dm-target.c dm target: remove struct tt_internal 2009-04-02 19:55:28 +01:00
dm-uevent.c
dm-uevent.h
dm-zero.c dm: consolidate target deregistration error handling 2009-01-06 03:04:58 +00:00
dm.c dm multipath: fix oops when request based io fails when no paths 2009-09-04 20:40:16 +01:00
dm.h dm: remove queue next_ordered workaround for barriers 2009-07-23 20:30:40 +01:00
faulty.c md: Move check for bitmap presence to personality code. 2009-06-18 08:49:23 +10:00
Kconfig dm raid1: add userspace log 2009-06-22 10:12:35 +01:00
linear.c md: Use revalidate_disk to effect changes in size of device. 2009-08-03 10:59:58 +10:00
linear.h md/linear: use call_rcu to free obsolete 'conf' structures. 2009-06-18 08:49:42 +10:00
Makefile dm raid1: add userspace log 2009-06-22 10:12:35 +01:00
md.c Fix new incorrect error return from do_md_stop. 2009-08-18 10:35:26 +10:00
md.h Remove deadlock potential in md_open 2009-08-10 12:50:52 +10:00
mktables.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
multipath.c md: Push down data integrity code to personalities. 2009-08-03 10:59:47 +10:00
multipath.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
raid0.c md: Push down data integrity code to personalities. 2009-08-03 10:59:47 +10:00
raid0.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
raid1.c md: Use revalidate_disk to effect changes in size of device. 2009-08-03 10:59:58 +10:00
raid1.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
raid5.c md/raid5: Properly remove excess drives after shrinking a raid5/6 2009-08-13 10:41:49 +10:00
raid5.h md: convert conf->chunk_size and conf->prev_chunk to sectors. 2009-06-18 08:45:55 +10:00
raid6algos.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6altivec.uc md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6int.uc md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6mmx.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6recov.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6sse1.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6sse2.c md/raid6: move raid6 data processing to raid6_pq.ko 2009-03-31 15:09:39 +11:00
raid6x86.h md: fix typo in FSF address 2009-03-31 14:57:37 +11:00
raid10.c md: Push down data integrity code to personalities. 2009-08-03 10:59:47 +10:00
raid10.h md: remove mddev_to_conf "helper" macro 2009-06-16 16:54:21 +10:00
unroll.pl