1
Commit Graph

161371 Commits

Author SHA1 Message Date
Ryusuke Konishi
6d28f7ea43 nilfs2: remove unused btree argument from btree functions
Even though many btree functions take a btree object as their first
argument, most of them are not used in their functions.

This sticky use of the btree argument is hurting code readability and
giving the possibility of inefficient code generation.

So, this removes the unnecessary btree arguments.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:15 +09:00
Ryusuke Konishi
9ead986373 nilfs2: remove nilfs_dat_abort_start and nilfs_dat_abort_free
These functions are not called from any functions.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:15 +09:00
Jiro SEKIBA
1cf58fa840 nilfs2: shorten freeze period due to GC in write operation v3
This is a re-revised patch to shorten freeze period.
This version include a fix of the bug Konishi-san mentioned last time.

When GC is runnning, GC moves live block to difference segments.
Copying live blocks into memory is done in a transaction,
however it is not necessarily to be in the transaction.
This patch will get the nilfs_ioctl_move_blocks() out from
transaction lock and put it before the transaction.

I ran sysbench fileio test against nilfs partition.
I copied some DVD/CD images and created snapshot to create live blocks
before starting the benchmark.

Followings are summary of rc8 and rc8 w/ the patch of per-request
statistics, which is min/max and avg.  I ran each test three times and
bellow is average of those numers.

According to this benchmark result, average time is slightly degrated.
However, worstcase (max) result is significantly improved.
This can address a few seconds write freeze.

- random write per-request performance of rc8
 min   0.843ms
 max 680.406ms
 avg   3.050ms
- random write per-request performance of rc8 w/ this patch
 min   0.843ms -> 100.00%
 max 380.490ms ->  55.90%
 avg   3.233ms -> 106.00%

- sequential write per-request performance of rc8
 min   0.736ms
 max 774.343ms
 avg   2.883ms
- sequential write per-request performance of rc8 w/ this patch
 min   0.720ms ->  97.80%
 max  644.280ms->  83.20%
 avg   3.130ms -> 108.50%

-----8<-----8<-----nilfs_cleanerd.conf-----8<-----8<-----
protection_period       150
selection_policy        timestamp       # timestamp in ascend order
nsegments_per_clean     2
cleaning_interval       2
retry_interval          60
use_mmap
log_priority            info
-----8<-----8<-----nilfs_cleanerd.conf-----8<-----8<-----

Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:15 +09:00
Zhu Yanhai
43be0ec038 nilfs2: add more check routines in mount process
nilfs2: Add more safeguard routines and protections in mount process,
which also makes nilfs2 report consistency error messages when
checkpoint number is invalid.

Signed-off-by: Zhu Yanhai <zhu.yanhai@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:14 +09:00
Zhang Qiang
a4f0b9c5b4 nilfs2: An unassigned variable is assigned to a never used structure member
nilfs2: In procedure 'nilfs_get_sb()', when a nilfs filesysttem is
mounted for the first time, local variable 'nilfs->ns_last_cno' is
used before loading the latest checkpoint number from disk (in
'nilfs_fill_super'). 'nilfs->ns_last_cno' is assigned to 'sd.cno', but
'sd.cno' has never been used in the procedure.

Signed-off-by: Zhang Qiang <zhangqiang.buaa@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:14 +09:00
Ryusuke Konishi
c1b353f04a nilfs2: use GFP_NOIO for bio_alloc instead of GFP_NOWAIT
Alberto Bertogli advised me about bio_alloc() use in nilfs:
On Sat, 13 Jun 2009 22:52:40 -0300, Alberto Bertogli wrote:
> By the way, those bio_alloc()s are using GFP_NOWAIT but it looks
> like they could use at least GFP_NOIO or GFP_NOFS, since the caller
> can (and sometimes do) sleep. The only caller is nilfs_submit_bh(),
> which calls nilfs_submit_seg_bio() which can sleep calling
> wait_for_completion().

This takes in the comment and replaces the use of GFP_NOWAIT flag with
GFP_NOIO.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:14 +09:00
Jiro SEKIBA
1dfa27105a nilfs2: stop using periodic write_super callback
This removes nilfs_write_super and commit super block in nilfs
internal thread, instead of periodic write_super callback.

VFS layer calls ->write_super callback periodically.  However,
it looks like that calling back is ommited when disk I/O is busy.
And when cleanerd (nilfs GC) is runnig, disk I/O tend to be busy thus
nilfs superblock is not synchronized as nilfs designed.

To avoid it, syncing superblock by nilfs thread instead of pdflush.

Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:14 +09:00
Jiro SEKIBA
79efdd9411 nilfs2: clean up nilfs_write_super
Separate conditions that check if syncing super block and alternative
super block are required as inline functions to reuse the conditions.

Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:14 +09:00
Jiro SEKIBA
6233caa9d5 nilfs2: fix disorder of nilfs_write_super in nilfs_sync_fs
This fixes disorder of nilfs_write_super in nilfs_sync_fs.  Commiting
super block must be the end of the function so that every changes are
reflected.

->sync_fs() is not called frequently so this makes nilfs_sync_fs call
nilfs_commit_super instead of nilfs_write_super.

Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:14 +09:00
Jiro SEKIBA
ec5d66abdb nilfs2: remove redundant super block commit
This removes redundant super block commit.

nilfs_write_super will call nilfs_commit_super to store super block
into block device.  However, nilfs_put_super will call
nilfs_commit_super right after calling nilfs_write_super.  So calling
nilfs_write_super in nilfs_put_super would be redundant.

Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:13 +09:00
Jiro SEKIBA
b58a285ba4 nilfs2: implement nilfs_show_options to display mount options in /proc/mounts
This is a patch to display mount options in procfs.
Mount options will show up in the /proc/mounts as other fs does.

...
/dev/sda6 /mnt nilfs2 ro,relatime,barrier=off,cp=3,order=strict 0 0
...

Signed-off-by: Jiro SEKIBA <jir@unicus.jp>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:13 +09:00
Ryusuke Konishi
1435110467 nilfs2: always lookup disk block address before reading metadata block
The current metadata file code skips disk address lookup for its data
block if the buffer has a mapped flag.

This has a potential risk to cause read request to be performed
against the stale block address that GC moved, and it may lead to meta
data corruption.  The mapped flag is safe if the buffer has an
uptodate flag, otherwise it may prevent necessary update of disk
address in the next read.

This will avoid the potential problem by ensuring disk address lookup
before reading metadata block even for buffers with the mapped flag.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:13 +09:00
Ryusuke Konishi
027d6404eb nilfs2: use semaphore to protect pointer to a writable FS-instance
will get rid of nilfs_get_writer() and nilfs_put_writer() pair used to
retain a writable FS-instance for a period.

The pair functions were making up some kind of recursive lock with a
mutex, but they became overkill since the commit
201913ed74.  Furthermore, they caused
the following lockdep warning because the mutex can be released by a
task which didn't lock it:

 =====================================
 [ BUG: bad unlock balance detected! ]
 -------------------------------------
 kswapd0/422 is trying to release lock (&nilfs->ns_writer_mutex) at:
 [<c1359ff5>] mutex_unlock+0x8/0xa
 but there are no more locks to release!

 other info that might help us debug this:
 no locks held by kswapd0/422.

 stack backtrace:
 Pid: 422, comm: kswapd0 Not tainted 2.6.31-rc4-nilfs #51
 Call Trace:
  [<c1358f97>] ? printk+0xf/0x18
  [<c104fea7>] print_unlock_inbalance_bug+0xcc/0xd7
  [<c11578de>] ? prop_put_global+0x3/0x35
  [<c1050195>] lock_release+0xed/0x1dc
  [<c1359ff5>] ? mutex_unlock+0x8/0xa
  [<c1359f83>] __mutex_unlock_slowpath+0xaf/0x119
  [<c1359ff5>] mutex_unlock+0x8/0xa
  [<d1284add>] nilfs_mdt_write_page+0xd8/0xe1 [nilfs2]
  [<c1092653>] shrink_page_list+0x379/0x68d
  [<c109171b>] ? isolate_pages_global+0xb4/0x18c
  [<c1092bd2>] shrink_list+0x26b/0x54b
  [<c10930be>] shrink_zone+0x20c/0x2a2
  [<c10936b7>] kswapd+0x407/0x591
  [<c1091667>] ? isolate_pages_global+0x0/0x18c
  [<c1040603>] ? autoremove_wake_function+0x0/0x33
  [<c10932b0>] ? kswapd+0x0/0x591
  [<c104033b>] kthread+0x69/0x6e
  [<c10402d2>] ? kthread+0x0/0x6e
  [<c1003e33>] kernel_thread_helper+0x7/0x1a

This patch uses a reader/writer semaphore instead of the own lock and
kills this warning.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:13 +09:00
Heiko Carstens
b5696e5e0d nilfs2: fix format string compile warning (ino_t)
Unlike on most other architectures ino_t is an unsigned int on s390.
So add an explicit cast to avoid this compile warning:

fs/nilfs2/recovery.c: In function 'recover_dsync_blocks':
fs/nilfs2/recovery.c:555: warning: format '%lu' expects type 'long unsigned int', but argument 3 has type 'ino_t'

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:13 +09:00
Ryusuke Konishi
1b2f5a641b nilfs2: fix ignored error code in __nilfs_read_inode()
The __nilfs_read_inode function is ignoring the error code returned
from nilfs_read_inode_common(), and wrongly delivers a success code
(zero) when it escapes from the function in erroneous cases.

This adds the missing error handling.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
2009-09-14 18:27:12 +09:00
Steven Whitehouse
86d0063656 GFS2: Whitespace fixes
Reported-by: Daniel Walker <dwalker@fifo99.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2009-09-14 09:50:57 +01:00
Christoph Hellwig
746cd1e7e4 block: use blkdev_issue_discard in blk_ioctl_discard
blk_ioctl_discard duplicates large amounts of code from blkdev_issue_discard,
the only difference between the two is that blkdev_issue_discard needs to
send a barrier discard request and blk_ioctl_discard a non-barrier one,
and blk_ioctl_discard needs to wait on the request.  To facilitates this
add a flags argument to blkdev_issue_discard to control both aspects of the
behaviour.  This will be very useful later on for using the waiting
funcitonality for other callers.

Based on an earlier patch from Matthew Wilcox <matthew@wil.cx>.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:53 +02:00
David Woodhouse
3d2257f157 Make DISCARD_BARRIER and DISCARD_NOBARRIER writes instead of reads
The commands are conceptually writes, and in the case of IDE and SCSI
commands actually are writes.  They were only reads because we thought
that would interact better with the elevators.  Now the elevators know
about discard requests, that advantage no longer exists.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:53 +02:00
Jens Axboe
b8a9ae779f block: don't assume device has a request list backing in nr_requests store
Stacked devices do not. For now, just error out with -EINVAL. Later
we could make the limit apply on stacked devices too, for throttling
reasons.

This fixes

5a54cd13353bb3b88887604e2c980aa01e314309

and should go into 2.6.31 stable as well.

Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:53 +02:00
Martin K. Petersen
3c5820c743 block: Optimal I/O limit wrapper
Implement blk_limits_io_opt() and make blk_queue_io_opt() a wrapper
around it. DM needs this to avoid poking at the queue_limits directly.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:52 +02:00
Jeff Moyer
06d2188644 cfq: choose a new next_req when a request is dispatched
This patch addresses http://bugzilla.kernel.org/show_bug.cgi?id=13401, a
regression introduced in 2.6.30.

From the bug report:

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:52 +02:00
Nikanth Karthikesan
a9327cac44 Seperate read and write statistics of in_flight requests
Currently, there is a single in_flight counter measuring the number of
requests in the request_queue. But some monitoring tools would like to
know how many read requests and write requests are in progress. Split the
current in_flight counter into two seperate counters for read and write.

This information is exported as a sysfs attribute, as changing the
currently available stat files would break the existing tools.

Signed-off-by: Nikanth Karthikesan <knikanth@suse.de>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:52 +02:00
Ed Cashin
18d8217bc4 aoe: end barrier bios with EOPNOTSUPP
BugLink: http://bugzilla.kernel.org/show_bug.cgi?id=13942

Bruno Premont noticed that aoe throws a BUG during umount of an XFS in
2.6.31:

[ 5259.349897] aoe: bi_io_vec is NULL
[ 5259.349940] ------------[ cut here ]------------
[ 5259.349958] kernel BUG at /usr/src/linux-2.6/drivers/block/aoe/aoeblk.c:177!
[ 5259.349990] invalid opcode: 0000 [#1]

The bio in question is a barrier.  Jens Axboe suggested that such bios
need to be recognized and ended with -EOPNOTSUPP by any driver that
provides its own ->make_request_fn handler and does not handle
barriers.

In testing the changes below eliminate the BUG.

(Better would be real barrier support, something that Ed says he'll add
for later in the .32 cycle. For now, this at least gets rid of a bug
with crashing on an empty barrier. Jens)

Signed-off-by: Ed L. Cashin <ecashin@coraid.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-14 08:24:52 +02:00
Eric Dumazet
8a3d271deb slub: fix slab_pad_check()
When SLAB_POISON is used and slab_pad_check() finds an overwrite of the
slab padding, we call restore_bytes() on the whole slab, not only
on the padding.

Acked-by: Christoph Lameer <cl@linux-foundation.org>
Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-09-14 07:57:55 +03:00
Dmitry Torokhov
fc8e1ead93 Merge branch 'next' into for-linus 2009-09-13 21:16:56 -07:00
Eric Paris
4e6d0bffd3 SELinux: flush the avc before disabling SELinux
Before SELinux is disabled at boot it can create AVC entries.  This patch
will flush those entries before disabling SELinux.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:11 +10:00
Eric Paris
008574b111 SELinux: seperate avc_cache flushing
Move the avc_cache flushing into it's own function so it can be reused when
disabling SELinux.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:09 +10:00
Eric Paris
ed868a5698 Creds: creds->security can be NULL is selinux is disabled
__validate_process_creds should check if selinux is actually enabled before
running tests on the selinux portion of the credentials struct.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2009-09-14 12:34:07 +10:00
Jiri Kosina
8123e8f7c8 Merge branches 'upstream', 'upstream-fixes' and 'debugfs' into for-linus 2009-09-13 20:09:41 +02:00
Henrik Rydberg
9de48cc300 Input: bcm5974 - silence uninitialized variables warnings
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-09-13 09:13:38 -07:00
Dmitry Torokhov
85927b0d52 Input: wistron_btns - add keymap for AOpen 1557
This one does not have anything useful in DMI either so again
we need to use:

    force=1 keymap=aopen1557

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2009-09-13 09:12:21 -07:00
James Bottomley
ea038f63ac [SCSI] fix oops during scsi scanning
Chris Webb reported:
  p0# uname -a
  Linux f7ea8425-d45b-490f-a738-d181d0df6963.host.elastichosts.com 2.6.30.4-elastic-lon-p #2 SMP PREEMPT Thu Aug 20 14:30:50 BST 2009 x86_64 Intel(R) Xeon(R) CPU E5420 @ 2.50GHz GenuineIntel GNU/Linux
  p0# zgrep SCAN_ASYNC /proc/config.gz
  # CONFIG_SCSI_SCAN_ASYNC is not set

  p0# cat /var/log/kern/2009-08-20
  [...]
  15:27:10.485 kernel: scsi9 : iSCSI Initiator over TCP/IP
  15:27:11.493 kernel: scsi 9:0:0:0: RAID              IET      Controller       0001 PQ: 0 ANSI: 5
  15:27:11.493 kernel: scsi 9:0:0:0: Attached scsi generic sg6 type 12
  15:27:11.495 kernel: scsi 9:0:0:1: Direct-Access     IET      VIRTUAL-DISK     0001 PQ: 0 ANSI: 5
  15:27:11.495 kernel: sd 9:0:0:1: Attached scsi generic sg7 type 0
  15:27:11.495 kernel: sd 9:0:0:1: [sdg] 4194304 512-byte hardware sectors: (2.14 GB/2.00 GiB)
  15:27:11.495 kernel: sd 9:0:0:1: [sdg] Write Protect is off
  15:27:11.495 kernel: sd 9:0:0:1: [sdg] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
  15:27:13.012 kernel: sdg:<6>scsi 9:0:0:1: [sdg] Unhandled error code
  15:27:13.012 kernel: scsi 9:0:0:1: [sdg] Result: hostbyte=0x07 driverbyte=0x00
  15:27:13.012 kernel: end_request: I/O error, dev sdg, sector 0
  15:27:13.012 kernel: Buffer I/O error on device sdg, logical block 0
  15:27:13.012 kernel: ldm_validate_partition_table(): Disk read failed.
  15:27:13.012 kernel: unable to read partition table
  15:27:13.014 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
  15:27:13.014 kernel: IP: [<ffffffff803f0d77>] disk_part_iter_next+0x74/0xfd
  15:27:13.014 kernel: PGD 82ad0b067 PUD 82cd7e067 PMD 0 
  15:27:13.014 kernel: Oops: 0000 [#1] PREEMPT SMP 
  15:27:13.014 kernel: last sysfs file: /sys/devices/platform/host9/session4/iscsi_session/session4/ifacename
  15:27:13.014 kernel: CPU 5 
  15:27:13.014 kernel: Modules linked in:
  15:27:13.014 kernel: Pid: 13999, comm: async/0 Not tainted 2.6.30.4-elastic-lon-p #2 X7DBN
  15:27:13.014 kernel: RIP: 0010:[<ffffffff803f0d77>]  [<ffffffff803f0d77>] disk_part_iter_next+0x74/0xfd
  15:27:13.014 kernel: RSP: 0018:ffff88066afa3dd0  EFLAGS: 00010246
  15:27:13.014 kernel: RAX: ffff88082b58a000 RBX: ffff88066afa3e00 RCX: 0000000000000000
  15:27:13.014 kernel: RDX: 0000000000000000 RSI: ffff88082b58a000 RDI: 0000000000000000
  15:27:13.014 kernel: RBP: ffff88066afa3df0 R08: ffff88066afa2000 R09: ffff8806a204f000
  15:27:13.014 kernel: R10: 000000fb12c7d274 R11: ffff8806c2bf0628 R12: ffff88066afa3e00
  15:27:13.014 kernel: R13: ffff88082c829a00 R14: 0000000000000000 R15: ffff8806bc50c920
  15:27:13.014 kernel: FS:  0000000000000000(0000) GS:ffff88002818a000(0000) knlGS:0000000000000000
  15:27:13.014 kernel: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
  15:27:13.014 kernel: CR2: 0000000000000010 CR3: 000000082ade3000 CR4: 00000000000426e0
  15:27:13.014 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  15:27:13.014 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  15:27:13.014 kernel: Process async/0 (pid: 13999, threadinfo ffff88066afa2000, task ffff8806c2bf05e0)
  15:27:13.014 kernel: Stack:
  15:27:13.014 kernel: 0000000000000000 ffff88066afa3e00 ffff88066afa3e00 ffff88082c829a00
  15:27:13.014 kernel: ffff88066afa3e40 ffffffff80306feb ffff88082b58a000 0000000000000000
  15:27:13.014 kernel: 0000000000000001 ffff8806bc50c920 ffff88066afa3e40 ffff88082b58a000
  15:27:13.014 kernel: Call Trace:
  15:27:13.014 kernel: [<ffffffff80306feb>] register_disk+0x122/0x13a
  15:27:13.014 kernel: [<ffffffff803f0b0f>] add_disk+0xaa/0x106
  15:27:13.014 kernel: [<ffffffff80493609>] sd_probe_async+0x198/0x25b
  15:27:13.014 kernel: [<ffffffff80270482>] async_thread+0x10c/0x20d
  15:27:13.014 kernel: [<ffffffff802545ff>] ? default_wake_function+0x0/0xf
  15:27:13.014 kernel: [<ffffffff80270376>] ? async_thread+0x0/0x20d
  15:27:13.014 kernel: [<ffffffff8026ad89>] kthread+0x55/0x80
  15:27:13.014 kernel: [<ffffffff8022be6a>] child_rip+0xa/0x20
  15:27:13.014 kernel: [<ffffffff8026ad34>] ? kthread+0x0/0x80
  15:27:13.014 kernel: [<ffffffff8022be60>] ? child_rip+0x0/0x20
  15:27:13.014 kernel: Code: c8 ff 80 e1 0c b9 00 00 00 00 0f 44 c1 41 83 cd ff 48 8d 7a 20 48 be ff ff ff ff 08 00 00 00 48 b9 00 00 00 00 08 00 00 00 eb 50 <8b> 42 10 41 bd 01 00 00 00 eb db 4c 63 c2 4e 8d 04 c7 4d 8b 20 
  15:27:13.015 kernel: RIP  [<ffffffff803f0d77>] disk_part_iter_next+0x74/0xfd
  15:27:13.015 kernel: RSP <ffff88066afa3dd0>
  15:27:13.015 kernel: CR2: 0000000000000010
  15:27:13.015 kernel: ---[ end trace 6104b56ef5590e25 ]---

The problem is caused because the async scanning split in sd.c doesn't hold
any reference to the device when it kicks off the async piece.  What's
happening is that an iSCSI disconnect is destorying the device again *before*
the async sd scanning thread even starts.  Fix this by taking a reference
before starting the thread and dropping it again when the thread completes.

Reported-by: Chris Webb <chris@arachsys.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:34 -05:00
Bart Van Assche
afffd3dabe [SCSI] libsrp: fix memory leak in srp_ring_free()
This patch fixes a memory leak in the libsrp function srp_ring_free().
It is not documented whether or not this function should free the ring
pointer itself. But the source code of the callers of this function
(srp_target_alloc() and srp_target_free()) makes it clear that
srp_ring_free() should deallocate the ring pointer itself. Furthermore,
the patch below makes srp_ring_free() deallocate all memory allocated by
srp_ring_alloc().

This patch affects the ibmvstgt driver, which is the only in-tree driver
that calls the srp_ring_free() function (indirectly).

Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com>
Acked-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:34 -05:00
Mike Christie
661134ad37 [SCSI] libiscsi, bnx2i: make bound ep check common
bnx2i currently has a check for if a ep is properly bound, so if
iscsi_queuecommand/xmit_task is called while there is no ep
we will not queue IO.

be2iscsi sends IO from queuecommand/xmit_task like how bnx2i does
and needs a similar test. This patch has us just use the suspend_bit
test for this.

When ep_poll has succeeed iscsid will call conn_bind, the LLD will
then call iscsi_conn_bind which will clear the suspend bit.
When ep_disconnect is called (or if there is a conn error) we set
the suspend bit. For the ep_disconnect case I am adding a helper
in this patch that will take the session lock to make sure
iscsi_queuecommand/xmit_task is not running and it will set
the suspend bit.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:33 -05:00
Mike Christie
4c0ba5d259 [SCSI] libiscsi: add completion function for drivers that do not need pdu processing
beiscsi does not need the iscsi scsi cmd processing. It does not
even get this info on the completion path. This adds a function
to just update the sequencing numbers and complete a task.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Jayamohan Kallickal <jayamohank@serverengines.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:33 -05:00
Moger, Babu
dd784edcfc [SCSI] scsi_dh_rdac: changes for rdac debug logging
Patch to add debugging stuff for rdac device handler.
- Added a bit mask "module parameter" rdac_logging with 2 bits for each type
  of logging.
- currently defined only two types of logging(failover and sense logging). Can
  be enhanced later if required.
- By default only failover logging is enabled which is equivalent of current
  logging.

Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Vijay Chauhan <vijay.chauhan@lsi.com>
Reviewed-by: Bob Stankey <Robert.stankey@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:31 -05:00
Moger, Babu
1527666e6a [SCSI] scsi_dh_rdac: changes to collect the rdac debug information during the initialization
Adding the code to read the debug information during initialization. This
patch collects the information about storage and controllers during
rdac_activate.

Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Vijay Chauhan <vijay.chauhan@lsi.com>
Reviewed-by: Bob Stankey <Robert.stankey@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:31 -05:00
Moger, Babu
87b79a5327 [SCSI] scsi_dh_rdac: move the init code from rdac_activate to rdac_bus_attach
Moving the initialization code from rdac_activate to rdac_bus_attach which is
more efficient.  We don't have to collect all the information during every
activate.

Signed-off-by: Babu Moger <babu.moger@lsi.com>
Reviewed-by: Vijay Chauhan <vijay.chauhan@lsi.com>
Reviewed-by: Bob Stankey <Robert.stankey@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:30 -05:00
Michal Schmidt
e71044ee2e [SCSI] sg: fix oops in the error path in sg_build_indirect()
When the allocation fails in sg_build_indirect(), an oops happens in
the error path. It's caused by an obvious typo.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reported-by: Bob Tracy <rct@gherkin.frus.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:30 -05:00
Kashyap, Desai
b437b95620 [SCSI] mptsas : Bump version to 3.04.12
Bump version to 3.04.12

Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:29 -05:00
Kashyap, Desai
9766096d33 [SCSI] mptsas : FW event thread and scsi mid layer deadlock in SYNCHRONIZE CACHE command
Normally In HBA reset path MPT driver will flush existing work in current work
queue (mpt/0) . This is just a dummy activity for MPT driver point of
view, since HBA reset will turn off Work queue events.

It means we will simply returns from work queue without doing anything.
But for the case where Work is already done (half the way), we have to have
that work to be done.

Considering above condition we stuck forever since Deadlock in scsi midlayer
and MPT driver. sd_sync_cache() will wait forever since HBA is not in
Running state, and it will never come into Running state since
sd_sync_cache() is called from HBA reset context.
Now new code will not wait for half cooked work to be finished
before returning from HBA reset.

Once we are out of HBA reset, EH thread will change host state to running from
recovery and work waiting for running state of HBA will be finished.
New code is turning ON firmware event from another special work called
Rescan toplogy.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:28 -05:00
Kashyap, Desai
fea984034b [SCSI] mptsas : Send DID_NO_CONNECT for pending IOs of removed device
Driver is modified to return DID_NO_CONNECT for all pending I/O
requests for bus type SAS, if it founds the target is removed at
the firmware level.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:28 -05:00
Kashyap, Desai
c55b89fba9 [SCSI] mptsas : PAE Kernel more than 4 GB kernel panic
This patch is solving problem for PAE kernel DMA operation.
On PAE system dma_addr and unsigned long will have different
values.
Now dma_addr is not type casted using unsigned long.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:27 -05:00
Kashyap, Desai
f44fd18198 [SCSI] mptsas : NULL pointer on big endian systems causing Expander not to tear off
On Big endian system kernel will crash due to address translation
is not handle properly.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:26 -05:00
Kashyap, Desai
9e39089b95 [SCSI] mptsas : Sanity check for phyinfo is added
Check for phyinfo->phy before calling sas_port_delete_phy.

Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:25 -05:00
Charlie Brady
5bab08858c [SCSI] scsi_dh_rdac: Add support for Sun StorageTek ST2500, ST2510 and ST2530
These storage arrays can use RDAC when configured with 'linux' as the
initiator.

http://www.sun.com/storage/disk_systems/workgroup/2500/
http://www.sun.com/storage/disk_systems/workgroup/2510/
http://www.sun.com/storage/disk_systems/workgroup/2530/

Signed-off-by: Charlie Brady <charlieb@budge.apana.org.au>
Reviewed-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:25 -05:00
Anil Ravindranath
89a3681041 [SCSI] pmcraid: PMC-Sierra MaxRAID driver to support 6Gb/s SAS RAID controller
Signed-off-by: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:24 -05:00
Giridhar Malavali
073ed91e24 [SCSI] qla2xxx: Update version number to 8.03.01-k6.
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:23 -05:00
Andrew Vasquez
67becc0041 [SCSI] qla2xxx: Properly delete rports attached to a vport.
Original code would inadvertently skip the deferred
fc_remote_port_delete() call for rports hanging off any vport.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:23 -05:00
Andrew Vasquez
0d6e61bc6a [SCSI] qla2xxx: Correct various NPIV issues.
* Consolidate vport-count processing.
* Correct vp_idx restrictions during RSCN processing.
* Push topology verification check to qla2x00_do_dpc_all_vps().
* Don't skip vport full-login-lip/lip-reset mailbox handling.

Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-09-12 09:35:22 -05:00