1
Commit Graph

1665 Commits

Author SHA1 Message Date
Tejun Heo
f686bcb807 [PATCH] libata-eh-fw: implement new EH scheduling via error completion
There are several ways a qc can get schedule for EH in new EH.  This
patch implements one of them - completing a qc with ATA_QCFLAG_FAILED
set or with non-zero qc->err_mask.  ALL such qc's are examined by EH.

New EH schedules a qc for EH from completion iff ->error_handler is
implemented, qc is marked as failed or qc->err_mask is non-zero and
the command is not an internal command (internal cmd is handled via
->post_internal_cmd).  The EH scheduling itself is performed by asking
SCSI midlayer to schedule EH for the specified scmd.

For drivers implementing old-EH, nothing changes.  As this change
makes ata_qc_complete() rather large, it's not inlined anymore and
__ata_qc_complete() is exported to other parts of libata for later
use.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:05 +09:00
Tejun Heo
f69499f42c [PATCH] libata-eh-fw: update ata_qc_from_tag() to enforce normal/EH qc ownership
New EH framework has clear distinction about who owns a qc.  Every qc
starts owned by normal execution path - PIO, interrupt or whatever.
When an exception condition occurs which affects the qc, the qc gets
scheduled for EH.  Note that some events (say, link lost and regained,
command timeout) may schedule qc's which are not directly related but
could have been affected for EH too.  Scheduling for EH is atomic
w.r.t. ap->host_set->lock and once schedule for EH, normal execution
path is not allowed to access the qc in whatever way.  (PIO
synchronization acts a bit different and will be dealt with later)

This patch make ata_qc_from_tag() check whether a qc is active and
owned by normal path before returning it.  If conditions don't match,
NULL is returned and thus access to the qc is denied.
__ata_qc_from_tag() is the original ata_qc_from_tag() and is used by
libata core/EH layers to access inactive/failed qc's.

This change is applied only if the associated LLDD implements new EH
as indicated by non-NULL ->error_handler

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:03 +09:00
Tejun Heo
2ab7db1ff1 [PATCH] libata-eh-fw: use special reserved tag and qc for internal commands
New EH may issue internal commands to recover from error while failed
qc's are still hanging around.  To allow such usage, reserve tag
ATA_MAX_QUEUE-1 for internal command.  This also makes it easy to tell
whether a qc is for internal command or not.  ata_tag_internal() test
implements this test.

To avoid breaking existing drivers, ata_exec_internal() uses
ATA_TAG_INTERNAL only for drivers which implement ->error_handler.
For drivers using old EH, tag 0 is used.  Note that this makes
ata_tag_internal() test valid only when ->error_handler is
implemented.  This is okay as drivers on old EH should not and does
not have any reason to use ata_tag_internal().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:02 +09:00
Tejun Heo
dc2b351586 [PATCH] libata-eh-fw: clear SError in ata_std_postreset()
Clear SError in ata_std_postreset().  This is to clear SError bits
which get set during reset.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:00 +09:00
Tejun Heo
f15a1dafed [PATCH] libata: use ATA printk helpers
Use ATA printk helpers.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:56 +09:00
Tejun Heo
3373efd89d [PATCH] libata: use dev->ap
Use dev->ap where possible and eliminate superflous @ap from functions
and structures.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:53 +09:00
Tejun Heo
38d87234d6 [PATCH] libata: add dev->ap
Add dev->ap which points back to the port the device belongs to.  This
makes it unnecessary to pass @ap for silly reasons (e.g. printks).
Also, this change is necessary to accomodate later PM support which
will introduce ATA link inbetween port and device.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:51 +09:00
Tejun Heo
81952c5497 [PATCH] libata: use new SCR and on/offline functions
Use new SCR and on/offline functions.  Note that for LLDD which know
it implements SCR callbacks, SCR functions are guaranteed to succeed
and ata_port_online() == !ata_port_offline().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:47 +09:00
Tejun Heo
34bf21704c [PATCH] libata: implement new SCR handling and port on/offline functions
Implement ata_scr_{valid|read|write|write_flush}() and
ata_port_{online|offline}().  These functions replace
scr_{read|write}() and sata_dev_present().

Major difference between between the new SCR functions and the old
ones is that the new ones have a way to signal error to the caller.
This makes handling SCR-available and SCR-unavailable cases in the
same path easier.  Also, it eases later PM implementation where SCR
access can fail due to various reasons.

ata_port_{online|offline}() functions return 1 only when they are
affirmitive of the condition.  e.g.  if SCR is unaccessible or
presence cannot be determined for other reasons, these functions
return 0.  So, ata_port_online() != !ata_port_offline().  This
distinction is useful in many exception handling cases.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:46 +09:00
Tejun Heo
838df6284c [PATCH] libata: init ap->cbl to ATA_CBL_SATA early
Init ap->cbl to ATA_CBL_SATA in ata_host_init().  This is necessary
for soon-to-follow SCR handling function changes.  LLDDs are free to
change ap->cbl during probing.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:44 +09:00
Tejun Heo
ce5f7f3d0c [PATCH] sata_sil24: update TF image only when necessary
Update TF image (pp->tf) only when necessary.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:42 +09:00
Tejun Heo
e61e067227 [PATCH] libata: implement qc->result_tf
Add qc->result_tf and ATA_QCFLAG_RESULT_TF.  This moves the
responsibility of loading result TF from post-compltion path to qc
execution path.  qc->result_tf is loaded if explicitly requested or
the qc failsa.  This allows more efficient completion implementation
and correct handling of result TF for controllers which don't have
global TF representation such as sil3124/32.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:40 +09:00
Tejun Heo
96bd39ec29 [PATCH] libata: remove postreset handling from ata_do_reset()
Make ata_do_reset() deal only with reset.  postreset is now the
responsibility of the caller.  This is simpler and eases later
prereset addition.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:38 +09:00
Tejun Heo
3adcebb2b5 [PATCH] libata: move ->set_mode() handling into ata_set_mode()
Move ->set_mode() handlng into ata_set_mode().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:37 +09:00
Tejun Heo
fe635c7e91 [PATCH] libata: use preallocated buffers
It's not a very good idea to allocate memory during EH.  Use
statically allocated buffer for dev->id[] and add 512byte buffer
ap->sector_buf.  This buffer is owned by EH (or probing) and to be
used as temporary buffer for various purposes (IDENTIFY, NCQ log page
10h, PM GSCR block).

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:35 +09:00
Tejun Heo
158693031d [PATCH] libata: hold host_set lock while finishing internal qc
Hold host_set lock while finishing internal qc.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:33 +09:00
Tejun Heo
7401abf2f4 [PATCH] libata: clear ap->active_tag atomically w.r.t. command completion
ap->active_tag was cleared in ata_qc_free().  This left ap->active_tag
dangling after ata_qc_complete().  Spurious interrupts inbetween could
incorrectly access the qc.  Clear active_tag in ata_qc_complete().
This change is necessary for later EH changes.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:32 +09:00
Tejun Heo
f8c2c4202d [PATCH] libata: fix ->phy_reset class code handling in ata_bus_probe()
ata_bus_probe() doesn't clear dev->class after ->phy_reset().  This
can result in falsely enabled devices if probing fails.  Clear
dev->class to ATA_DEV_UNKNOWN after fetching it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:30 +09:00
Tejun Heo
e23befe901 [PATCH] libata: unexport ata_scsi_error()
While moving ata_scsi_error() from LLDD sht to libata transportt,
EXPORT_SYMBOL_GPL() entry was left out.  Kill it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:27 +09:00
Tejun Heo
e4fac92ae7 [PATCH] ahci: hardreset classification fix
AHCI calls ata_dev_classify() even when no device is attached which
results in false class code.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:25 +09:00
Tejun Heo
3c567b7d11 [PATCH] libata: rename ata_down_sata_spd_limit() and friends
Rename ata_down_sata_spd_limit() and friends to sata_down_spd_limit()
and likewise for simplicity & consistency.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:23 +09:00
Tejun Heo
c44078c03f [PATCH] libata: silly fix in ata_scsi_start_stop_xlat()
Don't directly access &qc->tf when tf == &qc->tf.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:21 +09:00
Tejun Heo
ee7863bc68 [PATCH] SCSI: implement shost->host_eh_scheduled
libata needs to invoke EH without scmd.  This patch adds
shost->host_eh_scheduled to implement such behavior.

Currently the only user of this feature is libata and no general
interface is defined.  This patch simply adds handling for
host_eh_scheduled where needed and exports scsi_eh_wakeup() to
modules.  The rest is upto libata.  This is the result of the
following discussion.

http://thread.gmane.org/gmane.linux.scsi/23853/focus=9760

In short, SCSI host is not supposed to know about exceptions unrelated
to specific device or command.  Such exceptions should be handled by
transport layer proper.  However, the distinction is not essential to
ATA and libata is planning to depart from SCSI, so, for the time
being, libata will be using SCSI EH to handle such exceptions.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:20 +09:00
Luben Tuikov
89f48c4d67 [PATCH] SCSI: Introduce scsi_req_abort_cmd (REPOST)
Introduce scsi_req_abort_cmd(struct scsi_cmnd *).
This function requests that SCSI Core start recovery for the
command by deleting the timer and adding the command to the eh
queue.  It can be called by either LLDDs or SCSI Core.  LLDDs who
implement their own error recovery MAY ignore the timeout event if
they generated scsi_req_abort_cmd.

First post:
http://marc.theaimsgroup.com/?l=linux-scsi&m=113833937421677&w=2

Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:18 +09:00
Bastiaan Jacques
bf2af2a202 [PATCH] ahci: add support for VIA VT8251
Adds AHCI support for the VIA VT8251.

Includes a workaround for a hardware bug which requires a Command List
Override before softreset.

Signed-off-by: Bastiaan Jacques <b.jacques@planet.nl>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-04-20 18:37:08 -04:00
Jeff Garzik
4741c336d2 Merge branch 'master' 2006-04-18 04:54:00 -04:00
James Bottomley
7676f83aeb [SCSI] scsi_transport_sas: don't scan a non-existent end device
Any end device that can't support any of the scanning protocols
shouldn't be scanned, so set its id to -1 to prevent
scsi_scan_target() being called for it.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-14 15:33:41 -05:00
adam radford
1e08dcb39c [SCSI] 3ware 9000 disable local irqs during kmap_atomic
Equivalent of the same patch for the 3w-xxxx driver.

Signed-off-by: Adam Radford <linuxraid@amcc.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 13:27:38 -05:00
Tejun Heo
e36e0c8013 [SCSI] SCSI: fix scsi_kill_request() busy count handling
scsi_kill_request() completes requests via normal SCSI completion path
which decrements busy counts; however, requests which get passed to
scsi_kill_request() aren't holding busy counts and scsi_kill_request()
don't increment them before invoking completion path resulting in
incorrect busy counts.  Bump up busy counts before invoking completion
path.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 13:25:49 -05:00
James Smart
aedf349773 [SCSI] FC transport: fixes for workq deadlocks
As previously reported via Michael Reed, the FC transport took a hit
in 2.6.15 (perhaps a little earlier) when we solved a recursion error.
There are 2 deadlocks occurring:
- With scan and the delete items sharing the same workq, flushing the
  workq for the delete code was getting it stalled behind a very long
  running scan code path.
- There's a deadlock where scsi_remove_target() has to sit behind
  scsi_scan_target() due to contention over the scan_lock().

This patch resolves the 1st deadlock and significantly reduces the
odds of the second. So far, we have only replicated the 2nd deadlock
on a highly-parallel SMP system. More on the 2nd deadlock in a following
email.

This patch reworks the transport to:
- Only use the scsi host workq for scanning
- Use 2 other workq's internally. One for deletions, the other for
  scheduled deletions. Originally, we tried this with a single workq,
  but the occassional flushes of the scheduled queues was hitting the
  second deadlock with a slightly higher frequency. In the future, we'll
  look at the LLDD's and the transport to see if we can get rid of this
  extra overhead.
- When moving to the other workq's we tightened up some object states
  and some lock handling.
- Properly syncs adds/deletes
- minor code cleanups
  - directly reference fc_host_attrs, rather than through attribute
    macros
  - flush the right workq on delayed work cancel failures.

Large kudos to Michael Reed who has been working this issue for the last
month.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 13:25:16 -05:00
Hannes Reinecke
b0d2364887 [SCSI] aic79xx: target hotplug fixes
When a target is added aic79xx tries to be overly clever: it changes
the command on the fly to TEST UNIT READY and tries to requeue the
original command. Sadly this breaks SCSI compability and of course
the midlayer is getting a bit confused by it.

So we're just removing that bit of code and let the midlayer deal with
it. It's clever enough by now. And the driver code is getting simpler.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 13:23:11 -05:00
FUJITA Tomonori
441f987ca4 [SCSI] ibmvscsi: remove drivers/scsi/ibmvscsi/srp.h
It's no longer needed after the convrsion to use the linux srp.h file.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 13:21:51 -05:00
Hannes Reinecke
f41b5cec9b [SCSI] aic79xx bus reset update
As James B. correctly noted, ahd_reset_channel() in
ahd_linux_bus_reset() should be protected by ahd_lock().  However, the
main reason for not doing so was a deadlock with the interesting
polling mechanism to detect the end a bus reset.

This patch replaces the polling mechanism with a saner signalling via
flags; it also gives us the benefit of detecting any multiple calls to
ahd_reset_channel().

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 12:56:15 -05:00
James Bottomley
4d7db04a7a [SCSI] add SCSI_UNKNOWN and LUN transfer limit restrictions
Original From: Ingo Flaschberger <if@xip.at>

To support the RA4100 array from Compaq.

This patch now correctly handles SCSI_UNKNOWN types with regard to
BLIST_REPORTLUNS2 (allow it) and cdb[1] LUN inclusion (don't).

It also allows a BLIST_MAX_512 flag to restrict the maximum transfer
length to 512 blocks (apparently this is an RA4100 problem).

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:31 -05:00
Matthew Wilcox
d637c4543f [SCSI] sym2: Fix build when spinlock debugging is enabled
When spinlock debugging is turned on, a struct completion grows beyond the
size allowed for the scsi_pointer.  So move the struct completion back onto
the stack.  The additional memory barriers are to keep us from completing
a random piece of kernel stack if the command happens to complete after
the error handling has finished.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:30 -05:00
Brian King
80286d478c [SCSI] ipr: Bump version
Bump ipr driver version to 2.1.3

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:30 -05:00
Brian King
c651309671 [SCSI] ipr: Reset device cleanup
Encapsulate some more of the device reset processing in
preparation for SATA support.

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:29 -05:00
Brian King
fb3ed3cb4b [SCSI] ipr: printk macro cleanup/removal
Remove some unused printk macros, make some more robust, and
convert some to use standard printk macros when possible.

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:29 -05:00
Brian King
fe964d0a4b [SCSI] ipr: Simplify status area dumping
Simplify the dumping of the command status area by
removing some device specific information that has proven
to not be worthwhile.

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:28 -05:00
Brian King
e4fbf44ed0 [SCSI] ipr: Fixup device type check
Fixup a check used by the ipr driver to determine if a given
device is a SCSI disk. Due to the addition of support for
attaching SATA devices, this check needs to be more robust.

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:28 -05:00
Brian King
1121b794a3 [SCSI] ipr: Disk remove path cleanup
Instead of NULLing the resource entry pointer when a disk
goes away to prevent any new commands being sent to it,
set the adapter resource handle to an invalid value so
new ops getting sent to it will fail with a selection timeout
response. This patch is needed for future SATA patches.

Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:28 -05:00
Douglas Gilbert
c06bb7f514 [SCSI] sg: fix leak when dio setup fails
when the sg driver is unable to setup direct IO, free that scatter
gather list prior to falling back to indirect IO

Further to this thread started by Bryan Holty:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114306885116728&w=2

Here is the reworked patch again. This time it has been
tested with a program provided by Bryan.

Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:27 -05:00
James Bottomley
d6159c17c2 [SCSI] expose sas internal class for the domain transport
necessary to make the domain class use the internal structures

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:27 -05:00
KAMEZAWA Hiroyuki
530bba6fa8 [SCSI] for_each_possible_cpu: scsi
for_each_cpu() actually iterates across all possible CPUs.  We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs.  This is inefficient and
possibly buggy.

We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.

This patch replaces for_each_cpu with for_each_possible_cpu.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:26 -05:00
Matthew Wilcox
ac05165179 [SCSI] Version 2.2.3
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:26 -05:00
Matthew Wilcox
14ac8bf58d [SCSI] Enable clustering and large transfers
This patch enables clustering and sets max_sectors to 0xffff to enable
reading and writing of large blocks with tapes (and large transfers with
sg). This change is needed after the sg and st drivers started using
chained bios through scsi_request_async() in 2.6.16.

Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:26 -05:00
Matthew Wilcox
b4e93a739e [SCSI] Simplify error handling
Use wait_for_completion_timeout() instead of using a timer (as
Christoph Hellwig did for aic7xxx).

That lets me eliminate the sym_eh_wait structure; the struct completion,
the old_done pointer and the to_do flag can be folded into the sym_ucmd
(which overrides the scsi_pointer in scsi_cmnd).

The sym_eh_done() function becomes much simpler as the timeout handling
is done in sym_eh_handler() directly.

The host_lock can be unlocked earlier, and I cache the host in
a local variable to make accesses to it quicker.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:25 -05:00
Matthew Wilcox
c2349df918 [SCSI] Allow nvram settings to determine bus mode
The PDC code can set the bus mode, but we were ignoring that setting.
Also move the code that determines bus mode into its own function.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:25 -05:00
Matthew Wilcox
92d578b94c [SCSI] Use SPI messages where possible
Now sym2 is using spi_print_msg, we don't need to have our own messages
for IGNORE WIDE RESIDUE and MODIFY DATA POINTER, so provide the option
of passing NULL for the label.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:24 -05:00
Matthew Wilcox
3bea15a76e [SCSI] Disable sym2 driver queueing
Undef SYM_OPT_HANDLE_DEVICE_QUEUEING.
Call sym_put_start_queue instead of sym_start_next_ccbs.
Turn asserts into checks that we can send the command to the adapter,
and return busy from queuecommand if we can't.

Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2006-04-13 10:13:24 -05:00