It is possible that a ctask could be completing and getting
cleaned up at the same time, we are finishing up the last
data transfer. This could then result in the data transfer
code using stale or invalid values. This patch adds a refcount
to the ctask. When the count goes to zero then we know the
transmit thread and recv thread or softirq are not touching
it and we can safely release it.
The eh should not need to grab a reference because it only cleans
up a task if it has both the xmit mutex and recv lock (or recv
side suspended).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iSCSI RFC states that the first burst length must be smaller than the
max burst length. We currently assume targets will be good, but that may
not be the case, so this patch adds a check.
This patch also moves the unsol data out offset to the lib so the LLDs
do not have to track it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Sanitize the Vendor, Product, and Revision strings contained in an
INQUIRY result by setting all non-graphic or non-ASCII characters to ' '.
Since the standard disallows such characters, this will affect
only non-compliant devices.
To help maintain backward compatibility, NUL characters are treated
specially. They are taken as string terminators; they and all the
following characters are set to ' '. If some valid characters get
erased as a result... well, we weren't seeing them before so we haven't
lost anything.
The primary purpose of this change is to allow blacklist entries to
match devices with illegal Vendor or Product strings.
In addition, the patch updates a couple of function prototypes, giving
inq_result its correct type (unsigned char *).
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The fix isn't actually in sd: it's in scsi_device_get(). I modified it
to allow devices to be returned in SDEV_CANCEL, but not SDEV_DEL. This
means that the device_remove_driver, which occurs in device_del() in
scsi_remove_device() after the device has gone into SDEV_CANCEL is now
effective at flushing the cache.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds support for sharing tag maps at the host level
(i.e. either every queue [LUN] has its own tag map or there's a single
one for the entire host). This formulation is primarily intended to
help single issue queue hardware, like the aic7xxx
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch sets can_queue in the aic94xx driver's scsi_host to better
performing values than what's there currently. It seems that
asd_ha->seq.can_queue reflects the number of requests that can be
queued per controller; so long as there's one scsi_host per
controller, it seems logical that the scsi_host ought to have the same
can_queue value. To the best of my (still limited) knowledge, this
method provides the correct value.
The effect of leaving this value set to 1 is terrible performance in
the case of either (a) certain Maxtor SAS drives flying solo or (b)
flooding several disks with I/O simultaneously (md-raid). There may be
more scenarios where we see similar problems that I haven't uncovered.
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is the end point of the separate aic94xx driver based on the
original driver and transport class from Luben Tuikov
<ltuikov@yahoo.com>
The log of the separate development is:
Alexis Bruemmer:
o aic94xx: fix hotplug/unplug for expanderless systems
o aic94xx: disable split completion timer/setting by default
o aic94xx: wide port off expander support
o aic94xx: remove various inline functions
o aic94xx: use bitops
o aic94xx: remove queue comment
o aic94xx: remove sas_common.c
o aic94xx: sas remove depot's
o aic94xx: use available list_for_each_entry_safe_reverse()
o aic94xx: sas header file merge
James Bottomley:
o aic94xx: fix TF_TMF_NO_CTX processing
o aic94xx: convert to request_firmware interface
o aic94xx: fix hotplug/unplug
o aic94xx: add link error counts to the expander phys
o aic94xx: add transport class phy reset capability
o aic94xx: remove local_attached flag
o Remove README
o Fixup Makefile variable for libsas rename
o Rename sas->libsas
o aic94xx: correct return code for sas_discover_event
o aic94xx: use parent backlink port
o aic94xx: remove channel abstraction
o aic94xx: fix routing algorithms
o aic94xx: add backlink port
o aic94xx: fix cascaded expander properties
o aic94xx: fix sleep under lock
o aic94xx: fix panic on module removal in complex topology
o aic94xx: make use of the new sas_port
o rename sas_port to asd_sas_port
o Fix for eh_strategy_handler move
o aic94xx: move entirely over to correct transport class formulation
o remove last vestages of sas_rphy_alloc()
o update for eh_timed_out move
o Preliminary expander support for aic94xx
o sas: remove event thread
o minor warning cleanups
o remove last vestiges of id mapping arrays
o Further updates
o Convert aic94xx over entirely to the transport class end device and
o update aic94xx/sas to use the new sas transport class end device
o [PATCH] aic94xx: attaching to the sas transport class
o Add missing completion removal from prior patch
o [PATCH] aic94xx: attaching to the sas transport class
o Build fixes from akpm
Jeff Garzik:
o [scsi aic94xx] Remove ->owner from PCI info table
Luben Tuikov:
o initial aic94xx driver
Mike Anderson:
o aic94xx: fix panic on module insertion
o aic94xx: stub out SATA_DEV case
o aic94xx: compile warning cleanups
o aic94xx: sas_alloc_task
o aic94xx: ref count update
o aic94xx nexus loss time value
o [PATCH] aic94xx: driver assertion in non-x86 BIOS env
Randy Dunlap:
o libsas: externs not needed
Robert Tarte:
o aic94xx: sequence patch - fixes SATA support
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This flag denotes local attachment of the phy. There are two problems
with it:
1) It's actually redundant ... you can get the same information simply
by seeing whether a host is the phys parent
2) we condition a lot of phy parameters on it on the false assumption
that we can only control local phys. I'm wiring up phy resets in the
aic94xx now, and it will be able to reset non-local phys as well.
I fixed 2) by moving the local check into the reset and stats function
of the mptsas, since that seems to be the only HBA that can't
(currently) control non-local phys.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
idescsi_pc_intr() uses local_irq_enable() in IRQ context: annotate it.
(this has no effect on kernels with lockdep disabled. On kernels with lockdep
enabled this means that we wont actually disable interrupts, and the warning
message will go away as well.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The callers of scsi_send_eh_cmnd are setting the cmnd buffer,
and then scsi_send_eh_cmnd is copying that updated buffer to
the old_cmnd variable. Then after the command runs, we end up
copying that old_cmnd var which has the new cmnd to the scsi
command buffer. When this command gets recent, all types of fun
things happen like getting TUR or START_STOP commands with
data and scatterlists.
This patch made against scsi-rc-fixes, has the callers of
scsi_send_eh_cmnd pass in the command so scsi_send_eh_cmnd
can do the right thing. This should go into 2.6.18 since this
fixes a regression added when we removed some of the scsi_cmnd
fields and replaced them with local variables.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Software must explicitely re-enable extended firmware tracing
after any ISP abort condition.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Original code attempts to retry PLOGIs to fcports that are
FCP_TARGETs only. If the driver never performed a successful
PLOGI/PRLI, the port-type would never be assigned, and the
relogin logic would silently drop the request (and thus the port
would not be recognized and registered).
The fix is relatively straightforward, drop the FCP_TARGET-only
check.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
There's a problem where sg is executing a ->nopage operation on a
compound page, it actually calls get_page() on the first page in the
compound rather than the page which is being mapped. The fix is to
select the correct page by indexing into the compound.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
vt6420 has super-fragile SCR registers which can hang the whole
machine if accessed with the wrong timings. This patch makes sata_via
use SCR registers only during probing and with the same timings as
before (pre new EH), which is proven to work.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This patch implements force_pcs module parameter for ata_piix. If 1,
PCS is ignored, 2 honored. As there seem to be quite a few ICHs w/
impaired PCS, this option will be useful for cases where the default
setting doesn't work.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
There have been a number of reports regarding some ICH5s failing to
detect devices since the PCS handling update. Analysis shows that
these problems are caused by bogus PCS values from those controllers.
Before the PCS update, the driver didn't honor PCS regs exactly and
probed them in many cases PCS reports no device. Now that PCS is
honored exactly, these hardware problems are visible.
This patch makes ICH5 ignore PCS.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Move out PCS handling from piix_sata_prereset() into
piix_sata_present_mask() and use it from newly implemented
piix_sata_softreset(). Class codes for devices which are indicated to
be absent by PCS are cleared to ATA_DEV_NONE. This fixes ghost device
problem reported on ICH6 and 7.
This patch moves PCS handling from prereset to softreset, which makes
two behavior changes.
* perform softreset even when PCS indicates no device
* PCS handling is repeated before retrying softresets due to reset
failures.
Both behavior changes are intended and more consistent with how other
drivers behave.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
- Reworked all the very long lines in that block (this drivers full of
them though)
- Returns an error in three places that it didn't before.
- Properly clean up after a scsi_add_host() failure.
Signed-off-by: Daniel Walker <dwalker@mvista.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Stall error handler if attempting resets/aborts while an rport is blocked.
This avoids device offline scenarios due to errors in the error handler.
Background:
Although the transport is using the scsi_timed_out functionality to
restart the timeout if the rport is blocked, if the timeout has already
fired before the block occurs, the eh handler still runs and can take
the device offline. Ultimately, this window cannot be resolved without
significant work in the error handler thread. Christoph noted the first
level of these issues when he noted the poor error response handling
by the error thread.
We found, under heavy load and error testing, that time window from when
the scsi_times_out() adds the io to the queue to when the scsi_error_handler
gets around to servicing it, can be in the several seconds range. In most
cases, these test conditions are highly unusual, but possible.
As a result, we're stalling the error handler in this race window so that
we can avoid the device_offline transitions.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Misc Bug Fixes:
- Cap MBX_DOWN_LINK command timeout to 60 seconds
- Fix double free of ndlp object
- Don't free mbox structures on error. The completion handlers expect to do so.
- Clear host attention work items when going offline
- Fixed discovery issues in multi-initiator environments.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch updates the fc transport for the following:
- Addition of a new attribute "system_hostname" which can be
used to set the fully qualified hostname that the fc_host
is attached to. The fc_host can then register this string
as the FDMI-based host name attribute.
Note: for NPIV, a fc_host could be associated with a system which
is not the local system.
- Add the inline function u64_to_wwn(), which is the inverse of the
existing wwn_to_u64() function.
- Slight reorg, just to keep dynamic attributes with each other, etc
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Convert the pci_device_id-table of the megaraid_sas-driver to
the PCI_DEVICE-macro, to safe some lines.
Signed-off-by: Henrik Kretzschmar <henne@nachtwindheim.de>
Acked-by: "Patro, Sumant" <Sumant.Patro@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Modify beginning string to be more readable. Remove one trailing newline.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
kbuild includes this automatically these days.
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some targets may return slight variations of PQ and PDT to indicate
no LUN mapped. USB UFI setting PDT=0x1f but having reserved bits for
PQ is one example, and NetApp targets returning PQ=1 and PDT=0x1f is
another. Both instances seem like reasonable responses according to
SPC-3 and UFI specs.
The current scsi_probe_and_add_lun() code adds a scsi_device
for targets that return PQ=1 and PDT=0x1f. This causes LUNs of type
"UNKNOWN" to show up in /proc/scsi/scsi when no LUNs are mapped.
In addition, subsequent rescans fail to recognize LUNs that may be
added on the target, unless preceded by a write to the delete attribute
of the "UNKNOWN" LUN.
This patch addresses this problem by skipping over the scsi_add_lun()
when PQ=1,PDT=0x1f is encountered, and just returns
SCSI_SCAN_TARGET_PRESENT.
Signed-off-by: Dave Wysochanski <davidw@netapp.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
If the adapter is in blinkled (Firmware Assert) when error recovery
timeout actions have been triggered, perform an adapter warm reset and
restart the initialization.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
The enclosed patch cleans up some code fragments, adds some paranoia
(unproven causes of potential driver failures).
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
If the adapter should be in a blinkled (Firmware Assert) state when the
driver loads, we will perform a warm restart of the Adapter Firmware to
see if we can rescue the adapter. Possible causes of a blinkled can
occur on some early release motherboard BIOSes, transitory PCI bus
problems on embedded systems or non-x86 based architectures, transitory
startup failures of early release drives or transitory hardware
failures; some of which can bite the adapter later at runtime. Future
enhancements will include recovery during runtime.
Fixed extra whitespace space issue.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
This patch allows the FSACTL_SEND_LARGE_FIB, FSACTL_SENDFIB and
FSACTL_SEND_RAW_SRB ioctl calls into the aacraid driver to be
interruptible. Only necessary if the adapter and/or the management
software has gone into some sort of misbehavior and the system is being
rebooted, thus permitting the user management software applications to
be killed relatively cleanly. The FIB queue resource is held out of the
free queue until the adapter finally, if ever, completes the command.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Attached is a patch that should limit a possible recursion that can
lead to a stack overflow like follows:
Kernel stack overflow.
CPU: 3 Not tainted
Process zfcperp0.0.d819
(pid: 13897, task: 000000003e0d8cc8, ksp: 000000003499dbb8)
Krnl PSW : 0404000180000000 000000000030f8b2 (get_device+0x12/0x48)
Krnl GPRS: 00000000135a1980 000000000030f758 000000003ed6c1e8 0000000000000005
0000000000000000 000000000044a780 000000003dbf7000 0000000034e15800
000000003621c048 070000003499c108 000000003499c1a0 000000003ed6c000
0000000040895000 00000000408ab630 000000003499c0a0 000000003499c0a0
Krnl Code: a7 fb ff e8 a7 19 00 00 b9 02 00 22 e3 e0 f0 98 00 24 a7 84
Call Trace:
([<000000004089edc2>] scsi_request_fn+0x13e/0x650 [scsi_mod])
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
[<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
...
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
[<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089fa9e>] scsi_run_host_queues+0x196/0x230 [scsi_mod]
[<00000000409eba28>] zfcp_erp_thread+0x2638/0x3080 [zfcp]
[<0000000000107462>] kernel_thread_starter+0x6/0xc
[<000000000010745c>] kernel_thread_starter+0x0/0xc
<0>Kernel panic - not syncing: Corrupt kernel stack, can't continue.
This stack overflow occurred during tests on s390 using zfcp.
Recursion depth for this panic was 19.
Usually recursion between blk_run_queue and a request_fn is avoided
using QUEUE_FLAG_REENTER. But this does not help if the scsi stack
tries to flush the starved_list of a scsi_host.
Limit recursion depth when flushing the starved_list
of a scsi_host.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Hi,
Reading the Intel VSC and AHCI it seems like writing 0x302 is incorrect.
The only valid values are 4, 1 and 0. Writing 4 disables the
PHY.
Signed-off-by: Martin Hicks <mort@bork.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
pdc_adma was overlooked and broken by the irq-pio patch:
Only HSM_ST_LAST interrupts should be delivered to this LLDD.
Adding ATA_FLAG_PIO_POLLING to pdc_adma fixes the problem (temporarily),
before we convert the irq handler of pdc_adma to handle all interrupts.
Signed-off-by: Albert Lee <albertcc@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement dummy port which can be requested by setting appropriate bit
in probe_ent->dummy_port_mask. The dummy port is used as placeholder
for stolen legacy port. This allows libata to guarantee that
index_of(ap) == ap->port_no == actual_device_port_no, and thus to
remove error-prone ap->hard_port_no.
As it's used only when one port of a legacy controller is reserved by
some other entity (e.g. IDE), the focus is on keeping the added *code*
complexity at minimum, so dummy port allocates all libata core
resources and acts as a normal port. It just has all dummy port_ops.
This patch only implements dummy port. The following patch will make
libata use it for stolen legacy ports.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Kill host_set->next
Fix simplex support
Allow per platform setting of IDE legacy bases
Some of this can be tidied further later on, in particular all the
legacy port gunge belongs as a PCI quirk/PCI header decode to understand
the special legacy IDE rules in the PCI spec.
Longer term Jeff also wants to move the request_irq/free_irq out of core
which will make this even cleaner.
tj: folded in three followup patches - ata_piix-fix, broken-arch-fix
and fix-new-legacy-handling, and separated per-dev xfermask into
separate patch preceding this one. Folded in fixes are...
* ata_piix-fix: fix build failure due to host_set->next removal
* broken-arch-fix: add missing include/asm-*/libata-portmap.h
* fix-new-legacy-handling:
* In ata_pci_init_legacy_port(), probe_num was incorrectly
incremented during initialization of the secondary port and
probe_ent->n_ports was incorrectly fixed to 1.
* Both legacy ports ended up having the same hard_port_no.
* When printing port information, both legacy ports printed
the first irq.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Implement per-dev xfermask. libata used to determine xfermask
per-port - the fastest mode of the slowest device on the port. This
patch enables per-dev xfermask.
Original patch is from Alan Cox <alan@redhat.com>. The following
changes are made by me.
* simplex warning message is added
* remove disabled device handling code which is never invoked
(originally for choosing port-wide lowest PIO mode)
Cc: Alan Cox <alan@redhat.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
s/ata_host_add/ata_port_add/
s/ata_host_init/ata_port_init/
libata naming got stuck in the middle of a Great Renaming:
ata_host -> ata_port
ata_host_set -> ata_host
To eliminate confusion, let's just give up for now, and simply ensure
that things are internally consistent.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Update ata_host_init() such that it only initializes SCSI host related
stuff and doesn't call into ata_port_init(), and rename it to
ata_port_init_shost().
Signed-off-by: Tejun Heo <htejun@gmail.com>
SCSI EH locks door if sdev->locked is set. Sometimes door lock
command fails continuously (e.g. when medium is not present) and as
libata uses EH to acquire sense data, this easily creates a loop where
a failed lock door invokes EH and EH issues lock door on completion.
This patch clears sdev->locked on door lock failure to break this
loop. This problem has been spotted and diagnosed by Unicorn Chang
<uchang@tw.ibm.com>.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Fix a sata debug print statement that still uses an old variable name.
Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Initial IRQ mask clearing is done by libata-core by freezing all ports
prior to requesting IRQ. Remove redundant IRQ clearing from
init_controller().
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The following patch enhances libata to allow SAS device drivers
to utilize libata to talk to SATA devices. It introduces some
new APIs which allow libata to be used without allocating a
virtual scsi host.
New APIs:
ata_sas_port_alloc - Allocate an ata_port
ata_sas_port_init - Initialize an ata_port (probe device, etc)
ata_sas_port_destroy - Free an ata_port allocated by ata_sas_port_alloc
ata_sas_slave_configure - configure scsi device
ata_sas_queuecmd - queue a scsi command, similar to ata_scsi_queuecomand
These new APIs can be used either directly by a SAS LLDD or could be used
by the SAS transport class.
Possible usage for a SAS LLDD would be:
scsi_scan_host
target_alloc
ata_sas_port_alloc
slave_alloc
ata_sas_port_init
slave_configure
ata_sas_slave_configure
Commands received by the LLDD for SATA devices would call ata_sas_queuecmd.
Device teardown would occur with:
slave_destroy
port_disable
target_destroy
ata_sas_port_destroy
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Move ata_probe_ent_alloc to libata-core. It will also be used by
future SAS/SATA integration patches.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Separate out the ata_port initialization from ata_host_init
so that it can be used in future SAS patches.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
A recent drivers base commit:
3e95637a48
Caused the bus to be added to dev_printk, so now our SCSI inquiry short
messages print like this:
scsiscsi 2:0:0:0: Direct access IBM-ESXS ST973401SS B519 PQ: 0 ANSI: 5
Just remove the "scsi" from the sdev_printk to compensate.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Replace scsi_device_types array API with scsi_device_type function API.
Gets rid of a lot of common code, as well as being easier to use.
- Add the new device types in SPC4 r05a, and rename some of the older ones.
- Reformat the printing of inquiry data; now fits on one line and
includes PQ.
I think I've addressed all the feedback from the previous versions. My
current test box prints:
scsi 2:0:1:0: Direct access HP 18.2G ATLAS10K3_18_SCA HP05 PQ: 0 ANSI: 2
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix up a logic error in the checking for valid sense data.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The ipr driver currently translates adapter recovered errors
to DID_ERROR. This patch fixes this to translate these
errors to success instead.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add definitions for some SAS error codes that can be
logged by ipr SAS adapters.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add some hardware defined types for SATA. This is required
by future patches to add SATA support to ipr.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
SCSI folk forgot to fix up all the uses of 'buffer' before deleting
this struct member. Do it for them to rescue the resulting build
failures.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The hptiop just got merged with a horrible amount of really bad ioctl
code that is against the standards for new scsi drivers. This patch
backs it out (and fixes a small bug where scsi_add_host is called to
early). We can re-add proper APIs once we agree on them.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Conflicts:
arch/ia64/hp/sim/simscsi.c
Stylistic differences in two separate fixes for buffer->request_buffer
problem.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If you examine the queue_work() routine you'll see that it returns
1 on success, 0 if the work is already queued.
This patch corrects the source code documentation for the
scsi_queue_work function.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Currently it is impossible to reset provided by Qlogic QLA2xxx driver
SCSI devices externally using corresponding sg devices, particularly via
sg_reset utility, because qla2xxx driver in qla2xxx_eh_device_reset()
function checks if the input scsi_cmnd has its private data (CMD_SP())
attached. Then the found pointer isn't used anywhere inside of
qla2xxx_eh_device_reset(). If the RESET request comes from sg device, it
doesn't have such private data.
The attached patch removes check for non-NULL CMD_SP() from
qla2xxx_eh_device_reset(), hence allows to reset QLA2xxx's devices using
corresponding sg devices.
AV: change applies to bus/host reset handlers as well.
Signed-off-by: Vladislav Bolkhovitin <vst@vlnb.net>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
ID String and Message fixes
- Fix switch symbolic name registration to match cross-OS values
- Replace printk's with more standard lpfc_printf_log calls
- Make all lpfc_printf_log message numbers unique
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Short bug fixes:
- Fix iocbq list corruption due to missing list_del's in ct handling
- Missing unlock in lpfc_sli_next_iotag()
- Fix initialization of can_queue value
- Differentiate sysfs mailbox errors with different codes.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix race condition between lpfc_sli_issue_mbox and lpfc_online
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix failing firmware download due to mailbox delays needing to be longer.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In the error path, ata_device_add()
* dereferences null host_set->ports[] element.
* calls scsi_remove_host() on not-yet-added shost.
This patch fixes both bugs. The first problem was spotted and initial
patch submitted by Dave Jones <davej@redhat.com>. The second problem
was mentioned and fixed by Jeff Garzik <jgarzik@pobox.com> in a larger
cleanup patch.
Cc: Dave Jones <davej@redhat.com>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
sata_sil24 doesn't make use of probe_ent->mmio_base and setting this
field causes the area to be released twice on detach. Don't set
probe_ent->mmio_base.
Signed-off-by: Tejun Heo <htejun@gmail.com>
To get host_set->private_data initialized reliably, all pinfos need to
point to the same hpriv. Restore pinfo->private_data after pata pinfo
assignment.
Signed-off-by: Tejun Heo <htejun@gmail.com>
ata_prot_detach() did nothing for old EH ports and thus SCSI hosts
associated with those ports are left dangling after they are detached
leaving stale devices and causing oops eventually. Make
ata_port_detach() remove SCSI hosts for old EH ports.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Remove a lot of duplicate #defines from the advansys driver,
and make them look like PCI IDs as defined elsewhere in the kernel.
Also add a module table so that it automatically gets picked up
by tools relying on modinfo output (like say, distro installers).
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Skip protocol test altogether in spurious interrupt code. If PIOS is received
when it shouldn't, ahci will raise protocol violation.
Signed-off-by: Unicorn Chang <uchang@tw.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The sysfs files in arcmsr are non-standard in that they aren't simple
filename value pairs, the values actually contain preceeding text which
would have to be parsed. The idea of sysfs files is that the file name
is the description and the contents is a simple value.
Fix up arcmsr to conform to this standard.
Acked-By: Erich Chen <erich@areca.com.tw>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove sysfs_remove_bin_file() return-value checking from the areca driver.
There's nothing a driver can do if sysfs file removal fails, so we'll soon be
changing sysfs_remove_bin_file() to internally print a diagnostic and to
return void.
Cc: Erich Chen <erich@areca.com.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When we introduced -rR then aic7xxx no loger could pick up definition
of YACC&LEX from make - so do it explicit now.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Now that get_property() returns a void *, there's no need to cast its
return value. Also, treat the return value as const, so we can
constify get_property later.
sata_svw changes
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Now that get_property() returns a void *, there's no need to cast its
return value. Also, treat the return value as const, so we can
constify get_property later.
powerpc-specific scsi driver changes.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Implement power management support.
Original implementation is from Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Separate out ahci_reset_controller() and ahci_init_controller() from
ata_host_init(). These will be used by PM callbacks. This patch
doesn't introduce any behavior change.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement ahci_[de]init_port() and use it during initialization and
de-initialization. ahci_[de]init_port() are supersets of what used to
be done during driver [de-]initialization. This patch makes the
following behavior changes.
* Per-port IRQ mask is cleared on driver load as done in other
drivers. The mask will be configured properly during probe.
* During init_one(), HOST_IRQ_STAT is cleared after masking port IRQs
such that there is no race window.
* CMD_SPIN_UP is cleared during init_one() instead of being set. It
is set in port_start(). This is more consistent with overall
structure of initialization. Note that CMD_SPIN_UP simply controls
PHY activation.
* Slumber and staggered spin-up are handled properly.
* All init/deinit operations are done in step-by-step manner as
described in the spec instead of issued as single merged command.
Original implementation is from Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Simplify ahci_start_engine() by killing prerequisite condition checks.
Rationales are..
* No user checks error return from ahci_start_engine()
* Code flow guarantees the prerequisite conditions unless the
controller is malfunctioning. In such cases, the driver had chances
to learn about the problem _before_ calling this function.
* Closely related to the above two, driver calls into this function
even when prerequisites fail hoping for the best.
Basically, ahci_start_engine() should only do the operation itself.
It isn't the right place to check for prerequisites.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* fascist-format comments according to comment style used in libata
core layer.
* if() -> if ()
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* move ahci_port_start/stop() below EH functions. This makes ahci
more consistent with other drivers and makes prototypes for
ahci_start/stop_engine() unnecessary.
* swap positions between ahci_start_engine() and ahci_stop_engine()
for readability.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Zhao, Forrest <forrest.zhao@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
arcmsr is a driver for the Areca Raid controller, a host based RAID
subsystem that speaks SCSI at the firmware level.
This patch is quite a clean up over the initial submission with
contributions from:
Randy Dunlap <rdunlap@xenotime.net>
Christoph Hellwig <hch@lst.de>
Matthew Wilcox <matthew@wil.cx>
Adrian Bunk <bunk@stusta.de>
Signed-off-by: Erich Chen <erich@areca.com.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adds support for change_queue_depth so that device
queue depth can be changed at runtime through sysfs.
Signed-off-by: <brking@charter.net>
Acked-by: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
There was an issue in the data structure defined by megaraid driver
casuing "kernel unaligned access.." messages to be displayed during
IOCTL on IA64 platform.
The issue has been reported/fixed by Sakurai Hiroomi
[sakurai_hiro@soft.fujitsu.com].
Signed-Off By: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
With this patch, driver will protect data corruption created by
INQUIRY with EVPD request to megaraid controllers. As specified in
the changelog, megaraid F/W already has fixed the issue and being
under process of release. Meanwhile, driver will protect the system
with this patch.
Signed-Off By: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch contains
- a fix for 64-bit DMA capability check in megaraid_{mm,mbox} driver.
- includes changes (going back to 32-bit DMA mask if 64-bit DMA mask
failes) suggested by James with previous patch.
- addition of SATA 150-4/6 as commented by Vasily Averin.
With patch, the driver access PCIconfiguration space with dedicated
offset to read a signature. If the signature read, it means that the
controller has capability to handle 64-bit DMA.
Without this patch, the driver used to blindly claim 64-bit DMA
capability.
The issue has been reported by Vasily Averin [vvs@sw.ru].
Thank you Vasily for the reporting.
Signed-Off By: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The version info is useful for iscsi tcp, iser and qla4xxx so move to
transport class.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We were leaking some strings. This patch just frees them.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Must pass ISCSI_ERR values from the recv path and propogate them
upwards.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We currently try to allocate a max_recv_data_segment_length
which can be very large (default is 64K), and common uses
are up to 1MB. It is very very difficult to allocte this
much contiguous memory and it turns out we never even use it.
We really only need a couple of pages, so this patch has us
allocates just what we know what we need today.
Later if vendors start adding vendor specific data and
we need to handle large buffers we can do this, but for
the last 4 years we have not seen anyone do this or request
it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iscsi_tcp can send error events from soft irq context so we
cannot use GFP_KERNEL.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We are touching the cls_session after we have freed
it. This causes a oops.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When we enter recovery and flush the running commands
we cannot freee the connection before flushing the commands.
Some commands may have a reference to the connection
that needs to be released before. iscsi_stop was forcing
the term and suspend too early and was causing a oops
in iser, so this patch removes those callbacks all together
and allows the LLD to handle that detail.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Abort handler fixes.
If a connection is dropped and reconnected while an abort is
running then we should assume the recovery code will clean up
the abort. Not doing so causes a oops.
And if a command completes then we get the status for the abort, we do not
need to call into the LLD to cleanup the resources. Doing this causes
and oops in iser because it ends up freeing some resources twice.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
if iscsi_data_rsp fails we must bail out. Since the pdu values like
data length are invalid we cannot continue to process the data since
it could over run buffers.
This fixes a bug with cisco 5428s where that target is sending
too much data.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The iscsi tcp code can pluck multiple rt2s from the tasks's r2tqueue
in the xmit code. This can result in the task being queued on the xmit queue
but gettting completed at the same time.
This patch fixes the above bug by making the fifo a list so
we always remove the entry on the list del.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In the xmit patch we are sending a -EXXX value to iscsi_conn_failure
which is causing userspace to get confused.
We should be sending a ISCSI_ERR_* value that userspace understands.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
IOP reset message should be posted to inbound message register
instead of outbound message register.
Signed-off-by: HighPoint Linux Team <linux@highpoint-tech.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The follow patch fixes a problem for Matt Taggart.
The Compaq system he had (dl380?) has a SmartArray device that exposes
the 53c1510 device in both RAID and "normal" modes. The difference
is in RAID mode, the smart array driver (IIRC) should claim the
device instead of sym2 driver. Patch below prevents sym2 from
claiming the device when the RAID "daughter board" is attached.
Signed-off-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
HAL and friends have a tendency to trigger this one all the time.
It's not really interesting, so kill it. The vendor kernels all do
anyways.
Signed-off-by: Jens Axboe <axboe@suse.de>
This fixes three drivers to compile again after my patch that removes
the data_cmnd member from struct scsi_cmnd.
The fas216 change is trivial, it should have been using ->cmnd all the
time.
NCR53C9 (which seem to be mostly duplicate driver with esp.c!) is doing
something odd, it should only have looked at ->cmnd before not the saved
copy that is kept for the error handlers sake. Note that it really
should deal with the sync setting themselves but use the generic domain
validation code that get this right - but that's for later let's push
this simple compile fix for now.
And sorry for the late fix for this, I have been busy with OLS and
associated activities last week.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The data_cmd[] member got deleted, so do not use it any more. Scsi
commands do not have their ->cmd[] overwritten temporary to probe for
status after an error before retrying.
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (38 commits)
[SCSI] More buffer->request_buffer changes
[SCSI] mptfusion: bump version to 3.04.01
[SCSI] mptfusion: misc fix's
[SCSI] mptfusion: firmware download boot fix's
[SCSI] mptfusion: task abort fix's
[SCSI] mptfusion: sas nexus loss support
[SCSI] mptfusion: sas loginfo update
[SCSI] mptfusion: mptctl panic when loading
[SCSI] mptfusion: sas enclosures with smart drive
[SCSI] NCR_D700: misc fixes (section and argument ordering)
[SCSI] scsi_debug: must_check fixes
[SCSI] scsi_transport_sas: kill the use of channel
[SCSI] scsi_transport_sas: add expander backlink
[SCSI] hide EH backup data outside the scsi_cmnd
[SCSI] ibmvscsi: handle inactive SCSI target during probe
[SCSI] ibmvscsi: allocate lpevents for ibmvscsi on iseries
[SCSI] aic7[9x]xx: Remove last vestiges of reverse_scan
[SCSI] aha152x: stop poking at saved scsi_cmnd members
[SCSI] st.c: Improve sense output
[SCSI] lpfc 8.1.7: Change version number to 8.1.7
...
- Make ahci_start_engine() and ahci_stop_engine() more consistent with
AHCI spec 1.1
- Change their input parameter from ap to port_mmio
- Update the existing users of ahci_start_engine() and ahci_stop_engine()
Signed-off-by: Forrest Zhao <forrest.zhao@intel.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Update ata_eh_about_to_do() and ata_eh_done() to improve EH action and
EHI flag handling.
* There are two types of EHI flags - one which expires on successful
EH and the other which expires on a successful reset. Make this
distinction clear.
* Unlike other EH actions, reset actions are represented by two EH
action masks and a EHI modifier. Implement correct about_to_do/done
semantics for resets. That is, prior to reset, related EH info is
sucked in from ehi and cleared, and after reset is complete, related
EH info in ehc is cleared.
These changes improve consistency and remove unnecessary EH actions
caused by stale EH action masks and EHI flags.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* (ata_dev_absent() || ata_dev_ready()) test doesn't indicate
SUSPENDED state properly. Fix it.
* Link resuming resets shouldn't be skipped. Don't skip recovery on
EHI_RESUME_LINK. This doesn't matter for host ports as EHI_RESUME
always coincides with EHI_HOTPLUGGED which makes attached disabled
devices vacant. However, PMP reset causes non-hotplug link-resuming
resets which shouldn't be skipped.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Commit 0662c58b32 updated
ata_eh_autopsy() to OR determined action to ehc->i.action to preserve
action mask set directly into ehc->i.action by nested functions. This
broke action mask clearing on SENSE_VALID case causing revalidation
and EH complete message on successful ATAPI CC.
This patch removes two local variables - action and failed_dev - which
cache ehc->i.action and ehc->i.dev respectively, and make the function
directly modify ehc->i.* fields to remove aliasing issues.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
m15w blacklist overgrew by attributing unrelated problems to m15w
including R_ERR on DMA activate FIS errata. This patch shrinks
sata_sil m15w blacklist such that it's as reported by Silicon Image.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Carlos Pardo <Carlos.Pardo@siliconimage.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Prior to this patch, the driver would do this for each port:
read 8-bit PCS
write 8-bit PCS
read 8-bit PCS
write 8-bit PCS
In the field, flaky behavior has been observed related to this register.
In particular, these overzealous register writes can cause misdetection
problems.
Update to do the following once (not once per port) at boot:
read 16-bit PCS
if needs changing,
write 16-bit PCS
And thereafter, we only perform a 'read 16-bit PCS' per port.
This should eliminate all PCS writes in many cases, and be more friendly
in the cases where we do need to enable ports.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Add host_set private structure piix_host_priv. Currently the only
field is ->map which used to be stored directly at
host_set->private_data. This change allows more host_set private
fields to be added.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Seem like quite a few splipped through the cracks. Here's a patch to
update all references I could find:
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Apparently the D700 has had an argument ordering issue for quite a while
which can cause it to get the wrong scsi_id (I just got an unbootable
voyager system because of this). Hopefully this patch also fixes up all
the sectional mismatches within the driver.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Check all __must_check warnings in scsi_debug.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Using the port_id for the channel is completely unnecessary since the
host_id/target_id are constructed to be globally unique. Also move
the mptsas driver on to virtual channel 1 for its raid devices.
Acked-by: "Moore, Eric" <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds the ability to add a backlink to a particular port. The
idea is to represent properly ports on expanders that are used
specifically for linking to the parent device in the topology.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Move the roundup() macro from binfmt_elf.c into linux/kernel.h as it's
generally useful.
[akpm@osdl.org: nuke all the other implementations]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently struct scsi_cmnd has various fields that are used to backup
original data after the corresponding fields have been overridden for
EH commands. This means drivers can easily get at it and misuse it.
Due to the old_ naming this doesn't happen for most of them, but two
that have different names have been used wrong a lot (see previous
patch). Another downside is that they unessecarily bloat the scsi_cmnd
size.
This patch moves them onstack in scsi_send_eh_cmnd to fix those two
issues aswell as allowing future EH fixes like moving the EH command
submissions to use SG lists like everything else.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Without this patch we register an interrupt with request_irq,
but then return a bad return code from the module probe.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Allocate the correct number of lp events when running
ibmvscsi on legacy iseries
Signed-off-by: Dave Boutcher <sleddog@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove last vestiges of the reverse_scan paramater from aic7xxx and aic79xx.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Stop poking into the old_ & co scsi_cmnd fields that should only be used
in the EH code. Untested, but this is required to move ahead with the
EH fixes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Convert this:
st0: Error with sense data: <6>st: Current: sense key: Illegal Request
Additional sense: Invalid field in cdb
To this:
st0: Current: sense key: Illegal Request
Additional sense: Invalid field in cdb
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Acked-by: Kai Makisara <kai.makisara@kolumbus.fi>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Misc Fixes:
- Fix some sparse warnings - casts of address space
- Fix handling of the adapter registration string. Each invocation
was byteswapping, so every other adapter init attempt failed.
- Correct comments and default value for the lpfc_max_luns parameter
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add lpfc_sli_flush_mbox_queue() function and use it in lpfc_offline() call
to avoid deadlock on thread block.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Correct the wait in attachment that delays for topology discovery
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove depricated sysfs attribute board_online, as it's replaced by the new
issue_reset attribute
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adding new issue_reset sysfs attribute
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix panic in lpfc_sli_validate_fcp_iocb due to access of scsi_cmnd after
returning it to the midlayer
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Consolidate dma buf cleanup into a separate function
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Correct bogus nodev_tmo message on NPort that changes its NPort Id
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix txcmplq related panics on heavy IO while downloading firmware
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Issue DOWN_LINK prior to INIT_LINK to work around link failure issue
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fixed infinite retry of REG_LOGIN mailbox failed due to MBXERR_RPI_FULL
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix memory leak and cleanup code related to per ring lookup array.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Standardize the driver on a single define for the maximum supported targets.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Use mod_timer instead of add_timer in lpfc_els_timeout_handler
This patch was formerly posted by Mark Haverkamp.
http://marc.theaimsgroup.com/?l=linux-scsi&m=114246089015681&w=2
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Following on from my post titled: "additional sense codes
need update" see the attachment against lk 2.6.17 .
ChangeLog:
- update additional sense codes table to agree with
SPC-4 revision 5a (14 June 2006)
- adjust some of the opcode names
Signed-off-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some SAS HBAs don't want to go to the trouble of tracking port numbers,
so they'd simply like to say "add this port and give it a number".
This is especially beneficial from the hotplug point of view, since
tracking ports and the available number space can be a real pain.
The current implementation uses an incrementing number per expander to
add the port on. However, since there can never be more ports than
there are phys, a later implementation will try to be more intelligent
about this.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch makes a needlessly global function static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When we force the chip into dual fn mode so we get PATA and AHCI we must
be sure we don't then do anything dumb like try and grab both with the AHCI
driver.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This one looks better, IMHO.
This restores the default libata configuration messages printed during booting.
Signed-off-by: <petkov@math.uni-muenster.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Separate out controller initialization from sil24_init_one() into
sil24_init_controller(). This will be used by resume.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Separate out controller initialization from sil_init_one() into
sil_init_controller(). This will be used by resume.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reimplement controller-wide PM. ata_host_set_suspend/resume() are
defined to suspend and resume a host_set. While suspended, EHs for
all ports in the host_set are pegged using ATA_FLAG_SUSPENDED and
frozen.
Because SCSI device hotplug is done asynchronously against the rest of
libata EH and the same mutex is used when adding new device, suspend
cannot wait for hotplug to complete. So, if SCSI device hotplug is in
progress, suspend fails with -EBUSY.
In most cases, host_set resume is followed by device resume. As each
resume operation requires a reset, a single host_set-wide resume
operation may result in multiple resets. To avoid this, resume waits
upto 1 second giving PM to request resume for devices.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reimplement per-dev PM. The original implementation directly put the
device into suspended mode and didn't synchronize w/ EH operations
including hotplug. This patch reimplements ata_scsi_device_suspend()
and ata_scsi_device_resume() such that they request EH to perform the
respective operations. Both functions synchronize with hotplug such
that it doesn't operate on detached devices.
Suspend waits for completion but resume just issues request and
returns. This allows parallel wake up of devices and thus speeds up
system resume.
Due to sdev detach synchronization, it's not feasible to separate out
EH requesting from sdev handling; thus, ata_device_suspend/resume()
are removed and everything is implemented in the respective
libata-scsi functions.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement two PM per-dev EH actions - ATA_EH_SUSPEND and
ATA_EH_RESUME. Each action puts the target device into suspended mode
and resumes from it respectively.
Once a device is put to suspended mode, no EH operations other than
RESUME is allowed on the device. The device will stay suspended till
it gets resumed and thus reset and revalidated. To implement this, a
new device state helper - ata_dev_ready() - is implemented and used in
EH action implementations to make them operate only on attached &
running devices.
If all possible devices on a port are suspended, reset is skipped too.
This prevents spurious events including hotplug events from disrupting
suspended devices.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement ATA_EHI_NO_AUTOPSY and QUIET. These used to be implied by
ATA_PFLAG_LOADING, but new power management and PMP support need to
use these separately. e.g. Suspend/resume operations shouldn't print
full EH messages and resume shouldn't be recorded as an error.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
The names of predefined debounce timing parameters didn't exactly
match their usages. Rename to more generic names and implement param
selection helper sata_ehc_deb_timing() which uses EHI_HOTPLUGGED to
select params.
Combined with the previous EHI_RESUME_LINK differentiation, this makes
parameter selection accurate. e.g. user scan resumes link but normal
deb param is used instead of hotplug param.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement ATA_EHI_RESUME_LINK, which indicates that the link needs to
be resumed. This used to be implied by ATA_EHI_HOTPLUGGED. However,
hotplug isn't the only event which requires link resume and separating
this out allows other places to request link resume. This
differentiation also allows better debounce timing selection.
This patch converts user scan to use ATA_EHI_RESUME_LINK.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ap_lock was used because &ap->host_set->lock was too long and used a
lot. Now that &ap->host_set->lock is replaced with ap->lock, there's
no reason to keep ap_lock.
[ed. note: that's not entirely true. ap_lock is a local variable,
caching the results of a de-ref. In theory, if the compiler is smart
enough, this patch is cosmetic. However, since this is not a fast
path (it is the error path), this patch is nonetheless acceptable,
even though it _may_ introduce a performance regression.]
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ata_eh_autopsy() used to directly assign determined action mask to
ehc->i.action thus overriding actions set by some of nested analyze
functions. This patch makes ata_eh_autopsy() add action masks just as
it's done in other places.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ap->flags is way too clamped. Separate out core dynamic flags to
ap->pflags. ATA_FLAG_DISABLED is a dynamic flag but left alone as
it's referenced by a lot of LLDs and it's gonna be removed once all
LLDs are converted to new EH.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
In preparation for SAS attached SATA devices, which will
not have a libata scsi_host, only setup host->max_cmd_len
if ap->host exists.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Hi,
sata_vsc is an MMIO device, and should use the correct data_xfer
function. This problem was introduced by:
commit a6b2c5d475
Author: Alan Cox <alan@lxorguk.ukuu.org.uk>
Date: Mon May 22 16:59:59 2006 +0100
[PATCH] PATCH: libata. Add ->data_xfer method
Signed-off-by: Martin Hicks <mort@bork.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
lockdep needs to have the waitqueue lock initialized for on-stack waitqueues
implicitly initialized by DECLARE_COMPLETION(). Annotate on-stack completions
accordingly.
Has no effect on non-lockdep kernels.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Conflicts:
drivers/scsi/nsp32.c
drivers/scsi/pcmcia/nsp_cs.c
Removal of randomness flag conflicts with SA_ -> IRQF_ global
replacement.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
There was a logic fault in scsi_io_completion() where zero transfer
commands that complete successfully were sent to the block layer as
not up to date. This patch removes the if (good_bytes > 0) gate
around the successful completion, since zero transfer commands do have
good_bytes == 0.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix section mismatch warnings:
WARNING: drivers/scsi/qla1280.o - Section mismatch: reference to
.init.data: from .text between 'qla1280_get_token' (at offset 0x2a16)
and 'qla1280_probe_one'
WARNING: drivers/scsi/qla1280.o - Section mismatch: reference to
.init.data: from .text between 'qla1280_get_token' (at offset 0x2a3c)
and 'qla1280_probe_one'
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Make some needlessly global functions static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add a few spaces to MODULE_PARM_DESC() text for qla2xxx. Without these
spaces text runs together when modinfo prints the text.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Acked-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I got a NULL derefrence in cdev_del+1 when called from sg_remove. By looking at
the code of sg_add, sg_alloc and sg_remove (all in drivers/scsi/sg.c) I found
out that sg_add is calling sg_alloc but if it fails afterwards it does not
deallocate the space that was allocated in sg_alloc and the redundant entry has
NULL in cdev. When sg_remove is being called, it tries to perform cdev_del to
this NULL cdev and fails.
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Acked-by: Douglas Gilbert <dougg@torque.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6:
[PATCH] i386: export memory more than 4G through /proc/iomem
[PATCH] 64bit Resource: finally enable 64bit resource sizes
[PATCH] 64bit Resource: convert a few remaining drivers to use resource_size_t where needed
[PATCH] 64bit resource: change pnp core to use resource_size_t
[PATCH] 64bit resource: change pci core and arch code to use resource_size_t
[PATCH] 64bit resource: change resource core to use resource_size_t
[PATCH] 64bit resource: introduce resource_size_t for the start and end of struct resource
[PATCH] 64bit resource: fix up printks for resources in misc drivers
[PATCH] 64bit resource: fix up printks for resources in arch and core code
[PATCH] 64bit resource: fix up printks for resources in pcmcia drivers
[PATCH] 64bit resource: fix up printks for resources in video drivers
[PATCH] 64bit resource: fix up printks for resources in ide drivers
[PATCH] 64bit resource: fix up printks for resources in mtd drivers
[PATCH] 64bit resource: fix up printks for resources in pci core and hotplug drivers
[PATCH] 64bit resource: fix up printks for resources in networks drivers
[PATCH] 64bit resource: fix up printks for resources in sound drivers
[PATCH] 64bit resource: C99 changes for struct resource declarations
Fixed up trivial conflict in drivers/ide/pci/cmd64x.c (the printk that
was changed by the 64-bit resources had been deleted in the meantime ;)
A bit of a brown paper bag issue. The previous patch to remove the soon
to be ripped out fields that were used in autosense actually broke the
driver. This patch fixes it and has been tested (honestly).
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch adds or modifies the transport class functions
used to notify userspace of session state events.
We modify the session addition up event and add a destruction event
to notify userspace of session creation, relogin and destruction.
And we modify the conn error event to be sent by broadcast
since multiple listeners may want to listen for it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
So the drivers do not use the channel numbers, but some do
use the target numbers. We were just adding some goofy
variable that just increases for the target nr. This is useless
for software iscsi because it is always zero. And for qla4xxx
the target nr is actually the index of the target/session
in its FW or FLASH tables. We needed to expose this to userspace
so apps could access those numbers so this patch just adds the
target nr to the iscsi session creation functions. This way
when qla4xxx's Hw thinks a session is at target nr 4
in its hw, it is exposed as that number in sysfs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
qla4xxx is initialized in two steps like other HW drivers.
It allocates the host, sets up the HW, then adds the host.
For iscsi part of HW setup is setting up persistent iscsi
sessions. At that time, the interupts are off and the driver
is not completely set up so we just want to allocate them.
We do not want to add them to sysfs and expose them to userspace
because userspace could try to do lots of fun things with them
like scanning and at that time the driver is not ready.
So this patch breakes up the session creation like other
functions that use the driver model in two the alloc
and add parts. When the driver is ready, it can then add
the sessions and userspace can begin using them.
This also fixes a bug in the addition error patch where
we forgot to do a get on the session.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I do not remember what I was thinking when we added the channel
as a argument to the session create function. It was probably
due to too much cut and paste work from the FC transport class.
The channel is meaningless for iscsi drivers so this patch drops
its usage everywhere in the iscsi related code.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iscsi_tcp and iser cannot be rmmod from the kernel when sessions
are running because session removal is driven from userspace. For
those modules we get a module reference when a session is
created then drop it when the session is removed.
For qla4xxx, they can jsut remove the sessions from the pci remove
function like normal HW drivers, so this patch moves the module
reference from the transport class functions shared by all
drivers to the libiscsi functions only used be software iscsi
modules.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Convert iscsi_tcp to new lib functions.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Reduce duplication in the software iscsi_transport modules by
adding a libiscsi function to handle the common grunt work.
This also has the drivers return specifc -EXXX values for different
errors so userspace can finally handle them in a sane way.
Also just pass the sysfs buffers to the drivers so HW iscsi can
get/set its string values, like targetname, and initiatorname.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Patch from david.somayajulu@qlogic.com:
Add target discovery event. We may have a setup where the iscsi traffic
is on a different netowrk than the other network traffic. In this case
we will want to do discovery though the iscsi card. This patch adds
a event to the transport class that can be used by hw iscsi cards that
support this.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I noticed that in_use in st_buffer is not used. The patch below
against 2.6.17-rc3 removes it, assuming there is no future use for it.
It was tested in a sparc SS20 with a DLT4000.
Signed-off-by: Martin Habets <errandir_news@mph.eclipse.co.uk>
Acked-by: Kai Mkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Conflicts:
drivers/scsi/aacraid/comminit.c
Fixed up by removing the now renamed CONFIG_IOMMU option from
aacraid
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The atp870u driver is the largest stack eater reported by checkstack
(on x86_864, allmodconfig). This converts the offending function
to kmalloc+kfree struct atp_unit instead of allocating it on the stack.
Was:
0x0000164c atp870u_probe [atp870u]: 3176
Now:
0x0000164c atp870u_probe [atp870u]: 408
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
this patch introduces a port object, separates out ports and phys,
with ports becoming the primary objects of the tree.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If a device gets offlined as a result of the Inquiry sent
during scanning, the following oops can occur. After the
disk gets put into the SDEV_OFFLINE state, the error handler
sends back the failed inquiry, which wakes the thread doing
the scan. This starts a race between the scanning thread
freeing the scsi device and the error handler calling
scsi_run_host_queues to restart the host. Since the disk
is in the SDEV_OFFLINE state, scsi_device_get will still
work, which results in __scsi_iterate_devices getting
a reference to the scsi disk when it shouldn't.
The following execution thread causes the oops:
CPU 0 (scan) CPU 1 (eh)
---------------------------------------------------------
scsi_probe_and_add_lun
....
scsi_eh_offline_sdevs
scsi_eh_flush_done_q
scsi_destroy_sdev
scsi_device_dev_release
scsi_restart_operations
scsi_run_host_queues
__scsi_iterate_devices
get_device
scsi_device_dev_release_usercontext
scsi_run_queue
<---OOPS--->
The patch fixes this by changing the state of the sdev to SDEV_DEL
before doing the final put_device, which should prevent the race
from occurring.
Original oops follows:
Badness in kref_get at lib/kref.c:32
Call Trace:
[C00000002F4476D0] [C00000000000EE20] .show_stack+0x68/0x1b0 (unreliable)
[C00000002F447770] [C00000000037515C] .program_check_exception+0x1cc/0x5a8
[C00000002F447840] [C00000000000446C] program_check_common+0xec/0x100
Exception: 700 at .kref_get+0x10/0x28
LR = .kobject_get+0x20/0x3c
[C00000002F447B30] [C00000002F447BC0] 0xc00000002f447bc0 (unreliable)
[C00000002F447BB0] [C000000000254BDC] .get_device+0x20/0x3c
[C00000002F447C30] [D000000000063188] .scsi_device_get+0x34/0xdc [scsi_mod]
[C00000002F447CC0] [D0000000000633EC] .__scsi_iterate_devices+0x50/0xbc [scsi_mod]
[C00000002F447D60] [D00000000006A910] .scsi_run_host_queues+0x34/0x5c [scsi_mod]
[C00000002F447DF0] [D000000000069054] .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[C00000002F447EE0] [C00000000007B4E0] .kthread+0x128/0x178
[C00000002F447F90] [C000000000025E84] .kernel_thread+0x4c/0x68
Unable to handle kernel paging request for <7>PCI: Enabling device: (0002:41:01.1), cmd 143
data at address 0x000001b8
Faulting instruction address: 0xd0000000000698e4
sym1: <1010-66> rev 0x1 at pci 0002:41:01.1 irq 216
sym1: No NVRAM, ID 7, Fast-80, LVD, parity checking
sym1: SCSI BUS has been reset.
scsi2 : sym-2.2.2
cpu 0x0: Vector: 300 (Data Access) at [c00000002f447a30]
pc: d0000000000698e4: .scsi_run_queue+0x2c/0x218 [scsi_mod]
lr: d00000000006a904: .scsi_run_host_queues+0x28/0x5c [scsi_mod]
sp: c00000002f447cb0
msr: 9000000000009032
dar: 1b8
dsisr: 40000000
current = 0xc0000000045fecd0
paca = 0xc00000000048ee80
pid = 1123, comm = scsi_eh_1
enter ? for help
[c00000002f447d60] d00000000006a904 .scsi_run_host_queues+0x28/0x5c [scsi_mod]
[c00000002f447df0] d000000000069054 .scsi_error_handler+0xdb4/0xe44 [scsi_mod]
[c00000002f447ee0] c00000000007b4e0 .kthread+0x128/0x178
[c00000002f447f90] c000000000025e84 .kernel_thread+0x4c/0x68
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This is a resend of a patch I generated in response to an email sent
by Ruben Faelens <parasietje@gmail.com>. His original email to
linux-scsi requested a method in which he could spin down a scsi disk
when not in use and have the kernel automatically spin it back up when
an I/O was generated to the disk. The infrastructure to automatically
spin a disk up has been in the scsi error handler for some time now,
but it is not enabled by default. This patch adds an sd sysfs attribute
which allows userspace to enable this behavior.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
locking init cleanups:
- convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
- convert rwlocks in a similar manner
this patch was generated automatically.
Motivation:
- cleanliness
- lockdep needs control of lock initialization, which the open-coded
variants do not give
- it's also useful for -rt and for lock debugging in general
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is needed if we wish to change the size of the resource structures.
Based on an original patch from Vivek Goyal <vgoyal@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Original post was incorrect as it didn't realize that we already had
a self-referenc due to device_initialize(), and we were really only
missing the put on our own reference. This was hidden by the other bug
which had the midlayer reusing stargets after they were already free,
which was doing too many puts on our rport.
Updating FC transport for:
- Add put in fc_rport_final_delete(), to release the rport.
Prior, we were leaving the rport with a reference, thus the shost
with references, etc. If the driver was unloaded, shosts and rports
remained, along with work threads, etc
- Fix fc_rport_create failure path - too many put's on parent
- Add commenting to easily track ref taking.
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Updated patch to address comments from Pat Mansfield and Michael Reed:
Bumped max to 600 (10mins). Set default dev_loss_tmo to a value other
than the max (30s).
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In a prior posting to linux-scsi on the fc transport and workq
deadlocks, we noted a second error that did not have a patch:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114467847711383&w=2
- There's a deadlock where scsi_remove_target() has to sit behind
scsi_scan_target() due to contention over the scan_lock().
Subsequently we posted a request for comments about the deadlock:
http://marc.theaimsgroup.com/?l=linux-scsi&m=114469358829500&w=2
This posting resolves the second error. Here's what we now understand,
and are implementing:
If the lldd deletes the rport while a scan is active, the sdev's queue
is blocked which stops the issuing of commands associated with the scan.
At this point, the scan stalls, and does so with the shost->scan_mutex held.
If, at this point, if any scan or delete request is made on the host, it
will stall waiting for the scan_mutex.
For the FC transport, we queue all delete work to a single workq.
So, things worked fine when competing with the scan, as long as the
target blocking the scan was the same target at the top of our delete
workq, as the delete workq routine always unblocked just prior to
requesting the delete. Unfortunately, if the top of our delete workq
was for a different target, we deadlock. Additionally, if the target
blocking scan returned, we were unblocking it in the scan workq routine,
which really won't execute until the existing stalled scan workq
completes (e.g. we're re-scheduling it while it is in the midst of its
execution).
This patch moves the unblock out of the workq routines and moves it to
the context that is scheduling the work. This ensures that at some point,
we will unblock the target that is blocking scan. Please note, however,
that the deadlock condition may still occur while it waits for the
transport to timeout an unblock on a target. Worst case, this is bounded
by the transport dev_loss_tmo (default: 30 seconds).
Finally, Michael Reed deserves the credit for the bulk of this patch,
analysis, and it's testing. Thank you for your help.
Note: The request for comments statements about the gross-ness of the
scan_mutex still stand.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This removes the duplicate functionality which had been added to
the lpfc driver.
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The scsi midlayer portion of the patch
Signed-off-by: James Smart <James.Smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Ata_piix's Kconfig entry still refers only to ICH5, while it supports ICH6
through 8. This creates confusion with people who are looking to see
if their newer SATA enabled motherboards are supported. The
following patch makes this clear.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Some SATA controllers embedded in ATI IXPs seem to have broken
SATA_IRQ bit in their bmdma2 registers which is always stuck at 1.
This makes the driver believe that there has been a hotplug event and
freeze the port whenever there's an interrupt thus failing all
commands.
This patch disables SATA_IRQ for those controllers.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Cosmetic updates in libata-core.c.
* trim trailing whitespaces
* break lines which are over 80 column
* kill unnecessary braces
* make indentation consistent
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
From: Andrew Morton <akpm@osdl.org>
Provide a module parameter to override the default 30-second-per-device SATA
probing timeout.
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Make ata_do_simple_cmd() and ata_flush_cache() global. These will be
used from libata-eh.c.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
* the function has always returned AC_ERR_* masks not -errno but its
return type was int. Make return type unsigned int.
* don't print error message automatically. it's the caller's
responsibility.
* add header comment
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Clear related EH action on device detach such that new device doesn't
receive EH actions scheduled for the old one.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Implement and use ata_eh_dev_action() which returns EH action mask for
a device.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Partially revert 74d0a988d3:
[PATCH] PCI: Move various PCI IDs to header file
libata policy is to avoid use of named PCI device ID constants.
These are often single-use constants, which have little value over
direct numeric constants save for constant include/linux/pci_ids.h
patching/merging headaches.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This may seem like a DILLIGAF, but after chatting with the F/W folks,
there is no harm in dropping the page calculation as denoted in the
enclosed patch for these older adapters in this new age of 4GB+ memory
sticks. Any resource optimization within the old-old-old adapters for
systems with less than 4G of memory is of little consequence. The
existing AAC_QUIRK_31BIT flag in linit.c should look after the rest of
the legacy hardware DMA limitations.
Signed-off-by: Mark Salyzyn <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The scsi layer is already calling add_disk_randomness in scsi_end_request.
Signed-off-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>