Received from Mark Salyzyn
If the adapter is in blinkled (Firmware Assert) when error recovery
timeout actions have been triggered, perform an adapter warm reset and
restart the initialization.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
The enclosed patch cleans up some code fragments, adds some paranoia
(unproven causes of potential driver failures).
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
If the adapter should be in a blinkled (Firmware Assert) state when the
driver loads, we will perform a warm restart of the Adapter Firmware to
see if we can rescue the adapter. Possible causes of a blinkled can
occur on some early release motherboard BIOSes, transitory PCI bus
problems on embedded systems or non-x86 based architectures, transitory
startup failures of early release drives or transitory hardware
failures; some of which can bite the adapter later at runtime. Future
enhancements will include recovery during runtime.
Fixed extra whitespace space issue.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Received from Mark Salyzyn
This patch allows the FSACTL_SEND_LARGE_FIB, FSACTL_SENDFIB and
FSACTL_SEND_RAW_SRB ioctl calls into the aacraid driver to be
interruptible. Only necessary if the adapter and/or the management
software has gone into some sort of misbehavior and the system is being
rebooted, thus permitting the user management software applications to
be killed relatively cleanly. The FIB queue resource is held out of the
free queue until the adapter finally, if ever, completes the command.
Signed-off-by: Mark Haverkamp <markh@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Attached is a patch that should limit a possible recursion that can
lead to a stack overflow like follows:
Kernel stack overflow.
CPU: 3 Not tainted
Process zfcperp0.0.d819
(pid: 13897, task: 000000003e0d8cc8, ksp: 000000003499dbb8)
Krnl PSW : 0404000180000000 000000000030f8b2 (get_device+0x12/0x48)
Krnl GPRS: 00000000135a1980 000000000030f758 000000003ed6c1e8 0000000000000005
0000000000000000 000000000044a780 000000003dbf7000 0000000034e15800
000000003621c048 070000003499c108 000000003499c1a0 000000003ed6c000
0000000040895000 00000000408ab630 000000003499c0a0 000000003499c0a0
Krnl Code: a7 fb ff e8 a7 19 00 00 b9 02 00 22 e3 e0 f0 98 00 24 a7 84
Call Trace:
([<000000004089edc2>] scsi_request_fn+0x13e/0x650 [scsi_mod])
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
[<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
...
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089ff8c>] scsi_queue_insert+0x22c/0x2a4 [scsi_mod]
[<000000004089779a>] scsi_dispatch_cmd+0x8a/0x3d0 [scsi_mod]
[<000000004089f1ec>] scsi_request_fn+0x568/0x650 [scsi_mod]
[<00000000002c5ff4>] blk_run_queue+0xd4/0x1a4
[<000000004089fa9e>] scsi_run_host_queues+0x196/0x230 [scsi_mod]
[<00000000409eba28>] zfcp_erp_thread+0x2638/0x3080 [zfcp]
[<0000000000107462>] kernel_thread_starter+0x6/0xc
[<000000000010745c>] kernel_thread_starter+0x0/0xc
<0>Kernel panic - not syncing: Corrupt kernel stack, can't continue.
This stack overflow occurred during tests on s390 using zfcp.
Recursion depth for this panic was 19.
Usually recursion between blk_run_queue and a request_fn is avoided
using QUEUE_FLAG_REENTER. But this does not help if the scsi stack
tries to flush the starved_list of a scsi_host.
Limit recursion depth when flushing the starved_list
of a scsi_host.
Signed-off-by: Andreas Herrmann <aherrman@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
A recent drivers base commit:
3e95637a48
Caused the bus to be added to dev_printk, so now our SCSI inquiry short
messages print like this:
scsiscsi 2:0:0:0: Direct access IBM-ESXS ST973401SS B519 PQ: 0 ANSI: 5
Just remove the "scsi" from the sdev_printk to compensate.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Replace scsi_device_types array API with scsi_device_type function API.
Gets rid of a lot of common code, as well as being easier to use.
- Add the new device types in SPC4 r05a, and rename some of the older ones.
- Reformat the printing of inquiry data; now fits on one line and
includes PQ.
I think I've addressed all the feedback from the previous versions. My
current test box prints:
scsi 2:0:1:0: Direct access HP 18.2G ATLAS10K3_18_SCA HP05 PQ: 0 ANSI: 2
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Fix up a logic error in the checking for valid sense data.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The ipr driver currently translates adapter recovered errors
to DID_ERROR. This patch fixes this to translate these
errors to success instead.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add definitions for some SAS error codes that can be
logged by ipr SAS adapters.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add some hardware defined types for SATA. This is required
by future patches to add SATA support to ipr.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Add host_supported_speeds, host_maxframe_size, host_speed, host_fabric_name,
host_port_type, host_port_state, and host_symbolic_name transport attributes
to fusion fibre channel.
Signed-off-by: Michael Reed <mdr@sgi.com>
Acked-by: Moore, Eric <Eric.Moore@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If you examine the queue_work() routine you'll see that it returns
1 on success, 0 if the work is already queued.
This patch corrects the source code documentation for the
scsi_queue_work function.
Signed-off-by: Michael Reed <mdr@sgi.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
These aren't used anymore since the field in scsi_cmnd where it was
stored has been removed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The PCI ID table in the DAC960 driver conflicts with some devices
that use the ipr driver. All ipr adapters that use this chip
have an IBM subvendor ID and all DAC960 adapters that use this
chip have a Mylex subvendor id.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove a lot of duplicate #defines from the advansys driver,
and make them look like PCI IDs as defined elsewhere in the kernel.
Also add a module table so that it automatically gets picked up
by tools relying on modinfo output (like say, distro installers).
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The sysfs files in arcmsr are non-standard in that they aren't simple
filename value pairs, the values actually contain preceeding text which
would have to be parsed. The idea of sysfs files is that the file name
is the description and the contents is a simple value.
Fix up arcmsr to conform to this standard.
Acked-By: Erich Chen <erich@areca.com.tw>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Remove sysfs_remove_bin_file() return-value checking from the areca driver.
There's nothing a driver can do if sysfs file removal fails, so we'll soon be
changing sysfs_remove_bin_file() to internally print a diagnostic and to
return void.
Cc: Erich Chen <erich@areca.com.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
arcmsr is a driver for the Areca Raid controller, a host based RAID
subsystem that speaks SCSI at the firmware level.
This patch is quite a clean up over the initial submission with
contributions from:
Randy Dunlap <rdunlap@xenotime.net>
Christoph Hellwig <hch@lst.de>
Matthew Wilcox <matthew@wil.cx>
Adrian Bunk <bunk@stusta.de>
Signed-off-by: Erich Chen <erich@areca.com.tw>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This takes advantage of the sas class backlink function to show which
port on an expander is used to communicate with the parent.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Adds support for change_queue_depth so that device
queue depth can be changed at runtime through sysfs.
Signed-off-by: <brking@charter.net>
Acked-by: Seokmann Ju <seokmann.ju@lsil.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This fixes three drivers to compile again after my patch that removes
the data_cmnd member from struct scsi_cmnd.
The fas216 change is trivial, it should have been using ->cmnd all the
time.
NCR53C9 (which seem to be mostly duplicate driver with esp.c!) is doing
something odd, it should only have looked at ->cmnd before not the saved
copy that is kept for the error handlers sake. Note that it really
should deal with the sync setting themselves but use the generic domain
validation code that get this right - but that's for later let's push
this simple compile fix for now.
And sorry for the late fix for this, I have been busy with OLS and
associated activities last week.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SCSI] esp: Fix build.
[SPARC]: Fix SA_STATIC_ALLOC value.
[SPARC64]: Explicitly print return PC when the kernel fault PC is bogus.
The patch below moves the cpu hotplugging higher up in the cpufreq
layering; this is needed to avoid recursive taking of the cpu hotplug
lock and to otherwise detangle the mess.
The new rules are:
1. you must do lock_cpu_hotplug() around the following functions:
__cpufreq_driver_target
__cpufreq_governor (for CPUFREQ_GOV_LIMITS operation only)
__cpufreq_set_policy
2. governer methods (.governer) must NOT take the lock_cpu_hotplug()
lock in any way; they are called with the lock taken already
3. if your governer spawns a thread that does things, like calling
__cpufreq_driver_target, your thread must honor rule #1.
4. the policy lock and other cpufreq internal locks nest within
the lock_cpu_hotplug() lock.
I'm not entirely happy about how the __cpufreq_governor rule ended up
(conditional locking rule depending on the argument) but basically all
callers pass this as a constant so it's not too horrible.
The patch also removes the cpufreq_governor() function since during the
locking audit it turned out to be entirely unused (so no need to fix it)
The patch works on my testbox, but it could use more testing
(otoh... it can't be much worse than the current code)
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Tetsuo Handa from-linux-kernel@i-love.sakura.ne.jp
The recvmsg() for raw socket seems to return random u16 value
from the kernel stack memory since port field is not initialized.
But I'm not sure this patch is correct.
Does raw socket return any information stored in port field?
[ BSD defines RAW IP recvmsg to return a sin_port value of zero.
This is described in Steven's TCP/IP Illustrated Volume 2 on
page 1055, which is discussing the BSD rip_input() implementation. ]
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
IP multicast route code was reusing an skb which causes use after free
and double free.
From: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Note, it is real skb_clone(), not alloc_skb(). Equeued skb contains
the whole half-prepared netlink message plus room for the rest.
It could be also skb_copy(), if we want to be puristic about mangling
cloned data, but original copy is really not going to be used.
Acked-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle dev_alloc_skb() failures when initializing the RX rings.
Without proper handling, the driver will crash when using a partial
ring.
Thanks to Stephane Doyon <sdoyon@max-t.com> for reporting the bug and
providing the initial patch.
Howie Xu <howie@vmware.com> also reported the same issue.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add tg3_restart_hw() to handle failures when re-initializing the
device.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The CIC_SEEKY() test really wants to use the minimum of either:
- 2 msecs (not jiffies)
- or, the pending slice time
So code it like that.
Signed-off-by: Jens Axboe <axboe@suse.de>
We need to postpone the queue startup until after the softirq
handler has actually finished some requests, otherwise we could
be racing with cciss_softirq_done() and not actually restart
the queue handling.
Signed-off-by: Jens Axboe <axboe@suse.de>
Clear the accumulated junk in IP6CB when starting to handle an IPV6
packet.
Signed-off-by: Guillaume Chazarain <guichaz@yahoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
After the recent problems with all the SCTP stuff it seems reasonable
to mark this as experimental.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add bridge netfilter deferred output hooks to feature-removal-schedule
and disable them by default. Until their removal they will be
activated by the physdev match when needed.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Locally generated broadcast and multicast packets have pkttype set to
PACKET_LOOPBACK instead of PACKET_BROADCAST or PACKET_MULTICAST. This
causes the pkttype match to fail to match packets of either type.
The below patch remedies this by using the daddr as a hint as to
broadcast|multicast. While not pretty, this seems like the only way
to solve the problem short of just noting this as a limitation of the
match.
This resolves netfilter bugzilla #484
Signed-off-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of an unknown verdict or NF_STOP the packet leaks. Unknown verdicts
can happen when userspace is buggy. Reinject the packet in case of NF_STOP,
drop on unknown verdicts.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
An RCF message containing a timeout results in a NULL-ptr dereference if
no RRQ has been seen before.
Noticed by the "SATURN tool", reported by Thomas Dillig <tdillig@stanford.edu>
and Isil Dillig <isil@stanford.edu>.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The data_cmd[] member got deleted, so do not use it any more. Scsi
commands do not have their ->cmd[] overwritten temporary to probe for
status after an error before retrying.
Signed-off-by: David S. Miller <davem@davemloft.net>
dev_alloc_skb is designated for RX descriptors, not TX. (Some drivers
use it for the latter anyway, but that's a different story)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
skbuff.h has an #ifndef CONFIG_HAVE_ARCH_DEV_ALLOC_SKB to allow
architectures to reimplement __dev_alloc_skb. It's not set on any
architecture and now that we have an architecture-overrideable
NET_SKB_PAD there is not point at all to have one either.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the queue of the underlying device is stopped at initialization time
or the device is marked "not present", the state will be propagated to the
vlan device and never change. Based on an analysis by Patrick McHardy.
Signed-off-by: Stefan Rompf <stefan@loplof.de>
ACKed-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
It doesn't compile, and it's dubious in several regards:
1) is enabled by non-Kconfig controlled CONFIG_* value
(noted by Randy Dunlap)
2) XFRM6_TUNNEL_SPI_MAGIC is defined after it's first use
3) the debugging messages print object pointer addresses
which have no meaning without context
So let's just get rid of it.
Signed-off-by: David S. Miller <davem@davemloft.net>