1
Commit Graph

4311 Commits

Author SHA1 Message Date
Roland Dreier
3d155f8cd0 [IB] mthca: first pass at catastrophic error reporting
Add some initial support for detecting and reporting catastrophic
errors reported by Mellanox HCAs.  We start a periodic timer which
polls the catastrophic error reporting buffer in device memory.  If an
error is detected, we dump the contents of the buffer for port-mortem
debugging, and report a fatal asynchronous error to higher levels.

In the future we can try to recover from these errors by resetting the
device, but this will require some work in higher-level code as well.
Let's get this in now, so that we at least get catastrophic errors
reported in logs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-27 11:03:38 -07:00
Roland Dreier
7cc656efb5 [IB] simplify mad_rmpp.c:alloc_response_msg()
Change alloc_response_msg() in mad_rmpp.c to return the struct
it allocates directly (or an error code a la ERR_PTR), rather than
returning a status and passing the struct back in a pointer param.
This simplifies the code and gets rid of warnings like

    drivers/infiniband/core/mad_rmpp.c: In function nack_recv:
    drivers/infiniband/core/mad_rmpp.c:192: warning: msg may be used uninitialized in this function

with newer versions of gcc.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-25 15:13:54 -07:00
Roland Dreier
547e309073 [IB] mthca: correct modify QP attribute masks for UC
The UC transport does not support RDMA reads or atomic operations, so
we shouldn't require or even allow the consumer to set attributes
relating to these operations for UC QPs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-25 10:57:32 -07:00
Sean Hefty
34816ad98e [IB] Fix MAD layer DMA mappings to avoid touching data buffer once mapped
The MAD layer was violating the DMA API by touching data buffers used
for sends after the DMA mapping was done.  This causes problems on
non-cache-coherent architectures, because the device doing DMA won't
see updates to the payload buffers that exist only in the CPU cache.

Fix this by having all MAD consumers use ib_create_send_mad() to
allocate their send buffers, and moving the DMA mapping into the MAD
layer so it can be done just before calling send (and after any
modifications of the send buffer by the MAD layer).

Tested on a non-cache-coherent PowerPC 440SPe system.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-25 10:51:39 -07:00
Sean Hefty
ae7971a770 [IB] CM: Fix initialization of QP attributes for UC QPs.
Fix cm_init_qp_init_attr(), cm_init_qp_rtr_attr() and cm_init_qp_rts_attr()
so that they correctly handle the differences between UC and RC QPs.  This
fixes problems with setting up UC QPs through the CM.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-24 12:33:56 -07:00
Roland Dreier
ec329a1359 Manual merge of for-linus to upstream (fix conflicts in drivers/infiniband/core/ucm.c) 2005-10-24 10:55:29 -07:00
Roland Dreier
5d7edb3c1a [IB] Add idr_destroy() calls on module unload
Add idr_destroy() calls to the module_exit() functions of the four IB
driver modules that use idrs, so we don't leak idr_layer_cache objects
when these modules are unloaded.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-24 10:53:25 -07:00
Linus Torvalds
ba9e358fd0 Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-for-linus-2.6 2005-10-23 17:13:14 -07:00
Roland Dreier
75eeec2f3f [PATCH] ib: mthca: Always re-arm EQs in mthca_tavor_interrupt()
We should always re-arm an event queue's interrupt in
mthca_tavor_interrupt() if the corresponding bit is set in the event cause
register (ECR), even if we didn't find any entries in the EQ.  If we don't,
then there's a window where we miss an EQ entry and then get stuck because
we don't get another EQ event.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-23 16:38:39 -07:00
Mike Krufky
c0fef676bb [PATCH] Kconfig: saa7134-dvb should not select cx22702
On 2005-05-01, Gerd Knorr sent in a patch to add cx22702 to cx88-dvb:

 [PATCH] dvb: cx22702 frontend driver update
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9990d744bea7d28e83c420e2c9d524c7a8a2d136

...but as we can see, the Kconfig portion of his patch was incorrectly
applied to saa7134-dvb instead of cx88-dvb.

On 2005-06-24, Adrian bunk fixed cx88-dvb:

 [PATCH] VIDEO_CX88_DVB must select DVB_CX22702
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=d6988588e13616587aa879c2e0bd7cd811705e5d

...but we never removed the original patch from Gerd.

This patch sets things straight:

saa7134-dvb should not select cx22702

Signed-off-by: Michael Krufky <mkrufky@m1k.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-23 16:38:39 -07:00
Linus Torvalds
4196c3af25 cardbus: limit IO windows to 256 bytes
That's what we've always historically done, and bigger windows seem to
confuse some cardbus bridges. Or something.

Alan reports that this makes the ThinkPad 600x series work properly
again: the 4kB IO window for some reason made IDE DMA not work, which
makes IDE painfully slow even if it works after DMA timeouts.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-23 16:31:16 -07:00
Dave Airlie
e29971f9a4 [PATCH] drm: another mga bug
The wrong state emission routines were being called for G550, and
consistent maps weren't correctly mapped...

Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-21 12:18:09 -07:00
Eric Moore
024358eeaf [PATCH] mptsas: fix phy identifiers
This fixes handling of the phy identifiers in mptsas.

Signed-off-by: Eric Moore <Eric.Moore@lsil.com>
[ split it a pre-2.6.14 portion from Eric's bigger patch ]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-21 12:17:43 -07:00
Roland Dreier
bbf2078609 [IB] user_mad: Use class_device.devt
Use devt member of struct class_device so that we don't have to create
our own "dev" file in sysfs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-20 12:54:01 -07:00
Roland Dreier
2e0c512aff [IB] user_mad: trivial coding style fixes
Add spaces after "sizeof" operator to match the rest of file.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-20 12:30:16 -07:00
Roland Dreier
3910f44d79 [IB] cm: Add missing break in switch
Add missing "break" in switch statement.  Without the break, the
CM ended up always falling through and setting every connection
request to use RC transport, which meant that UC connections
didn't work.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-20 12:29:36 -07:00
Steven Rostedt
461a0ffbec [PATCH] scsi_error thread exits in TASK_INTERRUPTIBLE state.
Found in the -rt patch set.  The scsi_error thread likely will be in the
TASK_INTERRUPTIBLE state upon exit.  This patch fixes this bug.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:16:21 -07:00
Peter Chubb
51b190b304 [PATCH] `unaligned access' in acpi get_root_bridge_busnr()
In drivers/acpi/glue.c the address of an integer is cast to the address of
an unsigned long.  This breaks on systems where a long is larger than an
int --- for a start the int can be misaligned; for a second the assignment
through the pointer will overwrite part of the next variable.

Signed-off-by: Peter Chubb <peterc@gelato.unsw.edu.au>
Acked-by: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:04:31 -07:00
Dave Airlie
11909d6438 [PATCH] fix MGA DRM regression before 2.6.14
I've gotten a report on lkml, of a possible regression in the MGA DRM in
2.6.14-rc4 (since -rc1), I haven't been able to reproduce it here, but I've
figured out some possible issues in the mga code that were definitely
wrong, some of these are from DRM CVS, the main fix is the agp enable bit
on the old code path still used by everyone.....

Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:04:31 -07:00
NeilBrown
6985c43f39 [PATCH] Three one-liners in md.c
The main problem fixes is that in certain situations stopping md arrays may
take longer than you expect, or may require multiple attempts.  This would
only happen when resync/recovery is happening.

This patch fixes three vaguely related bugs.

1/ The recent change to use kthreads got the setting of the
   process name wrong.  This fixes it.
2/ The recent change to use kthreads lost the ability for
   md threads to be signalled with SIG_KILL.  This restores that.
3/ There is a long standing bug in that if:
    - An array needs recovery (onto a hot-spare) and
    - The recovery is being blocked because some other array being
       recovered shares a physical device and
    - The recovery thread is killed with SIG_KILL
   Then the recovery will appear to have completed with no IO being
   done, which can cause data corruption.
   This patch makes sure that incomplete recovery will be treated as
   incomplete.

Note that any kernel affected by bug 2 will not suffer the problem of bug
3, as the signal can never be delivered.  Thus the current 2.6.14-rc
kernels are not susceptible to data corruption.  Note also that if arrays
are shutdown (with "mdadm -S" or "raidstop") then the problem doesn't
occur.  It only happens if a SIGKILL is independently delivered as done by
'init' when shutting down.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:04:30 -07:00
Andy Wingo
4a9949d7ac [PATCH] raw1394: fix locking in the presence of SMP and interrupts
Changes all spinlocks that can be held during an irq handler to disable
interrupts while the lock is held.  Changes spin_[un]lock_irq to use the
irqsave/irqrestore variants for robustness and readability.

In raw1394.c:handle_iso_listen(), don't grab host_info_lock at all -- we're
not accessing host_info_list or host_count, and holding this lock while
trying to tasklet_kill the iso tasklet this can cause an ABBA deadlock if
ohci:dma_rcv_tasklet is running and tries to grab host_info_lock in
raw1394.c:receive_iso.  Test program attached reliably deadlocks all SMP
machines I have been able to test without this patch.

Signed-off-by: Andy Wingo <wingo@pobox.com>
Acked-by: Ben Collins <bcollins@ubuntu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:04:30 -07:00
Andrew Morton
c367c21c93 [PATCH] orinoco: limit message rate
Brice Goglin <Brice.Goglin@ens-lyon.org> reports a printk storm from this
driver.  Fix.

Acked-by: David Gibson <hermes@gibson.dropbear.id.au>
Cc: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-19 23:04:30 -07:00
Steven Rostedt
055787e447 [SCSI] scsi_error thread exits in TASK_INTERRUPTIBLE state.
Found in the -rt patch set.  The scsi_error thread likely will be in the
TASK_INTERRUPTIBLE state upon exit.  This patch fixes this bug.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
2005-10-19 09:53:59 -04:00
Jack Morgenstein
7150bf8a98 [IB] mthca: Don't enter QP into MCG more than once.
Avoid entering a QP as member of a multicast group multiple times.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-18 14:46:38 -07:00
Roland Dreier
ba8e931024 [IB] Fail sysfs queries after device is unregistered
We keep IB device structures around until the last sysfs reference is
gone, but we shouldn't ask the low-level driver to do anything after
the LLD unregisters the device.  To handle this, check the reg_state
field and just fail sysfs show() requests if the device has already
been unregistered.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-18 14:14:56 -07:00
Roland Dreier
d476306f1c [IB] mthca: Add struct pci_driver.owner field
Set mthca_driver.owner to THIS_MODULE.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-18 14:07:07 -07:00
Roland Dreier
c6f5cb7be0 [IB] mthca: Use enum in mthca_alloc_db() prototype
Make the type parameter of mthca_alloc_db() be an enum mthca_db_type
instead of an int.  This doesn't have any practical effect but
documents the functions a little better.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-18 13:22:16 -07:00
Roland Dreier
4b2d319b53 [IPoIB] Improve ipoib_timeout() output
Use jiffies_to_msecs() so we print a human-readable time so
we don't have to worry about what HZ is configured to, and
print out a few values to make post-mortem analysis easier.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-18 12:20:06 -07:00
Antonino A. Daplas
bb7e257ef8 [PATCH] vesafb: Fix display corruption on display blank
Reported by: Bob Tracy <rct@gherkin.frus.com>

 "...I've got a Toshiba notebook (730XCDT -- Pentium 150MMX) for which
  I'm using the Vesa FB driver.  When the machine has been idle for some
  time and the driver attempts to powerdown the display, rather than the
  display going blank, it goes gray with several strange lines.  When I
  hit the "shift" key or other-wise wake up the display, the old video
  state is not fully restored..."

vesafb recently added a blank method which has only 2 states, powerup and
powerdown.  The powerdown state is used for all blanking levels, but in his
case, powerdown does not work correctly for higher levels of display
powersaving. Thus, for intermediate power levels, use software blanking,
and use only hardware blanking for an explicit powerdown.

Signed-off-by: Antonino Daplas <adaplas@pol.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-18 08:43:29 -07:00
Linus Torvalds
ace7c76937 Merge master.kernel.org:/home/rmk/linux-2.6-serial 2005-10-18 08:40:46 -07:00
Linus Torvalds
1e65174a33 Add some basic .gitignore files
This still leaves driver and architecture-specific subdirectories alone,
but gets rid of the bulk of the "generic" generated files that we should
ignore.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-18 08:26:15 -07:00
Dmitry Torokhov
e7507ed91e [PATCH] uniput - fix crash on SMP
Only signal completion after marking request slot as free, otherwise other
processor can free request structure before we finish using it.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17 17:03:57 -07:00
Pavel Machek
5cc9eeef9a [PATCH] Fix /proc/acpi/events around suspend
Fix -EIO on /proc/acpi/events after suspends.  This actually breaks
suspending by power button in many setups.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17 17:03:57 -07:00
Stephan Brodkorb
9ac0b9c192 [PATCH] n_r3964 mod_timer() fix
Since Revision 1.10 was released the n_r3964 module wasn't able to receive any
data.  The reason for that behavior is because there were some wrong calls of
mod_timer(...) in the function receive_char (...).  This patch should fix this
problem and was successfully tested with talking to some kuka industrial
robots.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-17 17:03:57 -07:00
Roland Dreier
762a03e21e [IB] ucm: quiet sparse warnings
Make ctx_id_mutex and ctx_id_table static to quiet sparse warnings.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:38:50 -07:00
Sean Hefty
07d357d0cb [IB] CM: bind IDs to a specific device
Bind communication identifiers to a device to support device removal.
Export per HCA CM devices to userspace.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
2005-10-17 15:37:43 -07:00
Sean Hefty
595e726a1f [IB] merge ucm.h into ucm.c
Eliminate ucm.h.  Replace ucm_dbg with direct call to printk KERN_ERR.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
2005-10-17 15:33:47 -07:00
Roland Dreier
67cdb40ca4 [IB] uverbs: Implement more commands
Add kernel support for userspace calling poll CQ, request CQ
notification, post send, post receive, post SRQ receive, create AH and
destroy AH commands.  These commands allow us to support userspace
verbs for devices that can't perform these operations directly from
userspace (eg the PathScale HCA).

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:31 -07:00
Roland Dreier
883a99c702 [IB] uverbs: Add a mask of device methods allowed for userspace
Give each device a uverbs_cmd_mask, so that a low-level driver can
control which methods may be called on behalf of userspace.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:30 -07:00
Roland Dreier
56c202d6e4 [IB] fail SA queries if device initialization failed
If the SA query module's initialization fails for a device, then that
device won't have a struct ib_sa_device associated.  We should fail SA
queries in that case, rather than blindly dereferencing the NULL
pointer we get back from ib_get_client_data().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:29 -07:00
Roland Dreier
305a7e8705 [IB] uverbs: unlock correctly in error paths
A couple of functions were missing spin_unlock calls in error paths.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:29 -07:00
Roland Dreier
5b6810e048 [IPoIB] Rename ipoib_create_qp() -> ipoib_init_qp() and fix error cleanup
ipoib_create_qp() no longer creates IPoIB's QP, so it shouldn't
destroy the QP on failure -- that unwinding happens elsewhere, so the
current code can cause a double free.  While we're at it, the
function's name should match what it actually does, so rename it to
ipoib_init_qp().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:29 -07:00
Jack Morgenstein
efaae8f71f [IB] mthca: Better limit checking and reporting
Check the sizes of CQs, QPs and SRQs when creating objects, and fail
instead of creating too-big queues.  Also return real limits instead
of just plausible-sounding values from mthca_query_device().

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:29 -07:00
Roland Dreier
4ab6fb7e5b [IB] Fix leak on MAD initialization failure
There is a bug in ib_mad_init_device(): if ib_agent_port_open() fails
for a given port, then the current code doesn't call ib_mad_port_close()
for that port.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:28 -07:00
Roland Dreier
e23d6d2b09 [IB] mthca: detect SRQ overflow
The hardware relies on us keeping one extra work request that never
gets used in SRQs.  Add checks to the SRQ work request posting
functions so that they fail when someone is about to use up that extra
work request, rather than when someone uses the very last work request.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:28 -07:00
Roland Dreier
90f104da22 [IB] mthca: SRQ limit reached events
Our hardware supports generating an event when the number of receives
posted to a shared receive queue (SRQ) falls below a user-specified
limit.  Implement mthca_modify_srq() to arm the limit, and add code to
handle dispatching SRQ events when they occur.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:28 -07:00
Roland Dreier
116c0074ec [IB] Check port number in ib_query_port()/ib_modify_port()
Check port number before passing query_port or modify_port operations
on to device driver.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:27 -07:00
Roland Dreier
f575394f1d [IB] uverbs: reject invalid memory registration permission flags
Reject userspace memory registrations with invalid permission flags:
"local write" is required if "remote write" or "remote atomic" is also
requested.

Pointed out by Jack Morgenstein <jackm@mellanox.co.il>

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:27 -07:00
Jack Morgenstein
9825051e8c [IB] mthca: Fill in more fields in query_port method
Add code to fill in the bad_pkey_cntr, max_mtu, active_mtu and
subnet_timeout fields in mthca_query_port().

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:26 -07:00
Roland Dreier
274c089163 [IB] uverbs: Add device-specific ABI version attribute
Add abi_version attribute to uverbs class devices to allow for
ABI versioning of device-specific interfaces.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-10-17 15:20:26 -07:00