commit 518b1646 ("IPoIB/cm: Fix SRQ WR leak") introduced a severe
performance regression on Mellanox cards, because keeping a QP in the
error state for extended periods of time moves hardware to the slow
path (until the QP is destroyed). For example, MPI latency goes from
~3 usecs to ~7 usecs.
Fix this by posting a send WR on one of the QPs that are being
flushed, instead of using a separate drain QP that is kept in the
error state.
This fixes bug <https://bugs.openfabrics.org/show_bug.cgi?id=636>,
reported and bisected by Scott Weitzenkamp at Cisco and debugged by
Sasha Mikheev at Voltaire.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
mthca_free_err_wqe() currently treats both send and receive CQEs
identically if a QP is using an SRQ. But for Tavor hardware, send
CQEs with error can be chained together even if the RQ is part of SRQ,
so we may miss some CQEs.
Fix by following the WQE chain for all send CQEs even for non-SRQ QPs.
This fixes crashes in IPoIB CM:
<https://bugs.openfabrics.org//show_bug.cgi?id=604>
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC64]: Fill holes in hypervisor APIs and fix KTSB registry.
[SPARC64]: Fix two bugs wrt. kernel 4MB TSB.
[SPARC]: Mark as emulating cmpxchg, add appropriate depends for DRM.
[SPARC]: Emulate cmpxchg like parisc
[SPARC64]: Fix _PAGE_EXEC_4U check in sun4u I-TLB miss handler.
[SPARC]: Linux always started with 9600 8N1
[SPARC64]: arch/sparc64/time.c doesn't compile on Ultra 1 (no PCI)
[SPARC64]: Eliminate NR_CPUS limitations.
[SPARC64]: Use machine description and OBP properly for cpu probing.
[SPARC64]: Negotiate hypervisor API for PCI services.
[SPARC64]: Report proper system soft state to the hypervisor.
[SPARC64]: Fix typo in sun4v_hvapi_register error handling.
[SCSI] ESP: Kill SCSI_ESP_CORE and link directly just like jazz_esp
[SCSI] jazz_esp: Converted to use esp_core.
[SPARC64]: PCI device scan is way too verbose by default.
[SERIAL] sunzilog: section mismatch fix
[SPARC32]: Removes mismatch section warnigs in sparc time.c file
[SPARC64]: Don't be picky about virtual-dma values on sun4v.
[SPARC64]: Kill unused DIE_PAGE_FAULT enum value.
[SCSI] pluto: Use wait_for_completion_timeout.
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon/applesmc: Handle name file creation error and deletion
hwmon/applesmc: Simplify dependencies
hwmon-vid: Don't spam the logs when VRM version is missing
hwmon/w83627hf: Be quiet when no chip is found
hwmon/coretemp: Add more safety checks
hwmon/ds1621: Fix swapped temperature limits
* master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] megaraid_sas: intercept cmd timeout and throttle io
[SCSI] fusion: Fix |/|| confusion
[SCSI] aic94xx: asd_clear_nexus should fail if the cleared task does not complete
[SCSI] aic7xxx: fix aicasm build failure with gcc-3.4.6
[SCSI] aacraid: apply commit config for reset_devices flag
[SCSI] sd: fix refcounting regression in suspend/resume routines
[SCSI] aacraid: fix panic on short Inquiry
[SCSI] aacraid: Correct sa platform support. (Was: [Bug 8469] Bad EIP value on pentium3 SMP kernel-2.6.21.1)
[SCSI] NCR53C9x: correct spelling mistake in deprecation notice
[SCSI] tgt: fix a rdma indirect transfer error bug
[SCSI] MegaRAID: Update MAINTAINERS email-id
[SCSI] stex: minor cleanup and version update
[SCSI] stex: fix reset recovery for console device
[SCSI] stex: extend hard reset wait time
[SCSI] stex: fix id mapping issue
[SCSI] ipr: Proper return codes for eh_dev_reset for SATA devices
[SCSI] zfcp: IO stall after deleting and path checker changes after reenabling zfcp devices
[SCSI] zfcp: avoid clutter in erp_dbf
This patch (as912) replaces a couple of calls to flush_workqueue()
with cancel_sync_work() and cancel_rearming_delayed_work(). Using a
more directed approach allows us to avoid some nasty deadlocks. The
prime example occurs when a first-level device (the parent is a root
hub) is removed while at the same time the root hub gets a remote
wakeup request. khubd would try to flush the autosuspend workqueue
while holding the root-hub's lock, and the remote-wakeup workqueue
routine would be waiting to lock the root hub.
The patch also reorganizes the power management portion of
usb_disconnect(), separating it out into its own routine. The
autosuspend workqueue entry is cancelled immediately instead of
waiting for the device's release routine. In addition,
synchronization with the autosuspend thread is carried out even for
root hubs (an oversight in the original code).
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg KH <gregkh@suse.de>
Cc: Mark Lord <lkml@rtr.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Several interfaces were missing and others misnumbered or
improperly documented.
Also, make sure to check the return value when registering
the kernel TSBs with the hypervisor. This helped to find
the 4MB kernel TSB alignment bug fixed in a previous changeset.
Signed-off-by: David S. Miller <davem@davemloft.net>
1) The TSB lookup was not using the correct hash mask.
2) It was not aligned on a boundary equal to it's size,
which is required by the sun4v Hypervisor.
wasn't having it's return value checked, and that bug will be fixed up
as well in a subsequent changeset.
Signed-off-by: David S. Miller <davem@davemloft.net>
The DRM code depends on an atomic version of cmpxchg(), which is not
available on sparc32. Since other platforms besides sparc32 have this
issue a KCONFIG option is added for it.
Signed-off-by: Martin Habets <errandir_news@mph.eclipse.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was using an immediate _PAGE_EXEC_4U value in an 'and'
instruction to perform the test. This doesn't work because
the immediate field is signed 13-bit, this the mask being
tested against the PTE was 0x1000 sign-extended to 32-bits
instead of just plain 0x1000.
Signed-off-by: David S. Miller <davem@davemloft.net>
The Linux kernel ignored the PROM's serial settings (115200,n,8,1 in
my case). This was because mode_prop remained "ttyX-mode" (expected:
"ttya-mode") due to the constness of string literals when used with
"char *". Since there is no "ttyX-mode" property in the PROM, Linux
always used the default 9600.
[ Investigation of the suncore.s assembler reveals that gcc optimizied
away the stores, yet did not emit a warning, which is a pretty
anti-social thing to do and is the only reason this bug lived for
so long -DaveM ]
Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is bug 8540 on bugzilla.kernel.org
arch/sparc64/time.c contains references to assorted bq4802 stuff if
CONFIG_PCI is not set, and compile fails. I #ifdef'ed out everything
that looks PCI-ish in that file.
Signed-off-by: David S. Miller <davem@davemloft.net>
Cheetah systems can have cpuids as large as 1023, although physical
systems don't have that many cpus.
Only three limitations existed in the kernel preventing arbitrary
NR_CPUS values:
1) dcache dirty cpu state stored in page->flags on
D-cache aliasing platforms. With some build time
calculations and some build-time BUG checks on
page->flags layout, this one was easily solved.
2) The cheetah XCALL delivery code could only handle
a cpumask with up to 32 cpus set. Some simple looping
logic clears that up too.
3) thread_info->cpu was a u8, easily changed to a u16.
There are a few spots in the kernel that still put NR_CPUS
sized arrays on the kernel stack, but that's not a sparc64
specific problem.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use new esp_scsi for JAZZ SCSI host adapter driver
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
These messages were very useful when bringing up the
OBP based PCI device scan code, but it's just a lot
of noise every bootup now especially on big machines.
The messages can be re-enabled via 'ofpci_debug=1' on
the kernel command line.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes section mismatch warnings in the sunzilog driver.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes mismatch section warnings in the
sparc/kernel/time.c file.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle arbitrary base and length values as long as they
are multiples of IO_PAGE_SIZE.
Bug found by Arun Kumar Rao.
Signed-off-by: David S. Miller <davem@davemloft.net>
sparc64 got rid of the pagefault notifiers, so the enum value for them
can go away aswell.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix this warning on x86-64
drivers/firewire/fw-cdev.c:798: warning: initialization from incompatible pointer type
by making the return code of ioctl_send_request() the same as all the
other ioctl_xxx() return codes.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Of course everybody immediately associates "fw-" with FireWire, not
firmware or firewall or whatever. But "firewire-" has a nice ring to
it too.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Kristian Hoegsberg <krh@bitplanet.net>
While playing with libiec61883 I've noticed that async_send is broken
because it was doing copy_from_user(...., packet->data_size) before
packet->data_size was set to any useful value. It got broken when
packet->allocated_data_size got introduced, as hpsb_alloc_packet does
not set packet->data_size anymore. (Regression in 2.6.22-rc1)
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This adds a real parent device to eth1394's ethX device like in Linux
2.6.20 and older. However, due to unfinished conversion of the ieee1394
away from class_device, we now refer to the FireWire controller's PCI
device as the parent, not to the ieee1394 driver's fw-host device.
Having a real parent device instead of a virtual one allows udev scripts
to distinguish eth1394 interfaces from networking bridges, bondings and
the likes.
Fixes a regression since 2.6.21:
https://bugs.gentoo.org/show_bug.cgi?id=177199
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
When eth1394 was unable to acquire a transaction label, it just dropped
outgoing packets without attempt to resend them later.
The transmit queue is now halted if no tlabel is available to
->hard_start_xmit(). A workqueue job is then scheduled to catch the
moment when ieee1394 recycled the next lot of tlabels.
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=8402
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
If we cannot guess which VRM version the CPU uses, we set it to 0 and
log it. So we shouldn't spam the log each time vid_from_reg() is
later called with vrm 0.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Rudolf Marek <r.marek@assembler.cz>
Add detection of AE18 Errata of Core processor and warns
users that the absolute readings might be wrong for Core2 processor.
Signed-off-by: Rudolf Marek <r.marek@assembler.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
The low temperature limit and the high temperature limit registers
have been accidentally swapped, causing alarms to trigger
when they shouldn't.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
It's that time of the year again. Summer starts in the US, and people
want to sit at the beach with a new -rc candidate.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Why is it that since the 2f1a2ccb9c console
UTF-8 fixes went into 2.6.22-rc1, the PowerMac G5 shows only inverse video
question marks for the text on tty2-6? whereas tty1 is fine, and so is x86.
No fault of that patch: by removing the old fallback behaviour, it reveals
that 32-bit setfont running on 64-bit kernels has only really worked on
the current console, the rest getting faked by that inadequate fallback.
Bring the compat do_unimap_ioctl into line with the main one: PIO_UNIMAP
and GIO_UNIMAP apply to the specified tty, not redirected to fg_console.
Use the same checks, and most particularly, remember to check access_ok:
con_set_unimap and con_get_unimap are using __get_user and __put_user.
And the compat vt_check should ask for the same capability as the main
one, CAP_SYS_TTY_CONFIG rather than CAP_SYS_ADMIN. Added in vt_ioctl's
vc_cons_allocated check for safety, though failure may well be impossible.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB/cm: Drain cq in ipoib_cm_dev_stop()
IPoIB/cm: Fix timeout check in ipoib_cm_dev_stop()
IB/ehca: Fix number of send WRs reported for new QP
IB/mlx4: Initialize send queue entry ownership bits
IB/mlx4: Don't allocate RQ doorbell if using SRQ
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
pata_hpt37x: Further improvements based on the IDE updates and vendor drivers
pata: Trivia
[libata] sata_via, pata_via: Add PCI IDs.
[libata] Fix decoding of 6-byte commands
libata: sata_sis fixes
Fix build failure for drivers/ata/pata_scc.c
[libata] sata_mv: add TODO list
[libata] sata_promise: fix flags typo
We weren't cleaning up our inode reference on error in
ocfs2_reserve_local_alloc_bits(). Add a check for error return and iput() if
need be. Move the code to set the alloc context inode info to the end of the
function so we don't have any possibility of passing back a bad pointer.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>