linux

Author	SHA1	Message	Date
Roland Dreier	ed23a72778	IB: Return "maybe missed event" hint from ib_req_notify_cq() The semantics defined by the InfiniBand specification say that completion events are only generated when a completions is added to a completion queue (CQ) after completion notification is requested. In other words, this means that the following race is possible: while (CQ is not empty) ib_poll_cq(CQ); // new completion is added after while loop is exited ib_req_notify_cq(CQ); // no event is generated for the existing completion To close this race, the IB spec recommends doing another poll of the CQ after requesting notification. However, it is not always possible to arrange code this way (for example, we have found that NAPI for IPoIB cannot poll after requesting notification). Also, some hardware (eg Mellanox HCAs) actually will generate an event for completions added before the call to ib_req_notify_cq() -- which is allowed by the spec, since there's no way for any upper-layer consumer to know exactly when a completion was really added -- so the extra poll of the CQ is just a waste. Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for ib_req_notify_cq() so that it can return a hint about whether the a completion may have been added before the request for notification. The return value of ib_req_notify_cq() is extended so: < 0 means an error occurred while requesting notification == 0 means notification was requested successfully, and if IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events were missed and it is safe to wait for another event. > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed in. It means that the consumer must poll the CQ again to make sure it is empty to avoid the race described above. We add a flag to enable this behavior rather than turning it on unconditionally, because checking for missed events may incur significant overhead for some low-level drivers, and consumers that don't care about the results of this test shouldn't be forced to pay for the test. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Michael S. Tsirkin	b2875d4c39	IB/mthca: Always fill MTTs from CPU Speed up memory registration by filling in MTTs directly when the CPU can write directly to the whole table (all mem-free cards, and to Tavor mode on 64-bit systems with the patch I posted earlier). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-12 16:16:29 -08:00
Jack Morgenstein	b3b30f5e8a	IB/mthca: Recover from catastrophic errors Trigger device remove and then add when a catastrophic error is detected in hardware. This, in turn, will cause a device reset, which we hope will recover from the catastrophic condition. Since this might interefere with debugging the root cause, add a module option to suppress this behaviour. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:54 -07:00
Ralph Campbell	9bc57e2d19	IB/uverbs: Pass userspace data to modify_srq and modify_qp methods Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:25 -07:00
Roland Dreier	a3285aa4ee	IB/mthca: Fix race in reference counting Fix races in in destroying various objects. If a destroy routine waits for an object to become free by doing wait_event(&obj->wait, !atomic_read(&obj->refcount)); /* now clean up and destroy the object */ and another place drops a reference to the object by doing if (atomic_dec_and_test(&obj->refcount)) wake_up(&obj->wait); then this is susceptible to a race where the wait_event() and final freeing of the object occur between the atomic_dec_and_test() and the wake_up(). And this is a use-after-free, since wake_up() will be called on part of the already-freed object. Fix this in mthca by replacing the atomic_t refcounts with plain old integers protected by a spinlock. This makes it possible to do the decrement of the reference count and the wake_up() so that it appears as a single atomic operation to the code waiting on the wait queue. While touching this code, also simplify mthca_cq_clean(): the CQ being cleaned cannot go away, because it still has a QP attached to it. So there's no reason to be paranoid and look up the CQ by number; it's perfectly safe to use the pointer that the callers already have. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-05-09 10:50:29 -07:00
Jack Morgenstein	59fef3b1e9	IB/mthca: Fix max_srq_sge returned by ib_query_device for Tavor devices The driver allocates SRQ WQEs size with a power of 2 size both for Tavor and for memfree. For Tavor, however, the hardware only requires the WQE size to be a multiple of 16, not a power of 2, and the max number of scatter-gather allowed is reported accordingly by the firmware (and this is the value currently returned by ib_query_device() and ibv_query_device()). If the max number of scatter/gather entries reported by the FW is used when creating an SRQ, the creation will fail for Tavor, since the required WQE size will be increased to the next power of 2, which turns out to be larger than the device permitted max WQE size (which is not a power of 2). This patch reduces the reported SRQ max wqe size so that it can be used successfully in creating an SRQ on Tavor HCAs. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-04-12 11:42:30 -07:00
Jack Morgenstein	bf6a9e31cf	IB: simplify static rate encoding Push translation of static rate to HCA format into low-level drivers, where it belongs. For static rate encoding, use encoding of rate field from IB standard PathRecord, with addition of value 0, for backwards compatibility with current usage. The changes are: - Add enum ib_rate to midlayer includes. - Get rid of static rate translation in IPoIB; just use static rate directly from Path and MulticastGroup records. - Update mthca driver to translate absolute static rate into the format used by hardware. This also fixes mthca's static rate handling for HCAs that are capable of 4X DDR. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-04-10 09:43:47 -07:00
Roland Dreier	227c939b00	IB/mthca: Always build debugging code unless CONFIG_EMBEDDED=y Change the mthca debugging trace output code so that it can enabled and disabled at runtime with the debug_level module parameter in sysfs. Also, don't allow CONFIG_INFINIBAND_MTHCA_DEBUG to be disabled unless CONFIG_EMBEDDED is selected. We want users (and especially distros) to have this turned on unless they really need to save space, because by the time we want debugging output, it's usually too late to rebuild a kernel. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-04-02 14:39:20 -07:00
Eli Cohen	651eaac928	IB/mthca: Optimize large messages on Sinai HCAs Sinai (one-port PCI Express) HCAs get improved throughput for messages bigger than 80 KB in DDR mode if memory keys are formatted in a specific way. The enhancement only works if the memory key table is smaller than 2^24 entries. For larger tables, the enhancement is off and a warning is printed (to avoid silent performance loss). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Michael Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:22 -08:00
Jack Morgenstein	1d89b1ae6c	IB/mthca: Implement query_ah method Implement query_ah (except for AVs which are in HCA memory). This is needed to implement RMPP duplicate session detection on sending side (extraction of DGID/DLID and GRH flag from address handle). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:17 -08:00
Eli Cohen	14abdffcc0	IB/mthca: Write FW commands through doorbell page This patch is checks whether the HCA supports posting FW commands through a doorbell page (user access region 0, or "UAR0"). If this is supported, the driver maps UAR0 and uses it for FW commands. This can be controlled by the value of a writable module parameter fw_cmd_doorbell. When the parameter is 0, the commands are posted through HCR using the old method; otherwise if HCA is capable commands go through UAR0. This use of UAR0 to post commands eliminates the need for polling the "go" bit prior to posting a new command. Since reading from a PCI device is much more expensive then issuing a posted write, it is expected that issuing FW commands this way will provide better CPU utilization. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:17 -08:00
Roland Dreier	00df1b2c8b	IB/mthca: Bump driver version and release date Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:15 -08:00
Eli Cohen	8ebe5077e3	IB/mthca: Support for query QP and SRQ Implement the query_qp and query_srq methods in mthca. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:15 -08:00
Roland Dreier	4885bf64bc	IB/mthca: Add device-specific support for resizing CQs Add low-level driver support for resizing CQs (both kernel and userspace) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:08 -08:00
Roland Dreier	d9b98b0f11	IB/mthca: Make functions that never fail return void The function mthca_free_err_wqe() can never fail, so get rid of its return value. That means handle_error_cqe() doesn't have to check what mthca_free_err_wqe() returns, which means it can't fail either and doesn't have to return anything either. All this results in simpler source code and a slight object code improvement: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10) function old new delta mthca_free_err_wqe 83 81 -2 mthca_poll_cq 1758 1750 -8 Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:07 -08:00
Roland Dreier	7d2babc487	IB/mthca: bump driver version and release date Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-02-13 12:19:44 -08:00
Roland Dreier	fd9cfdd11b	IB/mthca: Semaphore to mutex conversions Convert semaphores to mutexes in mthca. Leave firmware command interface poll_sem and event_sem as semaphores. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-01-30 16:45:11 -08:00
Michael S. Tsirkin	9eacee2ac6	IB/mthca: Initialize grh_present before using it build_mlx_header() was using sqp->ud_header.grh_present before it was initialized by mthca_read_ah(). Furthermore, header->grh_present is set by ib_ud_header_init, so there's no need to set it again in mthca_read_ah(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-01-12 15:55:41 -08:00
Tim Schmielau	de25968cc8	[PATCH] fix more missing includes Include fixes for 2.6.14-git11. Should allow to remove sched.h from module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390. Probably more to come since I haven't yet checked the other archs. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:45 -08:00
Jack Morgenstein	77369ed31d	[IB] uverbs: have kernel return QP capabilities Move the computation of QP capabilities (max scatter/gather entries, max inline data, etc) into the kernel, and have the uverbs module return the values as part of the create QP response. This keeps precise knowledge of device limits in the low-level kernel driver. This requires an ABI bump, so while we're making changes, get rid of the max_sge parameter for the modify SRQ command -- it's not used and shouldn't be there. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-11-10 10:22:50 -08:00
Jack Morgenstein	0f69ce1e44	[IB] mthca: report page size capability Report the device's real page size capability in mthca_query_device(). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-11-04 21:28:21 -08:00
Michael S. Tsirkin	affcd50546	[IB] mthca: report asynchronous CQ events Implement reporting asynchronous CQ events in Mellanox HCA driver. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-10-29 07:39:42 -07:00
Roland Dreier	3d155f8cd0	[IB] mthca: first pass at catastrophic error reporting Add some initial support for detecting and reporting catastrophic errors reported by Mellanox HCAs. We start a periodic timer which polls the catastrophic error reporting buffer in device memory. If an error is detected, we dump the contents of the buffer for port-mortem debugging, and report a fatal asynchronous error to higher levels. In the future we can try to recover from these errors by resetting the device, but this will require some work in higher-level code as well. Let's get this in now, so that we at least get catastrophic errors reported in logs. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-10-27 11:03:38 -07:00
Jack Morgenstein	efaae8f71f	[IB] mthca: Better limit checking and reporting Check the sizes of CQs, QPs and SRQs when creating objects, and fail instead of creating too-big queues. Also return real limits instead of just plausible-sounding values from mthca_query_device(). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-10-17 15:20:29 -07:00
Roland Dreier	90f104da22	[IB] mthca: SRQ limit reached events Our hardware supports generating an event when the number of receives posted to a shared receive queue (SRQ) falls below a user-specified limit. Implement mthca_modify_srq() to arm the limit, and add code to handle dispatching SRQ events when they occur. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-10-17 15:20:28 -07:00
Jack Morgenstein	33033b7972	[IB] mthca: Report correct atomic capability Return correct atomic capability flag from mthca query function. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-10-17 15:20:24 -07:00
Roland Dreier	ec34a922d2	[PATCH] IB/mthca: Add SRQ implementation Add mthca support for shared receive queues (SRQs), including userspace SRQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-08-26 20:37:37 -07:00
Roland Dreier	87b816706b	[PATCH] IB/mthca: Factor out common queue alloc code Clean up the allocation of memory for queues by factoring out the common code into mthca_buf_alloc() and mthca_buf_free(). Now CQs and QPs share the same queue allocation code, which we'll also use for SRQs. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-08-26 20:37:37 -07:00
Roland Dreier	da6561c285	[PATCH] IB/mthca: Use correct port width capability value When we call the INIT_IB firmware command to bring up a port, use the actual port width capability returned by the QUERY_DEV_LIM command instead of always trying to enable both 1X and 4X. This fixes breakage seen when the firmware is build to allow 4X only. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-08-26 20:37:36 -07:00
Michael S. Tsirkin	2e8b981c5d	[PATCH] IB/mthca: add HCA board ID to sysfs info Add support for reporting HCA board ID returned from QUERY_ADAPTER firmware command through sysfs. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-08-26 20:37:35 -07:00
Sean Hefty	97f52eb438	[PATCH] IB: sparse endianness cleanup Fix sparse warnings. Use __be* where appropriate. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-08-26 20:37:35 -07:00
Roland Dreier	2a1d9b7f09	[PATCH] IB: Add copyright notices Make some lawyers happy and add copyright notices for people who forgot to include them when they actually touched the code. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2005-08-26 20:37:35 -07:00
Roland Dreier	80c8ec2c04	[PATCH] IB uverbs: add mthca user QP support Add support for userspace queue pairs (QPs) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:50 -07:00
Roland Dreier	74c2174e7b	[PATCH] IB uverbs: add mthca user CQ support Add support for userspace completion queues (CQs) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:50 -07:00
Roland Dreier	99264c1ee2	[PATCH] IB uverbs: add mthca user PD support Add support for userspace protection domains (PDs) to mthca. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-07-07 18:23:49 -07:00
Roland Dreier	cae54bdf6f	[PATCH] IB/mthca: Bump version It's about time for a version bump. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 15:11:47 -07:00
Roland Dreier	ed878458ee	[PATCH] IB/mthca: Align FW command mailboxes to 4K Future versions of Mellanox HCA firmware will require command mailboxes to be aligned to 4K. Support this by using a pci_pool to allocate all mailboxes. This has the added benefit of shrinking the source and text of mthca. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 15:11:46 -07:00
Roland Dreier	d56d6f9502	[PATCH] IB/mthca: Split off MTT allocation Split allocation of MTT range from creation of MR. This will be useful for implementing shared memory regions and userspace verbs. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 15:11:46 -07:00
Tom Duffy	cd4e8fb49d	[PATCH] IB/mthca: Add Sun copyright notice Add Sun copyright to files modified by Tom Duffy. Signed-off-by: Tom Duffy <tduffy@sun.com> Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-06-27 15:11:44 -07:00
Roland Dreier	68a3c21203	[PATCH] IB/mthca: add support for new MT25204 HCA Decouple table of HCA features from exact HCA device type. Add a current FW version field so we can warn when someone is using old FW. Add support for new MT25204 HCA. Remove the warning about mem-free support, since it should be pretty solid at this point. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:34 -07:00
Roland Dreier	08aeb14e5f	[PATCH] IB/mthca: map context for RDMA responder in mem-free mode Fix RDMA in mem-free mode: we need to make sure that the RDMA context memory is mapped for the HCA. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:34 -07:00
Roland Dreier	d10ddbf6d7	[PATCH] IB/mthca: encapsulate mem-free check into mthca_is_memfree() Clean up mem-free mode support by introducing mthca_is_memfree() function, which encapsulates the logic of deciding if a device is mem-free. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:32 -07:00
Michael S. Tsirkin	e0f5fdca1c	[PATCH] IB/mthca: add fast memory region implementation Implement fast memory regions (FMRs), where the driver writes directly into the HCA's translation tables rather than requiring a firmware command. For Tavor, MTTs for FMR are separate from regular MTTs, and are reserved at driver initialization. This is done to limit the amount of virtual memory needed to map the MTTs. For Arbel, there's no such limitation, and all MTTs and MPTs may be used for FMR or for regular MR. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:30 -07:00
Michael S. Tsirkin	9095e208d8	[PATCH] IB/mthca: encapsulate MTT buddy allocator Encapsulate the buddy allocator used for MTT segments. This cleans up the code and also gets us ready to add FMR support. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:26 -07:00
Michael S. Tsirkin	2a4443a699	[PATCH] IB/mthca: fill in opcode field for send completions Fill in missing fields in send completions. Signed-off-by: Itamar Rabenstein <itamar@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:25 -07:00
Roland Dreier	44ea66879d	[PATCH] IB/mthca: fix MTT allocation in mem-free mode Fix bug in MTT allocation in mem-free mode. I misunderstood the MTT size value returned by the firmware -- it is really the size of a single MTT entry, since mem-free mode does not segment the MTT as the original firmware did. This meant that our MTT addresses ended up being off by a factor of 8. This meant that our MTT allocations might overlap, and so we could overwrite and corrupt earlier memory regions when writing new MTT entries. We fix this by always using our 64-byte MTT segment size. This allows some simplification of the code as well, since there's no reason to put the MTT segment size in a variable -- we can always use our enum value directly. Signed-off-by: Roland Dreier <roland@topspin.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-04-16 15:26:24 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

47 Commits