linux

Author	SHA1	Message	Date
Anton Blanchard	1f79448302	IB/ehca: Make output clearer by removing some debug messages ehca spits out a lot of debugging information. I had to look closely to see the "Port 1 is not active" message within all the debug: eHCA Infiniband Device Driver (Rel.: SVNEHCA_0022) eHCA scaling code enabled ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_define_sqp Port 1 is not active. ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_create_qp ehca_define_sqp() failed rc=ffffffffffffffff ib_mad: Couldn't create ib_mad QP1 ib_mad: Couldn't open ehca0 port 1 ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr unsupported fmr_attr->page_shift=9 ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr rc=ffffffffffffffea pd=c000000b4b5b2420 mr_access_flags=7 fmr_attr=c0000005afd37394 fmr_create failed for FMR 0 Remove a few debug statements so that things are clearer: eHCA Infiniband Device Driver (Rel.: SVNEHCA_0022) eHCA scaling code enabled ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_define_sqp Port 1 is not active. ib_mad: Couldn't create ib_mad QP1 ib_mad: Couldn't open ehca0 port 1 ehca D.001.DQDXYCB-P1-C9: PU0006 EHCA_ERR:ehca_alloc_fmr unsupported fmr_attr->page_shift=9 fmr_create failed for FMR 0 Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:06 -07:00
Roland Dreier	1fea391039	IB/ehca: Include <linux/mutex.h> from ehca_classes.h ehca_classes.h uses struct mutex, so while <linux/mutex.h> seems to be pulled in indirectly by one of the headers it includes, the right thing is to include <linux/mutex.h> directly. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Acked-by: Stefan Roscher <stefan.roscher@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:05 -07:00
Joachim Fenkes	5ff70cac3e	IB/ehca: SRQ fixes to enable IPoIB CM Fix ehca SRQ support so that IPoIB connected mode works: - Report max_srq > 0 if SRQ is supported - Report "last wqe reached" asynchronous event when base QP dies; this is required by the IB spec and IPoIB CM relies on receiving it when cleaning up. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-31 13:58:04 -07:00
Stefan Roscher	fecea0ab34	IB/ehca: Fix Small QP regressions The new Small QP code had a few bugs that would also make it trigger for non-Small QPs. Fix them. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-31 13:56:42 -07:00
Eddy L O Jansson	0e6ff1580f	in-string typos of "error" One patch for two trivial typos of 'error' with three R's, appearing in message strings. There's a bunch more of the same in comments, not dealt with here. Signed-off-by: Eddy L O Jansson <eddy@klopper.net> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-31 15:39:40 -07:00
Hoang-Nam Nguyen	1655fc2e12	IB/ehca: Move extern declarations from .c files to .h files Make sure declarations stay in sync with definitions by keeping all extern declarations in common .h files. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-28 21:47:53 -07:00
Hoang-Nam Nguyen	05d989f948	IB/ehca: Fix include order to better match kernel style Include <rdma/...> headers after <asm/...> headers. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-28 08:36:32 -07:00
Stefan Roscher	e2f81daf23	IB/ehca: Support small QP queues eHCA2 supports QP queues that can be as small as 512 bytes. This greatly reduces memory overhead for consumers that use lots of QPs with small queues (e.g. RDMA-only QPs). Apart from dealing with firmware, this code needs to manage bite-sized chunks of kernel pages, making sure that no kernel page is shared between different protection domains. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>	2007-07-20 21:19:47 -07:00
Joachim Fenkes	0c10f7b79b	IB/ehca: Make internal_create/destroy_qp() static They're only used in ehca_qp.c, so make them static to that file. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-20 21:19:44 -07:00
Hoang-Nam Nguyen	51d2bfbddb	IB/ehca: Move ehca2ib_return_code() out of line ehca2ib_return_code() is not used in any fast path, and making it non-inline saves ~1.5K of code. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-20 21:19:44 -07:00
Hoang-Nam Nguyen	633a5aedae	IB/ehca: Generate async event when SRQ limit reached Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-20 21:19:44 -07:00
Hoang-Nam Nguyen	5bb7d9290c	IB/ehca: Support large page MRs Add support for MR pages larger than 4K on eHCA2. This reduces firmware memory consumption. If enabled via the mr_largepage module parameter, the MR page size will be determined based on the MR length and the hardware capabilities -- if the MR is >= 16M, 16M pages are used, for example. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-20 21:19:43 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Hoang-Nam Nguyen	2b94397adc	IB/ehca: Fix warnings issued by checkpatch.pl Run the existing ehca code through checkpatch.pl and clean up the worst of the coding style violations. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:40 -07:00
Hoang-Nam Nguyen	187c72e31f	IB/ehca: Restructure ehca_set_pagebuf() Split ehca_set_pagebuf() into three functions depending on MR type (phys/user/fast) and remove superfluous ehca_set_pagebuf_1(). Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:40 -07:00
Hoang-Nam Nguyen	df17bfd4a0	IB/ehca: MR/MW structure refactoring - Rename struct ehca_mr fields to clearly distinguish between kernel and HW page size. - Sort struct ehca_mr_pginfo into a common part and a union containing specific fields for physical, user and fast MR Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:40 -07:00
Hoang-Nam Nguyen	2492398e61	IB/ehca: Use macro to calculate number of chunks in a mem block Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:39 -07:00
Hoang-Nam Nguyen	4e4e74cae7	IB/ehca: Use #define for "pages per register_rpage" instead of hardcoded value Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:39 -07:00
Hoang-Nam Nguyen	a1a6ff1100	IB/ehca: Use common error code mapping instead of specific ones Instead of one error mapping function for each potential error source in ehca_mrmw.c, use a centralized function that handles all cases, saving a three-figure line count. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:39 -07:00
Hoang-Nam Nguyen	3df78f81e0	IB/ehca: Fix memory leak in error path of ehca_get_dma_mr() Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:39 -07:00
Joachim Fenkes	fbb9318be4	IB/ehca: Fix HW level autodetection Autodetection was missing a few HW revisions, causing certain eHCA1 revisions to be treated like eHCA2. Fixed. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:39 -07:00
Hoang-Nam Nguyen	f72d2f0814	IB/ehca: Improve latency by unlocking after triggering the hardware Kick the hardware before unlocking the send/receive queue to overlap processing a little more. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	8705ce5b90	IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events When firmware reports a nondisruptive port configuration change event, previous versions of the eHCA driver didn't forward the event to consumers like IPoIB. Add code that determines the type of configuration change by comparing old and new port attributes and reports it. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	b1cfe43d4b	IB/ehca: Return QP pointer in poll_cq() Also add two unlikely() statements. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	26ed687fdd	IB/ehca: Change idr spinlocks into rwlocks This eliminates lock contention among IRQs as well as the need to disable IRQs around idr_find, because there are no IRQ writers. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	28db6beb42	IB/ehca: Refactor sync between completions and destroy_cq using atomic_t - ehca_cq.nr_events is made an atomic_t, eliminating a lot of locking. - The CQ is removed from the CQ idr first now to make sure no more completions are scheduled on that CQ. The "wait for all completions to end" code becomes much simpler this way. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	9844b71baa	IB/ehca: Lock renaming, static initializers - Rename all spinlock flags to "flags", matching the vast majority of kernel code. - Move hcall_lock into the only file it's used in. - Replaced spin_lock_init() and friends with static initializers for global variables. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Hoang-Nam Nguyen	15f001ec47	IB/ehca: Report RDMA atomic attributes in query_qp() Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Stefan Roscher	85f003172f	IB/ehca: Set SEND_GRH flag for all non-LL UD QPs on eHCA2 Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Stefan Roscher	472803dab8	IB/ehca: Support UD low-latency QPs Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	a6a12947fb	IB/ehca: add Shared Receive Queue support Support SRQs on eHCA2. Since an SRQ is a QP for eHCA2, a lot of code (structures, create, destroy, post_recv) can be shared between QP and SRQ. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	9a79fc0a1b	IB/ehca: QP code restructuring in preparation for SRQ - Replace init_qp_queues() by a shorter init_qp_queue(), eliminating duplicate code. - hipz_h_alloc_resource_qp() doesn't need a pointer to struct ehca_qp any longer. All input and output data is transferred through the parms parameter. - Change the interface to also support SRQ. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Joachim Fenkes	91f13aa3fc	IB/ehca: HW level, HW caps and MTU autodetection In preparation for support of new eHCA2 features, change adapter probing: - Hardware level is changed to encode major and minor chip version - Hardware capabilities are queried from the firmware - The maximum MTU is queried from the firmware instead of assuming a fixed value Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:27 -07:00
Hoang-Nam Nguyen	b8a3ba5513	IB/ehca: Change scaling_code parameter description to match default value Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:26 -07:00
Jan Engelhardt	06cc85086e	IB: Use menuconfig for InfiniBand menu Change Kconfig objects from "menu, config" into "menuconfig" so that the user can disable the whole feature without having to enter the menu first. Signed-off-by: Jan Engelhardt <jengelh@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 20:12:26 -07:00
Joachim Fenkes	fffba373ef	IB/ehca: Refactor "maybe missed event" code Refactor the ehca changes from commit `ed23a727` ("IB: Return "maybe missed event" hint from ib_req_notify_cq()") so the queue arithmetic is done in slightly fewer lines. Also, move the spinlock flags into the block they're used in. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 16:17:32 -07:00
Stefan Roscher	65a2c841d6	IB/ehca: Fix number of send WRs reported for new QP Due to a typo, the driver was reporting the wrong number of "actual send WRs" after ehca_create_qp(). Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-24 14:02:39 -07:00
Hoang-Nam Nguyen	bd5a6ccc0e	IB/ehca: Return proper error code if register_mr fails Set the return code of ehca_register_mr() to ENOMEM if the corresponding firmware call fails due to out of resources. Some other error codes were explicitly mapped to EINVAL -- just remove those cases so they get mapped to the default case, which already returns EINVAL anyway. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-19 08:51:54 -07:00
Joachim Fenkes	4e430dcb7b	IB/ehca: Disable scaling code by default, bump version number - Scaling code is still considered experimental, so disable it by default - Increase version to SVNEHCA_0023 Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:41:40 -07:00
Joachim Fenkes	bba9b6013e	IB/ehca: Beautify sysfs attribute code and fix compiler warnings eHCA's sysfs attributes are now being created via sysfs_create_group(), making the process neatly table-driven. The return value is checked, thus fixing a few compiler warnings. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:40:45 -07:00
Joachim Fenkes	c7a14939e7	IB/ehca: Remove _irqsave, move #ifdef - In ehca_process_eq(), we're IRQ safe throughout the whole function, so we don't need another _irqsave in the middle of flight. - take_over_work() is only called by comp_pool_callback(), so it can move into the same #ifdef block. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:40:05 -07:00
Hoang-Nam Nguyen	c55a0ddd8e	IB/ehca: Fix AQP0/1 QP number AQP0/1 should report qp_num={0\|1} and the actual QP# should be stored in struct ehca_qp, not the other way round. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:39:31 -07:00
Joachim Fenkes	92761cdaf2	IB/ehca: Correctly set GRH mask bit in ehca_modify_qp() The driver needs to always supply the "GRH present" flag to the hypervisor, whether it's true or false. Not supplying it (i.e. not setting the corresponding mask bit) amounts to a "perhaps", which we don't want. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:38:57 -07:00
Stefan Roscher	5d88278e3b	IB/ehca: Serialize hypervisor calls in ehca_register_mr() Some pSeries hypervisor versions show a race condition in the allocate MR hCall. Serialize this call per adapter to circumvent this problem. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:38:11 -07:00
Linus Torvalds	de5603748a	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters IB: Put rlimit accounting struct in struct ib_umem IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules	2007-05-09 19:40:09 -07:00
Rafael J. Wysocki	8bb7844286	Add suspend-related notifications for CPU hotplug Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Roland Dreier	f7c6a7b5d5	IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:37 -07:00
Paul Mackerras	02bbc0f09c	Merge branch 'linux-2.6'	2007-05-08 13:37:51 +10:00
Linus Torvalds	972d45fb43	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IPoIB: Convert to NAPI IB: Return "maybe missed event" hint from ib_req_notify_cq() IB: Add CQ comp_vector support IB/ipath: Fix a race condition when generating ACKs IB/ipath: Fix two more spin lock problems IB/fmr_pool: Add prefix to all printks IB/srp: Set proc_name IB/srp: Add orig_dgid sysfs attribute to scsi_host IPoIB/cm: Don't crash if remote side uses one QP for both directions RDMA/cxgb3: Support for new abort logic RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message RDMA/cxgb3: Fail qp creation if the requested max_inline is too large RDMA/cxgb3: Fix TERM codes IPoIB/cm: Fix error handling in ipoib_cm_dev_open() IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed IB/mthca: Work around kernel QP starvation IB/ipath: Don't put QP in timeout queue if waiting to send IB/ipath: Don't call spin_lock_irq() from interrupt context	2007-05-07 12:18:21 -07:00
Roland Dreier	ed23a72778	IB: Return "maybe missed event" hint from ib_req_notify_cq() The semantics defined by the InfiniBand specification say that completion events are only generated when a completions is added to a completion queue (CQ) after completion notification is requested. In other words, this means that the following race is possible: while (CQ is not empty) ib_poll_cq(CQ); // new completion is added after while loop is exited ib_req_notify_cq(CQ); // no event is generated for the existing completion To close this race, the IB spec recommends doing another poll of the CQ after requesting notification. However, it is not always possible to arrange code this way (for example, we have found that NAPI for IPoIB cannot poll after requesting notification). Also, some hardware (eg Mellanox HCAs) actually will generate an event for completions added before the call to ib_req_notify_cq() -- which is allowed by the spec, since there's no way for any upper-layer consumer to know exactly when a completion was really added -- so the extra poll of the CQ is just a waste. Motivated by this, we add a new flag "IB_CQ_REPORT_MISSED_EVENTS" for ib_req_notify_cq() so that it can return a hint about whether the a completion may have been added before the request for notification. The return value of ib_req_notify_cq() is extended so: < 0 means an error occurred while requesting notification == 0 means notification was requested successfully, and if IB_CQ_REPORT_MISSED_EVENTS was passed in, then no events were missed and it is safe to wait for another event. > 0 is only returned if IB_CQ_REPORT_MISSED_EVENTS was passed in. It means that the consumer must poll the CQ again to make sure it is empty to avoid the race described above. We add a flag to enable this behavior rather than turning it on unconditionally, because checking for missed events may incur significant overhead for some low-level drivers, and consumers that don't care about the results of this test shouldn't be forced to pay for the test. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00

1 2

85 Commits