Several devices have multiple independant RX queues per net
device, and some have a single interrupt doorbell for several
queues.
In either case, it's easier to support layouts like that if the
structure representing the poll is independant from the net
device itself.
The signature of the ->poll() call back goes from:
int foo_poll(struct net_device *dev, int *budget)
to
int foo_poll(struct napi_struct *napi, int budget)
The caller is returned the number of RX packets processed (or
the number of "NAPI credits" consumed if you want to get
abstract). The callee no longer messes around bumping
dev->quota, *budget, etc. because that is all handled in the
caller upon return.
The napi_struct is to be embedded in the device driver private data
structures.
Furthermore, it is the driver's responsibility to disable all NAPI
instances in it's ->stop() device close handler. Since the
napi_struct is privatized into the driver's private data structures,
only the driver knows how to get at all of the napi_struct instances
it may have per-device.
With lots of help and suggestions from Rusty Russell, Roland Dreier,
Michael Chan, Jeff Garzik, and Jamal Hadi Salim.
Bug fixes from Thomas Graf, Roland Dreier, Peter Zijlstra,
Joseph Fannin, Scott Wood, Hans J. Koch, and Michael Chan.
[ Ported to current tree and all drivers converted. Integrated
Stephen's follow-on kerneldoc additions, and restored poll_list
handling to the old style to fix mutual exclusion issues. -DaveM ]
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
All drivers implement ethtool get_perm_addr the same way -- by calling
the generic function. So we can inline the generic function into the
caller and avoid going through the drivers.
Signed-off-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of all drivers reading pci config space to get the revision
ID, they can now use the pci_device->revision member.
This exposes some issues where drivers where reading a word or a dword
for the revision number, and adding useless error-handling around the
read. Some drivers even just read it for no purpose of all.
In devices where the revision ID is being copied over and used in what
appears to be the equivalent of hotpath, I have left the copy code
and the cached copy as not to influence the driver's performance.
Compile tested with make all{yes,mod}config on x86_64 and i386.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Acked-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The 82550 & 51 parts have an extended configuration block that
includes a bit "GMRC", required to enable the expected TCO behavior,
in config byte offset 22d. The config block sent by the failing driver
does include the extension area, but this bit is not initialised,
and the downlaod only specifies 0x16 bytes to be sent to the NIC
(thaht's bytes 00..21d). By initializing the GMRC bit, and extending
the download size for D102+ MACs, the problem is resolved.
Signed-off-by: David Graham <david.graham@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
This reverts commit d52df4a35a.
This patch attempted to fix e100 for non-cache coherent memory
architectures by using the cb style code that eepro100 had and using
the EL and s bits from the RFD list. Unfortunately the hardware
doesn't work exactly like this and therefore this patch actually
breaks e100. Reverting the change brings it back to the previously
known good state for 2.6.22. The pending rewrite in progress to this
code can then be safely merged later.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
It appears that some systems still like e100 better if it uses
I/O access mode. Setting the new parameter use_io=1 will cause
all driver instances to use io mapping to access the register
space on the e100 device.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Seved Torstendahl <seved.torstendahl@netinsight.net> suggested to
let the module parameter for invalid eeprom checksum control the valid
mac address test.
If this bypass happens we should print a different message,
or at least one that is correct, maybe something like below
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
I was going to say that eepro100's speedo_rx_link() does the same DMA
abuse as e100, but then I noticed one little detail: eepro100 sets both
EL (end of list) and S (suspend) bits in the RFD as it chains it to the
RFD list. e100 was only setting the EL bit. Hmmm, that's interesting.
That means that if HW reads a RFD with the S-bit set, it'll process
that RFD and then suspend the receive unit. The receive unit will
resume when SW clears the S-bit. There is no need for SW to restart
the receive unit. Which means a lot of the receive unit state tracking
code in the driver goes away.
So here's a patch against 2.6.14. (Sorry for inlining it; the mailer
I'm using now will mess with the word wrap). I can't test this on
XScale (unless someone has an e100 module for Gumstix :) . It should
be doing exactly what eepro100 does with RFDs. I don't believe this
change will introduce a performance hit because the S-bit and EL-bit go
hand-in-hand meaning if we're going to suspend because of the S- bit,
we're on the last resource anyway, so we'll have to wait for SW to
replenish.
(cherry picked from 29e79da9495261119e3b2e4e7c72507348e75976 commit)
To clearly state the intent of copying to linear sk_buffs, _offset being a
overly long variant but interesting for the sake of saving some bytes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
e100: fix napi ifdefs removing needed code
From: Auke Kok <auke-jan.h.kok@intel.com>
The e100 driver is NAPI mode only. We need to netif_poll_disable
during suspend and shutdown. The non-NAPI driver code was removed
and is only avaiable in the out-of-tree e100 kernel driver.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
e100: fix irq leak on suspend/resume
From: Frederik Deweerdt <frederik.deweerdt@gmail.com>
The e100_resume() function should be calling netif_device_detach and
free_irq. This fixes multiple irq's being allocated after resume.
Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Conflicts:
drivers/infiniband/core/iwcm.c
drivers/net/chelsio/cxgb2.c
drivers/net/wireless/bcm43xx/bcm43xx_main.c
drivers/net/wireless/prism54/islpci_eth.c
drivers/usb/core/hub.h
drivers/usb/input/hid-core.c
net/core/netpoll.c
Fix up merge failures with Linus's head and fix new compilation failures.
Signed-Off-By: David Howells <dhowells@redhat.com>
Fix various .c/.h typos in comments (no code changes).
Signed-off-by: Matt LaPlante <kernel1@cyberdogtech.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Account for the interface being closed before disabling polling
on a device, to fix shutdown on some systems that explcitly close
the netdevice before calling shutdown.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
When rebooting with netconsole over e100, the driver shutdown code would
deadlock with netpoll. Reduce shutdown code to a bare minimum while retaining
WoL and suspend functionality.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
of passing regs around manually through all ~1800 interrupt handlers in the
Linux kernel.
The regs pointer is used in few places, but it potentially costs both stack
space and code to pass it around. On the FRV arch, removing the regs parameter
from all the genirq function results in a 20% speed up of the IRQ exit path
(ie: from leaving timer_interrupt() to leaving do_IRQ()).
Where appropriate, an arch may override the generic storage facility and do
something different with the variable. On FRV, for instance, the address is
maintained in GR28 at all times inside the kernel as part of general exception
handling.
Having looked over the code, it appears that the parameter may be handed down
through up to twenty or so layers of functions. Consider a USB character
device attached to a USB hub, attached to a USB controller that posts its
interrupts through a cascaded auxiliary interrupt controller. A character
device driver may want to pass regs to the sysrq handler through the input
layer which adds another few layers of parameter passing.
I've build this code with allyesconfig for x86_64 and i386. I've runtested the
main part of the code on FRV and i386, though I can't test most of the drivers.
I've also done partial conversion for powerpc and MIPS - these at least compile
with minimal configurations.
This will affect all archs. Mostly the changes should be relatively easy.
Take do_IRQ(), store the regs pointer at the beginning, saving the old one:
struct pt_regs *old_regs = set_irq_regs(regs);
And put the old one back at the end:
set_irq_regs(old_regs);
Don't pass regs through to generic_handle_irq() or __do_IRQ().
In timer_interrupt(), this sort of change will be necessary:
- update_process_times(user_mode(regs));
- profile_tick(CPU_PROFILING, regs);
+ update_process_times(user_mode(get_irq_regs()));
+ profile_tick(CPU_PROFILING);
I'd like to move update_process_times()'s use of get_irq_regs() into itself,
except that i386, alone of the archs, uses something other than user_mode().
Some notes on the interrupt handling in the drivers:
(*) input_dev() is now gone entirely. The regs pointer is no longer stored in
the input_dev struct.
(*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
something different depending on whether it's been supplied with a regs
pointer or not.
(*) Various IRQ handler function pointers have been moved to type
irq_handler_t.
Signed-Off-By: David Howells <dhowells@redhat.com>
(cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)
Unify our shutdown/suspend/resume code and make it similar to e1000:
e1000_shutdown now calls suspend which does the exact same thing on
shutdown except saving PCI config state on suspend. WoL setup code
is now also more simple and works even when CONFIG_PM is not set, which
was previously broken.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
We keep getting requests from people that think that this might be
an exploitable hole where we would overwrite 4 bytes in the netdev
struct if the pci name would exceed 15 characters. In reality this
will never happen but we fix it anyway.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
This update to the copyright header adds the mailinglist, and aligns it
with the kernel licensing as well as remove the offending 'all rights
reserved'.
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
A recent patch in -mm3 titled
"gregkh-pci-pci-don-t-enable-device-if-already-enabled.patch" causes
pci_enable_device() to be a no-op if the kernel thinks that the device is
already enabled. This change breaks the PCI error recovery mechanism in
the e100 device driver, since, after PCI slot reset, the card is no longer
enabled. This is a trivial fix for this problem. Tested.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Several people run into the situation where the E100
EEPROM contents are fine, but the checksum hasn't been
set properly. This renders the device useless for
them even though it would function correctly.
The default is off, which retains the current behavior.
Signed-off-by: David S. Miller <davem@davemloft.net>
From: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
MDIO/MDIO-X was broken due to a wrong errata. Removing the workaround
code fixes for affected NICs.
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
This is needed if we wish to change the size of the resource structures.
Based on an original patch from Vivek Goyal <vgoyal@in.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Various PCI bus errors can be signaled by newer PCI controllers. This
patch adds the PCI error recovery callbacks to the intel ethernet e100
device driver. The patch has been tested, and appears to work well.
Signed-off-by: Linas Vepstas <linas@linas.org>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Auke Kok <auke-jan.h.kok@intel.com>
Olaf Hering reported a problem on pseries with e100 where ethtool -t would
cause a bus error, and the e100 driver would stop working. Due to the new
load ucode command the cb list must be allocated before calling
e100_init_hw, so remove the call and just let e100_up take care of it.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Checking e100.c code against Documentation/io_ordering.txt I found the
following problem:
spin_lock_irq...
write
spin-unlock
e100_write_flush
The attached patch fix the code like this:
spin_lock_irq...
write
e100_write_flush
spin-unlock
Signed-off-by: Catalin BOIE <catab@umbrella.ro>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
e100 seems to have had a long standing bug where e100_init_hw was being
called when it should not have been. This caused a panic due to recent
changes that rely on correct set up in the driver, and more robust error
paths.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
e100: e100 whitespace fixes
These are whitespace only fixes.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
e100: Handle the return values from pci_* functions
This is to resolve warnings during compile time.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
e100: Fix TX hang and RMCP Ping issue (due to a microcode loading issue)
Set the end of list bit to cause the hardware's transmit state machine to
work correctly and not prevent management (BMC) traffic.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We have identified two related bugs in the e100 driver.
Both bugs are related to manipulation of the MDI control register.
The first problem is that the Ready bit is being ignored when writing to
the Control register; we noticed this because the Linux bonding driver
would occasionally come to the spurious conclusion that the link was down
when querying Link State. It turned out that by failing to wait for a
previous command to complete it was selecting what was essentially a random
register in the MDI register set. When we added code that waits for the
Ready bit (as shown in the patch file below) all such problems ceased.
The second problem is that, although access to the MDI registers involves
multiple steps which must not be intermixed, nothing was defending against
two or more threads attempting simultaneous access. The most obvious
situation where such interference could occur involves the watchdog versus
ioctl paths, but there are probably others, so we recommend the locking
shown in our patch file.
Signed-off-by: Michael O'Donnell <Michael.ODonnell@stratus.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jeff Garzik <jgarzik@pobox.com>
Cc: John Ronciak <john.ronciak@intel.com>
Cc: Ganesh Venkatesan <ganesh.venkatesan@intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
For the four versions of hardware that we (currently) support microcode
download on, the default configuration of our receive interrupt mitigation
microcode was too aggressive, and caused unnecessary delays when pinging,
and low(er) throughput on single connection latency sensitive performance
tests.
This code adds microcode support, and sets the defaults to more reasonable
settings. It also explains the functionality in the code in more detail.
Compile and load tested, shows expected behavior for slight delay of ping
packets (1-2ms) when ucode is loaded, and decent interrupt moderation for
small packets, while maintaining good throughput.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
The patch below fixes the following sparse warnings:
drivers/net/e100.c:1481:13: warning: Using plain integer as NULL pointer
drivers/net/e100.c:1767:27: warning: Using plain integer as NULL pointer
drivers/net/e100.c:1847:27: warning: Using plain integer as NULL pointer
Signed-off-by: Luiz Capitulino <lcapitulino@mandriva.com.br>
Signed-off-by: John W. Linville <linville@tuxdriver.com>