Introduce uc_set_promisc flag to fix enabling of promisc mode
when exceeding the number of supported RAR entries.
Issue discovered by Ben Greear when using mac-vlans.
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver was calling the set Rx mode function for every multicast
filter set by the VF. When starting many VMs where each might have
multiple VLAN interfaces this would result in the function being
called hundreds or even thousands of times. This is unnecessary
for the case of the imperfect filters used in the MTA and has been
streamlined to be more efficient.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver is unnecessarily writing values to VLAN control registers.
These writes already done elsewhere and are superfluous here.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for the "ip link set" and "ip link show" commands that allow
configuration of the virtual functions' MAC and port VLAN via user space
command line.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a boolean parameter to ixgbe-set_vmolr so that the caller can
specify whether the pool should accept untagged packets. Required
for a follow on patch to enable administrative configuration of port
VLAN for virtual functions.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to an errata in 82598 parts MSI-X needs to be disabled
in certain ixgbe devices designed to transfer peer-to-peer
traffic on the PCIe bus. This patch sets the default
interrupt type to MSI rather than MSI-X for specific Cisco
ixgbe adapters.
Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Acked-by: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds registers (,tx/rx rings' status and so on) printout
code just before resetting adapters. This will be helpful for detecting
the root cause of adapters reset.
Signed-off-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Nicholas Nunley <nicholasx.d.nunley@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Description: When using Intel smartspeed, the patch displays a
warning when the link down shifts to 1 Gig.
Signed-off-by: Anjali Singhai <anjali.singhai@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The way we were setting autoneg via ethtool was inconstant with that
of our other drivers. It will change the following:
If autoneg is off:
>ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: off
TX: off
Before:
>ethtool -A eth0 autoneg on
>ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: off
RX: off
TX: off
Now:
>ethtool -A eth0 autoneg on
>ethtool -a eth0
Pause parameters for eth0:
Autonegotiate: on
RX: on
TX: on
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a small race between when the tx queues are stopped
and when netif_carrier_off() is called in ixgbe_down. If the
dev_watchdog() timer fires during this time it is possible for
a false tx timeout to occur.
This patch moves the netif_carrier_off() so that it is called before
the tx queues are stopped preventing the dev_watchdog timer from
detecting false tx timeouts. The race is seen occosionally when
FCoE or DCB settings are being configured or changed.
Testing note, running ifconfig up/down will not reproduce this
issue because dev_open/dev_close call dev_deactivate() and then
dev_activate().
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
writebacks can be held indefinitely by hardware if EITR=0, when
combined with TXDCTL.WTHRESH=8. When EITR=0, WTHRESH should be
set back to zero.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
82598/82599 can support EITR == 0, which allows for the
absolutely lowest latency setting in the hardware. This disables
writeback batching and anything else that relies upon a delayed
interrupt. This patch enables the feature of "override" when a
user sets rx-usecs to zero, the driver will respect that setting
over using RSC, and automatically disable RSC. If rx-usecs is
used to set the EITR value to 0, then the driver should disable
LRO (aka RSC) internally until EITR is set to non-zero again.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY laser is still on during driver init. It's allowing
garbage to hit our FIFO, which eventually can cause the entire
device to die. Power down the laser while setting up the device,
and re-enable the laser before getting link.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ixgbe driver was setting up 82598 hardware correctly, so that
when promiscuous mode was enabled hardware stripping was turned
off. But on 82599 the logic to disable/enable hardware stripping
is different, and the code was not updated correctly when the
hardware vlan stripping was enabled as default.
This change comprises the creation of two new helper functions
and calling them from the right locations to disable and enable
hardware stripping of vlan tags at appropriate times.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Converts the list and the core manipulating with it to be the same as uc_list.
+uses two functions for adding/removing mc address (normal and "global"
variant) instead of a function parameter.
+removes dev_mcast.c completely.
+exposes netdev_hw_addr_list_* macros along with __hw_addr_* functions for
manipulation with lists on a sandbox (used in bonding and 80211 drivers)
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
When running the offline diagnostic tests check to see if any VFs are
online. If so then only run the link test. This is necessary because
the VFs running in guest VMs aren't aware of when the PF is taken
offline for a diagnostic test. Also put a message to the system log
telling the system administrator to take the VFs offline manually if
(s)he wants to run a full diagnostic. Return 1 on each of the tests
not run to alert the user of the condition.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During FCF solicitation, the switch is supposed to pad the
solicited advertisement out to the endpoints specified
maximum FCoE frame size. That means that we need to receive
FIP frames that are larger than the standard MTU. To make
sure the receive queue is configured correctly, we should be
filtering FIP traffic into the FCoE queues.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently FIP (FCoE Initialization Protocol) frames
are going untagged. This causes various problems
with FCFs (switches) that have negotiated a priority
over dcbx. This patch tags FIP frames with the same
priority as the FCoE frames.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If the user buffer count was 256 the shift would place a 1
in the offset region leading to errors. It also overwrites
the uers buffer list. This patch makes sure that at most
256 user buffers are allowed for DDP and the buffer count
is masked properly such that it doesn't overwrite the offset
when shifting the bits.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Frank Zhang <frank_1.zhang@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the last patch I missed an unecessary min_t comparison.
This patch removes it, the path allocates at most
72 tx queues for 82599 and 24 for 82598 there is no need
for this check.
Additionally this sets MAX_[TX|RX]_QUEUES to 72. Which is
used as the size for the tx/rx_ring arrays. There is no
reason to have more tx_rings/rx_rings then num_tx_queues.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The clear_to_send flag is being cleared before the call to ping all
the VFs. It should be called after pinging all the VFs.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VFs running in guest VMs do not respond in as timely a manner to
PF indication it is going down as they do when running in the host
domain. If the adapter is in SR-IOV mode insert a two second delay
to guarantee that all VFs have had time to respond to the PF reset.
In any case resetting the PF while VFs are active should be
discouraged but if it must be done then there will be a two
second delay to help synchronize resets among the PF and all the
VFs.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Includes one minor indentation fix to placate checkpatch.
Signed-off-by: Frans Pop <elendil@planet.nl>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>
As per Simon Horman's feedback set IXGBE_RSC_CB(skb)->dma to zero
after unmapping HWRSC DMA address to avoid double freeing.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently netdev_features_change is called before fcoe tx queues
setup is done, so this patch moves calling of netdev_features_change
after tx queues setup is done in ixgbe_init_interrupt_scheme, so
that real_num_tx_queues is updated correctly on each fcoe enable
or disable.
This allows additional fcoe queues updated correctly in vlan driver
for their correct queue selection.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Advanced Power Management is disabled for 82599 KX4 connections by
clearing GRC.APME bit, causing it to not wake the system from an
improper system shutdown. By default GRC.APME is enabled and software
is not supposed to clear these settings during adapter probe.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix 82599 link issues during driver load and unload test using multi-speed
10G & 1G fiber modules. When connected back to back sometime 82599 multispeed
fiber modules would link at 1G speed instead of 10G highest speed, due to a
race condition in autotry process involving Tx laser flapping. Move autotry
autoneg-37 tx laser flapping process from multispeed module init setup
to driver unload. This will alert the link partner to restart its
autotry process when it tries to establish the link with the link partner
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename for_each_bit to for_each_set_bit in the kernel source tree. To
permit for_each_clear_bit(), should that ever be added.
The patch includes a macro to map the old for_each_bit() onto the new
for_each_set_bit(). This is a (very) temporary thing to ease the migration.
[akpm@linux-foundation.org: add temporary for_each_bit()]
Suggested-by: Alexey Dobriyan <adobriyan@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Artem Bityutskiy <dedekind@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move TC_PRIO_CONTROL check and queue remapping into
ixgbe_select_queue(). Remapping queues after the qdisc
can result in the wrong qdisc queue being stopped with
netif_stop_subqueue(). Even if this is resolved and the
correct queue is stopped it can result in a queue being
blocked by TC_PRIO_CONTROL frames uneccesarily. Moving
this into the select_queue routine maintains alignment
between tx_rings and qdisc queues.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of allocating 128 struct netdev_queue per device, use the
minimum value between 128 and the number of possible txq's, to
reduce ram usage and "tc -s -d class shod dev .." output.
This patch fixes Eric Dumazet's patch to set the TX queues to
the correct minimum.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Disabling TSO can cause the dev_watchdog timer to be triggered because
when TSO is disabled netif_tx_stop_all_queues is called. If the watchdog
timer fires while the queues are stopped and traffic has not recently been
sent on a paticular queue this is falsly identified as a hang and
ndo_tx_timeout() is called. This is ocossionally seen during testing.
This removes the netif_tx_stop_all_queues() it is not needed. The scheduler
submits skb's with dev_hard_start_xmit(), this checks if netif_needs_gso and
if so it calls dev_gso_segment. Disabling TSO will cause dev_hard_start_xmit()
to do the gso processing. However ixgbe does not use the features flags to
determine if it needs to use tso or not instead it uses skb->gso_size so
ixgbe will process these frames correctly regardless of the netdev features
flag.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Work around 82599 HW issue when HWRSC is enabled on IOMMU enabled
kernels. 82599 HW is updating the header information after setting the
descriptor to done, resulting DMA mapping/unmapping issues on IOMMU
enabled systems. To work around the issue delay unmapping of first packet
that carries the header information until end of packet is reached.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PowerPC architecture does not require loads to independent bytes to be
ordered without adding an explicit barrier.
In ixgbe_clean_rx_irq we load the status bit then load the packet data.
With packet split disabled if these loads go out of order we get a
stale packet, but we will notice the bad sequence numbers and drop it.
The problem occurs with packet split enabled where the TCP/IP header and data
are in different descriptors. If the reads go out of order we may have data
that doesn't match the TCP/IP header. Since we use hardware checksumming this
bad data is never verified and it makes it all the way to the application.
This bug was found during stress testing and adding this barrier has been shown
to fix it.
Signed-off-by: Milton Miller <miltonm@bga.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to have the WUS register set to all 1's in order for the hardware
to be capable of ever waking up. Set it here in the ixgbe_probe().
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 82598 has an erratum that receipt of pause frames at 1G
could lead to a Tx Hang. To avoid this this patch disables
Rx FC while at 1G speed for all 82598 parts.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The recent n-tuple patches added some comments to the headers
of the Flow Director functions that aren't accurate. This
cleans them up, and is a purely cosmetic patch.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch replaces dev->mc_count in all drivers (hopefully I didn't miss
anything). Used spatch and did small tweaks and conding style changes when
it was suitable.
Jirka
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Driver has gone under significant changes, the version should
reflect that.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds n-tuple filter programming to 82599.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allocates the ring structures themselves on each
NUMA node along with the buffer_info structures. This way we
don't allocate the entire ring memory on a single node in one
big block, thus reducing NUMA node memory crosstalk.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The default policy for the current driver is to do all its memory
allocation on whatever processor is running insmod/modprobe. This
is less than optimal.
This driver's default mode of operation will be to use each node for each
subsequent transmit/receive queue. The most efficient allocation will be
to then have the interrupts bound in such a way as to match up the
interrupt of the queue to the cpu where its memory was allocated.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Processing multiple ixgbe_watchdog_task calls may cause
the link_up variable and IXGBE_FLAG_NEED_LINK_UPDATE flag
to be set incorrectly. In the worse case this is causing
the netif_carrier_off to be called inappropriately which
results in an interface that can't be brought up.
Although schedule_work() will only schedule the task if
it is not already on the work queue the WORK_STRUCT_PENDING
bits are cleared just before calling the work function.
This allows WORK_STRUCT_PENDING to be cleared, the work
function to start and meanwhile schedule another task.
This patch adds a mutex to the watchdog task. This bug is
actualized by changing DCB settings or doing extended
cable pull or reset tests.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a developer had complained of getting lots of warnings:
"eth16 selects TX queue 98, but real number of TX queues is 64"
http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg02200.html
As there was no follow up on that bug, I am submitting this
patch assuming that the other return points will not return
invalid txq's, and also that this fixes the bug (not tested).
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit e5a43549f7 (ixgbe: remove
skb_dma_map/unmap calls from driver) looks to have introduced a bug in
ixgbe_tx_map. If we get an error from a PCI DMA call, we loop backwards
through count until it becomes -1 and return that.
The caller of ixgbe_tx_map expects 0 on error, so return that instead.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call ixgbe_copy_dcb_cfg() earlier in the ixgbe_dcbnl_set_all() so that
we can learn if this is going to fail as early as possible. Previously,
ixgbe_down or ixgbe_close were being called before this check and the
IXGBE_RESETTING bit was being set and cleared. Worse if this failed
the corresponding ixgbe_up/ndo_open would not called.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the correct bit BIT_PG_TX when tx PG settings are set.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Acked-by: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces three macros to work with uc list from net drivers.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Found this problem when testing IPv6 from a KVM guest to a remote
host via e1000e device on the host.
The following patch fixes the check for IPv6 GSO packet in Intel
ethernet drivers to use skb_is_gso_v6(). SKB_GSO_DODGY is also set
when packets are forwarded from a guest.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inadequate coordination between the PF driver and the VF driver results
in tx hangs in the VF driver when you perform certain actions that will
lead to a re-init of the PF. Add feature to notify active VFs when the PF
is about to re-initialize so that the VFs can take appropriate action.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PF Reset Done bit should not be set in the extended control register
until the PF has actually completed the bring up process. It is a mis-
interpretation of the purpose of this bit to assume it should be set
when the physical reset of the device is done. Instead it should be used
to indicate to the VFs when the PF is ready to provide them with required
services. This is not until after the PF is finished coming up and ready
to process mailbox events.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This data storage for SW emulated MAC addresses is unlikely to ever be used
so pull it.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When VFs are allocated (as indicated by adapter->num_vfs is non-zero) then
the PF pool is no longer zero. Instead it will be the same as the number
of VFs allocated. When setting the VLVF entry for the PF we need to use
the correct pool otherwise the PF will get VLAN packets from the wire
because the packet will pass VFTA filtering and the PF has the default
pool, but it will not get VLAN packets from the VFs because it has
not set the correct pool bit in the VLVF entry.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The variable count and i are unsigned so the (<|>=)0 tests do not work.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch resolves issues seen when running netconsole and rebooting via
reboot -f. The issue was due to the fact that we were attempting to
perform interrupt actions when the q_vectors and rings had already been
freed via the ixgbe_shutdown routines.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Skip MAC loopback test when the adapter is set to a VT mode such as SR-IOV
or VMDq
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds SR-IOV features supported by the 82599 controller to the main driver
module. If the CONFIG_PCI_IOV kernel option is selected then the SR-IOV
features are enabled. Use the max_vfs module option to allocate up to 63
virtual functions per physical port.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds code to the core 82599 module to support SR-IOV features of the 82599
network controller
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the mailbox and SR-IOV feature modules to the ixgbe driver Makefile.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This module and header file add functions to support SR-IOV features of the
82599 10Gbe controller.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds register definitions, bit definitions and structures used by
the driver to support SR-IOV features of the 82599 controller.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 82599 virtual function device and the master 82599 physical function
device implement a mailbox utility for communication between the devices
using some SRAM scratch memory and a doorbell/answering mechanism enabled
via interrupt and/or polling. This C module and accompanying header
file implement the base functions for use of this feature.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use DEFINE_PCI_DEVICE_TABLE() so we get place PCI ids table into correct section
in every case.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tc is still throwing a warning that is could be used
uninitialized. This fixes it, and properly formats the device ID
checks for the use of this variable.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use the %pM kernel extension to display the MAC address.
The only difference in the output is that the MAC address is
shown in the usual colon-separated hex notation.
Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a pci_save_state() call in ixgbe_resume() after
pci_restore_state(). A similar change was made in ixgbe_io_slot_reset()
that accommodates pci_restore_state() new behavior. This change makes
pci_restore_state() clear the saved_state flag This is necessary due
to a resent kernel change to pci_restore_state() so that it now clears
the saved_state flag of the device right after the device.s standard
configuration registers have been poplulated with the previously saved
values.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the 82598 is fed 802.1q packets, it chokes with
an error of the form:
ixgbe: eth0: ixgbe_tx_csum: partial checksum but proto=81!
As the logic there was not smart enough to look into
the vlan header to pick out the encapsulated protocol.
There are times when we'd like to send these packets
out without having to configure a vlan on the interface.
Here we check for the vlan tag and allow the packet to
go out with the correct hardware checksum.
This patch is a clone of a previously submitted patch by
Arthur Jones <ajones@riverbed.com> for igb (Commit -
fa4a7ef36e).
Signed-off-by: Gurucharan Shetty <gshetty@riverbed.com>
Signed-off-by: Arthur Jones <ajones@riverbed.com>
Acked-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modified patch with Dave's comments to replace mdelay with proper msleep.
Fix 82598 copper link issue, where the phy prematurely indicates link
before it is ready to process packets. The new function looks for phy
link and indicates that, when it is available. If phy is not ready
within few seconds of MAC indicating link, the function will return
failure which translates to link down indication.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the restart_queue and non_eop_desc counters from being
double-counted. They are cumulative in each ring, so we don't want to
add them to the cumulative result in the adapter's master counter.
Otherwise, the stats will be inaccurate
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is an updated version, because ixgbe_get_ethtool_stats()
needs to call dev_get_stats() or "ethtool -S" wont give
correct tx_bytes/tx_packets values.
Several cpus can update netdev->stats.tx_bytes & netdev->stats.tx_packets
in parallel. In this case, TX stats are under estimated and false sharing
takes place.
After a pktgen session sending exactly 200000000 packets :
# ifconfig fiber0 | grep TX
TX packets:198501982 errors:0 dropped:0 overruns:0 carrier:0
Multi queue devices should instead use txq->tx_bytes & txq->tx_packets
in their xmit() method (appropriate txq lock already held by caller, no
cache line miss), or use appropriate locking.
After patch, same pktgen session gives :
# ifconfig fiber0 | grep TX
TX packets:200000000 errors:0 dropped:0 overruns:0 carrier:0
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
when disabling interrupts, driver was writing with IO, this is no
necessary because on ixgbe parts the hardware can "oneshot"
disable and clear the interrupt. So on 82598/82599 use of EIAM
should avoid one posted write per interrupt when in MSI-X mode.
This should improve performance and seems to in my limited
testing, reduce CPU utilization VERY slightly.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drop variables that had cache lines modified in simultaneous hot paths.
keep some variables modified on hot paths but make their storage per queue.
cache align DMA data buffer start addresses.
cache align (padding) some structures that end within a cacheline.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
decrease the memory utilization of the tx / rx queue allocation
by changing the default ring size to 512 (from 1024). At
1024 rx entries of 2KB each (from 4kB slab) with 16 queues
ixgbe was using 64 MB of memory per port, which is not
necessary.
Users can still change queue lengths with ethtool -k.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes the Tx queue selection for FCoE traffic from ixgbe_xmit_frame()
and does it in the ndo_select_queue() call, moving all Tx queue selection
into a single routine.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Store the user priority for FCoE and use it directly for outgoing
FCoE traffic when DCB is enabled.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes skb_dma_map/unmap calls from the ixgbe driver due to the
fact that the calls don't work with HW IOMMU enabled systems. The problem
is that multiple mappings will give different results when HW IOMMU is
enabled and the skb_dma_map/unmap calls only have one location to store
mappings.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch extends the ethtool interface to display what PHY
is currently connected to a NIC. The results can be viewed in
ethtool ethX output.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes an issue when clearing out the RAR entries. If RAR[0]
is the only address in use, don't clear the others.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
82598 shouldn't try and access LINKS2 while configuring
link and flow control. This is an 82599-only register.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Flow Control autoneg should be disabled for certain adapters
that don't support autonegotiation of Flow Control at 10 gigabit.
These interfaces are the 10GBASE-T devices, CX4, and SFP+, all
running at 10 gigabit only. 1 gigabit is fine.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver was doing a divide by zero when adjusting tx-usecs.
This patch removes the divide by zero code and changes the logic slightly
to ignore tx-usecs in the case of shared TxRx vectors.
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There appears to be a stray setting of the VFE bit when registering vlans.
This should not be done as vlan filtering should be enabled any time the
interface is not in promiscous mode
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While configuring rscctl use rx buffer length from rx ring structure
instead of passing rx_buf_len to ixgbe_configure_rscctl
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Divide 82599 HWRSC counters into aggregated and flushed to count number of
packets getting coalesced per TCP connection.
Signed-off-by: Mallikarjuna R Chilakala <mallikarjuna.chilakala@intel.com>
Acked-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tc is required by CONFIG_IXGBE_DCB.
This also fixes compilation warning:
drivers/net/ixgbe/ixgbe_main.c: In function ‘ixgbe_tx_is_paused’:
drivers/net/ixgbe/ixgbe_main.c:245: warning: unused variable ‘tc’
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Not as fancy as coccinelle. Checkpatch errors ignored.
Compile tested allyesconfig x86, not all files compiled.
grep -rPl --include=*.[ch] "\brequest_irq\s*\([^,\)]+,\s*\&" drivers/net | while read file ; do \
perl -i -e 'local $/; while (<>) { s@(\brequest_irq\s*\([^,\)]+,\s*)\&@\1@g ; print ; }' $file ;\
done
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>