1
Commit Graph

25658 Commits

Author SHA1 Message Date
Nicolas Dichtel
a12c9a8581 sit/rtnl: add missing parameters on dump
IFLA_IPTUN_FLAGS and IFLA_IPTUN_PMTUDISC were missing.
There is only one possible flag in i_flag: SIT_ISATAP.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:38 -05:00
Nicolas Dichtel
f9cd5a5536 sit: always notify change when params are updated
netdev_state_change() was called only when end points or link was updated. Now
that all parameters are advertised via netlink, we must advertise any change.

This patch also prepares the support of sit tunnels management via rtnl. The
code which update tunnels will be put in a new function.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:38 -05:00
Nicolas Dichtel
be42da0e10 ipip: add support of link creation via rtnl
This patch add the support of 'ip link .. type ipip'.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:38 -05:00
Nicolas Dichtel
befe2aa1b2 ipip/rtnl: add IFLA_IPTUN_PMTUDISC on dump
This parameter was missing in the dump.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:38 -05:00
Nicolas Dichtel
c38cc4b599 ipip: always notify change when params are updated
netdev_state_change() was called only when end points or link was updated. Now
that all parameters are advertised via netlink, we must advertise any change.

This patch also prepares the support of ipip tunnels management via rtnl. The
code which update tunnels will be put in a new function.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:38 -05:00
Nicolas Dichtel
0b11245722 ip6tnl: add support of link creation via rtnl
This patch add the support of 'ip link .. type ip6tnl'.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:37 -05:00
Nicolas Dichtel
b58d731acc ip6tnl: rename rtnl functions for consistency
Functions in this file start with ip6_tnl_.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:37 -05:00
Nicolas Dichtel
cfa323b6b9 ip6tnl/rtnl: add IFLA_IPTUN_PROTO on dump
IPv6 tunnels can have three mode: 4in6, 6in6 and xin6.
This information was missing in the netlink message.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 22:02:37 -05:00
Saurabh Mohan
b2942004fb ipv4/ip_vti.c: VTI fix post-decryption forwarding
With the latest kernel there are two things that must be done post decryption
 so that the packet are forwarded.
 1. Remove the mark from the packet. This will cause the packet to not match
 the ipsec-policy again. However doing this causes the post-decryption check to
 fail also and the packet will get dropped. (cat /proc/net/xfrm_stat).
 2. Remove the sp association in the skbuff so that no policy check is done on
 the packet for VTI tunnels.

Due to #2 above we must now do a security-policy check in the vti rcv path
prior to resetting the mark in the skbuff.

Signed-off-by: Saurabh Mohan <saurabh.mohan@vyatta.com>
Reported-by: Ruben Herold <ruben@puettmann.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 21:40:21 -05:00
stephen hemminger
1007dd1aa5 bridge: add root port blocking
This is Linux bridge implementation of root port guard.
If BPDU is received from a leaf (edge) port, it should not
be elected as root port.

Why would you want to do this?
If using STP on a bridge and the downstream bridges are not fully
trusted; this prevents a hostile guest for rerouting traffic.

Why not just use netfilter?
Netfilter does not track of follow spanning tree decisions.
It would be difficult and error prone to try and mirror STP
resolution in netfilter module.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 20:20:44 -05:00
stephen hemminger
a2e01a65cd bridge: implement BPDU blocking
This is Linux bridge implementation of STP protection
(Cisco BPDU guard/Juniper BPDU block). BPDU block disables
the bridge port if a STP BPDU packet is received.

Why would you want to do this?
If running Spanning Tree on bridge, hostile devices on the network
may send BPDU and cause network failure. Enabling bpdu block
will detect and stop this.

How to recover the port?
The port will be restarted if link is brought down, or
removed and reattached.  For example:
 # ip li set dev eth0 down; ip li set dev eth0 up

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 20:20:44 -05:00
stephen hemminger
cd7537326e bridge: add template for bridge port flags
Provide macro to build sysfs data structures and functions
for accessing flag bits.  If flag bits change do netlink
notification.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 20:20:44 -05:00
stephen hemminger
25c71c75ac bridge: bridge port parameters over netlink
Expose bridge port parameter over netlink. By switching to a nested
message, this can be used for other bridge parameters.

This changes IFLA_PROTINFO attribute from one byte to a full nested
set of attributes. This is safe for application interface because the
old message used IFLA_PROTINFO and new one uses
 IFLA_PROTINFO | NLA_F_NESTED.

The code adapts to old format requests, and therefore stays
compatible with user mode RSTP daemon. Since the type field
for nested and unnested attributes are different, and the old
code in libnetlink doesn't do the mask, it is also safe to use
with old versions of bridge monitor command.

Note: although mode is only a boolean, treating it as a
full byte since in the future someone will probably want to add more
values (like macvlan has).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 20:20:44 -05:00
Li RongQing
c75ea26040 ipv6: remove obsolete comments in route.c
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 18:51:45 -05:00
Amerigo Wang
e086cadc08 net: unify for_each_ip_tunnel_rcu()
The defitions of for_each_ip_tunnel_rcu() are same,
so unify it. Also, don't hide the parameter 't'.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 18:49:50 -05:00
Amerigo Wang
aa0010f880 net: convert __IPTUNNEL_XMIT() to an inline function
__IPTUNNEL_XMIT() is an ugly macro, convert it to a static
inline function, so make it more readable.

IPTUNNEL_XMIT() is unused, just remove it.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-14 18:49:50 -05:00
Sven Eckelmann
170173bf37 batman-adv: Remove instant overwritten variable initialization
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:37 +01:00
Antonio Quartulli
068ee6e204 batman-adv: roaming handling mechanism redesign
This patch allows clients to roam multiple times within the same
originator-interval.

To enable this new feature two key aspects that have been introduced:
1) packets are always directed to the node that was originally
serving the roamed client which will then re-route the data
to the correct destination at any point in time;
2) the client flags handling mechanism has been properly modified
in order to allow multiple roamings withinin the same orig-int.
Therefore flags are now set properly even in this scenario.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:37 +01:00
Antonio Quartulli
be73b488d1 batman-adv: refactor tt_global_del_struct()
batadv_tt_global_del_struct() function is not properly named.
Having a more meaningful name which reflects the current behavior helps other
developers to easily understand what it does.

A parameter has also been renamed in order to let the function header better fit
the 80-chars line-width

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:36 +01:00
Antonio Quartulli
47c94655c3 batman-adv: refactor code to simplify long lines
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:36 +01:00
Antonio Quartulli
7c1fd91da5 batman-adv: substitute tt_poss_change with a per-tt_entry flag
tt_poss_change is a node-wide flag which tells whether the node is in a roaming
state (a client recently moved to/away from it) in order to let it apply special
re-routing rules. However this flag does not give a clear idea of the current
state because it is not possible to understand *which client* is actually
involved in the roaming. For this reason a better approach has been chosen:
instead of using a node-wide variable, the roaming state is now given by a
per-tt_entry ROAM flag which, in case of packet coming through the node, tells
the node whether the real destination is in roaming state or not.

With this flag change, batadv_check_unicast_ttvn() has also been rearranged in
order to better fit the new re-routing logic and to be much more readable.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:35 +01:00
Simon Wunderlich
28709878b6 batman-adv: wait multiple periods before activating bla
For some reasons (bridge forward delay, network device setup order, etc)
the initial bridge loop avoidance announcement packets may be lost. This
may lead to problems in finding other backbone gws, and therfore create
loops in the startup time.

Fix this by extending the waiting periods to 3 (define can be changed)
before allowing broadcast traffic.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:35 +01:00
Simon Wunderlich
d807f27287 batman-adv: allow bla traffic only after first worker period
When adding a backbone gateway for the first time, it might not yet
be known in the backbone, and therefore we should not forward
broadcasts yet. This behaviour is the same as when sending a request
to another backbone gw because of a CRC mismatch. The backbone gw
will operate normal after the next periodic bla work.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:34 +01:00
Simon Wunderlich
52aebd6a9f batman-adv: send announcement when backbone gw is registered
To avoid loops in the startup phase until the first announcement is
sent, send an announcement immediately as soon as a backbone gw is
added.

This may happen due to various reasons, e.g. a packet passes the rx
or tx path.

Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:33 +01:00
Antonio Quartulli
b7eddd0b39 batman-adv: prevent using any virtual device created on batman-adv as hard-interface
Any virtual device created on top of a batman-adv mesh interface must be
prevented to be used to create a new mesh network (this would lead to an
unwanted batman-over-batman configuration)

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:33 +01:00
Antonio Quartulli
a7528f8ddd batman-adv: fix wrong spinlock inline comment
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:32 +01:00
Simon Wunderlich
07568d0369 batman-adv: don't rely on positions in struct for hashing
The hash functions in the bridge loop avoidance code expects the
VLAN vid to be right after the mac address, but this is not guaranteed.

Fix this by explicitly hashing over the right fields of the struct.

Reported-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-14 21:00:32 +01:00
John W. Linville
485f2b7f5f Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2012-11-14 14:17:58 -05:00
John W. Linville
bd2a813074 Merge branch 'for-john' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 2012-11-14 14:15:43 -05:00
John W. Linville
5bdf502dd9 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2012-11-14 13:33:43 -05:00
Hannes Frederic Sowa
d4596bad2a ipv6: setsockopt(IPIPPROTO_IPV6, IPV6_MINHOPCOUNT) forgot to set return value
Cc: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-13 14:38:47 -05:00
Eric Dumazet
bd090dfc63 tcp: tcp_replace_ts_recent() should not be called from tcp_validate_incoming()
We added support for RFC 5961 in latest kernels but TCP fails
to perform exhaustive check of ACK sequence.

We can update our view of peer tsval from a frame that is
later discarded by tcp_ack()

This makes timestamps enabled sessions vulnerable to injection of
a high tsval : peers start an ACK storm, since the victim
sends a dupack each time it receives an ACK from the other peer.

As tcp_validate_incoming() is called before tcp_ack(), we should
not peform tcp_replace_ts_recent() from it, and let callers do it
at the right time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: Romain Francoise <romain@orebokech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-13 14:35:17 -05:00
Hannes Frederic Sowa
5cb04436ee ipv6: add knob to send unsolicited ND on link-layer address change
This patch introduces a new knob ndisc_notify. If enabled, the kernel
will transmit an unsolicited neighbour advertisement on link-layer address
change to update the neighbour tables of the corresponding hosts more quickly.

This is the equivalent to arp_notify in ipv4 world.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-13 14:27:45 -05:00
Florian Westphal
d3976a53ce netfilter: ipv6: only provide sk_bound_dev_if for link-local addr
yoshfuji points out that sk_bound_dev_if should only be provided
for link-local addresses.

IPv6 getpeer/sockname also has this test, i.e. we will now
only set sin6_scope_id if the original(!) destination
was a link-local address.

Reported-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-11-13 13:42:29 +01:00
YOSHIFUJI Hideaki / 吉藤英明
9fafd65ad4 ipv6 ndisc: Use pre-defined in6addr_linklocal_allnodes.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-12 15:23:21 -05:00
Johannes Berg
43c771a196 wireless: allow 40 MHz on world roaming channels 12/13
When in world roaming mode, allow 40 MHz to be used
on channels 12 and 13 so that an AP that is, e.g.,
using HT40+ on channel 9 (in the UK) can be used.

Cc: stable@vger.kernel.org
Reported-by: Eddie Chapman <eddie@ehuk.net>
Tested-by: Eddie Chapman <eddie@ehuk.net>
Acked-by: Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-12 16:26:06 +01:00
Xi Wang
0c9f79be29 ipv4: avoid undefined behavior in do_ip_setsockopt()
(1<<optname) is undefined behavior in C with a negative optname or
optname larger than 31.  In those cases the result of the shift is
not necessarily zero (e.g., on x86).

This patch simplifies the code with a switch statement on optname.
It also allows the compiler to generate better code (e.g., using a
64-bit mask).

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-11 17:53:13 -05:00
David S. Miller
d4185bbf62 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c

Minor conflict between the BCM_CNIC define removal in net-next
and a bug fix added to net.  Based upon a conflict resolution
patch posted by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-10 18:32:51 -05:00
Linus Torvalds
b251f0f399 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Bug fixes galore, mostly in drivers as is often the case:

  1) USB gadget and cdc_eem drivers need adjustments to their frame size
     lengths in order to handle VLANs correctly.  From Ian Coolidge.

  2) TIPC and several network drivers erroneously call tasklet_disable
     before tasklet_kill, fix from Xiaotian Feng.

  3) r8169 driver needs to apply the WOL suspend quirk to more chipsets,
     fix from Cyril Brulebois.

  4) Fix multicast filters on RTL_GIGA_MAC_VER_35 r8169 chips, from
     Nathan Walp.

  5) FDB netlink dumps should use RTM_NEWNEIGH as the message type, not
     zero.  From John Fastabend.

  6) Fix smsc95xx tx checksum offload on big-endian, from Steve
     Glendinning.

  7) __inet_diag_dump() needs to repsect and report the error value
     returned from inet_diag_lock_handler() rather than ignore it.
     Otherwise if an inet diag handler is not available for a particular
     protocol, we essentially report success instead of giving an error
     indication.  Fix from Cyrill Gorcunov.

  8) When the QFQ packet scheduler sees TSO/GSO packets it does not
     handle things properly, and in fact ends up corrupting it's
     datastructures as well as mis-schedule packets.  Fix from Paolo
     Valente.

  9) Fix oopser in skb_loop_sk(), from Eric Leblond.

  10) CXGB4 passes partially uninitialized datastructures in to FW
      commands, fix from Vipul Pandya.

  11) When we send unsolicited ipv6 neighbour advertisements, we should
      send them to the link-local allnodes multicast address, as per
      RFC4861.  Fix from Hannes Frederic Sowa.

  12) There is some kind of bug in the usbnet's kevent deferral
      mechanism, but more immediately when it triggers an uncontrolled
      stream of kernel messages spam the log.  Rate limit the error log
      message triggered when this problem occurs, as sending thousands
      of error messages into the kernel log doesn't help matters at all,
      and in fact makes further diagnosis more difficult.

      From Steve Glendinning.

  13) Fix gianfar restore from hibernation, from Wang Dongsheng.

  14) The netlink message attribute sizes are wrong in the ipv6 GRE
      driver, it was using the size of ipv4 addresses instead of ipv6
      ones :-) Fix from Nicolas Dichtel."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  gre6: fix rtnl dump messages
  gianfar: ethernet vanishes after restoring from hibernation
  usbnet: ratelimit kevent may have been dropped warnings
  ipv6: send unsolicited neighbour advertisements to all-nodes
  net: usb: cdc_eem: Fix rx skb allocation for 802.1Q VLANs
  usb: gadget: g_ether: fix frame size check for 802.1Q
  cxgb4: Fix initialization of SGE_CONTROL register
  isdn: Make CONFIG_ISDN depend on CONFIG_NETDEVICES
  cxgb4: Initialize data structures before using.
  af-packet: fix oops when socket is not present
  pkt_sched: enable QFQ to support TSO/GSO
  net: inet_diag -- Return error code if protocol handler is missed
  net: bnx2x: Fix typo in bnx2x driver
  smsc95xx: fix tx checksum offload for big endian
  rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dump
  ptp: update adjfreq callback description
  r8169: allow multicast packets on sub-8168f chipset.
  r8169: Fix WoL on RTL8168d/8111d.
  drivers/net: use tasklet_kill in device remove/close process
  tipc: do not use tasklet_disable before tasklet_kill
2012-11-10 22:03:49 +01:00
Felix Fietkau
1f98ab7fef mac80211: call skb_dequeue/ieee80211_free_txskb instead of __skb_queue_purge
Fixes more wifi status skb leaks, leading to hostapd/wpa_supplicant hangs.

Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-10 21:26:28 +01:00
Nicolas Dichtel
c075b13098 ip6tnl: advertise tunnel param via rtnl
It is usefull for daemons that monitor link event to have the full parameters of
these interfaces when a rtnl message is sent.
It allows also to dump them via rtnetlink.

It is based on what is done for GRE tunnels.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 19:36:20 -05:00
Nicolas Dichtel
ba3e3f50a0 sit: advertise tunnel param via rtnl
It is usefull for daemons that monitor link event to have the full parameters of
these interfaces when a rtnl message is sent.
It allows also to dump them via rtnetlink.

It is based on what is done for GRE tunnels.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 19:36:20 -05:00
Nicolas Dichtel
0974658da4 ipip: advertise tunnel param via rtnl
It is usefull for daemons that monitor link event to have the full parameters of
these interfaces when a rtnl message is sent.
It allows also to dump them via rtnetlink.

It is based on what is done for GRE tunnels.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 19:36:20 -05:00
Nicolas Dichtel
a375413311 gre6: fix rtnl dump messages
Spotted after a code review.
Introduced by c12b395a46 (gre: Support GRE over
IPv6).

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 17:11:17 -05:00
Eric Dumazet
c3b89fbba3 ipip: add GSO support
In commit 6b78f16e4b (gre: add GSO support) we added GSO support to GRE
tunnels.

This patch does the same for IPIP tunnels.

Performance of single TCP flow over an IPIP tunnel is increased by 40%

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 17:01:05 -05:00
Hannes Frederic Sowa
60713a0ca7 ipv6: send unsolicited neighbour advertisements to all-nodes
As documented in RFC4861 (Neighbor Discovery for IP version 6) 7.2.6.,
unsolicited neighbour advertisements should be sent to the all-nodes
multicast address.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-09 16:18:52 -05:00
Johannes Berg
20f544eea0 mac80211: don't send null data packet when not associated
On resume or firmware recovery, mac80211 sends a null
data packet to see if the AP is still around and hasn't
disconnected us. However, it always does this even if
it wasn't even connected before, leading to a warning
in the new channel context code. Fix this by checking
that it's associated.

Cc: stable@vger.kernel.org
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-09 17:31:47 +01:00
Johan Hedberg
482049f757 Bluetooth: Fix memory leak when removing a UUID
When removing a UUID from the list in the remove_uuid() function we must
also kfree the entry in addition to removing it from the list.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-11-09 16:45:37 +01:00
Marcel Holtmann
fbe96d6ff9 Bluetooth: Notify about device registration before power on
It is important that the monitor interface gets notified about
a new device before its power on procedure has been started.

For some reason that is no longer working as expected and the power
on procedure runs first. It is safe to just notify about device
registration and trigger the power on procedure afterwards.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-11-09 16:45:37 +01:00
Paulo Sérgio
896ea28ea8 Bluetooth: Fix error status when pairing fails
When pairing fails due to wrong confirm value, the management layer
doesn't report a proper error status. It sends
MGMT_STATUS_CONNECT_FAILED instead of MGMT_STATUS_AUTH_FAILED.

Most of management functions that receive a status as a parameter
expects for it to be encoded as a HCI status. But when a SMP pairing
fails, the SMP layer sends the SMP reason as the error status to the
management layer.

This commit maps all SMP reasons to HCI_ERROR_AUTH_FAILURE, which will
be converted to MGMT_STATUS_AUTH_FAILED in the management layer.

Reported-by: Claudio Takahasi <claudio.takahasi@openbossa.org>
Reviewed-by: João Paulo Rechi Vita <jprvita@openbossa.org>
Signed-off-by: Paulo Sérgio <paulo.sergio@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-11-09 16:45:37 +01:00
Johan Hedberg
476e44cb19 Bluetooth: Fix having bogus entries in mgmt_read_index_list reply
The mgmt_read_index_list uses one loop to calculate the max needed size
of its response with the help of an upper-bound of the controller count.
The second loop is more strict as it checks for HCI_SETUP (which might
have gotten set after the first loop) and could result in some indexes
being skipped. Because of this the function needs to readjust the event
length and index count after filling in the response array.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: stable@vger.kernel.org
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
2012-11-09 16:45:37 +01:00
Johannes Berg
cfff2f999d mac80211: fix memory leak in device registration error path
If the cipher suites need to be allocated, but this
allocation fails, this leaks the internal scan request.
Fix that by going to the correct error handling label.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-09 09:48:43 +01:00
Doug Goldstein
e949b09b71 vlan: set sysfs device_type to 'vlan'
Sets the sysfs device_type to 'vlan' for udev. This makes it easier for
applications that query network information via udev to identify vlans
instead of using strrchr().

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-08 22:02:23 -05:00
Li RongQing
a4477c4ddb ipv6: remove rt6i_peer_genid from rt6_info and its handler
6431cbc25f(Create a mechanism for upward inetpeer propagation into routes)
introduces these codes, but this mechanism is never enabled since
rt6i_peer_genid always is zero whether it is not assigned or assigned by
rt6_peer_genid(). After 5943634fc5 (ipv4: Maintain redirect and PMTU info
in struct rtable again), the ipv4 related codes of this mechanism has been
removed, I think we maybe able to remove them now.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-08 21:16:08 -05:00
David S. Miller
f1e0b5b4f1 Included changes:
- minimal fixes to the packet layout to avoid the __packed attribute when not
   needed
 - new packet type called UNICAST_4ADDR: in this packet it is possible to find
   both source and destination node (in the classic UNICAST header only the
   destination field exists).
 - a new feature: Distributed ARP Table (D.A.T.). It aims to reduce ARP lookups
   latency by means of a simil-DHT approach.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iEYEABECAAYFAlCasB8ACgkQpGgxIkP9cwd2kACeOhRvj9EA2hWo1vKQUN6OvTi0
 nOcAn08Rvyf+gUPAFQcA9HEUAr8kAebT
 =c3ey
 -----END PGP SIGNATURE-----

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge

Included changes:
- minimal fixes to the packet layout to avoid the __packed attribute when not
  needed
- new packet type called UNICAST_4ADDR: in this packet it is possible to find
  both source and destination node (in the classic UNICAST header only the
  destination field exists).
- a new feature: Distributed ARP Table (D.A.T.). It aims to reduce ARP lookups
  latency by means of a simil-DHT approach.
2012-11-07 19:08:42 -05:00
Nicolas Dichtel
b20b6d9726 ndisc: fix a typo in a comment in ndisc_recv_na()
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-07 19:03:16 -05:00
Paul Chavent
5920cd3a41 packet: tx_ring: allow the user to choose tx data offset
The tx data offset of packet mmap tx ring used to be :
(TPACKET2_HDRLEN - sizeof(struct sockaddr_ll))

The problem is that, with SOCK_RAW socket, the payload (14 bytes after
the beginning of the user data) is misaligned.

This patch allows to let the user gives an offset for it's tx data if
he desires.

Set sock option PACKET_TX_HAS_OFF to 1, then specify in each frame of
your tx ring tp_net for SOCK_DGRAM, or tp_mac for SOCK_RAW.

Signed-off-by: Paul Chavent <paul.chavent@onera.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-07 18:54:30 -05:00
Eric Leblond
a3d744e995 af-packet: fix oops when socket is not present
Due to a NULL dereference, the following patch is causing oops
in normal trafic condition:

commit c0de08d042
Author: Eric Leblond <eric@regit.org>
Date:   Thu Aug 16 22:02:58 2012 +0000

    af_packet: don't emit packet on orig fanout group

This buggy patch was a feature fix and has reached most stable
branches.

When skb->sk is NULL and when packet fanout is used, there is a
crash in match_fanout_group where skb->sk is accessed.
This patch fixes the issue by returning false as soon as the
socket is NULL: this correspond to the wanted behavior because
the kernel as to resend the skb to all the listening socket in
this case.

Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-07 15:40:14 -05:00
Paolo Valente
3015f3d2a3 pkt_sched: enable QFQ to support TSO/GSO
If the max packet size for some class (configured through tc) is
violated by the actual size of the packets of that class, then QFQ
would not schedule classes correctly, and the data structures
implementing the bucket lists may get corrupted. This problem occurs
with TSO/GSO even if the max packet size is set to the MTU, and is,
e.g., the cause of the failure reported in [1]. Two patches have been
proposed to solve this problem in [2], one of them is a preliminary
version of this patch.

This patch addresses the above issues by: 1) setting QFQ parameters to
proper values for supporting TSO/GSO (in particular, setting the
maximum possible packet size to 64KB), 2) automatically increasing the
max packet size for a class, lmax, when a packet with a larger size
than the current value of lmax arrives.

The drawback of the first point is that the maximum weight for a class
is now limited to 4096, which is equal to 1/16 of the maximum weight
sum.

Finally, this patch also forcibly caps the timestamps of a class if
they are too high to be stored in the bucket list. This capping, taken
from QFQ+ [3], handles the unfrequent case described in the comment to
the function slot_insert.

[1] http://marc.info/?l=linux-netdev&m=134968777902077&w=2
[2] http://marc.info/?l=linux-netdev&m=135096573507936&w=2
[3] http://marc.info/?l=linux-netdev&m=134902691421670&w=2

Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Tested-by: Cong Wang <amwang@redhat.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-07 15:37:04 -05:00
Antonio Quartulli
9affec6be8 batman-adv: enable fast client detection using unicast_4addr packets
The "early client detection mechanism" can be extended to find new clients by
means of unicast_4addr packets.

The unicast_4addr packet contains as well as the broadcast packet (which is
currently used in this mechanism) the address of the originating node and can
therefore be used to install new entries in the Global Translation Table

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
2012-11-07 20:00:24 +01:00
Martin Hundebøll
4046b24afa batman-adv: Add get_ethtool_stats() support for DAT
Added additional counters for D.A.T.

Signed-off-by: Martin Hundebøll <martin@hundeboll.net>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:23 +01:00
Antonio Quartulli
33af49ad8a batman-adv: Distributed ARP Table - add runtime switch
This patch adds a runtime switch that enables the user to turn the DAT feature
on or off at runtime

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:22 +01:00
Antonio Quartulli
1722447482 batman-adv: Distributed ARP Table - add compile option
This patch makes it possible to decide whether to include DAT within the
batman-adv binary or not.
It is extremely useful when the user wants to reduce the size of the resulting
module by cutting off any not needed feature.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:22 +01:00
Antonio Quartulli
c384ea3ec9 batman-adv: Distributed ARP Table - add snooping functions for ARP messages
In case of an ARP message going in or out the soft_iface, it is intercepted and
a special action is performed. In particular the DHT helper functions previously
implemented are used to store all the ARP entries belonging to the network in
order to provide a fast and unicast lookup instead of the classic broadcast
flooding mechanism.
Each node stores the entries it is responsible for (following the DHT rules) in
its soft_iface ARP table. This makes it possible to reuse the kernel data
structures and functions for ARP management.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:21 +01:00
Antonio Quartulli
5c3a0e5535 batman-adv: Distributed ARP Table - add ARP parsing functions
ARP messages are now parsed to make it possible to trigger special actions
depending on their types (snooping).

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:20 +01:00
Antonio Quartulli
2f1dfbe185 batman-adv: Distributed ARP Table - implement local storage
Since batman-adv cannot inter-operate with the host ARP table, this patch
introduces a batman-adv private storage for ARP entries exchanged within DAT.
This storage will represent the node local cache in the DAT protocol.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:20 +01:00
Antonio Quartulli
785ea11441 batman-adv: Distributed ARP Table - create DHT helper functions
Add all the relevant functions in order to manage a Distributed Hash Table over
the B.A.T.M.A.N.-adv network. It will later be used to store several ARP entries
and implement DAT (Distributed ARP Table)

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:19 +01:00
Antonio Quartulli
0e861a3c4f batman-adv: Distributed ARP Table - add a new debug log level
A new log level has been added to concentrate messages regarding DAT: ARP
snooping, requests, response and DHT related messages.
The new log level is named BATADV_DBG_DAT

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:18 +01:00
Antonio Quartulli
7cdcf6dddc batman-adv: add UNICAST_4ADDR packet type
The current unicast packet type does not contain the orig source address. This
patches add a new unicast packet (called UNICAST_4ADDR) which provides two new
fields: the originator source address and the subtype (the type of the data
contained in the packet payload). The former is useful to identify the node
which injected the packet into the network and the latter is useful to avoid
creating new unicast packet types in the future: a macro defining a new subtype
will be enough.

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:18 +01:00
Sven Eckelmann
f6c57a4609 batman-adv: Mark correctly aligned headers not as __packed
Headers which are already perfectly aligned and create a 4 byte boundary
non-ethernet header payload can have the __packed attribute removed. The
__packed attribute doesn't change the appeareance of the packet for these
headers because no extra padding is necessary to align the data members. The
compiler will also create slightly faster code for loads of multi-byte members.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:17 +01:00
Sven Eckelmann
5b24657443 batman-adv: Reserve extra bytes in skb for better alignment
The ethernet header is 14 bytes long. Therefore, the data after it is not 4
byte aligned and may cause problems on systems without unaligned data access.
Reserving NET_IP_ALIGN more byes can fix the misalignment of the ethernet
header.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
2012-11-07 20:00:16 +01:00
Eric Dumazet
196d97f6b1 htb: fix two bugs
Commit 56b765b79e (htb: improved accuracy at high rates)
introduced two bugs :

1) one bstats_update() was inadvertently removed from
   htb_dequeue_tree(), breaking statistics/rate estimation.

2) Missing qdisc_put_rtab() calls in htb_change_class(),
   leaking kernel memory, now struct htb_class no longer
   retains pointers to qdisc_rate_table structs.

   Since only rate is used, dont use qdisc_get_rtab() calls
   copying data we ignore anyway.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vimalkumar <j.vimal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-06 19:06:29 -05:00
Arik Nemtsov
987c285c2a mac80211: sync acccess to tx_filtered/ps_tx_buf queues
These are accessed without a lock when ending STA PSM. If the
sta_cleanup timer accesses these lists at the same time, we might crash.

This may fix some mysterious crashes we had during
ieee80211_sta_ps_deliver_wakeup.

Cc: stable@vger.kernel.org
Signed-off-by: Arik Nemtsov <arik@wizery.com>
Signed-off-by: Ido Yariv <ido@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2012-11-05 14:19:55 +01:00
Lee Jones
0cb2bbbea0 bridge: Avoid 'statement with no effect' compiler warnings
Instead of issuing (0) statements when !CONFIG_SYSFS which will cause
'warning: ', we'll use inline statements instead. This will effectively
do the same thing, but suppress any unnecessary warnings.

Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-04 01:59:29 -04:00
Cyrill Gorcunov
cacb6ba0f3 net: inet_diag -- Return error code if protocol handler is missed
We've observed that in case if UDP diag module is not
supported in kernel the netlink returns NLMSG_DONE without
notifying a caller that handler is missed.

This patch makes __inet_diag_dump to return error code instead.

So as example it become possible to detect such situation
and handle it gracefully on userspace level.

Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
CC: David Miller <davem@davemloft.net>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-04 01:56:49 -04:00
Linus Torvalds
d4164973a0 NFS bugfixes for Linux 3.7
- Fix a bunch of deadlock situations:
   * State recovery can deadlock if we fail to release sequence ids before
     scheduling the recovery thread.
   * Calling deactivate_super() from an RPC workqueue thread can deadlock
     because of the call to rpc_shutdown_client.
 - Display the device name correctly in /proc/*/mounts
 - Fix a number of incorrect error return values:
   * When NFSv3 mounts fail due to a timeout.
   * On NFSv4.1 backchannel setup failure
   * On NFSv4 open access checks
 - pnfs_find_alloc_layout() must check the layout pointer for NULL
 - Fix a regression in the legacy DNS resolved
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQlXRyAAoJEGcL54qWCgDy1G4QALUjERF6VBaDUVRVDe3cyrNr
 1y5niKU19ysaXuNKtsGquVr3y72c0xyzCFLan8ZqJl1kXpYTDIyoDc0NBT3reffo
 T4rbw8SXVFVk+iVcJQgnFo5tblHVsVk+fvZPezBknaBb/uI/kxIbicZHHxvASyzM
 G0Ge/MC709x07tu3wJSyOtvD55j6bfpwbon6yjOKdN8/S2nOEsNOBjAunOWrKtsa
 2v3uWC9APQqIF5z6dfhAVCEMRrQMame6W6b5LuY/MSQpLAwUUYeCGzjc1TMeWvnS
 n66iCqQ0TCVgJH+7fYnrzPm5uNk4Am2XhM3QY1B0j8zl73ZWIs4PfjdoBytLnqF+
 SykNVUl72WqdJnWPCtk6LvoctyOsSMK5pJj4jadvNqDKgFkuRZkWb28bqux2OlHg
 ZCmnEZiew5tV+2anF6sgEaEhSyKNR/1h1ilXOy2TaV93qX+ec8tg+TE6C6YQR/R+
 IoOoYD8Hbe/R/D76WK0xOOQbrXIQvKb1zvlIRx7W6d5gWpg9IceNLcBfCiF1GjPf
 0PKxTbBSbSg44IRolPIprsZx6YzCI6wj2UzdPC7riozfZdSZ+T6Hh7z8rhhTjsHS
 D5emunX93kMp5QSSrw6EtdTkRJ++DB+cpjK5SP7UydHsWSGnKbd8eIv7viZu5J5O
 FLcHHWKIan9wQCslGQkO
 =NjEu
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Fix a bunch of deadlock situations:
   * State recovery can deadlock if we fail to release sequence ids
     before scheduling the recovery thread.
   * Calling deactivate_super() from an RPC workqueue thread can
     deadlock because of the call to rpc_shutdown_client.

 - Display the device name correctly in /proc/*/mounts

 - Fix a number of incorrect error return values:
   * When NFSv3 mounts fail due to a timeout.
   * On NFSv4.1 backchannel setup failure
   * On NFSv4 open access checks

 - pnfs_find_alloc_layout() must check the layout pointer for NULL

 - Fix a regression in the legacy DNS resolved

* tag 'nfs-for-3.7-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS4: nfs4_opendata_access should return errno
  NFSv4: Initialise the NFSv4.1 slot table highest_used_slotid correctly
  SUNRPC: return proper errno from backchannel_rqst
  NFS: add nfs_sb_deactive_async to avoid deadlock
  nfs: Show original device name verbatim in /proc/*/mount{s,info}
  nfsv3: Make v3 mounts fail with ETIMEDOUTs instead EIO on mountd timeouts
  nfs: Check whether a layout pointer is NULL before free it
  NFS: fix bug in legacy DNS resolver.
  NFSv4: nfs4_locku_done must release the sequence id
  NFSv4.1: We must release the sequence id when we fail to get a session slot
  NFS: Wait for session recovery to finish before returning
2012-11-03 15:27:21 -07:00
John Fastabend
c38e01b8b9 net: fix bridge notify hook to manage flags correctly
The bridge notify hook rtnl_bridge_notify() was not handling the
case where the master flags was set or with both flags set. First
flags are not being passed correctly and second the logic to parse
them is broken.

This patch passes the original flags value and fixes the
logic.

Reported-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 15:37:36 -04:00
John Fastabend
a7a558fe42 rtnetlink: Use nlmsg type RTM_NEWNEIGH from dflt fdb dump
Change the dflt fdb dump handler to use RTM_NEWNEIGH to
be compatible with bridge dump routines.

The dump reply from the network driver handlers should
match the reply from bridge handler. The fact they were
not in the ixgbe case was effectively a bug. This patch
resolves it.

Applications that were not checking the nlmsg type will
continue to work. And now applications that do check
the type will work as expected.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 15:27:08 -04:00
Vimalkumar
56b765b79e htb: improved accuracy at high rates
Current HTB (and TBF) uses rate table computed by the "tc"
userspace program, which has the following issue:

The rate table has 256 entries to map packet lengths
to token (time units).  With TSO sized packets, the
256 entry granularity leads to loss/gain of rate,
making the token bucket inaccurate.

Thus, instead of relying on rate table, this patch
explicitly computes the time and accounts for packet
transmission times with nanosecond granularity.

This greatly improves accuracy of HTB with a wide
range of packet sizes.

Example:

tc qdisc add dev $dev root handle 1: \
        htb default 1

tc class add dev $dev classid 1:1 parent 1: \
        rate 5Gbit mtu 64k

Here is an example of inaccuracy:

$ iperf -c host -t 10 -i 1

With old htb:
eth4:   34.76 Mb/s In  5827.98 Mb/s Out -  65836.0 p/s In  481273.0 p/s Out
[SUM]  9.0-10.0 sec   669 MBytes  5.61 Gbits/sec
[SUM]  0.0-10.0 sec  6.50 GBytes  5.58 Gbits/sec

With new htb:
eth4:   28.36 Mb/s In  5208.06 Mb/s Out -  53704.0 p/s In  430076.0 p/s Out
[SUM]  9.0-10.0 sec   594 MBytes  4.98 Gbits/sec
[SUM]  0.0-10.0 sec  5.80 GBytes  4.98 Gbits/sec

The bits per second on the wire is still 5200Mb/s with new HTB
because qdisc accounts for packet length using skb->len, which
is smaller than total bytes on the wire if GSO is used.  But
that is for another patch regardless of how time is accounted.

Many thanks to Eric Dumazet for review and feedback.

Signed-off-by: Vimalkumar <j.vimal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 15:24:01 -04:00
Xiaotian Feng
d145f7ec23 tipc: do not use tasklet_disable before tasklet_kill
If tasklet_disable() is called before related tasklet handled,
tasklet_kill will never be finished. tasklet_kill is enough.

Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Cc: Jon Maloy <jon.maloy@ericsson.com>
Cc: Allan Stephens <allan.stephens@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: tipc-discussion@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 15:10:14 -04:00
Amerigo Wang
94e187c015 ipv6: introduce ip6_rt_put()
As suggested by Eric, we could introduce a helper function
for ipv6 too, to avoid checking if rt is NULL before
dst_release().

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 14:59:05 -04:00
Neil Horman
b26ddd8130 sctp: Clean up type-punning in sctp_cmd_t union
Lots of points in the sctp_cmd_interpreter function treat the sctp_cmd_t arg as
a void pointer, even though they are written as various other types.  Theres no
need for this as doing so just leads to possible type-punning issues that could
cause crashes, and if we remain type-consistent we can actually just remove the
void * member of the union entirely.

Change Notes:

v2)
	* Dropped chunk that modified SCTP_NULL to create a marker pattern
	 should anyone try to use a SCTP_NULL() assigned sctp_arg_t, Assigning
	 to .zero provides the same effect and should be faster, per Vlad Y.

v3)
	* Reverted part of V2, opting to use memset instead of .zero, so that
	 the entire union is initalized thus avoiding the i164 speculative load
	 problems previously encountered, per Dave M..  Also rewrote
	 SCTP_[NO]FORCE so as to use common infrastructure a little more

Signed-off-by: Neil Horman <nhorman@tuxdriver.com
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 14:54:55 -04:00
Amerigo Wang
1a9408355e ipv6: remove a useless NULL check
In dev_forward_change(), it is useless to check if idev->dev
is NULL, it is always non-NULL here.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 14:50:15 -04:00
Daniel Borkmann
398f382c0a pktgen: clean up ktime_t helpers
Some years ago, the ktime_t helper functions ktime_now() and ktime_lt()
have been introduced. Instead of defining them inside pktgen.c, they
should either use ktime_t library functions or, if not available, they
should be defined in ktime.h, so that also others can benefit from them.
ktime_compare() is introduced with a similar notion as in timespec_compare().

Signed-off-by: Daniel Borkmann <daniel.borkmann@tik.ee.ethz.ch>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 14:50:15 -04:00
Eric Dumazet
e6c022a4fa tcp: better retrans tracking for defer-accept
For passive TCP connections using TCP_DEFER_ACCEPT facility,
we incorrectly increment req->retrans each time timeout triggers
while no SYNACK is sent.

SYNACK are not sent for TCP_DEFER_ACCEPT that were established (for
which we received the ACK from client). Only the last SYNACK is sent
so that we can receive again an ACK from client, to move the req into
accept queue. We plan to change this later to avoid the useless
retransmit (and potential problem as this SYNACK could be lost)

TCP_INFO later gives wrong information to user, claiming imaginary
retransmits.

Decouple req->retrans field into two independent fields :

num_retrans : number of retransmit
num_timeout : number of timeouts

num_timeout is the counter that is incremented at each timeout,
regardless of actual SYNACK being sent or not, and used to
compute the exponential timeout.

Introduce inet_rtx_syn_ack() helper to increment num_retrans
only if ->rtx_syn_ack() succeeded.

Use inet_rtx_syn_ack() from tcp_check_req() to increment num_retrans
when we re-send a SYNACK in answer to a (retransmitted) SYN.
Prior to this patch, we were not counting these retransmits.

Change tcp_v[46]_rtx_synack() to increment TCP_MIB_RETRANSSEGS
only if a synack packet was successfully queued.

Reported-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Vijay Subramanian <subramanian.vijay@gmail.com>
Cc: Elliott Hughes <enh@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 14:45:00 -04:00
Linus Torvalds
0f89a5733a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "First post-Sandy pull request"

 1) Fix antenna gain handling and initialization of chan->max_reg_power
    in wireless, from Felix Fietkau.

 2) Fix nexthop handling in H.232 conntrack helper, from Julian
    Anastasov.

 3) Only process 80211 mesh config header in certain kinds of frames,
    from Javier Cardona.

 4) 80211 management frame header length needs to be validated, from
    Johannes Berg.

 5) Don't access free'd SKBs in ath9k driver, from Felix Fietkay.

 6) Test for permanent state correctly in VXLAN driver, from Stephen
    Hemminger.

 7) BNX2X bug fixes from Yaniv Rosner and Dmitry Kravkov.

 8) Fix off by one errors in bonding, from Nikolay ALeksandrov.

 9) Fix divide by zero in TCP-Illinois congestion control.  From Jesper
    Dangaard Brouer.

10) TCP metrics code says "Yo dawg, I heard you like sizeof, so I did a
    sizeof of a sizeof, so you can size your size" Fix from Julian
    Anastasov.

11) Several drivers do mdiobus_free without first doing an
    mdiobus_unregister leading to stray pointer references.  Fix from
    Peter Senna Tschudin.

12) Fix OOPS in l2tp_eth_create() error path, it's another danling
    pointer kinda situation.  Fix from Tom Parkin.

13) Hardware driven by the vmxnet driver can't handle larger than 16K
    fragments, so split them up when necessary.  From Eric Dumazet.

14) Handle zero length data length in tcp_send_rcvq() properly.  Fix
    from Pavel Emelyanov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (38 commits)
  tcp-repair: Handle zero-length data put in rcv queue
  vmxnet3: must split too big fragments
  l2tp: fix oops in l2tp_eth_create() error path
  cxgb4: Fix unable to get UP event from the LLD
  drivers/net/phy/mdio-bitbang.c: Call mdiobus_unregister before mdiobus_free
  drivers/net/ethernet/nxp/lpc_eth.c: Call mdiobus_unregister before mdiobus_free
  bnx2x: fix HW initialization using fw 7.8.x
  tcp: Fix double sizeof in new tcp_metrics code
  net: fix divide by zero in tcp algorithm illinois
  net: sctp: Fix typo in net/sctp
  bonding: fix second off-by-one error
  bonding: fix off-by-one error
  bnx2x: Disable FCoE for 57840 since not yet supported by FW
  bnx2x: Fix no link on 577xx 10G-baseT
  bnx2x: Fix unrecognized SFP+ module after driver is loaded
  bnx2x: Fix potential incorrect link speed provision
  bnx2x: Restore global registers back to default.
  bnx2x: Fix link down in 57712 following LFA
  bnx2x: Fix 57810 1G-KR link against certain switches.
  ixgbe: PTP get_ts_info missing software support
  ...
2012-11-02 20:48:41 -07:00
Pavel Emelyanov
c454e6111d tcp-repair: Handle zero-length data put in rcv queue
When sending data into a tcp socket in repair state we should check
for the amount of data being 0 explicitly. Otherwise we'll have an skb
with seq == end_seq in rcv queue, but tcp doesn't expect this to happen
(in particular a warn_on in tcp_recvmsg shoots).

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Reported-by: Giorgos Mavrikas <gmavrikas@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 22:01:45 -04:00
Tom Parkin
789336360e l2tp: fix oops in l2tp_eth_create() error path
When creating an L2TPv3 Ethernet session, if register_netdev() should fail for
any reason (for example, automatic naming for "l2tpeth%d" interfaces hits the
32k-interface limit), the netdev is freed in the error path.  However, the
l2tp_eth_sess structure's dev pointer is left uncleared, and this results in
l2tp_eth_delete() then attempting to unregister the same netdev later in the
session teardown.  This results in an oops.

To avoid this, clear the session dev pointer in the error path.

Signed-off-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:56:35 -04:00
Ben Hutchings
25b1e67921 net: Fix continued iteration in rtnl_bridge_getlink()
Commit e5a55a8987 ('net: create generic
bridge ops') broke the handling of a non-zero starting index in
rtnl_bridge_getlink() (based on the old br_dump_ifinfo()).

When the starting index is non-zero, we need to increment the current
index for each entry that we are skipping.  Also, we need to check the
index before both cases, since we may previously have stopped
iteration between getting information about a device from its master
and from itself.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Tested-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:53:35 -04:00
Nicolas Dichtel
1a72418bd7 ipv6/multipath: remove flag NLM_F_EXCL after the first nexthop
fib6_add_rt2node() will reject the nexthop if this flag is set, so
we perform the check only for the first nexthop.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:38:19 -04:00
Ben Hutchings
2bc80059fe eth: Rename and properly align br_reserved_address array
Since this array is no longer part of the bridge driver, it should
have an 'eth' prefix not 'br'.

We also assume that either it's 16-bit-aligned or the architecture has
efficient unaligned access.  Ensure the first of these is true by
explicitly aligning it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:34:05 -04:00
Ben Hutchings
46acc460c0 eth: Make is_link_local() consistent with other address tests
Function name should include '_ether_addr'.
Return type should be bool.
Parameter name should be 'addr' not 'dest' (also matching kernel-doc).

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:34:05 -04:00
Ben Hutchings
4197f24b5b bridge: Use is_link_local() in store_group_addr()
Parse the string into an array of bytes rather than ints, so we can
use is_link_local() rather than reimplementing it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:34:05 -04:00
Michael S. Tsirkin
25121173f7 skb: api to report errors for zero copy skbs
Orphaning frags for zero copy skbs needs to allocate data in atomic
context so is has a chance to fail. If it does we currently discard
the skb which is safe, but we don't report anything to the caller,
so it can not recover by e.g. disabling zero copy.

Add an API to free skb reporting such errors: this is used
by tun in case orphaning frags fails.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:29:57 -04:00
Michael S. Tsirkin
e19d6763cc skb: report completion status for zero copy skbs
Even if skb is marked for zero copy, net core might still decide
to copy it later which is somewhat slower than a copy in user context:
besides copying the data we need to pin/unpin the pages.

Add a parameter reporting such cases through zero copy callback:
if this happens a lot, device can take this into account
and switch to copying in user context.

This patch updates all users but ignores the passed value for now:
it will be used by follow-up patches.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-02 21:29:57 -04:00
Florian Westphal
121d1e0941 netfilter: ipv6: add getsockopt to retrieve origdst
userspace can query the original ipv4 destination address of a REDIRECTed
connection via
getsockopt(m_sock, SOL_IP, SO_ORIGINAL_DST, &m_server_addr, &addrsize)

but for ipv6 no such option existed.

This adds getsockopt(..., IPPROTO_IPV6, IP6T_SO_ORIGINAL_DST, ...).

Without this, userspace needs to parse /proc or use ctnetlink, which
appears to be overkill.

This uses option number 80 for IP6T_SO_ORIGINAL_DST, which is spare,
to use the same number we use in the IPv4 socket option SO_ORIGINAL_DST.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2012-11-02 12:26:32 +01:00
Amerigo Wang
f4d5392e5f vlan: use IS_ENABLED()
#if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)

can be replaced by

#if IS_ENABLED(CONFIG_FOO)

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 12:41:35 -04:00
Amerigo Wang
07a936260a ipv6: use IS_ENABLED()
#if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)

can be replaced by

#if IS_ENABLED(CONFIG_FOO)

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 12:41:35 -04:00
Nicolas Dichtel
cc535dfb6a rtnl/ipv4: use netconf msg to advertise rp_filter status
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 12:41:34 -04:00
Julian Anastasov
2c42a3fb30 tcp: Fix double sizeof in new tcp_metrics code
Fix double sizeof when parsing IPv6 address from
user space because it breaks get/del by specific IPv6 address.

	Problem noticed by David Binderman:

https://bugzilla.kernel.org/show_bug.cgi?id=49171

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-01 11:59:08 -04:00