1
Commit Graph

5313 Commits

Author SHA1 Message Date
Dmitry Butskoy
f13ec93fba [IPV6]: MSG_ERRQUEUE messages do not pass to connected raw sockets
From: Dmitry Butskoy <dmitry@butskoy.name>

Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8747

Problem Description:

It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp
and raw sockets, both connected and unconnected.

There is a little typo in net/ipv6/icmp.c code, which prevents such messages
to be delivered to the errqueue of the correspond raw socket, when the socket
is CONNECTED.  The typo is due to swap of local/remote addresses.

Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is
looked up usual way, it is something like:

sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);

where "daddr" is a destination address of the incoming packet (IOW our local
address), "saddr" is a source address of the incoming packet (the remote end).

But when the raw socket is looked up for some icmp error report, in
net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr are obtained from the echoed
fragment of the "bad" packet, i.e.  "daddr" is the original destination
address of that packet, "saddr" is our local address.  Hence, for
icmpv6_notify() must use "saddr, daddr" in its arguments, not "daddr, saddr"
...

Steps to reproduce:

Create some raw socket, connect it to an address, and cause some error
situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach.
Set IPV6_RECVERR .
Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN).
You should receive "time exceeded" icmp message (because of "ttl=1"), but the
socket do not receive it.

If you do not connect your raw socket, you will receive MSG_ERRQUEUE
successfully.  (The reason is that for unconnected socket there are no actual
checks for local/remote addresses).

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 23:53:08 -07:00
Jean Delvare
1b1ac759d7 [IPV4]: Cleanup call to __neigh_lookup()
Back in the times of Linux 2.2, negative values for the creat parameter
of __neigh_lookup() had a particular meaning, but no longer, so we
should pass 1 instead.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:51:44 -07:00
Patrick McHardy
0621ed2e4e [NET_SCHED]: Revert "avoid transmit softirq on watchdog wakeup" optimization
As noticed by Ranko Zivojnovic <ranko@spidernet.net>, calling qdisc_run
from the timer handler can result in deadlock:

> CPU#0
>
> qdisc_watchdog() fires and gets dev->queue_lock
> qdisc_run()...qdisc_restart()...
> -> releases dev->queue_lock and enters dev_hard_start_xmit()
>
> CPU#1
>
> tc del qdisc dev ...
> qdisc_graft()...dev_graft_qdisc()...dev_deactivate()...
> -> grabs dev->queue_lock ...
>
> qdisc_reset()...{cbq,hfsc,htb,netem,tbf}_reset()...qdisc_watchdog_cancel()...
> -> hrtimer_cancel() - waiting for the qdisc_watchdog() to exit, while still
>		        holding dev->queue_lock
>
> CPU#0
>
> dev_hard_start_xmit() returns ...
> -> wants to get dev->queue_lock(!)
>
> DEADLOCK!

The entire optimization is a bit questionable IMO, it moves potentially
large parts of NET_TX_SOFTIRQ work to TIMER_SOFTIRQ/HRTIMER_SOFTIRQ,
which kind of defeats the separation of them.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Ranko Zivojnovic <ranko@spidernet.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:49:26 -07:00
Patrick McHardy
59eecdfb16 [NETFILTER]: nf_conntrack: UDPLITE support
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:48:44 -07:00
Patrick McHardy
61075af51f [NETFILTER]: nf_conntrack: mark protocols __read_mostly
Also remove two unnecessary EXPORT_SYMBOLs and move the
nf_conntrack_l3proto_ipv4 declaration to the correct file.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:48:19 -07:00
Jan Engelhardt
370786f9cf [NETFILTER]: x_tables: add connlimit match
ipt_connlimit has been sitting in POM-NG for a long time.
Here is a new shiny xt_connlimit with:

 * xtables'ified
 * will request the layer3 module
   (previously it hotdropped every packet when it was not loaded)
 * fixed: there was a deadlock in case of an OOM condition
 * support for any layer4 protocol (e.g. UDP/SCTP)
 * using jhash, as suggested by Eric Dumazet
 * ipv6 support

Signed-off-by: Jan Engelhardt <jengelh@gmx.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:47:26 -07:00
Patrick McHardy
a887c1c148 [NETFILTER]: Lower *tables printk severity
Lower ip6tables, arptables and ebtables printk severity similar to
Dan Aloni's patch for iptables.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:46:15 -07:00
Yasuyuki Kozakai
130e7a83d7 [NETFILTER]: nf_conntrack: Don't track locally generated special ICMP error
The conntrack assigned to locally generated ICMP error is usually the one
assigned to the original packet which has caused the error. But if
the original packet is handled as invalid by nf_conntrack, no conntrack
is assigned to the original packet. Then nf_ct_attach() cannot assign
any conntrack to the ICMP error packet. In that case the current
nf_conntrack_icmp assigns appropriate conntrack to it. But the current
code mistakes the direction of the packet. As a result, NAT code mistakes
the address to be mangled.

To fix the bug, this changes nf_conntrack_icmp not to assign conntrack
to such ICMP error. Actually no address is necessary to be mangled
in this case.

Spotted by Jordan Russell.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:45:41 -07:00
Yasuyuki Kozakai
e2a3123fbe [NETFILTER]: nf_conntrack: Introduces nf_ct_get_tuplepr and uses it
nf_ct_get_tuple() requires the offset to transport header and that bothers
callers such as icmp[v6] l4proto modules. This introduces new function
to simplify them.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:45:14 -07:00
Yasuyuki Kozakai
ffc3069048 [NETFILTER]: nf_conntrack: make l3proto->prepare() generic and renames it
The icmp[v6] l4proto modules parse headers in ICMP[v6] error to get tuple.
But they have to find the offset to transport protocol header before that.
Their processings are almost same as prepare() of l3proto modules.
This makes prepare() more generic to simplify icmp[v6] l4proto module
later.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:44:50 -07:00
Yasuyuki Kozakai
d87d8469e2 [NETFILTER]: nf_conntrack: Increment error count on parsing IPv4 header
Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 20:44:23 -07:00
Michael Chan
6460d948f3 [NET]: Add ethtool support for NETIF_F_IPV6_CSUM devices.
Add ethtool utility function to set or clear IPV6_CSUM feature flag.
Modify tg3.c and bnx2.c to use this function when doing ethtool -K
to change tx checksum.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:07:52 -07:00
Ursula Braun
febca281f6 [AF_IUCV]: Add lock when updating accept_q
The accept_queue of an af_iucv socket will be corrupted, if
adding and deleting of entries in this queue occurs at the
same time (connect request from one client, while accept call
is processed for another client).
Solution: add locking when updating accept_q

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:04:25 -07:00
Ursula Braun
13fdc9a74d [AF_IUCV]: Avoid deadlock between iucv_path_connect and tasklet.
An iucv deadlock may occur, where one CPU is spinning on the
iucv_table_lock for iucv_tasklet_fn(), while another CPU is holding
the iucv_table_lock for an iucv_path_connect() and is waiting for
the first CPU in an smp_call_function.
Solution: replace spin_lock in iucv_tasklet_fn by spin_trylock and
reschedule tasklet in case of non-granted lock.

Signed-off-by: Ursula Braun <braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:03:41 -07:00
Jennifer Hunt
da7de31cc5 [AF_IUCV]: Improve description of IUCV and AFIUCV configuration options.
Signed-off-by: Jennifer Hunt <jenhunt@us.ibm.com>
Signed-off-by: Ursula Braun >braunu@de.ibm.com>
Acked-by: Frank Pavlic <fpavlic@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:03:00 -07:00
Adrian Bunk
acd159b6b5 [INET_SOCK]: make net/ipv4/inet_timewait_sock.c:__inet_twsk_kill() static
This patch makes the needlessly global __inet_twsk_kill() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 19:00:59 -07:00
David S. Miller
cf3842ec50 Merge branch 'upstream-davem' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2007-07-14 18:58:49 -07:00
Stephen Hemminger
b3b0b681b1 [TCP]: tcp probe add back ssthresh field
Sangtae noticed the ssthresh got missed.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:57:19 -07:00
Patrick McHardy
a7ecfc8665 [VLAN]: Fix memset length
Fix sizeof(ETH_ALEN) Introduced by my rtnl_link patches.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:56:30 -07:00
Patrick McHardy
b863ceb7dd [NET]: Add macvlan driver
Add macvlan driver, which allows to create virtual ethernet devices
based on MAC address.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:55:06 -07:00
Patrick McHardy
56addd6eee [VLAN]: Use multicast list synchronization helpers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:53:28 -07:00
Patrick McHardy
6c78dcbd47 [VLAN]: Fix promiscous/allmulti synchronization races
The set_multicast_list function may be called without holding the rtnl
mutex, resulting in races when changing the underlying device's promiscous
and allmulti state. Use the change_rx_mode hook, which is always invoked
under the rtnl.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:52:56 -07:00
Patrick McHardy
a0a400d79e [NET]: dev_mcast: add multicast list synchronization helpers
The method drivers currently use to synchronize multicast lists is not
very pretty:

- walk the multicast list
- search each entry on a copy of the previous list
- if new add to lower device
- walk the copy of the previous list
- search each entry on the current list
- if removed delete from lower device
- copy entire list

This patch adds a new field to struct dev_addr_list to store the
synchronization state and adds two helper functions for synchronization
and cleanup.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:52:02 -07:00
Patrick McHardy
24023451c8 [NET]: Add net_device change_rx_mode callback
Currently the set_multicast_list (and set_rx_mode) callbacks are
responsible for configuring the device according to the IFF_PROMISC,
IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
case of set_rx_mode).

These callbacks can be invoked from BH context without the rtnl_mutex
by dev_mc_add/dev_mc_delete, which makes reading the device flags and
promiscous/allmulti count racy. For real hardware drivers that just
commit all changes to the hardware this is not a real problem since
the stack guarantees to call them for every change, so at least the
final call will not race and commit the correct configuration to the
hardware.

For software devices that want to synchronize promiscous and multicast
state to an underlying device however this can cause corruption of the
underlying device's flags or promisc/allmulti counts.

When the software device is concurrently put in promiscous or allmulti
mode while set_multicast_list is invoked from bottem half context, the
device might synchronize the change to the underlying device without
holding the rtnl_mutex, which races with concurrent changes to the
underlying device.

Add a dev->change_rx_flags hook that is invoked when any of the flags
that affect rx filtering change (under the rtnl_mutex), which allows
drivers to perform synchronization immediately and only synchronize
the address lists in set_multicast_list/set_rx_mode.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:51:31 -07:00
Ingo Molnar
e6c9116d1d [RFKILL]: fix net/rfkill/rfkill-input.c bug on 64-bit systems
Subject: [patch] net/input: fix net/rfkill/rfkill-input.c bug on 64-bit systems

this recent commit:

 commit cf4328cd94
 Author: Ivo van Doorn <IvDoorn@gmail.com>
 Date:   Mon May 7 00:34:20 2007 -0700

     [NET]: rfkill: add support for input key to control wireless radio

added this 64-bit bug:

        ....
	unsigned int flags;
 
 	spin_lock_irqsave(&task->lock, flags);
        ....

irq 'flags' must be unsigned long, not unsigned int. The -rt tree has 
strict checks about this on 64-bit so this triggered a build failure. 

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-14 18:50:15 -07:00
Eric Van Hensbergen
0af8887ebf 9p: fix a race condition bug in umount which caused a segfault
umounting partitions after heavy activity would sometimes trigger a
segmentation violation.  This fix appears to remove that problem.
Fix originally provided by Latchesar Ionkov.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:14:19 -05:00
Latchesar Ionkov
1d6b560238 net/9p: set error to EREMOTEIO if trans->write returns zero
If trans->write returns 0, p9_write_work goes through the error path, but
sets the error code to zero.

This patch sets the error code to EREMOTEIO if trans->write returns zero
value.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
2007-07-14 15:14:01 -05:00
Latchesar Ionkov
e46662be7f net/9p: change net/9p module name to 9pnet
Change module name of net/9p module from 9p.ko to 9pnet.ko. fs/9p module
already uses 9p.ko name.

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
2007-07-14 15:13:50 -05:00
Latchesar Ionkov
bd238fb431 9p: Reorganization of 9p file system code
This patchset moves non-filesystem interfaces of v9fs from fs/9p to net/9p.
It moves the transport, packet marshalling and connection layers to net/9p
leaving only the VFS related files in fs/9p.  This work is being done in
preparation for in-kernel 9p servers as well as alternate 9p clients (other
than VFS).

Signed-off-by: Latchesar Ionkov <lucho@ionkov.net>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
2007-07-14 15:13:40 -05:00
Linus Torvalds
16cefa8c38 Merge git://git.linux-nfs.org/pub/linux/nfs-2.6
* git://git.linux-nfs.org/pub/linux/nfs-2.6: (122 commits)
  sunrpc: drop BKL around wrap and unwrap
  NFSv4: Make sure unlock is really an unlock when cancelling a lock
  NLM: fix source address of callback to client
  SUNRPC client: add interface for binding to a local address
  SUNRPC server: record the destination address of a request
  SUNRPC: cleanup transport creation argument passing
  NFSv4: Make the NFS state model work with the nosharedcache mount option
  NFS: Error when mounting the same filesystem with different options
  NFS: Add the mount option "nosharecache"
  NFS: Add support for mounting NFSv4 file systems with string options
  NFS: Add final pieces to support in-kernel mount option parsing
  NFS: Introduce generic mount client API
  NFS: Add enums and match tables for mount option parsing
  NFS: Improve debugging output in NFS in-kernel mount client
  NFS: Clean up in-kernel NFS mount
  NFS: Remake nfsroot_mount as a permanent part of NFS client
  SUNRPC: Add a convenient default for the hostname when calling rpc_create()
  SUNRPC: Rename rpcb_getport to be consistent with new rpcb_getport_sync name
  SUNRPC: Rename rpcb_getport_external routine
  SUNRPC: Allow rpcbind requests to be interrupted by a signal.
  ...
2007-07-13 16:46:18 -07:00
Linus Torvalds
e030dbf91a Merge branch 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop
* 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits)
  ioatdma: add the unisys "i/oat" pci vendor/device id
  ARM: Add drivers/dma to arch/arm/Kconfig
  iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver
  iop13xx: surface the iop13xx adma units to the iop-adma driver
  dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
  md: remove raid5 compute_block and compute_parity5
  md: handle_stripe5 - request io processing in raid5_run_ops
  md: handle_stripe5 - add request/completion logic for async expand ops
  md: handle_stripe5 - add request/completion logic for async read ops
  md: handle_stripe5 - add request/completion logic for async check ops
  md: handle_stripe5 - add request/completion logic for async compute ops
  md: handle_stripe5 - add request/completion logic for async write ops
  md: common infrastructure for running operations with raid5_run_ops
  md: raid5_run_ops - run stripe operations outside sh->lock
  raid5: replace custom debug PRINTKs with standard pr_debug
  raid5: refactor handle_stripe5 and handle_stripe6 (v3)
  async_tx: add the async_tx api
  xor: make 'xor_blocks' a library routine for use with async_tx
  dmaengine: make clients responsible for managing channels
  dmaengine: refactor dmaengine around dma_async_tx_descriptor
  ...
2007-07-13 10:52:27 -07:00
Dan Williams
d379b01e90 dmaengine: make clients responsible for managing channels
The current implementation assumes that a channel will only be used by one
client at a time.  In order to enable channel sharing the dmaengine core is
changed to a model where clients subscribe to channel-available-events.
Instead of tracking how many channels a client wants and how many it has
received the core just broadcasts the available channels and lets the
clients optionally take a reference.  The core learns about the clients'
needs at dma_event_callback time.

In support of multiple operation types, clients can specify a capability
mask to only be notified of channels that satisfy a certain set of
capabilities.

Changelog:
* removed DMA_TX_ARRAY_INIT, no longer needed
* dma_client_chan_free -> dma_chan_release: switch to global reference
  counting only at device unregistration time, before it was also happening
  at client unregistration time
* clients now return dma_state_client to dmaengine (ack, dup, nak)
* checkpatch.pl fixes
* fixup merge with git-ioat

Cc: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
2007-07-13 08:06:13 -07:00
Linus Torvalds
dc690d8ef8 Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (61 commits)
  sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
  sysfs: make directory dentries and inodes reclaimable
  sysfs: implement sysfs_get_dentry()
  sysfs: move sysfs_drop_dentry() to dir.c and make it static
  sysfs: restructure add/remove paths and fix inode update
  sysfs: use sysfs_mutex to protect the sysfs_dirent tree
  sysfs: consolidate sysfs spinlocks
  sysfs: make kobj point to sysfs_dirent instead of dentry
  sysfs: implement sysfs_find_dirent() and sysfs_get_dirent()
  sysfs: implement SYSFS_FLAG_REMOVED flag
  sysfs: rename sysfs_dirent->s_type to s_flags and make room for flags
  sysfs: make sysfs_drop_dentry() access inodes using ilookup()
  sysfs: Fix oops in sysfs_drop_dentry on x86_64
  sysfs: use singly-linked list for sysfs_dirent tree
  sysfs: slim down sysfs_dirent->s_active
  sysfs: move s_active functions to fs/sysfs/dir.c
  sysfs: fix root sysfs_dirent -> root dentry association
  sysfs: use iget_locked() instead of new_inode()
  sysfs: reorganize sysfs_new_indoe() and sysfs_create()
  sysfs: fix parent refcounting during rename and move
  ...
2007-07-12 13:40:20 -07:00
Daniel Drake
63fc33ceb0 [PATCH] mac80211: improved 802.11g CTS protection
Currently, CTS protection is partially implemented twice:
 1. via prism2 ioctls, only used by hostapd
 2. via STA beacon parsing, recorded in sta.use_protection but never used
    (other than printed in debugfs)

Protection control should be implemented on a per-subif basis. For example,
a single physical device may be running a soft AP on one channel, and a STA
on another. The AP interface should use protection based on what hostapd told
it, and the STA interface should use protection based on beacon parsing.
These should operate independantly: one subif using protection should not
influence the other.

To implement this, I moved the use_protection flag into ieee80211_sub_if_data
and removed the device-global cts_protect_erp_frames flag.

I also made the PRISM2_PARAM_CTS_PROTECT_ERP_FRAMES write operation only
available for AP interfaces, to avoid any possibility of the user messing with
the behaviour of a STA.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:26 -04:00
Daniel Drake
5628221caf [PATCH] mac80211: ERP IE handling improvements
The "protection needed" flag is currently parsed out of the ERP IE in
beacons. This patch allows the ERP IE to be available at assocation time
and causes the appropriate actions to be performed earlier.

It is slightly complicated by the fact that most APs don't include the
ERP IE in association responses. To work around this, we store ERP
values in the ieee80211_sta_bss structure.

Also added some WLAN_ERP defines for use by upcoming patches.

Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:26 -04:00
Larry Finger
1fd5e589d8 [PATCH] mac80211: Implementation of SIOCSIWRATE
The WEXT ioctl SIOCSIWRATE is not implemented in mac80211. This patch
adds the missing routine. It supports the 'auto' keyword, fixed rates,
and the combination of 'auto' and a fixed rate to select an upper bound.

Based on the patch from Mohamed Abbas <mabbas@linux.intel.com>.

Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:26 -04:00
Johannes Berg
4480f15ca6 [PATCH] mac80211: clarify some mac80211 things
The semantics of not having an add_interface callback are not well
defined, this callback is required because otherwise you cannot obtain
the requested MAC address of the device. Change the documentation to
reflect this, add a note about having no MAC address at all, add a
warning that mac_addr in struct ieee80211_if_init_conf can be NULL and
finally verify that a few callbacks are assigned by way of BUG_ON()

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:26 -04:00
Johannes Berg
5558235c6b [PATCH] mac80211: conserve stack space due to padding
This patch reorders some fields in struct ieee802_11_elems to save 17*7
or 17*3 bytes (on 64/32-bit machines respectively) stack space in a few
functions.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Johannes Berg
191b92666e [PATCH] mac80211: kill PRISM2_PARAM_CLEAR_KEYS
Not used anywhere, hence dead code. wpa_supplicant has its
own clear_keys routine.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Johannes Berg
fda6cc7ac4 [PATCH] mac80211: remove PRISM2_PARAM_DROP_UNENCRYPTED ioctl
Interestingly, wpa_supplicant doesn't use it, but uses the
currently unsupported IW_AUTH_DROP_UNENCRYPTED. So I guess
it doesn't matter anyway.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Johannes Berg
9771f740c6 [PATCH] mac80211: kill antenna select ioctls
Not used anywhere.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Johannes Berg
3ef8bed469 [PATCH] mac80211: kill rate control ioctls
These aren't used anywhere (hostapd, wpa_supplicant) and until we
have a proper interface to the rate control algorithms they don't
make much sense either since e.g. rc80211_lowest won't honour them.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Johannes Berg
40f7cac9f8 [PATCH] mac80211: separate monitor/subif_start_xmit
This patch separates the monitor interface start_xmit from the
subif start xmit (those other devices have 802.3 framing, monitor
interfaces have radiotap framing)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Hong Liu
333af2f071 [PATCH] mac80211: add support for iwlist channel
Add supported channels info in SIOCGIWRANGE implementation.

Signed-off-by: Hong Liu <hong.liu@intel.com>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Johannes Berg
c59304b5e0 [PATCH] mac80211: remove ieee80211_set_aid_for_sta
Remove ieee80211_set_aid_for_sta and associated code.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:25 -04:00
Johannes Berg
7f8c059828 [PATCH] mac80211: remove ieee80211_msg_passive_scan
This constant is unused.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:24 -04:00
Johannes Berg
b306f45300 [PATCH] mac80211: show transmitted frames on monitor interfaces
This patch makes mac80211 show transmitted frames on monitor interfaces,
including radiotap headers that indicate some transmission parameters.
The shown parameters will need to be expanded, but this should work as
a basis to work from.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:24 -04:00
Andy Green
e4c967c6d8 [PATCH] mac80211: Monitor mode radiotap-based packet injection
Signed-off-by: Andy Green <andy@warmcat.com>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:24 -04:00
Andy Green
179f831bc3 [PATCH] cfg80211: Radiotap parser
Generic code to walk through the fields in a radiotap header, accounting
for nasties like extended "field present" bitfields and alignment rules

Signed-off-by: Andy Green <andy@warmcat.com>
Signed-off-by: Jiri Benc <jbenc@suse.cz>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2007-07-12 16:07:24 -04:00
Patrick McHardy
db3d99c090 [NET_SCHED]: ematch: module autoloading
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:46:26 -07:00
Stephen Hemminger
662ad4f8ef [TCP]: tcp probe wraparound handling and other changes
Switch from formatting messages in probe routine and copying with
kfifo, to using a small circular queue of information and formatting
on read.  This avoids wraparound issues with kfifo, and saves one
copy.

Also make sure to state correct license, rather than copying off some
other driver I started with.

Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:45:39 -07:00
Patrick McHardy
0e06877c6f [RTNETLINK]: rtnl_link: allow specifying initial device address
Drivers need to validate the initial addresses in their netlink attribute
validation function or manually reject them if they can't support this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:45:36 -07:00
Patrick McHardy
2d85cba2b2 [RTNETLINK]: rtnl_link API simplification
All drivers need to unregister their devices in the module unload function.
While doing so they must hold the rtnl and atomically unregister the
rtnl_link ops as well. This makes the rtnl_link_unregister function that
takes the rtnl itself completely useless.

Provide default newlink/dellink functions, make __rtnl_link_unregister and
rtnl_link_unregister unregister all devices with matching rtnl_link_ops and
change the existing users to take advantage of that.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:45:33 -07:00
Patrick McHardy
8c979c26a0 [VLAN]: Fix MAC address handling
The VLAN MAC address handling is broken in multiple ways. When the address
differs when setting it, the real device is put in promiscous mode twice,
but never taken out again. Additionally it doesn't resync when the real
device's address is changed and needlessly puts it in promiscous mode when
the vlan device is still down.

Fix by moving address handling to vlan_dev_open/vlan_dev_stop and properly
deal with address changes in the device notifier. Also switch to
dev_unicast_add (which needs the exact same handling).

Since the set_mac_address handler is identical to the generic ethernet one
with these changes, kill it and use ether_setup().

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:45:24 -07:00
Patrick McHardy
71bffe556c [ETH]: Validate address in eth_mac_addr
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:41:18 -07:00
David S. Miller
50b65cc6fa Merge master.kernel.org:/pub/scm/linux/kernel/git/holtmann/bluetooth-2.6 2007-07-11 19:37:40 -07:00
Olaf Kirch
29578624e3 [NET]: Fix races in net_rx_action vs netpoll.
Keep netpoll/poll_napi from messing with the poll_list.
Only net_rx_action is allowed to manipulate the list.

Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 19:32:02 -07:00
Andrew Morton
e00c5d8b4d I/OAT: warning fix
net/ipv4/tcp.c: In function 'tcp_recvmsg':
net/ipv4/tcp.c:1111: warning: unused variable 'available'

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Chris Leech <christopher.leech@intel.com>
2007-07-11 16:10:53 -07:00
Chris Leech
2b1244a43b I/OAT: Only offload copies for TCP when there will be a context switch
The performance wins come with having the DMA copy engine doing the copies
in parallel with the context switch.  If there is enough data ready on the
socket at recv time just use a regular copy.

Signed-off-by: Chris Leech <christopher.leech@intel.com>
2007-07-11 16:10:53 -07:00
Zhang Rui
91a6902958 sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
Well, first of all, I don't want to change so many files either.

What I do:
Adding a new parameter "struct bin_attribute *" in the
.read/.write methods for the sysfs binary attributes.

In fact, only the four lines change in fs/sysfs/bin.c and
include/linux/sysfs.h do the real work.
But I have to update all the files that use binary attributes
to make them compatible with the new .read and .write methods.
I'm not sure if I missed any. :(

Why I do this:
For a sysfs attribute, we can get a pointer pointing to the
struct attribute in the .show/.store method,
while we can't do this for the binary attributes.
I don't know why this is different, but this does make it not
so handy to use the binary attributes as the regular ones.
So I think this patch is reasonable. :)

Who benefits from it:
The patch that exposes ACPI tables in sysfs
requires such an improvement.
All the table binary attributes share the same .read method.
Parameter "struct bin_attribute *" is used to get
the table signature and instance number which are used to
distinguish different ACPI table binary attributes.

Without this parameter, we need to offer different .read methods
for different ACPI table binary attributes.
This is impossible as there are various ACPI tables on different
platforms, and we don't know what they are until they are loaded.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:09 -07:00
Tejun Heo
7b595756ec sysfs: kill unnecessary attribute->owner
sysfs is now completely out of driver/module lifetime game.  After
deletion, a sysfs node doesn't access anything outside sysfs proper,
so there's no reason to hold onto the attribute owners.  Note that
often the wrong modules were accounted for as owners leading to
accessing removed modules.

This patch kills now unnecessary attribute->owner.  Note that with
this change, userland holding a sysfs node does not prevent the
backing module from being unloaded.

For more info regarding lifetime rule cleanup, please read the
following message.

  http://article.gmane.org/gmane.linux.kernel/510293

(tweaked by Greg to not delete the field just yet, to make it easier to
merge things properly.)

Signed-off-by: Tejun Heo <htejun@gmail.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-07-11 16:09:06 -07:00
Miklos Szeredi
1fd05ba5a2 [AF_UNIX]: Rewrite garbage collector, fixes race.
Throw out the old mark & sweep garbage collector and put in a
refcounting cycle detecting one.

The old one had a race with recvmsg, that resulted in false positives
and hence data loss.  The old algorithm operated on all unix sockets
in the system, so any additional locking would have meant performance
problems for all users of these.

The new algorithm instead only operates on "in flight" sockets, which
are very rare, and the additional locking for these doesn't negatively
impact the vast majority of users.

In fact it's probable, that there weren't *any* heavy senders of
sockets over sockets, otherwise the above race would have been
discovered long ago.

The patch works OK with the app that exposed the race with the old
code.  The garbage collection has also been verified to work in a few
simple cases.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-11 14:22:39 -07:00
Patrick McHardy
99d24edeb6 [NETFILTER]: {ip, nf}_conntrack_sctp: fix remotely triggerable NULL ptr dereference (CVE-2007-2876)
When creating a new connection by sending an unknown chunk type, we
don't transition to a valid state, causing a NULL pointer dereference
in sctp_packet when accessing sctp_timeouts[SCTP_CONNTRACK_NONE].

Fix by don't creating new conntrack entry if initial state is invalid.

Noticed by Vilmos Nebehaj <vilmos.nebehaj@ramsys.hu>

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 23:24:52 -07:00
Philippe De Muyter
56b3d975bb [NET]: Make all initialized struct seq_operations const.
Make all initialized struct seq_operations in net/ const

Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 23:07:31 -07:00
Patrick McHardy
3be550f34b [UDP]: Fix length check.
Rémi Denis-Courmont wrote:
> Right. By the way, shouldn't "len" rather be signed in there?
> 
> 		unsigned int len;
> 
> 		/* if we're overly short, let UDP handle it */
> 		len = skb->len - sizeof(struct udphdr);
> 		if (len <= 0)
> 			goto udp;

It should, but the < 0 case can't happen since __udp4_lib_rcv
already makes sure that we have at least a complete UDP header.

Anyways, this patch fixes it.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 23:06:43 -07:00
Micah Gruber
dffe4f048b [IPV6]: Remove unneeded pointer idev from addrconf_cleanup().
This trivial patch removes the unneeded pointer idev returned from
__in6_dev_get(), which is never used. The check for NULL can be simply
done by if (__in6_dev_get(dev) == NULL).

Signed-off-by: Micah Gruber <micah.gruber@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 23:04:19 -07:00
YOSHIFUJI Hideaki
4c752098f5 [IPV6]: Make IPV6_{RECV,2292}RTHDR boolean options.
Because reversing RH0 is no longer supported by deprecation
of RH0, let's make IPV6_{RECV,2292}RTHDR boolean options.
Boolean are more appropriate from standard POV.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:56:31 -07:00
YOSHIFUJI Hideaki
bb4dbf9e61 [IPV6]: Do not send RH0 anymore.
Based on <draft-ietf-ipv6-deprecate-rh0-00.txt>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:55:49 -07:00
YOSHIFUJI Hideaki
c382bb9d32 [IPV6]: Restore semantics of Routing Header processing.
The "fix" for emerging security threat was overkill and it broke
basic semantic of IPv6 routing header processing.  We should assume
RT0 (or even RT2, depends on configuration) as "unknown" RH type so
that we
- silently ignore the routing header if segleft == 0
- send ICMPv6 Parameter Problem message back to the sender,
  otherwise.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:47:58 -07:00
Ranjit Manomohan
c9726d6890 [NET_SCHED]: Make HTB scheduler work with TSO.
Currently the HTB scheduler does not correctly account for TSO packets
which causes large inaccuracies in the bandwidth control when using TSO.
This patch allows the HTB scheduler to work with TSO enabled devices.

Signed-off-by: Ranjit Manomohan <ranjitm@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:43:16 -07:00
Marcel Holtmann
5b7f990927 [Bluetooth] Add basics to better support and handle eSCO links
To better support and handle eSCO links in the future a bunch of
constants needs to be added and some basic routines need to be
updated. This is the initial step.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2007-07-11 07:35:32 +02:00
Patrick McHardy
cfbba49d80 [NET]: Avoid copying writable clones in tunnel drivers
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:19:05 -07:00
Philippe De Muyter
4839c52b01 [IPV4]: Make ip_tos2prio const.
Signed-off-by: Philippe De Muyter <phdm@macqel.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:19:04 -07:00
Patrick McHardy
6b25d30bf1 [NET]: Fix gen_estimator timer removal race
As noticed by Jarek Poplawski <jarkao2@o2.pl>, the timer removal in
gen_kill_estimator races with the timer function rearming the timer.

Check whether the timer list is empty before rearming the timer
in the timer function to fix this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Jarek Poplawski <jarkao2@o2.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:19:03 -07:00
Satyam Sharma
1498b3f195 [NETPOLL]: Fix a leak-n-bug in netpoll_cleanup()
93ec2c723e applied excessive duct tape to
the netpoll beast's netpoll_cleanup(), thus substituting one leak with
another, and opening up a little buglet :-)

net_device->npinfo (netpoll_info) is a shared and refcounted object and
cannot simply be set NULL the first time netpoll_cleanup() is called.
Otherwise, further netpoll_cleanup()'s see np->dev->npinfo == NULL and
become no-ops, thus leaking. And it's a bug too: the first call to
netpoll_cleanup() would thus (annoyingly) "disable" other (still alive)
netpolls too. Maybe nobody noticed this because netconsole (only user
of netpoll) never supported multiple netpoll objects earlier.

This is a trivial and obvious one-line fixlet.

Signed-off-by: Satyam Sharma <ssatyam@cse.iitk.ac.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:19:02 -07:00
Robert P. J. Day
5f1de3ec66 [RXRPC]: Remove Makefile reference to obsolete RXRPC config variable
Since there is no Kconfig variable RXRPC anywhere in the tree, and the
variable AF_RXRPC performs exactly the same function, remove the
reference to CONFIG_RXRPC from net/Makefile.

Signed-off-by: Robert P. J. Day <rpjday@mindspring.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:19:01 -07:00
Dan Aloni
0236e667e1 [NETFILTER] net/ipv4/netfilter/ip_tables.c: lower printk severity
Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:53 -07:00
Adrian Bunk
4fda25a2cd [DCCP]: Make struct dccp_li_cachep static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:52 -07:00
Andrew Morton
6f11df8355 [NET]: "wrong timeout value in sk_wait_data()": cleanups
- save 4 bytes

- it's read-mostly.

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:50 -07:00
Pavel Emelianov
60f0438a87 [NET]: Make some network-related proc files use seq_list_xxx helpers
This includes /proc/net/protocols, /proc/net/rxrpc_calls and
/proc/net/rxrpc_connections files.

All three need seq_list_start_head to show some header.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:49 -07:00
Pavel Emelianov
9af97186fc [ATM] br2684: Use seq_list_xxx helpers
The .show callback receives the list_head pointer now, not the struct
br2684_dev one.

Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:48 -07:00
Balazs Scheidler
5faf415352 [NETFILTER]: x_tables: add more detail to error message about match/target mask mismatch
Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:29 -07:00
Yasuyuki Kozakai
585426fdc5 [NETFILTER]: nf_queue: Use RCU and mutex for queue handlers
Queue handlers are registered/unregistered in only process context.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:28 -07:00
Yasuyuki Kozakai
ce7663d84a [NETFILTER]: nfnetlink_queue: don't unregister handler of other subsystem
The queue handlers registered by ip[6]_queue.ko at initialization should
not be unregistered according to requests from userland program
using nfnetlink_queue. If we allow that, there is no way to register
the handlers of built-in ip[6]_queue again.

Signed-off-by: Yasuyuki Kozakai <yasuyuki.kozakai@toshiba.co.jp>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:21 -07:00
Patrick McHardy
0d53778e81 [NETFILTER]: Convert DEBUGP to pr_debug
Convert DEBUGP to pr_debug and fix lots of non-compiling debug statements.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:20 -07:00
Patrick McHardy
342b7e3c8a [NETFILTER]: xt_helper: use RCU
The ->helper pointer is protected by RCU, no need to take
nf_conntrack_lock. Also remove excessive debugging.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:19 -07:00
Patrick McHardy
91e8db8006 [NETFILTER]: nf_conntrack_h323: turn some printks into DEBUGPs
Don't spam the ringbuffer with decoding errors. The only printks remaining
are for dropped packets when we're certain they are H.323.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:18 -07:00
Patrick McHardy
d3c3f4243e [NETFILTER]: ipt_CLUSTERIP: add compat code
Adjust structure size and don't expect pointers passed in from
userspace to be valid. Also replace an enum in an ABI structure
by a fixed size type.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:17 -07:00
Patrick McHardy
3569b621ce [NETFILTER]: ipt_SAME: add to feature-removal-schedule
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:16 -07:00
Patrick McHardy
7ae7730fd6 [NETFILTER]: nf_conntrack: early_drop improvement
When the maximum number of conntrack entries is reached and a new
one needs to be allocated, conntrack tries to drop an unassured
connection from the same hash bucket the new conntrack would hash
to. Since with a properly sized hash the average number of entries
per bucket is 1, the chances of actually finding one are not very
good. This patch makes it walk the hash until a minimum number of
8 entries are checked.

Based on patch by Vasily Averin <vvs@sw.ru>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:15 -07:00
Patrick McHardy
ec59a1110a [NETFILTER]: nf_conntrack: mark helpers __read_mostly
Most are __read_mostly already, this changes the remaining ones.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:14 -07:00
Patrick McHardy
b8a7fe6c10 [NETFILTER]: nf_conntrack_helper: use hashtable for conntrack helpers
Eliminate the last global list searched for every new connection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:13 -07:00
Patrick McHardy
f264a7df08 [NETFILTER]: nf_conntrack_expect: introduce nf_conntrack_expect_max sysct
As a last step of preventing DoS by creating lots of expectations, this
patch introduces a global maximum and a sysctl to control it. The default
is initialized to 4 * the expectation hash table size, which results in
1/64 of the default maxmimum of conntracks.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:12 -07:00
Patrick McHardy
b560580a13 [NETFILTER]: nf_conntrack_expect: maintain per conntrack expectation list
This patch brings back the per-conntrack expectation list that was
removed around 2.6.10 to avoid walking all expectations on expectation
eviction and conntrack destruction.

As these were the last users of the global expectation list, this patch
also kills that.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:02 -07:00
Patrick McHardy
31f15875c5 [NETFILTER]: nf_conntrack_helper/nf_conntrack_netlink: convert to expectation hash
Convert from the global expectation list to the hash table.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:01 -07:00
Patrick McHardy
5d08ad440f [NETFILTER]: nf_conntrack_expect: convert proc functions to hash
Convert from the global expectation list to the hash table.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:18:00 -07:00
Patrick McHardy
a71c085562 [NETFILTER]: nf_conntrack: use hashtable for expectations
Currently all expectations are kept on a global list that

- needs to be searched for every new conncetion
- needs to be walked for evicting expectations when a master connection
  has reached its limit
- needs to be walked on connection destruction for connections that
  have open expectations

This is obviously not good, especially when considering helpers like
H.323 that register *lots* of expectations and can set up permanent
expectations, but it also allows for an easy DoS against firewalls
using connection tracking helpers.

Use a hashtable for expectations to avoid incurring the search overhead
for every new connection. The default hash size is 1/256 of the conntrack
hash table size, this can be overriden using a module parameter.

This patch only introduces the hash table for expectation lookups and
keeps other users to reduce the noise, the following patches will get
rid of it completely.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:17:59 -07:00
Patrick McHardy
e9c1b084e1 [NETFILTER]: nf_conntrack: move expectaton related init code to nf_conntrack_expect.c
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:17:58 -07:00
Patrick McHardy
cf6994c2b9 [NETFILTER]: nf_conntrack_netlink: sync expectation dumping with conntrack table dumping
Resync expectation table dumping code with conntrack dumping: don't
rely on the unique ID anymore since that requires to walk the list
backwards, which doesn't work with the upcoming conversion to hlists.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:17:57 -07:00
Patrick McHardy
4e1d4e6c5a [NETFILTER]: nf_conntrack_expect: avoid useless list walking
Don't walk the list when unexpecting an expectation, we already
have a reference and the timer check is enough to guarantee
that it still is on the list.

This comment suggests that it was copied there by mistake from
expectation eviction:

/* choose the oldest expectation to evict */

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:17:56 -07:00