linux

Author	SHA1	Message	Date
Eric Dumazet	941297f443	netfilter: nf_conntrack: nf_conntrack_alloc() fixes When a slab cache uses SLAB_DESTROY_BY_RCU, we must be careful when allocating objects, since slab allocator could give a freed object still used by lockless readers. In particular, nf_conntrack RCU lookups rely on ct->tuplehash[xxx].hnnode.next being always valid (ie containing a valid 'nulls' value, or a valid pointer to next object in hash chain.) kmem_cache_zalloc() setups object with NULL values, but a NULL value is not valid for ct->tuplehash[xxx].hnnode.next. Fix is to call kmem_cache_alloc() and do the zeroing ourself. As spotted by Patrick, we also need to make sure lookup keys are committed to memory before setting refcount to 1, or a lockless reader could get a reference on the old version of the object. Its key re-check could then pass the barrier. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-07-16 14:03:40 +02:00
Patrick McHardy	aa6a03eb0a	netfilter: xt_osf: fix nf_log_packet() arguments The first argument is the address family, the second one the hook number. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-07-16 14:01:54 +02:00
Jan Engelhardt	d6d3f08b0f	netfilter: xtables: conntrack match revision 2 As reported by Philip, the UNTRACKED state bit does not fit within the 8-bit state_mask member. Enlarge state_mask and give status_mask a few more bits too. Reported-by: Philip Craig <philipc@snapgear.com> References: http://markmail.org/thread/b7eg6aovfh4agyz7 Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-29 14:31:46 +02:00
Patrick McHardy	a3a9f79e36	netfilter: tcp conntrack: fix unacknowledged data detection with NAT When NAT helpers change the TCP packet size, the highest seen sequence number needs to be corrected. This is currently only done upwards, when the packet size is reduced the sequence number is unchanged. This causes TCP conntrack to falsely detect unacknowledged data and decrease the timeout. Fix by updating the highest seen sequence number in both directions after packet mangling. Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-29 14:07:56 +02:00
Jesper Dangaard Brouer	308ff823eb	nf_conntrack: Use rcu_barrier() RCU barriers, rcu_barrier(), is inserted two places. In nf_conntrack_expect.c nf_conntrack_expect_fini() before the kmem_cache_destroy(). Firstly to make sure the callback to the nf_ct_expect_free_rcu() code is still around. Secondly because I'm unsure about the consequence of having in flight nf_ct_expect_free_rcu/kmem_cache_free() calls while doing a kmem_cache_destroy() slab destroy. And in nf_conntrack_extend.c nf_ct_extend_unregister(), inorder to wait for completion of callbacks to __nf_ct_ext_free_rcu(), which is invoked by __nf_ct_ext_add(). It might be more efficient to call rcu_barrier() in nf_conntrack_core.c nf_conntrack_cleanup_net(), but thats make it more difficult to read the code (as the callback code in located in nf_conntrack_extend.c). Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-25 16:32:52 +02:00
Patrick McHardy	4d900f9df5	netfilter: xt_rateest: fix comparison with self As noticed by T�r�k Edwin <edwintorok@gmail.com>: Compiling the kernel with clang has shown this warning: net/netfilter/xt_rateest.c:69:16: warning: self-comparison always results in a constant value ret &= pps2 == pps2; ^ Looking at the code: if (info->flags & XT_RATEEST_MATCH_BPS) ret &= bps1 == bps2; if (info->flags & XT_RATEEST_MATCH_PPS) ret &= pps2 == pps2; Judging from the MATCH_BPS case it seems to be a typo, with the intention of comparing pps1 with pps2. http://bugzilla.kernel.org/show_bug.cgi?id=13535 Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:17:12 +02:00
Jan Engelhardt	6d62182fea	netfilter: xt_quota: fix incomplete initialization Commit v2.6.29-rc5-872-gacc738f ("xtables: avoid pointer to self") forgot to copy the initial quota value supplied by iptables into the private structure, thus counting from whatever was in the memory kmalloc returned. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:16:45 +02:00
Patrick McHardy	2495561928	netfilter: nf_log: fix direct userspace memory access in proc handler Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:15:30 +02:00
Patrick McHardy	f9ffc31251	netfilter: fix some sparse endianess warnings net/netfilter/xt_NFQUEUE.c:46:9: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:46:9: expected unsigned int [unsigned] [usertype] ipaddr net/netfilter/xt_NFQUEUE.c:46:9: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:68:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:68:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:68:10: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:69:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:69:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:69:10: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:70:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:70:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:70:10: got restricted unsigned int net/netfilter/xt_NFQUEUE.c:71:10: warning: incorrect type in assignment (different base types) net/netfilter/xt_NFQUEUE.c:71:10: expected unsigned int [unsigned] <noident> net/netfilter/xt_NFQUEUE.c:71:10: got restricted unsigned int net/netfilter/xt_cluster.c:20:55: warning: incorrect type in return expression (different base types) net/netfilter/xt_cluster.c:20:55: expected unsigned int net/netfilter/xt_cluster.c:20:55: got restricted unsigned int const [usertype] ip net/netfilter/xt_cluster.c:20:55: warning: incorrect type in return expression (different base types) net/netfilter/xt_cluster.c:20:55: expected unsigned int net/netfilter/xt_cluster.c:20:55: got restricted unsigned int const [usertype] ip Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:15:02 +02:00
Patrick McHardy	8d8890b775	netfilter: nf_conntrack: fix conntrack lookup race The RCU protected conntrack hash lookup only checks whether the entry has a refcount of zero to decide whether it is stale. This is not sufficient, entries are explicitly removed while there is at least one reference left, possibly more. Explicitly check whether the entry has been marked as dying to fix this. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:14:41 +02:00
Patrick McHardy	5c8ec910e7	netfilter: nf_conntrack: fix confirmation race condition New connection tracking entries are inserted into the hash before they are fully set up, namely the CONFIRMED bit is not set and the timer not started yet. This can theoretically lead to a race with timer, which would set the timeout value to a relative value, most likely already in the past. Perform hash insertion as the final step to fix this. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:14:16 +02:00
Eric Dumazet	8cc20198cf	netfilter: nf_conntrack: death_by_timeout() fix death_by_timeout() might delete a conntrack from hash list and insert it in dying list. nf_ct_delete_from_lists(ct); nf_ct_insert_dying_list(ct); I believe a (lockless) reader could catch ct while doing a lookup and miss the end of its chain. (nulls lookup algo must check the null value at the end of lookup and should restart if the null value is not the expected one. cf Documentation/RCU/rculist_nulls.txt for details) We need to change nf_conntrack_init_net() and use a different "null" value, guaranteed not being used in regular lists. Choose very large values, since hash table uses [0..size-1] null values. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-22 14:13:55 +02:00
David S. Miller	9cbc1cb8cd	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/scsi/fcoe/fcoe.c net/core/drop_monitor.c net/core/net-traces.c	2009-06-15 03:02:23 -07:00
Joe Perches	3dd5d7e3ba	x_tables: Convert printk to pr_err Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:32:39 +02:00
Pablo Neira Ayuso	dd7669a92c	netfilter: conntrack: optional reliable conntrack event delivery This patch improves ctnetlink event reliability if one broadcast listener has set the NETLINK_BROADCAST_ERROR socket option. The logic is the following: if an event delivery fails, we keep the undelivered events in the missed event cache. Once the next packet arrives, we add the new events (if any) to the missed events in the cache and we try a new delivery, and so on. Thus, if ctnetlink fails to deliver an event, we try to deliver them once we see a new packet. Therefore, we may lose state transitions but the userspace process gets in sync at some point. At worst case, if no events were delivered to userspace, we make sure that destroy events are successfully delivered. Basically, if ctnetlink fails to deliver the destroy event, we remove the conntrack entry from the hashes and we insert them in the dying list, which contains inactive entries. Then, the conntrack timer is added with an extra grace timeout of random32() % 15 seconds to trigger the event again (this grace timeout is tunable via /proc). The use of a limited random timeout value allows distributing the "destroy" resends, thus, avoiding accumulating lots "destroy" events at the same time. Event delivery may re-order but we can identify them by means of the tuple plus the conntrack ID. The maximum number of conntrack entries (active or inactive) is still handled by nf_conntrack_max. Thus, we may start dropping packets at some point if we accumulate a lot of inactive conntrack entries that did not successfully report the destroy event to userspace. During my stress tests consisting of setting a very small buffer of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket flag, and generating lots of very small connections, I noticed very few destroy entries on the fly waiting to be resend. A simple way to test this patch consist of creating a lot of entries, set a very small Netlink buffer in conntrackd (+ a patch which is not in the git tree to set the BROADCAST_ERROR flag) and invoke `conntrack -F'. For expectations, no changes are introduced in this patch. Currently, event delivery is only done for new expectations (no events from expectation expiration, removal and confirmation). In that case, they need a per-expectation event cache to implement the same idea that is exposed in this patch. This patch can be useful to provide reliable flow-accouting. We still have to add a new conntrack extension to store the creation and destroy time. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:30:52 +02:00
Pablo Neira Ayuso	9858a3ae1d	netfilter: conntrack: move helper destruction to nf_ct_helper_destroy() This patch moves the helper destruction to a function that lives in nf_conntrack_helper.c. This new function is used in the patch to add ctnetlink reliable event delivery. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:28:22 +02:00
Pablo Neira Ayuso	a0891aa6a6	netfilter: conntrack: move event caching to conntrack extension infrastructure This patch reworks the per-cpu event caching to use the conntrack extension infrastructure. The main drawback is that we consume more memory per conntrack if event delivery is enabled. This patch is required by the reliable event delivery that follows to this patch. BTW, this patch allows you to enable/disable event delivery via /proc/sys/net/netfilter/nf_conntrack_events in runtime, although you can still disable event caching as compilation option. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:26:29 +02:00
Patrick McHardy	65cb9fda32	netfilter: nf_conntrack: use mod_timer_pending() for conntrack refresh Use mod_timer_pending() instead of atomic sequence of del_timer()/ add_timer(). mod_timer_pending() does not rearm an inactive timer, so we don't need the conntrack lock anymore to make sure we don't accidentally rearm a timer of a conntrack which is in the process of being destroyed. With this change, we don't need to take the global lock anymore at all, counter updates can be performed under the per-conntrack lock. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:21:49 +02:00
Patrick McHardy	266d07cb1c	netfilter: nf_log: fix sleeping function called from invalid context Fix regression introduced by `17625274` "netfilter: sysctl support of logger choice": BUG: sleeping function called from invalid context at /mnt/s390test/linux-2.6-tip/arch/s390/include/asm/uaccess.h:234 in_atomic(): 1, irqs_disabled(): 0, pid: 3245, name: sysctl CPU: 1 Not tainted 2.6.30-rc8-tipjun10-02053-g39ae214 #1 Process sysctl (pid: 3245, task: 000000007f675da0, ksp: 000000007eb17cf0) 0000000000000000 000000007eb17be8 0000000000000002 0000000000000000 000000007eb17c88 000000007eb17c00 000000007eb17c00 0000000000048156 00000000003e2de8 000000007f676118 000000007eb17f10 0000000000000000 0000000000000000 000000007eb17be8 000000000000000d 000000007eb17c58 00000000003e2050 000000000001635c 000000007eb17be8 000000007eb17c30 Call Trace: (�<00000000000162e6>� show_trace+0x13a/0x148) �<00000000000349ea>� __might_sleep+0x13a/0x164 �<0000000000050300>� proc_dostring+0x134/0x22c �<0000000000312b70>� nf_log_proc_dostring+0xfc/0x188 �<0000000000136f5e>� proc_sys_call_handler+0xf6/0x118 �<0000000000136fda>� proc_sys_read+0x26/0x34 �<00000000000d6e9c>� vfs_read+0xac/0x158 �<00000000000d703e>� SyS_read+0x56/0x88 �<0000000000027f42>� sysc_noemu+0x10/0x16 Use the nf_log_mutex instead of RCU to fix this. Reported-and-tested-by: Maran Pakkirisamy <maranpsamy@in.ibm.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-13 12:21:10 +02:00
Pavel Machek	4737f0978d	trivial: Kconfig: .ko is normally not included in module names .ko is normally not included in Kconfig help, make it consistent. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:50 +02:00
Martin Olsson	6d60f9dfc8	trivial: Fix paramater/parameter typo in dmesg and source comments Signed-off-by: Martin Olsson <martin@minimum.se> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-06-12 18:01:46 +02:00
Patrick McHardy	334a47f634	netfilter: nf_ct_tcp: fix up build after merge Replace the last occurence of tcp_lock by the per-conntrack lock. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-11 16:16:09 +02:00
Patrick McHardy	36432dae73	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6	2009-06-11 16:00:49 +02:00
Patrick McHardy	440f0d5885	netfilter: nf_conntrack: use per-conntrack locks for protocol data Introduce per-conntrack locks and use them instead of the global protocol locks to avoid contention. Especially tcp_lock shows up very high in profiles on larger machines. This will also allow to simplify the upcoming reliable event delivery patches. Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-10 14:32:47 +02:00
Jesper Dangaard Brouer	67137f3cc7	nfnetlink_queue: Use rcu_barrier() on module unload. This module uses rcu_call() thus it should use rcu_barrier() on module unload. Also fixed a trivial typo 'nfetlink' -> 'nfnetlink' in comment. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-10 01:11:23 -07:00
Laszlo Attila Toth	a31e1ffd22	netfilter: xt_socket: added new revision of the 'socket' match supporting flags If the XT_SOCKET_TRANSPARENT flag is set, enabled 'transparent' socket option is required for the socket to be matched. Signed-off-by: Laszlo Attila Toth <panther@balabit.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-09 15:16:34 +02:00
Evgeniy Polyakov	11eeef41d5	netfilter: passive OS fingerprint xtables match Passive OS fingerprinting netfilter module allows to passively detect remote OS and perform various netfilter actions based on that knowledge. This module compares some data (WS, MSS, options and it's order, ttl, df and others) from packets with SYN bit set with dynamically loaded OS fingerprints. Fingerprint matching rules can be downloaded from OpenBSD source tree or found in archive and loaded via netfilter netlink subsystem into the kernel via special util found in archive. Archive contains library file (also attached), which was shipped with iptables extensions some time ago (at least when ipt_osf existed in patch-o-matic). Following changes were made in this release: * added NLM_F_CREATE/NLM_F_EXCL checks * dropped _rcu list traversing helpers in the protected add/remove calls * dropped unneded structures, debug prints, obscure comment and check Fingerprints can be downloaded from http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os or can be found in archive Example usage: -d switch removes fingerprints Please consider for inclusion. Thank you. Passive OS fingerprint homepage (archives, examples): http://www.ioremap.net/projects/osf Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-08 17:01:51 +02:00
Florian Westphal	10662aa308	netfilter: xt_NFQUEUE: queue balancing support Adds support for specifying a range of queues instead of a single queue id. Flows will be distributed across the given range. This is useful for multicore systems: Instead of having a single application read packets from a queue, start multiple instances on queues x, x+1, .. x+n. Each instance can process flows independently. Packets for the same connection are put into the same queue. Signed-off-by: Holger Eitzenberger <heitzenberger@astaro.com> Signed-off-by: Florian Westphal <fwestphal@astaro.com> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-05 13:24:24 +02:00
Florian Westphal	61f5abcab1	netfilter: xt_NFQUEUE: use NFPROTO_UNSPEC We can use wildcard matching here, just like `ab4f21e6fb` ("xtables: use NFPROTO_UNSPEC in more extensions"). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-05 13:18:07 +02:00
Eric Dumazet	adf30907d6	net: skb->dst accessors Define three accessors to get/set dst attached to a skb struct dst_entry skb_dst(const struct sk_buff skb) void skb_dst_set(struct sk_buff skb, struct dst_entry dst) void skb_dst_drop(struct sk_buff *skb) This one should replace occurrences of : dst_release(skb->dst) skb->dst = NULL; Delete skb->dst field Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:04 -07:00
Eric Dumazet	511c3f92ad	net: skb->rtable accessor Define skb_rtable(const struct sk_buff *skb) accessor to get rtable from skb Delete skb->rtable field Setting rtable is not allowed, just set dst instead as rtable is an alias. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-06-03 02:51:02 -07:00
David S. Miller	b2f8f7525c	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/forcedeth.c	2009-06-03 02:43:41 -07:00
Pablo Neira Ayuso	e34d5c1a4f	netfilter: conntrack: replace notify chain by function pointer This patch removes the notify chain infrastructure and replace it by a simple function pointer. This issue has been mentioned in the mailing list several times: the use of the notify chain adds too much overhead for something that is only used by ctnetlink. This patch also changes nfnetlink_send(). It seems that gfp_any() returns GFP_KERNEL for user-context request, like those via ctnetlink, inside the RCU read-side section which is not valid. Using GFP_KERNEL is also evil since netlink may schedule(), this leads to "scheduling while atomic" bug reports. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-03 10:32:06 +02:00
Pablo Neira Ayuso	17e6e4eac0	netfilter: conntrack: simplify event caching system This patch simplifies the conntrack event caching system by removing several events: * IPCT_[]_VOLATILE, IPCT_HELPINFO and IPCT_NATINFO has been deleted since the have no clients. IPCT_COUNTER_FILLING which is a leftover of the 32-bits counter days. * IPCT_REFRESH which is not of any use since we always include the timeout in the messages. After this patch, the existing events are: * IPCT_NEW, IPCT_RELATED and IPCT_DESTROY, that are used to identify addition and deletion of entries. * IPCT_STATUS, that notes that the status bits have changes, eg. IPS_SEEN_REPLY and IPS_ASSURED. * IPCT_PROTOINFO, that reports that internal protocol information has changed, eg. the TCP, DCCP and SCTP protocol state. * IPCT_HELPER, that a helper has been assigned or unassigned to this entry. * IPCT_MARK and IPCT_SECMARK, that reports that the mark has changed, this covers the case when a mark is set to zero. * IPCT_NATSEQADJ, to report that there's updates in the NAT sequence adjustment. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:08:46 +02:00
Pablo Neira Ayuso	274d383b9c	netfilter: conntrack: don't report events on module removal During the module removal there are no possible event listeners since ctnetlink must be removed before to allow removing nf_conntrack. This patch removes the event reporting for the module removal case which is not of any use in the existing code. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:08:38 +02:00
Pablo Neira Ayuso	03b64f518a	netfilter: ctnetlink: cleanup message-size calculation This patch cleans up the message calculation to make it similar to rtnetlink, moreover, it removes unneeded verbose information. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:08:27 +02:00
Pablo Neira Ayuso	96bcf938dc	netfilter: ctnetlink: use nlmsg_* helper function to build messages Replaces the old macros to build Netlink messages with the new nlmsg_*() helper functions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:07:39 +02:00
Pablo Neira Ayuso	f2f3e38c63	netfilter: ctnetlink: rename tuple() by nf_ct_tuple() macro definition This patch move the internal tuple() macro definition to the header file as nf_ct_tuple(). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:03:35 +02:00
Pablo Neira Ayuso	8b0a231d4d	netfilter: ctnetlink: remove nowait parameter from fill_info() This patch is a cleanup, it removes the `nowait' parameter from all fill_info() function since it is always set to one. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:03:34 +02:00
Pablo Neira Ayuso	f49c857ff2	netfilter: nfnetlink: cleanup for nfnetlink_rcv_msg() function This patch cleans up the message handling path in two aspects: * it uses NLMSG_LENGTH() instead of NLMSG_SPACE() like rtnetlink does in this case to check if there is enough room for the Netlink/nfnetlink headers. No need to check for the padding room. * it removes a redundant header size checking that has been already do at the beginning of the function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2009-06-02 20:03:33 +02:00
Jozsef Kadlecsik	874ab9233e	netfilter: nf_ct_tcp: TCP simultaneous open support The patch below adds supporting TCP simultaneous open to conntrack. The unused LISTEN state is replaced by a new state (SYN_SENT2) denoting the second SYN sent from the reply direction in the new case. The state table is updated and the function tcp_in_window is modified to handle simultaneous open. The functionality can fairly easily be tested by socat. A sample tcpdump recording 23:21:34.244733 IP (tos 0x0, ttl 64, id 49224, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xe75f (correct), 3383710133:3383710133(0) win 5840 <mss 1460,sackOK,timestamp 173445629 0,nop,wscale 7> 23:21:34.244783 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40) 192.168.0.1.2020 > 192.168.0.254.2020: R, cksum 0x0253 (correct), 0:0(0) ack 3383710134 win 0 23:21:36.038680 IP (tos 0x0, ttl 64, id 28092, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.1.2020 > 192.168.0.254.2020: S, cksum 0x704b (correct), 2634546729:2634546729(0) win 5840 <mss 1460,sackOK,timestamp 824213 0,nop,wscale 1> 23:21:36.038777 IP (tos 0x0, ttl 64, id 49225, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.254.2020 > 192.168.0.1.2020: S, cksum 0xb179 (correct), 3383710133:3383710133(0) ack 2634546730 win 5840 <mss 1460,sackOK,timestamp 173447423 824213,nop,wscale 7> 23:21:36.038847 IP (tos 0x0, ttl 64, id 28093, offset 0, flags [DF], proto TCP (6), length 52) 192.168.0.1.2020 > 192.168.0.254.2020: ., cksum 0xebad (correct), ack 3383710134 win 2920 <nop,nop,timestamp 824213 173447423> and the corresponding netlink events: [NEW] tcp 6 120 SYN_SENT src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 [UNREPLIED] src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 120 LISTEN src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 60 SYN_RECV src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.254 dst=192.168.0.1 sport=2020 dport=2020 src=192.168.0.1 dst=192.168.0.254 sport=2020 dport=2020 [ASSURED] The RST packet was dropped in the raw table, thus it did not reach conntrack. nfnetlink_conntrack is unpatched so it shows the new SYN_SENT2 state as the old unused LISTEN. With TCP simultaneous open support we satisfy REQ-2 in RFC 5382 ;-) . Additional minor correction in this patch is that in order to catch uninitialized reply directions, "td_maxwin == 0" is used instead of "td_end == 0" because the former can't be true except in uninitialized state while td_end may accidentally be equal to zero in the mid of a connection. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-06-02 13:58:56 +02:00
Patrick McHardy	8cc848fa34	Merge branch 'master' of git://dev.medozas.de/linux	2009-06-02 13:44:56 +02:00
David S. Miller	4d3383d0ad	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6	2009-05-27 15:51:25 -07:00
Pablo Neira Ayuso	a17c859849	netfilter: conntrack: add support for DCCP handshake sequence to ctnetlink This patch adds CTA_PROTOINFO_DCCP_HANDSHAKE_SEQ that exposes the u64 handshake sequence number to user-space. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-27 17:50:35 +02:00
Pablo Neira Ayuso	eeff9beec3	netfilter: nfnetlink_log: fix wrong skbuff size calculation This problem was introduced in `72961ecf84` since no space was reserved for the new attributes NFULA_HWTYPE, NFULA_HWLEN and NFULA_HWHEADER. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-27 15:49:11 +02:00
Jesper Dangaard Brouer	683a04cebc	netfilter: xt_hashlimit does a wrong SEQ_SKIP The function dl_seq_show() returns 1 (equal to SEQ_SKIP) in case a seq_printf() call return -1. It should return -1. This SEQ_SKIP behavior brakes processing the proc file e.g. via a pipe or just through less. Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-27 15:45:34 +02:00
Pablo Neira Ayuso	b38b1f6168	netfilter: nf_ct_dccp: add missing DCCP protocol changes in event cache This patch adds the missing protocol state-change event reporting for DCCP. $ sudo conntrack -E [NEW] dccp 33 240 src=192.168.0.2 dst=192.168.1.2 sport=57040 dport=5001 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=5001 dport=57040 With this patch: $ sudo conntrack -E [NEW] dccp 33 240 REQUEST src=192.168.0.2 dst=192.168.1.2 sport=57040 dport=5001 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=5001 dport=57040 Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-25 17:29:43 +02:00
Jozsef Kadlecsik	bfcaa50270	netfilter: nf_ct_tcp: fix accepting invalid RST segments Robert L Mathews discovered that some clients send evil TCP RST segments, which are accepted by netfilter conntrack but discarded by the destination. Thus the conntrack entry is destroyed but the destination retransmits data until timeout. The same technique, i.e. sending properly crafted RST segments, can easily be used to bypass connlimit/connbytes based restrictions (the sample script written by Robert can be found in the netfilter mailing list archives). The patch below adds a new flag and new field to struct ip_ct_tcp_state so that checking RST segments can be made more strict and thus TCP conntrack can catch the invalid ones: the RST segment is accepted only if its sequence number higher than or equal to the highest ack we seen from the other direction. (The last_ack field cannot be reused because it is used to catch resent packets.) Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>	2009-05-25 17:23:15 +02:00
Michał Mirosław	8f698d5453	ipvs: Use genl_register_family_with_ops() Use genl_register_family_with_ops() instead of a copy. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-21 16:50:24 -07:00
Simon Horman	be8be9eccb	ipvs: Fix IPv4 FWMARK virtual services This fixes the use of fwmarks to denote IPv4 virtual services which was unfortunately broken as a result of the integration of IPv6 support into IPVS, which was included in 2.6.28. The problem arises because fwmarks are stored in the 4th octet of a union nf_inet_addr .all, however in the case of IPv4 only the first octet, corresponding to .ip, is assigned and compared. In other words, using .all = { 0, 0, 0, htonl(svc->fwmark) always results in a value of 0 (32bits) being stored for IPv4. This means that one fwmark can be used, as it ends up being mapped to 0, but things break down when multiple fwmarks are used, as they all end up being mapped to 0. As fwmarks are 32bits a reasonable fix seems to be to just store the fwmark in .ip, and comparing and storing .ip when fwmarks are used. This patch makes the assumption that in calls to ip_vs_ct_in_get() and ip_vs_sched_persist() if the proto parameter is IPPROTO_IP then we are dealing with an fwmark. I believe this is valid as ip_vs_in() does fairly strict filtering on the protocol and IPPROTO_IP should not be used in these calls unless explicitly passed when making these calls for fwmarks in ip_vs_sched_persist(). Tested-by: Fabien Duchêne <fabien.duchene@student.uclouvain.be> Cc: Joseph Mack NA3T <jmack@wm7d.net> Cc: Julius Volz <julius.volz@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2009-05-08 14:54:47 -07:00

1 2 3 4 5 ...

1032 Commits