linux

Author	SHA1	Message	Date
Eric Dumazet	ba89966c19	[NET]: use __read_mostly on kmem_cache_t , DEFINE_SNMP_STAT pointers This patch puts mostly read only data in the right section (read_mostly), to help sharing of these data between CPUS without memory ping pongs. On one of my production machine, tcp_statistics was sitting in a heavily modified cache line, so every SNMP update had to force a reload. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:11:18 -07:00
Patrick McHardy	066286071d	[NETLINK]: Add "groups" argument to netlink_kernel_create Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:01:11 -07:00
Patrick McHardy	ac6d439d20	[NETLINK]: Convert netlink users to use group numbers instead of bitmasks Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:00:54 -07:00
Patrick McHardy	43e943c32b	[NETLINK]: Fix missing dst_groups initializations in netlink_broadcast users netlink_broadcast users must initialize NETLINK_CB(skb).dst_groups to the destination group mask for netlink_recvmsg. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 16:00:34 -07:00
Harald Welte	4fdb3bb723	[NETLINK]: Add properly module refcounting for kernel netlink sockets. - Remove bogus code for compiling netlink as module - Add module refcounting support for modules implementing a netlink protocol - Add support for autoloading modules that implement a netlink protocol as soon as someone opens a socket for that protocol Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-08-29 15:35:08 -07:00
Herbert Xu	a4f1bac625	[XFRM]: Fix possible overflow of sock->sk_policy Spotted by, and original patch by, Balazs Scheidler. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-26 15:43:17 -07:00
Sam Ravnborg	6a2e9b738c	[NET]: move config options out to individual protocols Move the protocol specific config options out to the specific protocols. With this change net/Kconfig now starts to become readable and serve as a good basis for further re-structuring. The menu structure is left almost intact, except that indention is fixed in most cases. Most visible are the INET changes where several "depends on INET" are replaced with a single ifdef INET / endif pair. Several new files were created to accomplish this change - they are small but serve the purpose that config options are now distributed out where they belongs. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-07-11 21:13:56 -07:00
Herbert Xu	d094cd83c0	[IPSEC]: Add xfrm_state_afinfo->init_flags This patch adds the xfrm_state_afinfo->init_flags hook which allows each address family to perform any common initialisation that does not require a corresponding destructor call. It will be used subsequently to set the XFRM_STATE_NOPMTUDISC flag in IPv4. It also fixes up the error codes returned by xfrm_init_state. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: James Morris <jmorris@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-20 13:19:41 -07:00
Herbert Xu	72cb6962a9	[IPSEC]: Add xfrm_init_state This patch adds xfrm_init_state which is simply a wrapper that calls xfrm_get_type and subsequently x->type->init_state. It also gets rid of the unused args argument. Abstracting it out allows us to add common initialisation code, e.g., to set family-specific flags. The add_time setting in xfrm_user.c was deleted because it's already set by xfrm_state_alloc. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: James Morris <jmorris@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-20 13:18:08 -07:00
Herbert Xu	0603eac0d6	[IPSEC]: Add XFRMA_SA/XFRMA_POLICY for delete notification This patch changes the format of the XFRM_MSG_DELSA and XFRM_MSG_DELPOLICY notification so that the main message sent is of the same format as that received by the kernel if the original message was via netlink. This also means that we won't lose the byid information carried in km_event. Since this user interface is introduced by Jamal's patch we can still afford to change it. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:54:36 -07:00
Jamal Hadi Salim	ee57eef99b	[IPSEC] Use NLMSG_LENGTH in xfrm_exp_state_notify Small fixup to use netlink macros instead of hardcoding. Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:45:56 -07:00
Patrick McHardy	7d6dfe1f5b	[IPSEC] Fix xfrm_state leaks in error path Herbert Xu wrote: > @@ -1254,6 +1326,7 @@ static int pfkey_add(struct sock *sk, st > if (IS_ERR(x)) > return PTR_ERR(x); > > + xfrm_state_hold(x); This introduces a leak when xfrm_state_add()/xfrm_state_update() fail. We hold two references (one from xfrm_state_alloc(), one from xfrm_state_hold()), but only drop one. We need to take the reference because the reference from xfrm_state_alloc() can be dropped by __xfrm_state_delete(), so the fix is to drop both references on error. Same problem in xfrm_user.c. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-06-18 22:45:31 -07:00
Herbert Xu	f60f6b8f70	[IPSEC] Use XFRM_MSG_* instead of XFRM_SAP_* This patch removes XFRM_SAP_* and converts them over to XFRM_MSG_*. The netlink interface is meant to map directly onto the underlying xfrm subsystem. Therefore rather than using a new independent representation for the events we can simply use the existing ones from xfrm_user. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:44:37 -07:00
Herbert Xu	e7443892f6	[IPSEC] Set byid for km_event in xfrm_get_policy This patch fixes policy deletion in xfrm_user so that it sets km_event.data.byid. This puts xfrm_user on par with what af_key does in this case. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:44:18 -07:00
Herbert Xu	bf08867f91	[IPSEC] Turn km_event.data into a union This patch turns km_event.data into a union. This makes code that uses it clearer. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:44:00 -07:00
Herbert Xu	4666faab09	[IPSEC] Kill spurious hard expire messages This patch ensures that the hard state/policy expire notifications are only sent when the state/policy is successfully removed from their respective tables. As it is, it's possible for a state/policy to both expire through reaching a hard limit, as well as being deleted by the user. Note that this behaviour isn't actually forbidden by RFC 2367. However, it is a quality of implementation issue. As an added bonus, the restructuring in this patch will help eventually in moving the expire notifications from softirq context into process context, thus improving their reliability. One important side-effect from this change is that SAs reaching their hard byte/packet limits are now deleted immediately, just like SAs that have reached their hard time limits. Previously they were announced immediately but only deleted after 30 seconds. This is bad because it prevents the system from issuing an ACQUIRE command until the existing state was deleted by the user or expires after the time is up. In the scenario where the expire notification was lost this introduces a 30 second delay into the system for no good reason. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:43:22 -07:00
Jamal Hadi Salim	26b15dad9f	[IPSEC] Add complete xfrm event notification Heres the final patch. What this patch provides - netlink xfrm events - ability to have events generated by netlink propagated to pfkey and vice versa. - fixes the acquire lets-be-happy-with-one-success issue Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2005-06-18 22:42:13 -07:00
Hideaki YOSHIFUJI	92d63decc0	From: Kazunori Miyazawa <kazunori@miyazawa.org> [XFRM] Call dst_check() with appropriate cookie This fixes infinite loop issue with IPv6 tunnel mode. Signed-off-by: Kazunori Miyazawa <kazunori@miyazawa.org> Signed-off-by: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-26 12:58:04 -07:00
Herbert Xu	31c26852cb	[IPSEC]: Verify key payload in verify_one_algo We need to verify that the payload contains enough data so that attach_one_algo can copy alg_key_len bits from the payload. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:39:49 -07:00
Herbert Xu	b9e9dead05	[IPSEC]: Fixed alg_key_len usage in attach_one_algo The variable alg_key_len is in bits and not bytes. The function attach_one_algo is currently using it as if it were in bytes. This causes it to read memory which may not be there. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-19 12:39:04 -07:00
Evgeniy Polyakov	d48102007d	[XFRM]: skb_cow_data() does not set proper owner for new skbs. It looks like skb_cow_data() does not set proper owner for newly created skb. If we have several fragments for skb and some of them are shared(?) or cloned (like in async IPsec) there might be a situation when we require recreating skb and thus using skb_copy() for it. Newly created skb has neither a destructor nor a socket assotiated with it, which must be copied from the old skb. As far as I can see, current code sets destructor and socket for the first one skb only and uses truesize of the first skb only to increment sk_wmem_alloc value. If above "analysis" is correct then attached patch fixes that. Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-18 22:51:45 -07:00
Herbert Xu	aabc9761b6	[IPSEC]: Store idev entries I found a bug that stopped IPsec/IPv6 from working. About a month ago IPv6 started using rt6i_idev->dev on the cached socket dst entries. If the cached socket dst entry is IPsec, then rt6i_idev will be NULL. Since we want to look at the rt6i_idev of the original route in this case, the easiest fix is to store rt6i_idev in the IPsec dst entry just as we do for a number of other IPv6 route attributes. Unfortunately this means that we need some new code to handle the references to rt6i_idev. That's why this patch is bigger than it would otherwise be. I've also done the same thing for IPv4 since it is conceivable that once these idev attributes start getting used for accounting, we probably need to dereference them for IPv4 IPsec entries too. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:27:10 -07:00
David S. Miller	0f4821e7b9	[XFRM/RTNETLINK]: Decrement qlen properly in {xfrm_,rt}netlink_rcv(). If we free up a partially processed packet because it's skb->len dropped to zero, we need to decrement qlen because we are dropping out of the top-level loop so it will do the decrement for us. Spotted by Herbert Xu. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 16:15:59 -07:00
David S. Miller	09e1430598	[NETLINK]: Fix infinite loops in synchronous netlink changes. The qlen should continue to decrement, even if we pop partially processed SKBs back onto the receive queue. Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 15:30:05 -07:00
Herbert Xu	2a0a6ebee1	[NETLINK]: Synchronous message processing. Let's recap the problem. The current asynchronous netlink kernel message processing is vulnerable to these attacks: 1) Hit and run: Attacker sends one or more messages and then exits before they're processed. This may confuse/disable the next netlink user that gets the netlink address of the attacker since it may receive the responses to the attacker's messages. Proposed solutions: a) Synchronous processing. b) Stream mode socket. c) Restrict/prohibit binding. 2) Starvation: Because various netlink rcv functions were written to not return until all messages have been processed on a socket, it is possible for these functions to execute for an arbitrarily long period of time. If this is successfully exploited it could also be used to hold rtnl forever. Proposed solutions: a) Synchronous processing. b) Stream mode socket. Firstly let's cross off solution c). It only solves the first problem and it has user-visible impacts. In particular, it'll break user space applications that expect to bind or communicate with specific netlink addresses (pid's). So we're left with a choice of synchronous processing versus SOCK_STREAM for netlink. For the moment I'm sticking with the synchronous approach as suggested by Alexey since it's simpler and I'd rather spend my time working on other things. However, it does have a number of deficiencies compared to the stream mode solution: 1) User-space to user-space netlink communication is still vulnerable. 2) Inefficient use of resources. This is especially true for rtnetlink since the lock is shared with other users such as networking drivers. The latter could hold the rtnl while communicating with hardware which causes the rtnetlink user to wait when it could be doing other things. 3) It is still possible to DoS all netlink users by flooding the kernel netlink receive queue. The attacker simply fills the receive socket with a single netlink message that fills up the entire queue. The attacker then continues to call sendmsg with the same message in a loop. Point 3) can be countered by retransmissions in user-space code, however it is pretty messy. In light of these problems (in particular, point 3), we should implement stream mode netlink at some point. In the mean time, here is a patch that implements synchronous processing. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:55:09 -07:00
Thomas Graf	492b558b31	[XFRM]: Cleanup xfrm_msg_min and xfrm_dispatch Converts xfrm_msg_min and xfrm_dispatch to use c99 designated initializers to make greping a little bit easier. Also replaces two hardcoded message type with meaningful names. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-05-03 14:26:40 -07:00
Patrick McHardy	5c5d281a93	[XFRM]: Fix existence lookup in xfrm_state_find Use 'daddr' instead of &tmpl->id.daddr, since the latter might be zero. Also, only perform the lookup when tmpl->id.spi is non-zero. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2005-04-21 20:12:32 -07:00
Linus Torvalds	1da177e4c3	Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!	2005-04-16 15:20:36 -07:00

28 Commits