linux

History

David S. Miller 5e9965c15b Merge branch 'kill_rtcache' The ipv4 routing cache is non-deterministic, performance wise, and is subject to reasonably easy to launch denial of service attacks. The routing cache works great for well behaved traffic, and the world was a much friendlier place when the tradeoffs that led to the routing cache's design were considered. What it boils down to is that the performance of the routing cache is a product of the traffic patterns seen by a system rather than being a product of the contents of the routing tables. The former of which is controllable by external entitites. Even for "well behaved" legitimate traffic, high volume sites can see hit rates in the routing cache of only ~%10. The general flow of this patch series is that first the routing cache is removed. We build a completely new rtable entry every lookup request. Next we make some simplifications due to the fact that removing the routing cache causes several members of struct rtable to become no longer necessary. Then we need to make some amends such that we can legally cache pre-constructed routes in the FIB nexthops. Firstly, we need to invalidate routes which are hit with nexthop exceptions. Secondly we have to change the semantics of rt->rt_gateway such that zero means that the destination is on-link and non-zero otherwise. Now that the preparations are ready, we start caching precomputed routes in the FIB nexthops. Output and input routes need different kinds of care when determining if we can legally do such caching or not. The details are in the commit log messages for those changes. The patch series then winds down with some more struct rtable simplifications and other tidy ups that remove unnecessary overhead. On a SPARC-T3 output route lookups are ~876 cycles. Input route lookups are ~1169 cycles with rpfilter disabled, and about ~1468 cycles with rpfilter enabled. These measurements were taken with the kbench_mod test module in the net_test_tools GIT tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git That GIT tree also includes a udpflood tester tool and stresses route lookups on packet output. For example, on the same SPARC-T3 system we can run: time ./udpflood -l 10000000 10.2.2.11 with routing cache: real 1m21.955s user 0m6.530s sys 1m15.390s without routing cache: real 1m31.678s user 0m6.520s sys 1m25.140s Performance undoubtedly can easily be improved further. For example fib_table_lookup() performs a lot of excessive computations with all the masking and shifting, some of it conditionalized to deal with edge cases. Also, Eric's no-ref optimization for input route lookups can be re-instated for the FIB nexthop caching code path. I would be really pleased if someone would work on that. In fact anyone suitable motivated can just fire up perf on the loading of the test net_test_tools benchmark kernel module. I spend much of my time going: bash# perf record insmod ./kbench_mod.ko dst=172.30.42.22 src=74.128.0.1 iif=2 bash# perf report Thanks to helpful feedback from Joe Perches, Eric Dumazet, Ben Hutchings, and others. Signed-off-by: David S. Miller <davem@davemloft.net>		2012-07-22 17:04:15 -07:00
..
associola.c	sctp: Implement quick failover draft from tsvwg	2012-07-22 12:13:46 -07:00
auth.c	sctp: better integer overflow check in sctp_auth_create_key()	2011-11-29 15:51:03 -05:00
bind_addr.c	net: Remove casts of void *	2011-06-16 23:19:27 -04:00
chunk.c
command.c
debug.c	sctp: remove completely unsed EMPTY state	2011-04-20 01:51:03 -07:00
endpointola.c	treewide: Fix typos in various parts of the kernel, and fix some comments.	2011-12-02 14:57:31 +01:00
input.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2012-07-19 11:17:30 -07:00
inqueue.c
ipv6.c	ipv6: Add redirect support to all protocol icmp error handlers.	2012-07-12 00:25:15 -07:00
Kconfig
Makefile
objcnt.c
output.c	sctp: Adjust PMTU updates to accomodate route invalidation.	2012-07-16 03:57:14 -07:00
outqueue.c	sctp: Implement quick failover draft from tsvwg	2012-07-22 12:13:46 -07:00
primitive.c
probe.c
proc.c	net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules	2011-10-31 19:30:30 -04:00
protocol.c	sctp: fix warning when compiling without IPv6	2012-06-19 00:26:26 -07:00
sm_make_chunk.c	sctp: fix sparse warning for sctp_init_cause_fixed	2012-07-16 23:23:52 -07:00
sm_sideeffect.c	sctp: Implement quick failover draft from tsvwg	2012-07-22 12:13:46 -07:00
sm_statefuns.c	net: Convert net_ratelimit uses to net_<level>_ratelimited	2012-05-15 13:45:03 -04:00
sm_statetable.c	sctp: Enforce retransmission limit during shutdown	2011-07-07 14:08:44 -07:00
socket.c	sctp: Implement quick failover draft from tsvwg	2012-07-22 12:13:46 -07:00
ssnmap.c
sysctl.c	sctp: Implement quick failover draft from tsvwg	2012-07-22 12:13:46 -07:00
transport.c	Merge branch 'kill_rtcache'	2012-07-22 17:04:15 -07:00
tsnmap.c	sctp: be more restrictive in transport selection on bundled sacks	2012-06-30 22:44:35 -07:00
ulpevent.c	sctp: be more restrictive in transport selection on bundled sacks	2012-06-30 22:44:35 -07:00
ulpqueue.c	sctp: be more restrictive in transport selection on bundled sacks	2012-06-30 22:44:35 -07:00