1
linux/include
Zhang Yanmin f1dd9c379c [NET]: Fix tbench regression in 2.6.25-rc1
Comparing with kernel 2.6.24, tbench result has regression with
2.6.25-rc1.

1) On 2 quad-core processor stoakley: 4%.
2) On 4 quad-core processor tigerton: more than 30%.

bisect located below patch.

b4ce92775c is first bad commit
commit b4ce92775c
Author: Herbert Xu <herbert@gondor.apana.org.au>
Date:   Tue Nov 13 21:33:32 2007 -0800

    [IPV6]: Move nfheader_len into rt6_info

    The dst member nfheader_len is only used by IPv6.  It's also currently
    creating a rather ugly alignment hole in struct dst.  Therefore this patch
    moves it from there into struct rt6_info.

Above patch changes the cache line alignment, especially member
__refcnt. I did a testing by adding 2 unsigned long pading before
lastuse, so the 3 members, lastuse/__refcnt/__use, are moved to next
cache line. The performance is recovered.

I created a patch to rearrange the members in struct dst_entry.

With Eric and Valdis Kletnieks's suggestion, I made finer arrangement.

1) Move tclassid under ops in case CONFIG_NET_CLS_ROUTE=y. So
   sizeof(dst_entry)=200 no matter if CONFIG_NET_CLS_ROUTE=y/n. I
   tested many patches on my 16-core tigerton by moving tclassid to
   different place. It looks like tclassid could also have impact on
   performance.  If moving tclassid before metrics, or just don't move
   tclassid, the performance isn't good. So I move it behind metrics.

2) Add comments before __refcnt.

On 16-core tigerton:

If CONFIG_NET_CLS_ROUTE=y, the result with below patch is about 18%
better than the one without the patch;

If CONFIG_NET_CLS_ROUTE=n, the result with below patch is about 30%
better than the one without the patch.

With 32bit 2.6.25-rc1 on 8-core stoakley, the new patch doesn't
introduce regression.

Thank Eric, Valdis, and David!

Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com>
Acked-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-03-12 22:52:37 -07:00
..
acpi ACPI, cpuidle: Clarify C-state description in sysfs 2008-02-14 00:09:55 -05:00
asm-alpha CONFIG_HIGHPTE vs. sub-page page tables. 2008-02-08 09:22:42 -08:00
asm-arm spi: pxa2xx_spi clock polarity fix 2008-02-23 17:12:14 -08:00
asm-avr32 CONFIG_HIGHPTE vs. sub-page page tables. 2008-02-08 09:22:42 -08:00
asm-blackfin Add pgtable_t to remaining nommu architectures 2008-02-09 11:08:33 -08:00
asm-cris Merge branch 'cris' of git://www.jni.nu/cris 2008-02-08 10:01:28 -08:00
asm-frv FRV: Change the timerfd syscalls to be the same as i386 2008-02-20 19:58:16 -08:00
asm-generic percpu: fix DEBUG_PREEMPT per_cpu checking 2008-02-23 12:09:28 -08:00
asm-h8300 h8300: IRQ handling update 2008-02-23 17:12:16 -08:00
asm-ia64 [IA64] Fix build for sim_defconfig 2008-02-11 13:23:46 -08:00
asm-m32r CONFIG_HIGHPTE vs. sub-page page tables. 2008-02-08 09:22:42 -08:00
asm-m68k CONFIG_HIGHPTE vs. sub-page page tables. 2008-02-08 09:22:42 -08:00
asm-m68knommu m68knommu: use tabs not spaces in cacheflush.h 2008-02-14 20:58:05 -08:00
asm-mips Alchemy: compile fix 2008-02-24 20:03:42 +01:00
asm-mn10300 MN10300: define SO_MARK 2008-02-23 17:12:13 -08:00
asm-parisc CONFIG_HIGHPTE vs. sub-page page tables. 2008-02-08 09:22:42 -08:00
asm-powerpc percpu: fix DEBUG_PREEMPT per_cpu checking 2008-02-23 12:09:28 -08:00
asm-ppc [POWERPC] Fix arch/ppc compilation - add typedef for pgtable_t 2008-02-14 22:11:02 +11:00
asm-s390 [S390] find bit corner case. 2008-02-19 15:29:33 +01:00
asm-sh sh: fix ioreadN_rep and iowriteN_rep 2008-02-14 14:25:37 +09:00
asm-sparc [SPARC]: Merge asm-sparc{,64}/a.out.h 2008-02-09 22:25:50 -08:00
asm-sparc64 [SPARC64]: More sparse warning fixes in process.c 2008-02-19 21:25:50 -08:00
asm-um uml: x86_64 should copy %fs during fork 2008-02-08 09:22:43 -08:00
asm-v850 Add pgtable_t to remaining nommu architectures 2008-02-09 11:08:33 -08:00
asm-x86 Remove empty file remnants that were left in the tree by mistake 2008-02-20 19:56:01 -08:00
asm-xtensa [XTENSA] Allow debugger to modify the WINDOWBASE register. 2008-02-13 17:45:36 -08:00
crypto
keys
linux [NETFILTER]: nfnetlink: fix ifdef in nfnetlink_compat.h 2008-03-10 16:41:06 -07:00
math-emu
media V4L/DVB (7192): Adds support for Genius TVGo A11MCE 2008-02-18 11:15:19 -03:00
mtd
net [NET]: Fix tbench regression in 2.6.25-rc1 2008-03-12 22:52:37 -07:00
pcmcia
rdma IB/core: Remove unused struct ib_device.flags member 2008-02-08 14:47:26 -08:00
rxrpc
scsi [SCSI] update SG_ALL to avoid causing chaining 2008-02-11 13:40:13 -06:00
sound [ALSA] opl3 - Fix compilation without sequencer support 2008-02-22 14:20:08 -08:00
video
xen
Kbuild