1
linux/mm
Wu Fengguang 2fad6f5dee readahead: enforce full readahead size on async mmap readahead
We need this in one particular case and two more general ones.

Now we do async readahead for sequential mmap reads, and do it with the
help of PG_readahead.  For normal reads, PG_readahead is the sufficient
condition to do a sequential readahead.  But unfortunately, for mmap
reads, there is a tiny nuisance:

[11736.998347] readahead-init0(process: sh/23926, file: sda1/w3m, offset=0:4503599627370495, ra=0+4-3) = 4
[11737.014985] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:0, ra=290+32-0) = 17
[11737.019488] readahead-around(process: w3m/23926, file: sda1/w3m, offset=0:0, ra=118+32-0) = 32
[11737.024921] readahead-interleaved(process: w3m/23926, file: sda1/w3m, offset=0:2, ra=4+6-6) = 6
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                 ~~~~~~~~~~~~~

An unfavorably small readahead.  The original dumb read-around size could
be more efficient.

That happened because ld-linux.so does a read(832) in L1 before mmap(),
which triggers a 4-page readahead, with the second page tagged
PG_readahead.

L0: open("/lib/libc.so.6", O_RDONLY)        = 3
L1: read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\342"..., 832) = 832
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L2: fstat(3, {st_mode=S_IFREG|0755, st_size=1420624, ...}) = 0
L3: mmap(NULL, 3527256, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fac6e51d000
L4: mprotect(0x7fac6e671000, 2097152, PROT_NONE) = 0
L5: mmap(0x7fac6e871000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x154000) = 0x7fac6e871000
L6: mmap(0x7fac6e876000, 16984, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fac6e876000
L7: close(3)                                = 0

In general, the PG_readahead flag will also be hit in cases

- sequential reads

- clustered random reads

A full readahead size is desirable in both cases.

Cc: Nick Piggin <npiggin@suse.de>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-16 19:47:29 -07:00
..
allocpercpu.c percpu: __percpu_depopulate_mask can take a const mask 2009-04-06 13:44:15 -07:00
backing-dev.c block: change the request allocation/congestion logic to be sync/async based 2009-04-06 08:04:53 -07:00
bootmem.c bootmem: fix slab fallback on numa 2009-06-11 19:15:54 +03:00
bounce.c Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block 2009-06-11 11:10:35 -07:00
debug-pagealloc.c generic debug pagealloc 2009-04-01 08:59:13 -07:00
dmapool.c
fadvise.c readahead: move max_sane_readahead() calls into force_page_cache_readahead() 2009-06-16 19:47:28 -07:00
failslab.c kmemtrace, mm: fix slab.h dependency problem in mm/failslab.c 2009-04-03 12:23:01 +02:00
filemap_xip.c mm: do_xip_mapping_read: fix length calculation 2009-04-02 19:04:49 -07:00
filemap.c readahead: enforce full readahead size on async mmap readahead 2009-06-16 19:47:29 -07:00
fremap.c Do not account for the address space used by hugetlbfs using VM_ACCOUNT 2009-02-10 10:48:42 -08:00
highmem.c mm: introduce debug_kmap_atomic 2009-04-01 08:59:14 -07:00
hugetlb.c mm: account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs 2009-05-29 08:40:03 -07:00
init-mm.c mm: consolidate init_mm definition 2009-06-16 19:47:28 -07:00
internal.h nommu: there is no mlock() for NOMMU, so don't provide the bits 2009-04-01 08:59:14 -07:00
Kconfig security: use mmap_min_addr indepedently of security models 2009-06-04 12:07:48 +10:00
Kconfig.debug generic debug pagealloc: build fix 2009-04-02 19:04:48 -07:00
kmemleak-test.c kmemleak: Simple testing module for kmemleak 2009-06-11 17:04:19 +01:00
kmemleak.c kmemleak: Add the base support 2009-06-11 17:03:28 +01:00
maccess.c [S390] maccess: add weak attribute to probe_kernel_write 2009-06-12 10:27:37 +02:00
madvise.c readahead: move max_sane_readahead() calls into force_page_cache_readahead() 2009-06-16 19:47:28 -07:00
Makefile mm: consolidate init_mm definition 2009-06-16 19:47:28 -07:00
memcontrol.c memcg: fix build warning and avoid checking for mem != null again and again 2009-05-29 08:40:03 -07:00
memory_hotplug.c mm: remove GFP_HIGHUSER_PAGECACHE 2009-01-06 15:59:01 -08:00
memory.c mm: close page_mkwrite races 2009-05-02 15:36:09 -07:00
mempolicy.c [CVE-2009-0029] System call wrappers part 28 2009-01-14 14:15:30 +01:00
mempool.c
migrate.c FS-Cache: Recruit a page flags for cache management 2009-04-03 16:42:36 +01:00
mincore.c [CVE-2009-0029] System call wrappers part 14 2009-01-14 14:15:24 +01:00
mlock.c x86, bts, mm: clean up buffer allocation 2009-04-24 10:18:52 +02:00
mm_init.c
mmap.c Merge branch 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-06-11 14:01:07 -07:00
mmu_notifier.c
mmzone.c [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 2009-05-18 11:22:24 +01:00
mprotect.c perf_counter: Add mmap event hooks to mprotect() 2009-06-08 23:10:43 +02:00
mremap.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
msync.c [CVE-2009-0029] System call wrappers part 13 2009-01-14 14:15:23 +01:00
nommu.c nommu: Provide mmap_min_addr definition. 2009-06-10 09:24:09 +10:00
oom_kill.c oom: fix possible oom_dump_tasks NULL pointer 2009-05-29 08:40:01 -07:00
page_alloc.c kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash 2009-06-11 17:03:30 +01:00
page_cgroup.c memcg: fix page_cgroup fatal error in FLATMEM 2009-06-12 11:00:54 +03:00
page_io.c block: fix bad definition of BIO_RW_SYNC 2009-02-18 10:32:00 +01:00
page_isolation.c
page-writeback.c page-writeback: fix the calculation of the oldest_jif in wb_kupdate() 2009-05-17 16:36:11 -07:00
pagewalk.c
pdflush.c Revert "mm: add /proc controls for pdflush threads" 2009-05-15 11:32:24 +02:00
percpu.c percpu: remove rbtree and use page->index instead 2009-04-08 18:31:31 +02:00
prio_tree.c
quicklist.c cpumask: replace node_to_cpumask with cpumask_of_node. 2009-03-13 14:49:46 +10:30
readahead.c readahead: remove sync/async readahead call dependency 2009-06-16 19:47:29 -07:00
rmap.c hugh: update email address 2009-05-21 13:14:32 -07:00
shmem_acl.c
shmem.c integrity: move ima_counts_get 2009-05-22 09:45:33 +10:00
slab.c slab: setup cpu caches later on when interrupts are enabled 2009-06-12 18:53:58 +03:00
slob.c kmemleak: Add the slob memory allocation/freeing hooks 2009-06-11 17:03:30 +01:00
slub.c slab,slub: don't enable interrupts during early boot 2009-06-12 18:53:33 +03:00
sparse-vmemmap.c
sparse.c mm: mminit_validate_memmodel_limits(): remove redundant test 2009-04-01 08:59:11 -07:00
swap_state.c memcg: fix deadlock between lock_page_cgroup and mapping tree_lock 2009-05-29 08:40:02 -07:00
swap.c mm: fix Committed_AS underflow on large NR_CPUS environment 2009-05-02 15:36:10 -07:00
swapfile.c PM/hibernate: fix "swap breaks after hibernation failures" 2009-02-21 14:17:17 -08:00
thrash.c
truncate.c memcg: fix deadlock between lock_page_cgroup and mapping tree_lock 2009-05-29 08:40:02 -07:00
util.c Merge branch 'linus' into tracing/core 2009-05-07 11:17:34 +02:00
vmalloc.c Merge branch 'for-linus' of git://linux-arm.org/linux-2.6 2009-06-11 14:15:57 -07:00
vmscan.c PM/Suspend: Do not shrink memory before suspend 2009-06-12 21:32:32 +02:00
vmstat.c [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 2009-05-18 11:22:24 +01:00