Requested by Ingo, Thomas and HPA.
The old bootmem code is no longer necessary, and the transition is
complete. Remove it.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
memblock_memory_size() will return memory size in memblock.memory.region.
memblock_free_memory_size() will return free memory size in memblock.memory.region.
So We can get exact reseved size in specified range.
Set the size right after initmem_init(), because later bootmem API will
get area above 16M. (except some fallback).
Later after we remove the bootmem, We could call that just before paging_init().
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
1.include linux/memblock.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly
-v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
1. replace find_e820_area with memblock_find_in_range
2. replace reserve_early with memblock_x86_reserve_range
3. replace free_early with memblock_x86_free_range.
4. NO_BOOTMEM will switch to use memblock too.
5. use _e820, _early wrap in the patch, in following patch, will
replace them all
6. because memblock_x86_free_range support partial free, we can remove some special care
7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill()
so adjust some calling later in setup.c::setup_arch()
-- corruption_check and mptable_update
-v2: Move reserve_brk() early
Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range()
that could happen We have more then 128 RAM entry in E820 tables, and
memblock_x86_fill() could use memblock_find_in_range() to find a new place for
memblock.memory.region array.
and We don't need to use extend_brk() after fill_memblock_area()
So move reserve_brk() early before fill_memblock_area().
-v3: Move find_smp_config early
To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable
in right place.
-v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in
memblock.reserved already..
use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later.
-v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit
active_region for 32bit does include high pages
need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped()
-v6: Use current_limit instead
-v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L
-v8: Set memblock_can_resize early to handle EFI with more RAM entries
-v9: update after kmemleak changes in mainline
Suggested-by: David S. Miller <davem@davemloft.net>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Also let memblock_x86_reserve_range/memblock_x86_free_range could print out name if memblock=debug is
specified
will also print ther name when reserve_memblock_area/free_memblock_area are called.
-v2: according to Ingo, put " if (memblock_debug) " in one place
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It will return memory size in specified range according to memblock.memory.region
Try to share some code with memblock_x86_free_memory_in_range() by passing get_free to
__memblock_x86_memory_in_range().
-v2: Ben want _in_range in the name instead of size
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It will return free memory size in specified range.
We can not use memory_size - reserved_size here, because some reserved area
may not be in the scope of memblock.memory.region.
Use memblock.memory.region subtracting memblock.reserved.region to get free range array.
then count size of all free ranges.
-v2: Ben insist on using _in_range
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It can be used to find NODE_DATA for numa.
Need to make sure early_node_map[] is filled before it is called, otherwise
it will fallback to memblock_find_in_range(), with node range.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
memblock_x86_register_active_regions() will be used to fill early_node_map,
the result will be memblock.memory.region AND numa data
memblock_x86_hole_size will be used to find hole size on memblock.memory.region
with specified range.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
get_free_all_memory_range is for CONFIG_NO_BOOTMEM=y, and will be called by
free_all_memory_core_early().
It will use early_node_map aka active ranges subtract memblock.reserved to
get all free range, and those ranges will convert to slab pages.
-v4: increase range size
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
They are wrappers for core versions, which take start/end/name instead
of base/size. This will make x86 conversion eaasier.
could add more debug print out
-v2: change get_max_mapped() to memblock.default_alloc_limit according to Michael
Ellerman and Ben
change to memblock_x86_reserve_range and memblock_x86_free_range according to Michael Ellerman
-v3: call check_and_double after reserve/free, so could avoid to use
find_memblock_area. Suggested by Michael Ellerman
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
memblock_x86_to_bootmem() will reserve memblock.reserved.region in
bootmem after bootmem is set up.
We can use it to with all arches that support memblock later.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It will be used memblock_x86_to_bootmem converting
It is an wrapper for reserve_bootmem, and x86 64bit is using special one.
Also clean up that version for x86_64. We don't need to take care of numa
path for that, bootmem can handle it how
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
size is returned according free range.
Will be used to find free ranges for early_memtest and memory corruption check
Do not mess it up with lib/memblock.c yet.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It's wrong for several reasons, but the most direct one is that the
fault may be for the stack accesses to set up a previous SIGBUS. When
we have a kernel exception, the kernel exception handler does all the
fixups, not some user-level signal handler.
Even apart from the nested SIGBUS issue, it's also wrong to give out
kernel fault addresses in the signal handler info block, or to send a
SIGBUS when a system call already returns EFAULT.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
kunmap_atomic() is currently at level -4 on Rusty's "Hard To Misuse"
list[1] ("Follow common convention and you'll get it wrong"), except in
some architectures when CONFIG_DEBUG_HIGHMEM is set[2][3].
kunmap() takes a pointer to a struct page; kunmap_atomic(), however, takes
takes a pointer to within the page itself. This seems to once in a while
trip people up (the convention they are following is the one from
kunmap()).
Make it much harder to misuse, by moving it to level 9 on Rusty's list[4]
("The compiler/linker won't let you get it wrong"). This is done by
refusing to build if the type of its first argument is a pointer to a
struct page.
The real kunmap_atomic() is renamed to kunmap_atomic_notypecheck()
(which is what you would call in case for some strange reason calling it
with a pointer to a struct page is not incorrect in your code).
The previous version of this patch was compile tested on x86-64.
[1] http://ozlabs.org/~rusty/index.cgi/tech/2008-04-01.html
[2] In these cases, it is at level 5, "Do it right or it will always
break at runtime."
[3] At least mips and powerpc look very similar, and sparc also seems to
share a common ancestor with both; there seems to be quite some
degree of copy-and-paste coding here. The include/asm/highmem.h file
for these three archs mention x86 CPUs at its top.
[4] http://ozlabs.org/~rusty/index.cgi/tech/2008-03-30.html
[5] As an aside, could someone tell me why mn10300 uses unsigned long as
the first parameter of kunmap_atomic() instead of void *?
Signed-off-by: Cesar Eduardo Barros <cesarb@cesarb.net>
Cc: Russell King <linux@arm.linux.org.uk> (arch/arm)
Cc: Ralf Baechle <ralf@linux-mips.org> (arch/mips)
Cc: David Howells <dhowells@redhat.com> (arch/frv, arch/mn10300)
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> (arch/mn10300)
Cc: Kyle McMartin <kyle@mcmartin.ca> (arch/parisc)
Cc: Helge Deller <deller@gmx.de> (arch/parisc)
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> (arch/parisc)
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> (arch/powerpc)
Cc: Paul Mackerras <paulus@samba.org> (arch/powerpc)
Cc: "David S. Miller" <davem@davemloft.net> (arch/sparc)
Cc: Thomas Gleixner <tglx@linutronix.de> (arch/x86)
Cc: Ingo Molnar <mingo@redhat.com> (arch/x86)
Cc: "H. Peter Anvin" <hpa@zytor.com> (arch/x86)
Cc: Arnd Bergmann <arnd@arndb.de> (include/asm-generic)
Cc: Rusty Russell <rusty@rustcorp.com.au> ("Hard To Misuse" list)
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Ioremap: fix wrong physical address handling in PAT code
x86, tlb: Clean up and correct used type
x86, iomap: Fix wrong page aligned size calculation in ioremapping code
x86, mm: Create symbolic index into address_markers array
x86, ioremap: Fix normal ram range check
x86, ioremap: Fix incorrect physical address handling in PAE mode
x86-64, mm: Initialize VDSO earlier on 64 bits
x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
tracing/kprobes: unregister_trace_probe needs to be called under mutex
perf: expose event__process function
perf events: Fix mmap offset determination
perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
perf, powerpc: Convert the FSL driver to use local64_t
perf tools: Don't keep unreferenced maps when unmaps are detected
perf session: Invalidate last_match when removing threads from rb_tree
perf session: Free the ref_reloc_sym memory at the right place
x86,mmiotrace: Add support for tracing STOS instruction
perf, sched migration: Librarize task states and event headers helpers
perf, sched migration: Librarize the GUI class
perf, sched migration: Make the GUI class client agnostic
perf, sched migration: Make it vertically scrollable
perf, sched migration: Parameterize cpu height and spacing
perf, sched migration: Fix key bindings
perf, sched migration: Ignore unhandled task states
perf, sched migration: Handle ignored migrate out events
perf: New migration tool overview
tracing: Drop cpparg() macro
perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
...
Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
Add support for stos access tracing with mmiotrace.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Acked-by: Pekka Paalanen <pq@iki.fi>
Cc: Nouveau <nouveau@lists.freedesktop.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100731205101.GA5860@joi.lan>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
The following two commits fixed a problem that x86 ioremap() doesn't handle
physical address higher than 32-bit properly in X86_32 PAE mode.
ffa71f33a8 (x86, ioremap: Fix incorrect
physical address handling in PAE mode)
35be1b716a (x86, ioremap: Fix normal
ram range check)
But these fixes are not enough, since pat_pagerange_is_ram() in PAT code
also has a same problem. This patch fixes it.
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <4C47DDCF.80300@jp.fujitsu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
smp_processor_id() returns an int and not an unsigned long.
Also, since the function is small enough, there's no need for a
local variable caching its value.
No functionality change, just cleanup.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
LKML-Reference: <20100721124705.GA674@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86 early_iounmap(): fix off-by-one error in page alignment of allocation
size for sizes where size%PAGE_SIZE==1.
Signed-off-by: Florian Zumbiehl <florz@florz.de>
LKML-Reference: <201007202219.o6KMJlES021058@imap1.linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Without this, adding entries into the address_markers array means adding
more and more of an #ifdef maze in pt_dump_init(). By using indices, we
can keep it a bit saner.
Signed-off-by: Andres Salomon <dilinger@queued.net>
LKML-Reference: <201007202219.o6KMJkUs021052@imap1.linux-foundation.org>
Cc: Jordan Crouse <jordan.crouse@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
pavel@suse.cz no longer works, replace it with working address.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Check for normal RAM in x86 ioremap() code seems to not work for the
last page frame in the specified physical address range.
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <4C1AE6CD.1080704@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Current x86 ioremap() doesn't handle physical address higher than
32-bit properly in X86_32 PAE mode. When physical address higher than
32-bit is passed to ioremap(), higher 32-bits in physical address is
cleared wrongly. Due to this bug, ioremap() can map wrong address to
linear address space.
In my case, 64-bit MMIO region was assigned to a PCI device (ioat
device) on my system. Because of the ioremap()'s bug, wrong physical
address (instead of MMIO region) was mapped to linear address space.
Because of this, loading ioatdma driver caused unexpected behavior
(kernel panic, kernel hangup, ...).
Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <4C1AE680.7090408@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Reimplement augmented RB-trees without sprinkling extra branches
all over the RB-tree code (which lives in the scheduler hot path).
This approach is 'borrowed' from Fabio's BFQ implementation and
relies on traversing the rebalance path after the RB-tree-op to
correct the heap property for insertion/removal and make up for
the damage done by the tree rotations.
For insertion the rebalance path is trivially that from the new
node upwards to the root, for removal it is that from the deepest
node in the path from the to be removed node that will still
be around after the removal.
[ This patch also fixes a video driver regression reported by
Ali Gholami Rudi - the memtype->subtree_max_end was updated
incorrectly. ]
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Ali Gholami Rudi <ali@rudi.ir>
Cc: Fabio Checconi <fabio@gandalf.sssup.it>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1275414172.27810.27961.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
After every iounmap mmiotrace has to free kmmio_fault_pages, but
it can't do it directly, so it defers freeing by RCU.
It usually works, but when mmiotraced code calls ioremap-iounmap
multiple times without sleeping between (so RCU won't kick in
and start freeing) it can be given the same virtual address, so
at every iounmap mmiotrace will schedule the same pages for
release. Obviously it will explode on second free.
Fix it by marking kmmio_fault_pages which are scheduled for
release and not adding them second time.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Tested-by: Marcin Kocielnicki <koriakin@0x04.net>
Tested-by: Shinpei KATO <shinpei@il.is.s.u-tokyo.ac.jp>
Acked-by: Pekka Paalanen <pq@iki.fi>
Cc: Stuart Bennett <stuart@freedesktop.org>
Cc: Marcin Kocielnicki <koriakin@0x04.net>
Cc: nouveau@lists.freedesktop.org
Cc: <stable@kernel.org>
LKML-Reference: <20100613215654.GA3829@joi.lan>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
subtree_max_end that was recently added to struct memtype was not getting
properly initialized resulting in
WARNING: kmemcheck: Caught 64-bit read from uninitialized memory
in memtype_rb_augment_cb()
reported here
https://bugzilla.kernel.org/show_bug.cgi?id=16092
This change fixes the problem.
Reported-by: Christian Casteyde <casteyde.christian@free.fr>
Tested-by: Christian Casteyde <casteyde.christian@free.fr>
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
LKML-Reference: <1276217101-11515-1-git-send-email-venki@google.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, cpufeature: Unbreak compile with gcc 3.x
x86, pat: Fix memory leak in free_memtype
x86, k8: Fix section mismatch for powernowk8_exit()
lib/atomic64_test: fix missing include of linux/kernel.h
x86: remove last traces of quicklist usage
x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012
x86: "nosmp" command line option should force the system into UP mode
arch/x86/pci: use kasprintf
x86, apic: ack all pending irqs when crashed/on kexec
This reverts commit 0ac0c0d0f8, which
caused cross-architecture build problems for all the wrong reasons.
IA64 already added its own version of __node_random(), but the fact is,
there is nothing architectural about the function, and the original
commit was just badly done. Revert it, since no fix is forthcoming.
Requested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (61 commits)
tracing: Add __used annotation to event variable
perf, trace: Fix !x86 build bug
perf report: Support multiple events on the TUI
perf annotate: Fix up usage of the build id cache
x86/mmiotrace: Remove redundant instruction prefix checks
perf annotate: Add TUI interface
perf tui: Remove annotate from popup menu after failure
perf report: Don't start the TUI if -D is used
perf: Fix getline undeclared
perf: Optimize perf_tp_event_match()
perf: Remove more code from the fastpath
perf: Optimize the !vmalloc backed buffer
perf: Optimize perf_output_copy()
perf: Fix wakeup storm for RO mmap()s
perf-record: Share per-cpu buffers
perf-record: Remove -M
perf: Ensure that IOC_OUTPUT isn't used to create multi-writer buffers
perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to track events
perf, trace: Optimize tracepoints by removing IRQ-disable from perf/tracepoint interaction
perf tui: Allow disabling the TUI on a per command basis in ~/.perfconfig
...
x86 arch specific changes to use generic numa_node_id() based on generic
percpu variable infrastructure. Back out x86's custom version of
numa_node_id()
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Whitney <eric.whitney@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some workloads that create a large number of small files tend to assign
too many pages to node 0 (multi-node systems). Part of the reason is that
the rotor (in cpuset_mem_spread_node()) used to assign nodes starts at
node 0 for newly created tasks.
This patch changes the rotor to be initialized to a random node number of
the cpuset.
[akpm@linux-foundation.org: fix layout]
[Lee.Schermerhorn@hp.com: Define stub numa_random() for !NUMA configuration]
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Paul Menage <menage@google.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reserve_memtype will allocate memory for new memtype, but
in free_memtype, after the memtype erased from rbtree, the
memory is not freed.
Changes since V1:
make rbt_memtype_erase return erased memtype so that
it can be freed in free_memtype.
[ hpa: not for -stable: 2.6.34 and earlier not affected ]
Signed-off-by: Xiaotian Feng <dfeng@redhat.com>
LKML-Reference: <1274838670-8731-1-git-send-email-dfeng@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Jack Steiner <steiner@sgi.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We still have a stray quicklist header included even though we axed
quicklist usage quite a while back.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <201005241913.o4OJDJe9010881@imap1.linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Get rid of the duplicated entries in prefix_codes[]
to eliminate redundant checks by skip_prefix().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Pekka Paalanen <pq@iki.fi>
LKML-Reference: <1274140110-5841-1-git-send-email-akinobu.mita@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'drm-for-2.6.35' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (207 commits)
drm/radeon/kms/pm/r600: select the mid clock mode for single head low profile
drm/radeon: fix power supply kconfig interaction.
drm/radeon/kms: record object that have been list reserved
drm/radeon: AGP memory is only I/O if the aperture can be mapped by the CPU.
drm/radeon/kms: don't default display priority to high on rs4xx
drm/edid: fix typo in 1600x1200@75 mode
drm/nouveau: fix i2c-related init table handlers
drm/nouveau: support init table i2c device identifier 0x81
drm/nouveau: ensure we've parsed i2c table entry for INIT_*I2C* handlers
drm/nouveau: display error message for any failed init table opcode
drm/nouveau: fix init table handlers to return proper error codes
drm/nv50: support fractional feedback divider on newer chips
drm/nv50: fix monitor detection on certain chipsets
drm/nv50: store full dcb i2c entry from vbios
drm/nv50: fix suspend/resume with DP outputs
drm/nv50: output calculated crtc pll when debugging on
drm/nouveau: dump pll limits entries when debugging is on
drm/nouveau: bios parser fixes for eDP boards
drm/nouveau: fix a nouveau_bo dereference after it's been destroyed
drm/nv40: remove some completed ctxprog TODOs
...
* 'x86-pat-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, pat: Update the page flags for memtype atomically instead of using memtype_lock
x86, pat: In rbt_memtype_check_insert(), update new->type only if valid
x86, pat: Migrate to rbtree only backend for pat memtype management
x86, pat: Preparatory changes in pat.c for bigger rbtree change
rbtree: Add support for augmented rbtrees
Conflicts:
drivers/gpu/drm/i915/i915_dma.c
drivers/gpu/drm/i915/i915_drv.h
drivers/gpu/drm/radeon/r300.c
The BSD ringbuffer support that is landing in this branch
significantly conflicts with the Ironlake PIPE_CONTROL fix on master,
and requires it to be tested successfully anyway.
With NUMA emulation, it's possible for a single cpu to be bound
to multiple nodes since more than one may have affinity if
allocated on a physical node that is local to the cpu.
APIC ids must therefore be mapped to the lowest node ids to
maintain generic kernel use of functions such as cpu_to_node()
that determine device affinity. For example, if a device has
proximity to physical node 1, for instance, and a cpu happens to
be mapped to a higher emulated node id 8, the proximity may not
be correctly determined by comparison in generic code even
though the cpu may be truly local and allocated on physical node 1.
When this happens, the true topology of the machine isn't
accurately represented in the emulated environment; although
this isn't critical to the system's uptime, any generic code
that is NUMA aware benefits from the physical topology being
accurately represented.
This can affect any system that maps multiple APIC ids to a
single node and is booted with numa=fake=N where N is greater
than the number of physical nodes.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <alpine.DEB.2.00.1005060224140.19473@chino.kir.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Commit e67a807 ("x86: Fix 'reservetop=' functionality") added a
fixup_early_ioremap() call to parse_reservetop() and declared it
in io.h.
But asm/io.h was only included indirectly - and on some configs
not at all, causing a build failure on those configs.
Cc: Liang Li <liang.li@windriver.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Wang Chen <wangchen@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1272621711-8683-1-git-send-email-liang.li@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
When specifying the 'reservetop=0xbadc0de' kernel parameter,
the kernel will stop booting due to a early_ioremap bug that
relates to commit 8827247ff.
The root cause of boot failure problem is the value of
'slot_virt[i]' was initialized in setup_arch->early_ioremap_init().
But later in setup_arch, the function 'parse_early_param' will
modify 'FIXADDR_TOP' when 'reservetop=0xbadc0de' being specified.
The simplest fix might be use __fix_to_virt(idx0) to get updated
value of 'FIXADDR_TOP' in '__early_ioremap' instead of reference
old value from slot_virt[slot] directly.
Changelog since v0:
-v1: When reservetop being handled then FIXADDR_TOP get
adjusted, Hence check prev_map then re-initialize slot_virt and
PMD based on new FIXADDR_TOP.
-v2: place fixup_early_ioremap hence call early_ioremap_init in
reserve_top_address to re-initialize slot_virt and
corresponding PMD when parse_reservertop
-v3: move fixup_early_ioremap out of reserve_top_address to make
sure other clients of reserve_top_address like xen/lguest won't
broken
Signed-off-by: Liang Li <liang.li@windriver.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Wang Chen <wangchen@cn.fujitsu.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <1272621711-8683-1-git-send-email-liang.li@windriver.com>
[ fixed three small cleanliness details in fixup_early_ioremap() ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
... i.e. when the hole between two regions isn't occupied by memory on
another node. This reduces the memory->node table size, thus reducing
cache footprint of lookups, which got increased significantly some
time ago, and things go back to how they were before that change on
the systems I looked at.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4BCF3230020000780003B3CA@vpn.id2.novell.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
While testing an application using the xpmem (out of kernel) driver, we
noticed a significant page fault rate reduction of x86_64 with respect
to ia64. For one test running with 32 cpus, one thread per cpu, it
took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
For the same test running on 256 cpus, one thread per cpu, it took 14:48
to vm_insert_pfn 2 GB worth of pages.
The slowdown was tracked to lookup_memtype which acquires the
spinlock memtype_lock. This heavily contended lock was slowing down
vm_insert_pfn().
With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
cases take approx 00:01.3 seconds to complete.
Signed-off-by: Robin Holt <holt@sgi.com>
LKML-Reference: <20100423153627.751194346@gulag1.americas.sgi.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: Rafael Wysocki <rjw@novell.com>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* drm-ttm-pool:
drm/ttm: using kmalloc/kfree requires including slab.h
drm/ttm: include linux/seq_file.h for seq_printf
drm/ttm: Add sysfs interface to control pool allocator.
drm/ttm: Use set_pages_array_wc instead of set_memory_wc.
arch/x86: Add array variants for setting memory to wc caching.
drm/nouveau: Add ttm page pool debugfs file.
drm/radeon/kms: Add ttm page pool debugfs file.
drm/ttm: Add debugfs output entry to pool allocator.
drm/ttm: add pool wc/uc page allocator V3