1
Commit Graph

638 Commits

Author SHA1 Message Date
Yinghai Lu
49c980df55 x86: fix vmemmap printout check
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: "Nick Piggin" <npiggin@suse.de>
Cc: "Mark McLoughlin" <markmc@redhat.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: "Eduardo Habkost" <ehabkost@redhat.com>
Cc: "Vegard Nossum" <vegard.nossum@gmail.com>
Cc: "Stephen Tweedie" <sct@redhat.com>
Cc: "Jeremy Fitzhardinge" <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 10:43:24 +02:00
Yinghai Lu
b50efd2a55 x86: introduce page_size_mask for 64bit
prepare for overmapped patch

also printout last_map_addr together with end

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 09:37:45 +02:00
Andrew Morton
5ed4273af8 arch/x86/mm/pgtable_32.c: remove unused variable `fixmaps'
arch/x86/mm/pgtable_32.c:144: warning: 'fixmaps' defined but not used

Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 08:18:40 +02:00
Jack Steiner
3a9e189d69 x86: map UV chipset space - pagetable
Add boot-time function for creating additional 2MB page table entries for
mapping chipset specific cached/uncached ranges.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-09 07:43:23 +02:00
Yinghai Lu
6247943d8a x86: remove acpi_srat config v2
use ACPI_NUMA directly

and move srat_32.c to mm/

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 15:49:08 +02:00
Yinghai Lu
698839fe04 x86: remove have_arch_parse_srat -v2
we already have the same srat handling interface for 32bit.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 15:49:01 +02:00
Jeremy Fitzhardinge
5a654ba7a8 x86/cpa: use an undefined PTE bit for testing CPA
Rather than using _PAGE_GLOBAL - which not all CPUs support - to test
CPA, use one of the reserved-for-software-use PTE flags instead.  This
allows CPA testing to work on CPUs which don't support PGD.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:30 +02:00
Jeremy Fitzhardinge
ef5e94af16 x86_32: remove __PAGE_KERNEL(_EXEC)
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Older x86-32 processors do not support global mappings (PGD), so must
only use it if the processor supports it.

The _PAGE_KERNEL* flags always have _PAGE_KERNEL set, since logically
we always want it set.

This is OK even on processors which do not support PGD, since all
_PAGE flags are masked with __supported_pte_mask before being turned
into a real in-pagetable pte.  On 32-bit systems, __supported_pte_mask
is initialized to not contain _PAGE_GLOBAL, and it is then added if
the CPU is found to support it.

The x86-32 code used to use __PAGE_KERNEL/__PAGE_KERNEL_EXEC for this
purpose, but they're now redundant and can be removed.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:29 +02:00
Jeremy Fitzhardinge
574977a2ed x86_64/setup: unconditionally populate the pgd
When allocating a new pud, unconditionally populate the pgd (why did
we bother to create a new pud if we weren't going to populate it?).

This will only happen if the pgd slot was empty, since any existing
pud will be reused.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:27 +02:00
Yinghai Lu
cb95a13a8a x86: merge zones_sizes_init for numa and non numa on 32-bit
move out e820_register_active_regions from non numa zones_sizes_init()
and remove numa version zones_sizes_init().

and let 32 bit call remove_all_active_ranges() in setup_arch() directly
like 64-bit

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:22 +02:00
Yinghai Lu
996cf4438f x86: don't reallocate pgt for node0
kva ram already mapped right after away, so don't need to get that for low ram.
avoid wasting one copy of pgdat.

also add node id in early_res name in case we get it from find_e820_area.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:15 +02:00
Yinghai Lu
a04ad82d0b x86: fix init_memory_mapping over boundary, v4
use PMD_SHIFT to calculate boundary also adjust size for pre-allocated
table size

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:07 +02:00
Yinghai Lu
7482b0e962 x86: fix init_memory_mapping over boundary v3
some ram-end boundary only has page alignment, instead of 2M alignment.

v2: make init_memory_mapping more solid: start could be any value other than 0
v3: fix NON PAE by handling left over in kernel_physical_mapping

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:06 +02:00
Jeremy Fitzhardinge
22b45144f6 x86/paravirt: groundwork for 64-bit Xen support, fix #2
Ingo Molnar wrote:
> that fixed the build but now we've got a boot crash with this config:
>
>  time.c: Detected 2010.304 MHz processor.
>  spurious 8259A interrupt: IRQ7.
>  BUG: unable to handle kernel NULL pointer dereference at  0000000000000000
>  IP: [<0000000000000000>]
>  PGD 0
>  Thread overran stack, or stack corrupted
>  Oops: 0010 [1] SMP
>  CPU 0
>

I don't know if this will fix this bug, but it's definitely a bugfix.
It was trashing random pages by overwriting them with pagetables...

Don't trash a large pmd's data when mapping physical memory.
This is a bugfix for "x86_64: adjust mapping of physical pagetables
to work with Xen".

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:02 +02:00
Yinghai Lu
e7b3789524 x86: move fix mapping page table range early
do that in init_memory_mapping

also remove one init_ohci1394_dma_on_all_controllers

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:16:01 +02:00
Jeremy Fitzhardinge
8207c2570a x86: fix pte allocation in "x86: introduce init_memory_mapping for 32bit"
The patch "x86: introduce init_memory_mapping for 32bit" does not allocate
enough space for PTEs if the CPU does not implement PSE.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:15:58 +02:00
Eduardo Habkost
0814e0bace x86, 64-bit: split set_pte_vaddr()
We will need to set a pte on l3_user_pgt. Extract set_pte_vaddr_pud()
from set_pte_vaddr(), that will accept the l3 page table as parameter.

This change should be a no-op for existing code.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:11:09 +02:00
Jeremy Fitzhardinge
7c934d3990 x86, 64-bit: create small vmemmap mappings if PSE not available
If PSE is not available, then fall back to 4k page mappings for the
vmemmap area.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:11:08 +02:00
Jeremy Fitzhardinge
4f9c11dd49 x86, 64-bit: adjust mapping of physical pagetables to work with Xen
This makes a few of changes to the construction of the initial
pagetables to work better with paravirt_ops/Xen.  The main areas
are:

 1. Support non-PSE mapping of memory, since Xen doesn't currently
    allow 2M pages to be mapped in guests.

 2. Make sure that the ioremap alias of all pages are dropped before
    attaching the new page to the pagetable.  This avoids having
    writable aliases of pagetable pages.

 3. Preserve existing pagetable entries, rather than overwriting.  Its
    possible that a fair amount of pagetable has already been constructed,
    so reuse what's already in place rather than ignoring and overwriting it.

The algorithm relies on the invariant that any page which is part of
the kernel pagetable is itself mapped in the linear memory area.  This
way, it can avoid using ioremap on a pagetable page.

The invariant holds because it maps memory from low to high addresses,
and also allocates memory from low to high.  Each allocated page can
map at least 2M of address space, so the mapped area will always
progress much faster than the allocated area.  It relies on the early
boot code mapping enough pages to get started.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:11:07 +02:00
Jeremy Fitzhardinge
d8d5900ef8 x86: preallocate and prepopulate separately
Jan Beulich points out that vmalloc_sync_all() assumes that the
kernel's pmd is always expected to be present in the pgd.  The current
pgd construction code will add the pgd to the pgd_list before its pmds
have been pre-populated, thereby making it visible to
vmalloc_sync_all().

However, because pgd_prepopulate_pmd also does the allocation, it may
block and cannot be done under spinlock.

The solution is to preallocate the pmds out of the spinlock, then
populate them while holding the pgd_list lock.

This patch also pulls the pmd preallocation and mop-up functions out
to be common, assuming that the compiler will generate no code for
them when PREALLOCTED_PMDS is 0.  Also, there's no need for pgd_ctor
to clear the pgd again, since it's allocated as a zeroed page.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jan Beulich <jbeulich@novell.com>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:11:02 +02:00
Jeremy Fitzhardinge
eba0045ff8 x86/paravirt: add a pgd_alloc/free hooks
Add hooks which are called at pgd_alloc/free time.  The pgd_alloc hook
may return an error code, which if non-zero, causes the pgd allocation
to be failed.  The hooks may be used to allocate/free auxillary
per-pgd information.

also fix:

> * Ingo Molnar <mingo@elte.hu> wrote:
>
>  include/asm/pgalloc.h: In function ‘paravirt_pgd_free':
>  include/asm/pgalloc.h:14: error: parameter name omitted
>  arch/x86/kernel/entry_64.S: In file included from
>  arch/x86/kernel/traps_64.c:51:include/asm/pgalloc.h: In function ‘paravirt_pgd_free':
>  include/asm/pgalloc.h:14: error: parameter name omitted

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:11:01 +02:00
Jeremy Fitzhardinge
67350a5c45 x86: simplify vmalloc_sync_all
vmalloc_sync_all() is only called from register_die_notifier and
alloc_vm_area.  Neither is on any performance-critical paths, so
vmalloc_sync_all() itself is not on any hot paths.

Given that the optimisations in vmalloc_sync_all add a fair amount of
code and complexity, and are fairly hard to evaluate for correctness,
it's better to just remove them to simplify the code rather than worry
about its absolute performance.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:59 +02:00
Yinghai Lu
c987d12f84 x86: remove end_pfn in 64bit
and use max_pfn directly.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:38 +02:00
Yinghai Lu
d86623a0d5 x86: add table_top check for alloc_low_page in 64 bit
that range is from find_e820_area, so don't try to use end_pfn to see
if out of boundary...use table_top instead to avoid possible strange
result while cross the boundary...

also change early_printk to printk, because init_memory_mapping is after
early param parsing, and console=uart8250 already working at that time.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:36 +02:00
Yinghai Lu
1a0db38e5f x86: get max_pfn_mapped in init_memory_mapping
so don't shift that in the loop

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:35 +02:00
Yinghai Lu
3a58a2a6c8 x86: introduce init_memory_mapping for 32bit #3
move kva related early backto initmem_init for numa32

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:33 +02:00
Yinghai Lu
cfb0e53b05 x86: introduce init_memory_mapping for 32bit #2
moving relocate_initrd early

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:32 +02:00
Yinghai Lu
4e29684c40 x86: introduce init_memory_mapping for 32bit #1
... so can we use mem below max_low_pfn earlier.

this allows us to move several functions more early instead of waiting
to after paging_init.

That includes moving relocate_initrd() earlier in the bootup, and kva
related early setup done in initmem_init. (in followup patches)

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:32 +02:00
Jeremy Fitzhardinge
4583ed514e x86, 64-bit: unify early_ioremap
The 32-bit early_ioremap will work equally well for 64-bit, so just use it.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:28 +02:00
Jeremy Fitzhardinge
bb23e403e5 x86, 64-bit: use p??_populate() to attach pages to pagetable
Use the _populate() functions to attach new pages to a pagetable, to
make sure the right paravirt_ops calls get called.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: xen-devel <xen-devel@lists.xensource.com>
Cc: Stephen Tweedie <sct@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 13:10:27 +02:00
Yinghai Lu
11cd0bc140 x86: move some func calling from setup_arch to paging_init
those function depend on paging setup pgtable, so they could access
the ram in bootmem region but just get mapped.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:24 +02:00
Yinghai Lu
c09434571d x86: numa32 pfn print out using hex instead
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:23 +02:00
Yinghai Lu
6a07a0edac x86: fix compile warning in init_64.c
len is long and ret is only for NUMA

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:23 +02:00
Yinghai Lu
346cafecde x86: clean up min_low_pfn
for 32bit
we already had early_res support, so don't need to track min_low_pfn.
keep it to 0 always.

also use init_bootmem_node instead of init_bootmem, so don't touch
min_low_pfn.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:21 +02:00
Yinghai Lu
2ec65f8b89 x86: clean up using max_low_pfn on 32-bit
so that max_low_pfn is not changed after it is set.
so we can move that early and out of initmem_init.

could call find_low_pfn_range just after max_pfn is set.

also could move reserve_initrd out of setup_bootmem_allocator

so 32bit is more like 64bit.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:20 +02:00
Yinghai Lu
bef1568d97 x86: move reservetop and vmalloc parsing to pgtable_32.c
also change reserve_top_address to __init attibute

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:19 +02:00
Yinghai Lu
90d967e0ef x86: move find_max_low_pfn to init_32.c
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:18 +02:00
Yinghai Lu
225c37d71b x86: introduce reserve_initrd
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:16 +02:00
Yinghai Lu
b2ac82a090 x86: introduce initmem_init for 32 bit
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:15 +02:00
Yinghai Lu
1f75d7e32e x86: introduce initmem_init for 64 bit
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:50:14 +02:00
Yinghai Lu
d52d53b8a5 RFC x86: try to remove arch_get_ram_range
want to remove arch_get_ram_range, and use early_node_map instead.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:48:27 +02:00
Jeremy Fitzhardinge
a7bf0bd5e6 build: add __page_aligned_data and __page_aligned_bss
Making a variable page-aligned by using
__attribute__((section(".data.page_aligned"))) is fragile because if
sizeof(variable) is not also a multiple of page size, it leaves
variables in the remainder of the section unaligned.

This patch introduces two new qualifiers, __page_aligned_data and
__page_aligned_bss to set the section *and* the alignment of
variables.  This makes page-aligned variables more robust because the
linker will make sure they're aligned properly.  Unfortunately it
requires *all* page-aligned data to use these macros...

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:48:13 +02:00
Ingo Molnar
6236af82d8 Merge branch 'x86/fixmap' into x86/devel
Conflicts:

	arch/x86/mm/init_64.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 12:24:29 +02:00
Ingo Molnar
2b4fa851b2 Merge branch 'x86/numa' into x86/devel
Conflicts:

	arch/x86/Kconfig
	arch/x86/kernel/e820.c
	arch/x86/kernel/efi_64.c
	arch/x86/kernel/mpparse.c
	arch/x86/kernel/setup.c
	arch/x86/kernel/setup_32.c
	arch/x86/mm/init_64.c
	include/asm-x86/proto.h

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 11:59:23 +02:00
Bernhard Walle
3fd052b1b4 x86: add flags parameter to reserve_bootmem_generic()
This patch adds a 'flags' parameter to reserve_bootmem_generic() like it
already has been added in reserve_bootmem() with commit
72a7fe3967.

It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively
change the behaviour. Since the change is x86-specific, I don't think it's
necessary to add a new API for migration. There are only 4 users of that
function.

The change is necessary for the next patch, using reserve_bootmem_generic()
for crashkernel reservation.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 11:49:49 +02:00
Thomas Gleixner
886533a3e3 x86: numa_64.c fix shadowed variable
sparse mutters:
arch/x86/mm/numa_64.c:195:27: warning: symbol 'end_pfn' shadows an earlier one

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 11:31:29 +02:00
Thomas Gleixner
864fc31ea5 x86: numa_64.c make local variables static
plat_node_bdata, cmdline, nodemap_addr, nodemap_size are local to
numa_64.c. Make them static

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 11:31:28 +02:00
Mike Travis
9f248bde9d x86: remove the static 256k node_to_cpumask_map
* Consolidate node_to_cpumask operations and remove the 256k
    byte node_to_cpumask_map.  This is done by allocating the
    node_to_cpumask_map array after the number of possible nodes
    (nr_node_ids) is known.

  * Debug printouts when CONFIG_DEBUG_PER_CPU_MAPS is active have
    been increased.  It now shows faults when calling node_to_cpumask()
    and node_to_cpumask_ptr().

For inclusion into sched-devel/latest tree.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-08 11:31:24 +02:00
Mike Travis
23ca4bba3e x86: cleanup early per cpu variables/accesses v4
* Introduce a new PER_CPU macro called "EARLY_PER_CPU".  This is
    used by some per_cpu variables that are initialized and accessed
    before there are per_cpu areas allocated.

    ["Early" in respect to per_cpu variables is "earlier than the per_cpu
    areas have been setup".]

    This patchset adds these new macros:

	DEFINE_EARLY_PER_CPU(_type, _name, _initvalue)
	EXPORT_EARLY_PER_CPU_SYMBOL(_name)
	DECLARE_EARLY_PER_CPU(_type, _name)

	early_per_cpu_ptr(_name)
	early_per_cpu_map(_name, _idx)
	early_per_cpu(_name, _cpu)

    The DEFINE macro defines the per_cpu variable as well as the early
    map and pointer.  It also initializes the per_cpu variable and map
    elements to "_initvalue".  The early_* macros provide access to
    the initial map (usually setup during system init) and the early
    pointer.  This pointer is initialized to point to the early map
    but is then NULL'ed when the actual per_cpu areas are setup.  After
    that the per_cpu variable is the correct access to the variable.

    The early_per_cpu() macro is not very efficient but does show how to
    access the variable if you have a function that can be called both
    "early" and "late".  It tests the early ptr to be NULL, and if not
    then it's still valid.  Otherwise, the per_cpu variable is used
    instead:

	#define early_per_cpu(_name, _cpu) 			\
		(early_per_cpu_ptr(_name) ?			\
			early_per_cpu_ptr(_name)[_cpu] :	\
			per_cpu(_name, _cpu))

    A better method is to actually check the pointer manually.  In the
    case below, numa_set_node can be called both "early" and "late":

	void __cpuinit numa_set_node(int cpu, int node)
	{
	    int *cpu_to_node_map = early_per_cpu_ptr(x86_cpu_to_node_map);

	    if (cpu_to_node_map)
		    cpu_to_node_map[cpu] = node;
	    else
		    per_cpu(x86_cpu_to_node_map, cpu) = node;
	}

  * Add a flag "arch_provides_topology_pointers" that indicates pointers
    to topology cpumask_t maps are available.  Otherwise, use the function
    returning the cpumask_t value.  This is useful if cpumask_t set size
    is very large to avoid copying data on to/off of the stack.

  * The coverage of CONFIG_DEBUG_PER_CPU_MAPS has been increased while
    the non-debug case has been optimized a bit.

  * Remove an unreferenced compiler warning in drivers/base/topology.c

  * Clean up #ifdef in setup.c

For inclusion into sched-devel/latest tree.

Based on:
	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
    +   sched-devel/latest  .../mingo/linux-2.6-sched-devel.git

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-08 11:31:20 +02:00
Ingo Molnar
3de352bbd8 Merge branch 'x86/mpparse' into x86/devel
Conflicts:

	arch/x86/Kconfig
	arch/x86/kernel/io_apic_32.c
	arch/x86/kernel/setup_64.c
	arch/x86/mm/init_32.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 11:14:58 +02:00
Yinghai Lu
a4caa18efe x86: fix compiling when CONFIG_X86_MPPARSE is not set
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 10:39:19 +02:00
Yinghai Lu
064d25f120 x86: merge setup_memory_map with e820
... and kill e820_32/64.c and e820_32/64.h

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 10:38:25 +02:00
Yinghai Lu
cc9f7a0ccf x86: kill bad_ppro
so don't punish all other cpus without that problem when init highmem

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 10:38:19 +02:00
Yinghai Lu
b5bc6c0e55 x86, mm: use add_highpages_with_active_regions() for high pages init v2
use early_node_map to init high pages, so we can remove page_is_ram() and
page_is_reserved_early() in the big loop with add_one_highpage

also remove page_is_reserved_early(), it is not needed anymore.

v2: fix the build of other platforms

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 10:37:25 +02:00
Yinghai Lu
cc1050bafe x86: replace shrink_active_range() with remove_active_range()
in case we have kva before ramdisk on a node, we still need to use
those ranges.

v2: reserve_early kva ram area, in case there are holes in highmem, to avoid
    those area could be treat as free high pages.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 10:36:29 +02:00
Yinghai Lu
d2dbf34332 x86: clean up reserve_bootmem_generic() and port it to 32-bit
1. add reserve_bootmem_generic for 32bit
2. change len to unsigned long
3. make early_res_to_bootmem to use it

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 10:36:17 +02:00
Bernhard Walle
8b2ef1d728 x86: add flags parameter to reserve_bootmem_generic()
This patch adds a 'flags' parameter to reserve_bootmem_generic() like it
already has been added in reserve_bootmem() with commit
72a7fe3967.

It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively
change the behaviour. Since the change is x86-specific, I don't think it's
necessary to add a new API for migration. There are only 4 users of that
function.

The change is necessary for the next patch, using reserve_bootmem_generic()
for crashkernel reservation.

Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 10:34:54 +02:00
Ingo Molnar
896395c290 Merge branch 'linus' into tmp.x86.mpparse.new 2008-07-08 10:32:56 +02:00
Ingo Molnar
58cf35228f Merge branches 'x86/mmio', 'x86/delay', 'x86/idle', 'x86/oprofile', 'x86/debug', 'x86/ptrace' and 'x86/amd-iommu' into x86/devel 2008-07-08 09:46:15 +02:00
Ingo Molnar
6924d1ab8b Merge branches 'x86/numa-fixes', 'x86/apic', 'x86/apm', 'x86/bitops', 'x86/build', 'x86/cleanups', 'x86/cpa', 'x86/cpu', 'x86/defconfig', 'x86/gart', 'x86/i8259', 'x86/intel', 'x86/irqstats', 'x86/kconfig', 'x86/ldt', 'x86/mce', 'x86/memtest', 'x86/pat', 'x86/ptemask', 'x86/resumetrace', 'x86/threadinfo', 'x86/timers', 'x86/vdso' and 'x86/xen' into x86/devel 2008-07-08 09:16:56 +02:00
Jiri Slaby
684eb0163a x86_64: use PAGE_OFFSET in dump_pagetables
Use PAGE_OFFSET macro instead of using 0xffff810000000000UL directly.

Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: hpa@zytor.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-07-08 08:12:06 +02:00
Thomas Gleixner
6e92a5a615 x86: add sparse annotations to ioremap
arch/x86/mm/ioremap.c:308:11: error: incompatible types in comparison expression (different address spaces)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 08:12:05 +02:00
Thomas Gleixner
65280e613f x86: janitor CPA statistics patch
1) Remove __meminit from update_pages_count. It is used inside
split_pages()

2) Make the code depend on PROC_FS. Doing statistics for nothing is
useless and not adding useless code is nice to the Linux tiny folks.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 08:12:05 +02:00
Andi Kleen
ce0c0e50f9 x86, generic: CPA add statistics about state of direct mapping v4
Add information about the mapping state of the direct mapping to
/proc/meminfo. I chose /proc/meminfo because that is where all the other
memory statistics are too and it is a generally useful metric even
outside debugging situations. A lot of split kernel pages means the
kernel will run slower.

This way we can see how many large pages are really used for it and how
many are split.

Useful for general insight into the kernel.

v2: Add hotplug locking to 64bit to plug a very obscure theoretical race.
    32bit doesn't need it because it doesn't support hotadd for lowmem.
    Fix some typos
v3: Rename dpages_cnt
    Add CONFIG ifdef for count update as requested by tglx
    Expand description
v4: Fix stupid bugs added in v3
    Move update_page_count to pageattr.c

Signed-off-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-08 08:11:45 +02:00
Ingo Molnar
4d51c7587b Merge branch 'tracing/mmiotrace-mergefixups' into tracing/mmiotrace 2008-07-07 08:08:19 +02:00
Ingo Molnar
d763d5edf9 Merge branch 'linus' into tracing/mmiotrace 2008-07-07 08:07:35 +02:00
Gustavo Fernando Padovan
95c60b08c6 x86: remove unnecessary #ifdef CONFIG_X86_32...#else
Remove the #ifdef conditional because this comparison is already done in
user_mode_vm().

Signed-off-by: Gustavo F. Padovan <gustavo@las.ic.unicamp.br>
Cc: akpm@osdl.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-03 15:09:10 +02:00
Andrew Morton
27df66a406 arch/x86/mm/init_64.c: early_memtest(): fix types
fix this warning:

arch/x86/mm/init_64.c: In function 'early_memtest':
arch/x86/mm/init_64.c:524: warning: passing argument 2 of 'find_e820_area_size' from incompatible pointer type

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-03 10:15:51 +02:00
Vegard Nossum
f294a8ce21 x86: small unifications of address printing
'man 3 printf' tells me that %p should be printed as if by %#x, but
this is not true for the kernel, which does not use the '0x' prefix
for the %p conversion specifier.

A small cast to (void *) is also prettier than #ifdef/#else/#endif.

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-01 16:21:41 +02:00
Daniel J Blueman
0b1faeef5f x86: section/warning fixes
WARNING: arch/x86/mm/built-in.o(.text+0x3a1): Section mismatch in
reference from the function set_pte_phys() to the function
.init.text:spp_getpage()
The function set_pte_phys() references
the function __init spp_getpage().
This is often because set_pte_phys lacks a __init
annotation or the annotation of spp_getpage is wrong.

arch/x86/mm/init_64.c: In function 'early_memtest':
arch/x86/mm/init_64.c:520: warning: passing argument 2 of
'find_e820_area_size' from incompatible pointer type

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-26 15:33:09 +02:00
Jens Axboe
15c8b6c1aa on_each_cpu(): kill unused 'retry' parameter
It's not even passed on to smp_call_function() anymore, since that
was removed. So kill it.

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2008-06-26 11:24:38 +02:00
Andreas Herrmann
33af9039cb x86: pat.c final cleanup of loop body in reserve_memtype
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 13:05:59 +02:00
Andreas Herrmann
64fe44c38b x86: pat.c introduce function to check for conflicts with existing memtypes
... to strip down loop body in reserve_memtype.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 13:05:57 +02:00
Andreas Herrmann
f6887264de x86: pat.c consolidate list_add handling in reserve_memtype
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 13:05:55 +02:00
Andreas Herrmann
3e9c83b309 x86: pat.c consolidate error/debug messages in reserve_memtype
... and move last debug message out of locked section.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 13:05:51 +02:00
Andreas Herrmann
69e26be9b1 x86: pat.c more trivial changes
- add BUG statement to catch invalid start and end parameters
 - No need to track the actual type in both req_type and actual_type --
   keep req_type unchanged.
 - removed (IMHO) superfluous comments

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 13:05:50 +02:00
Andreas Herrmann
ac97991ec9 x86: pat.c choose more crisp variable names
- parse/ml => entry (within list_for_each and friends)
 - new_entry => new
 - ret_type => new_type (to avoid confusion with req_type)

(... to make it more readable...)

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 13:05:49 +02:00
Andreas Herrmann
bcc643dc28 x86: introduce macro to check whether an address range is in the ISA range
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-24 13:05:48 +02:00
Ingo Molnar
a1d5a8691f x86: unify __set_fixmap, fix
fix build failure:

 arch/x86/mm/pgtable.c:280: warning: ‘enum fixed_addresses’ declared inside parameter list
 arch/x86/mm/pgtable.c:280: warning: its scope is only this definition or declaration, which is probably not what you want
 arch/x86/mm/pgtable.c:280: error: parameter 1 (‘idx’) has incomplete type

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 15:36:57 +02:00
Jeremy Fitzhardinge
aeaaa59c7e x86/paravirt/xen: add set_fixmap pv_mmu_ops
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 15:09:56 +02:00
Jeremy Fitzhardinge
d494a96125 x86: implement set_pte_vaddr
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 15:09:54 +02:00
Jeremy Fitzhardinge
7c7e6e07e2 x86: unify __set_fixmap
In both cases, I went with the 32-bit behaviour.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-20 15:09:51 +02:00
Andreas Herrmann
dd0c7c4903 x86: shrink pat_x_mtrr_type to its essentials
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh B Siddha <suresh.b.siddha@intel.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-19 12:57:40 +02:00
Hugh Dickins
6cf514fce1 x86: PAT: make pat_x_mtrr_type() more readable
Clean up over-complications in pat_x_mtrr_type().
And if reserve_memtype() ignores stray req_type bits when
pat_enabled, it's better to mask them off when not also.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-18 11:51:08 +02:00
Ingo Molnar
faeca31d06 Merge branch 'linus' into x86/pat 2008-06-16 11:20:28 +02:00
Ingo Molnar
064a32d82c Merge branch 'linus' into x86/memtest 2008-06-16 11:19:53 +02:00
Ingo Molnar
1791a78c0b Merge branch 'linus' into x86/cleanups 2008-06-16 11:17:50 +02:00
Ingo Molnar
cb9aa97c21 Merge branch 'linus' into tracing/mmiotrace-mergefixups 2008-06-16 11:16:46 +02:00
Linus Torvalds
cbfa66b88d Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest
  x86, lockdep: fix "WARNING: at kernel/lockdep.c:2658 check_flags+0x4c/0x128()"
  x86: fix an incompatible pointer type warning on 64-bit compilations
  x86: fix lockdep warning during suspend-to-ram
  x86: fix unused variable 'loops' warning in arch/x86/boot/a20.c
  Revert "x86: fix ioapic bug again"
  x86: fix asm warning in head_32.S
  x86: fix endless page faults in mount_block_root for Linux 2.6
  geode: fix modular build
2008-06-12 12:55:32 -07:00
Kevin Winchester
f8a45704f5 x86: fix pointer type warning in arch/x86/mm/init_64.c:early_memtest
Changed the call to find_e820_area_size to pass u64 instead of unsigned long.

Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-12 21:36:23 +02:00
David Howells
eb53e9f3ea x86: fix an incompatible pointer type warning on 64-bit compilations
Fix an incompatible pointer type warning on x86_64 compilations.
early_memtest() is passing a u64* to find_e820_area_size() which is expecting
an unsigned long.  Change t_start and t_size to unsigned long as those are
also 64-bit types on x88_64.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-12 21:27:15 +02:00
Henry Nestler
b29c701dea x86: fix endless page faults in mount_block_root for Linux 2.6
Page faults in kernel address space between PAGE_OFFSET up to
VMALLOC_START should not try to map as vmalloc.

Fix rarely endless page faults inside mount_block_root for root
filesystem at boot time.

All 32bit kernels up to 2.6.25 can fail into this hole.
I can not present this under native linux kernel. I see, that the 64bit
has fixed the problem. I copied the same lines into 32bit part.

Recorded debugs are from coLinux kernel 2.6.22.18 (virtualisation):
http://www.henrynestler.com/colinux/testing/pfn-check-0.7.3/20080410-antinx/bug16-recursive-page-fault-endless.txt
The physicaly memory was trimmed down to 192MB to better catch the bug.
More memory gets the bug more rarely.

Details, how every x86 32bit system can fail:

Start from "mount_block_root",
http://lxr.linux.no/linux/init/do_mounts.c#L297
There the variable "fs_names" got one memory page with 4096 bytes.
Variable "p" walks through the existing file system types. The first
string is no problem.
But, with the second loop in mount_block_root the offset of "p" is not
at beginning of page, the offset is for example +9, if "reiserfs" is the
first in list.
Than calls do_mount_root, and lands in sys_mount.
Remember: Variable "type_page" contains now "fs_type+9" and not contains
a full page.
The sys_mount copies 4096 bytes with function "exact_copy_from_user()":
http://lxr.linux.no/linux/fs/namespace.c#L1540

Mostly exist pages after the buffer "fs_names+4096+9" and the page fault
handler was not called. No problem.

In the case, if the page after "fs_names+4096" is not mapped, the page
fault handler was called from http://lxr.linux.no/linux/fs/namespace.c#L1320

The do_page_fault gots an address 0xc03b4000.
It's kernel address, address >= TASK_SIZE, but not from vmalloc! It's
from "__getname()" alias "kmem_cache_alloc".
The "error_code" is 0. "vmalloc_fault" will be call:
http://lxr.linux.no/linux/arch/i386/mm/fault.c#L332

"vmalloc_fault" tryed to find the physical page for a non existing
virtual memory area. The macro "pte_present" in vmalloc_fault()
got a next page fault for 0xc0000ed0 at:
http://lxr.linux.no/linux/arch/i386/mm/fault.c#L282

No PTE exist for such virtual address. The page fault handler was trying
to sync the physical page for the PTE lockup.

This called vmalloc_fault() again for address 0xc000000, and that also
was not existing. The endless began...

In normal case the cpu would still loop with disabled interrrupts. Under
coLinux this was catched by a stack overflow inside printk debugs.

Signed-off-by: Henry Nestler <henry.nestler@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-12 21:26:07 +02:00
Andreas Herrmann
499f8f84b8 x86: rename pat_wc_enabled to pat_enabled
BTW, what does pat_wc_enabled stand for? Does it mean
"write-combining"?

Currently it is used to globally switch on or off PAT support.
Thus I renamed it to pat_enabled.
I think this increases readability (and hope that I didn't miss
something).

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-12 10:14:27 +02:00
Andreas Herrmann
cd7a4e936d x86: PAT: fixed checkpatch errors (and whitespaces)
x86: PAT: fixed checkpatch errors (and whitespaces)

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-12 10:14:24 +02:00
Andreas Herrmann
97cfab6ac4 x86: PAT: fix ambiguous paranoia check in pat_init()
Starting with commit 8d4a430085 (x86:
cleanup PAT cpu validation) the PAT CPU feature flag is not cleared
anymore. Now the error message

  "PAT enabled, but CPU feature cleared"

in pat_init() is misleading.

Furthermore the current code does not check for existence of the PAT
CPU feature flag if a CPU is whitelisted in validate_pat_support.

This patch clears pat_wc_enabled if boot CPU has no PAT feature flag
and adapts the paranoia check.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-12 10:14:22 +02:00
Venki Pallipadi
c26421d019 x86: fix Xorg crash with xf86MapVidMem error
Clarify the usage of mtrr_lookup() in PAT code, and to make PAT code
resilient to mtrr lookup problems.

Specifically, pat_x_mtrr_type() is restructured to highlight, under what
conditions we look for mtrr hint. pat_x_mtrr_type() uses a default type
when there are any errors in mtrr lookup (still maintaining the pat
consistency). And, reserve_memtype() highlights its usage ot mtrr_lookup
for request type of '-1' and also defaults in a sane way on any mtrr
lookup failure.

pat.c looks at mtrr type of a range to get a hint on what mapping type
to request when user/API: (1) hasn't specified any type (/dev/mem
mapping) and we do not want to take performance hit by always mapping
UC_MINUS. This will be the case for /dev/mem mappings used to map BIOS
area or ACPI region which are WB'able. In this case, as long as MTRR is
not WB, PAT will request UC_MINUS for such mappings.

(2) user/API requests WB mapping while in reality MTRR may have UC or
WC. In this case, PAT can map as WB (without checking MTRR) and still
effective type will be UC or WC. But, a subsequent request to map same
region as UC or WC may fail, as the region will get trackked as WB in
PAT list. Looking at MTRR hint helps us to track based on effective type
rather than what user requested. Again, here mtrr_lookup is only used as
hint and we fallback to WB mapping (as requested by user) as default.

In both cases, after using the mtrr hint, we still go through the
memtype list to make sure there are no inconsistencies among multiple
users.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Tested-by: Rufus & Azrael <rufus-azrael@numericable.fr>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-12 10:03:55 +02:00
Fenghua Yu
39b8931b5c ACPI: handle invalid ACPI SLIT table
This is a SLIT sanity checking patch.  It moves slit_valid() function to
generic ACPI code and does sanity checking for both x86 and ia64.  It sets up
node_distance with LOCAL_DISTANCE and REMOTE_DISTANCE when hitting invalid
SLIT table on ia64.  It also cleans up unused variable localities in
acpi_parse_slit() on x86.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-06-11 19:13:46 -04:00
Yinghai Lu
9043f00796 x86, numa, 32-bit: use find_e820_area() to find KVA RAM on node
don't assume we can use RAM near the end of every node.
Esp systems that have few memory and they could have
kva address and kva RAM all below max_low_pfn.

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-10 11:31:52 +02:00
Yinghai Lu
cc1a9d86ce mm, x86: shrink_active_range() should check all
Now we are using register_e820_active_regions() instead of
add_active_range() directly. So end_pfn could be different between the
value in early_node_map to node_end_pfn.

So we need to make shrink_active_range() smarter.

shrink_active_range() is a generic MM function in mm/page_alloc.c but
it is only used on 32-bit x86. Should we move it back to some file in
arch/x86?

Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-06-10 11:31:44 +02:00
Huang, Ying
d0ec2c6f2c x86: reserve highmem pages via reserve_early
This patch makes early reserved highmem pages become reserved
pages. This can be used for highmem pages allocated by bootloader such
as EFI memory map, linked list of setup_data, etc.

Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: andi@firstfloor.org
Cc: mingo@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-06-05 15:10:02 +02:00