We now have cpu_init() and secondary_cpu_init() doing nothing but calling
_cpu_init() with the same arguments. Rename _cpu_init() to cpu_init() and use
it as a replcement for secondary_cpu_init().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Now we are no longer dynamically allocating the GDT, we don't need the
"cpu_gdt_table" at all: we can switch straight from "boot_gdt_table" to the
per-cpu GDT. This means initializing the cpu_gdt array in C.
The boot CPU uses the per-cpu var directly, then in smp_prepare_cpus() it
switches to the per-cpu copy just allocated. For secondary CPUs, the
early_gdt_descr is set to point directly to their per-cpu copy.
For UP the code is very simple: it keeps using the "per-cpu" GDT as per SMP,
but we never have to move.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Allocating PDA and GDT at boot is a pain. Using simple per-cpu variables adds
happiness (although we need the GDT page-aligned for Xen, which we do in a
followup patch).
[akpm@linux-foundation.org: build fix]
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Because the command line is increased to 2048 characters after 2.6.21, it's
not possible for boot loaders and userspace tools to determine the length
of the command line the kernel can understand. The benefit of knowing the
length is that users can be warned if the command line size is too long
which prevents surprise if things don't work after bootup.
This patch updates the boot protocol to contain a field called
"cmdline_size" that contain the length of the command line (excluding the
terminating zero).
The patch also adds missing fields (of protocol version 2.05) to the x86_64
setup code.
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Alon Bar-Lev <alon.barlev@gmail.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The lguest patches somehow managed to trigger this:
In file included from arch/i386/lguest/lguest.c:38:
include/asm/asm-offsets.h:67:1: warning: "VDSO_PRELINK" redefined
In file included from include/linux/elf.h:7,
from include/linux/module.h:15,
from include/linux/device.h:21,
from include/linux/interrupt.h:15,
from arch/i386/lguest/lguest.c:27:
include/asm/elf.h:140:1: warning: this is the location of the previous definition
I assume that using the same identifier twice was a bad idea..
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
remove the reporting of the constant_tsc flag from the "power management"
field in /proc/cpuinfo. The NULL value there was replaced by "" because
the former would result in a printout of [8] if the flag is set.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Extends the numa=fake x86_64 command-line option to split the remaining system
memory into nodes of fixed size. Any leftover memory is allocated to a final
node unless the command-line ends with a comma.
For example:
numa=fake=2*512,*128 gives two 512M nodes and the remaining system
memory is split into nodes of 128M each.
This is beneficial for systems where the exact size of RAM is unknown or not
necessarily relevant, but the size of the remaining nodes to be allocated is
known based on their capacity for resource management.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Extends the numa=fake x86_64 command-line option to split the remaining
system memory into equal-sized nodes.
For example:
numa=fake=2*512,4* gives two 512M nodes and the remaining system
memory is split into four approximately equal
chunks.
This is beneficial for systems where the exact size of RAM is unknown or not
necessarily relevant, but the granularity with which nodes shall be allocated
is known.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Extends the numa=fake x86_64 command-line option to allow for configurable
node sizes. These nodes can be used in conjunction with cpusets for coarse
memory resource management.
The old command-line option is still supported:
numa=fake=32 gives 32 fake NUMA nodes, ignoring the NUMA setup of the
actual machine.
But now you may configure your system for the node sizes of your choice:
numa=fake=2*512,1024,2*256
gives two 512M nodes, one 1024M node, two 256M nodes, and
the rest of system memory to a sixth node.
The existing hash function is maintained to support the various node sizes
that are possible with this implementation.
Each node of the same size receives roughly the same amount of available
pages, regardless of any reserved memory with its address range. The total
available pages on the system is calculated and divided by the number of equal
nodes to allocate. These nodes are then dynamically allocated and their
borders extended until such time as their number of available pages reaches
the required size.
Configurable node sizes are recommended when used in conjunction with cpusets
for memory control because it eliminates the overhead associated with scanning
the zonelists of many smaller full nodes on page_alloc().
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Paul Jackson <pj@sgi.com>
Cc: Christoph Lameter <clameter@engr.sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fix comments to represent the true number of quadwords in GDT.
Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This patch makes the needlessly global vmi_pmd_clear() static.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Change mark_tsc_unstable() so it takes a string argument, which holds the
reason the TSC was marked unstable.
This is then displayed the first time mark_tsc_unstable is called.
This should help us better debug why the TSC was marked unstable on certain
systems and allow us to make sure we're not being overly paranoid when
throwing out this troublesome clocksource.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Work around a warning with -Wmissing-prototypes in
arch/i386/kernel/asm-offsets.c
The warning isn't gcc's fault - asm-offsets.c is simply a special file.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
clean up unneeded type cast by properly declare data type.
Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
o Modpost generates warnings for i386 if compiled with CONFIG_RELOCATABLE=y
WARNING: vmlinux - Section mismatch: reference to .init.text:find_unisys_acpi_oem_table from .text between 'acpi_madt_oem_check' (at offset 0xc0101eda) and 'enable_apic_mode'
WARNING: vmlinux - Section mismatch: reference to .init.text:acpi_get_table_header_early from .text between 'acpi_madt_oem_check' (at offset 0xc0101ef0) and 'enable_apic_mode'
WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'acpi_madt_oem_check' (at offset 0xc0101f2e) and 'enable_apic_mode'
WARNING: vmlinux - Section mismatch: reference to .init.text:setup_unisys from .text between 'acpi_madt_oem_check' (at offset 0xc0101f37) and 'enable_apic_mode'WARNING: vmlinux - Section mismatch: reference to .init.text:parse_unisys_oem from .text between 'mps_oem_check' (at offset 0xc0101ec7) and 'acpi_madt_oem_check'
WARNING: vmlinux - Section mismatch: reference to .init.text:es7000_sw_apic from .text between 'enable_apic_mode' (at offset 0xc0101f48) and 'check_apicid_present'
o Some functions which are inline (acpi_madt_oem_check) are not inlined by
compiler as these functions are accessed using function pointer. These
functions are put in .text section and they in-turn access __init type
functions hence modpost generates warnings.
o Do not iniline acpi_madt_oem_check, instead make it __init.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When compiling with -Os (which is default) the compiler defaults to it
anyways. And with -O2 it probably generates somewhat better (although
also larger) code.
Signed-off-by: Andi Kleen <ak@suse.de>
o This patch moves the code to verify long mode and SSE to a common file.
This code is now shared by trampoline.S, wakeup.S, boot/setup.S and
boot/compressed/head.S
o So far we used to do very limited check in trampoline.S, wakeup.S and
in 32bit entry point. Now all the entry paths are forced to do the
exhaustive check, including SSE because verify_cpu is shared.
o I am keeping this patch as last in the x86 relocatable series because
previous patches have got quite some amount of testing done and don't want
to distrub that. So that if there is problem introduced by this patch, at
least it can be easily isolated.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o Extend the bzImage protocol (same as i386) to allow bzImage loaders to
load the protected mode kernel at non-1MB address. Now protected mode
component is relocatable and can be loaded at non-1MB addresses.
o As of today kdump uses it to run a second kernel from a reserved memory
area.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o X86_64 kernel should run from 2MB aligned address for two reasons.
- Performance.
- For relocatable kernels, page tables are updated based on difference
between compile time address and load time physical address.
This difference should be multiple of 2MB as kernel text and data
is mapped using 2MB pages and PMD should be pointing to a 2MB
aligned address. Life is simpler if both compile time and load time
kernel addresses are 2MB aligned.
o Flag the error at compile time if one is trying to build a kernel which
does not meet alignment restrictions.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This patch modifies the x86_64 kernel so that it can be loaded and run
at any 2M aligned address, below 512G. The technique used is to
compile the decompressor with -fPIC and modify it so the decompressor
is fully relocatable. For the main kernel the page tables are
modified so the kernel remains at the same virtual address. In
addition a variable phys_base is kept that holds the physical address
the kernel is loaded at. __pa_symbol is modified to add that when
we take the address of a kernel symbol.
When loaded with a normal bootloader the decompressor will decompress
the kernel to 2M and it will run there. This both ensures the
relocation code is always working, and makes it easier to use 2M
pages for the kernel and the cpu.
AK: changed to not make RELOCATABLE default in Kconfig
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Currently __pa_symbol is for use with symbols in the kernel address
map and __pa is for use with pointers into the physical memory map.
But the code is implemented so you can usually interchange the two.
__pa which is much more common can be implemented much more cheaply
if it is it doesn't have to worry about any other kernel address
spaces. This is especially true with a relocatable kernel as
__pa_symbol needs to peform an extra variable read to resolve
the address.
There is a third macro that is added for the vsyscall data
__pa_vsymbol for finding the physical addesses of vsyscall pages.
Most of this patch is simply sorting through the references to
__pa or __pa_symbol and using the proper one. A little of
it is continuing to use a physical address when we have it
instead of recalculating it several times.
swapper_pgd is now NULL. leave_mm now uses init_mm.pgd
and init_mm.pgd is initialized at boot (instead of compile time)
to the physmem virtual mapping of init_level4_pgd. The
physical address changed.
Except for the for EMPTY_ZERO page all of the remaining references
to __pa_symbol appear to be during kernel initialization. So this
should reduce the cost of __pa in the common case, even on a relocated
kernel.
As this is technically a semantic change we need to be on the lookout
for anything I missed. But it works for me (tm).
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o __pa() should be used only on kernel linearly mapped virtual addresses
and not on kernel text and data addresses.
o Hibernation code needs to determine the physical address associated
with kernel symbol to mark a section boundary which contains pages which
don't have to be saved and restored during hibernate/resume operation.
o Move this piece of code in arch dependent section. So that architectures
which don't have kernel text/data mapped into kernel linearly mapped
region can come up with their own ways of determining physical addresses
associated with a kernel text.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
With the rewrite of the SMP trampoline and the early page
allocator there is nothing that needs identity mapped pages,
once we start executing C code.
So add zap_identity_mappings into head64.c and remove
zap_low_mappings() from much later in the code. The functions
are subtly different thus the name change.
This also kills boot_level4_pgt which was from an earlier
attempt to move the identity mappings as early as possible,
and is now no longer needed. Essentially I have replaced
boot_level4_pgt with trampoline_level4_pgt in trampoline.S
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o Moved wakeup_level4_pgt into the wakeup routine so we can
run the kernel above 4G.
o Now we first go to 64bit mode and continue to run from trampoline and
then then start accessing kernel symbols and restore processor context.
This enables us to resume even in relocatable kernel context when
kernel might not be loaded at physical addr it has been compiled for.
o Removed the need for modifying any existing kernel page table.
o Increased the size of the wakeup routine to 8K. This is required as
wake page tables are on trampoline itself and they got to be at 4K
boundary, hence one page is not sufficient.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o Various cleanups. One of the main purpose of cleanups is that make
wakeup.S as close as possible to trampoline.S.
o Following are the changes
- Indentations for comments.
- Changed the gdt table to compact form and to resemble the
one in trampoline.S
- Take the jump to 32bit from real mode using ljmpl. Makes code
more readable.
- After enabling long mode, directly take a long jump for 64bit
mode. No need to take an extra jump to "reach_comaptibility_mode"
- Stack is not used after real mode. So don't load stack in
32 bit mode.
- No need to enable PGE here.
- No need to do extra EFER read, anyway we trash the read contents.
- No need to enable system call (EFER_SCE). Anyway it will be
enabled when original EFER is restored.
- No need to set MP, ET, NE, WP, AM bits in cr0. Very soon we will
reload the original cr0 while restroing the processor state.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o Use appropriate names for 64bit regsiters.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
o Get rid of dead code in wakeup.S
o We never restore from saved_gdt, saved_idt, saved_ltd, saved_tss, saved_cr3,
saved_cr4, saved_cr0, real_save_gdt, saved_efer, saved_efer2. Get rid
of of associated code.
o Get rid of bogus_magic, bogus_31_magic and bogus_magic2. No longer being
used.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
This modifies the SMP trampoline and all of the associated code so
it can jump to a 64bit kernel loaded at an arbitrary address.
The dependencies on having an idenetity mapped page in the kernel
page tables for SMP bootup have all been removed.
In addition the trampoline has been modified to verify
that long mode is supported. Asking if long mode is implemented is
down right silly but we have traditionally had some of these checks,
and they can't hurt anything. So when the totally ludicrous happens
we just might handle it correctly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
EFER varies like %cr4 depending on the cpu capabilities, and which cpu
capabilities we want to make use of. So save/restore it make certain
we have the same EFER value when we are done.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Move __KERNEL32_CS up into the unused gdt entry. __KERNEL32_CS is
used when entering the kernel so putting it first is useful when
trying to keep boot gdt sizes to a minimum.
Set the accessed bit on all gdt entries. We don't care
so there is no need for the cpu to burn the extra cycles,
and it potentially allows the pages to be immutable. Plus
it is confusing when debugging and your gdt entries mysteriously
change.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Use virtual addresses instead of physical addresses
in copy bootdata. In addition fix the implementation
of the old bootloader convention. Everything is
at real_mode_data always. It is just that sometimes
real_mode_data was relocated by setup.S to not sit at
0x90000.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
- Merge physmem_pgt and ident_pgt, removing physmem_pgt. The merge
is broken as soon as mm/init.c:init_memory_mapping is run.
- As physmem_pgt is gone don't export it in pgtable.h.
- Use defines from pgtable.h for page permissions.
- Fix the physical memory identity mapping so it is at the correct
address.
- Remove the physical memory mapping from wakeup_level4_pgt it
is at the wrong address so we can't possibly be usinging it.
- Simply NEXT_PAGE the work to calculate the phys_ alias
of the labels was very cool. Unfortuantely it was a brittle
special purpose hack that makes maitenance more difficult.
Instead just use label - __START_KERNEL_map like we do
everywhere else in assembly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Early in the boot process we need the ability to set
up temporary mappings, before our normal mechanisms are
initialized. Currently this is used to map pages that
are part of the page tables we are building and pages
during the dmi scan.
The core problem is that we are using the user portion of
the page tables to implement this. Which means that while
this mechanism is active we cannot catch NULL pointer dereferences
and we deviate from the normal ways of handling things.
In this patch I modify early_ioremap to map pages into
the kernel portion of address space, roughly where
we will later put modules, and I make the discovery of
which addresses we can use dynamic which removes all
kinds of static limits and remove the dependencies
on implementation details between different parts of the code.
Now alloc_low_page() and unmap_low_page() use
early_iomap() and early_iounmap() to allocate/map and
unmap a page.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
The dma_ops structure can be const since it never changes
after boot.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
smp_call_function and smp_call_function_single are almost complete
duplicates of the same logic. This patch combines them by
implementing them in terms of the more general
smp_call_function_mask().
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Stephane Eranian <eranian@hpl.hp.com>
Cc: Andrew Morton <akpm@osdl.org>
Cc: Andi Kleen <ak@suse.de>
Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Cc: Ingo Molnar <mingo@elte.hu>
Hi!
I sent this simple patch to lkml about two weeks ago and also cc'ed
to Linus, but seems that the patch got ignored. I decided to write to
you, because you have modified the relevant file most recently.
Below is a copy of the mail that is also available at
<http://lkml.org/lkml/2007/2/28/230>.
Signed-off-by: Andi Kleen <ak@suse.de>
The reboot_fixups stuff seems to be a bit of a mess, specifically the
header is in linux/ when its a purely i386-specific piece of code. I'm
not sure why it has its config option; its only currently needed for
"geode-gx1/cs5530a", so perhaps whatever config option controls that
hardware should enable this?
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
The kernel only supports gcc 3.2+ now so it doesn't make sense
anymore to explicitely check for options this compiler version
already has.
This actually fixes a bug. The -mprefered-stack-boundary check
never worked because gcc rightly complains
CC arch/i386/kernel/asm-offsets.s
cc1: -mpreferred-stack-boundary=2 is not between 4 and 12
We just never saw the error because of cc-options.
I changed it to 4 to actually work.
Tested by compiling i386 and x86-64 defconfig with gcc 3.2.
Should speed up the build time a tiny bit and improve
stack usage on i386 slightly.
Signed-off-by: Andi Kleen <ak@suse.de>
Change sysenter_setup to __cpuinit.
Change __INIT & __INITDATA to be cpu hotplug aware.
Resolve MODPOST warnings similar to:
WARNING: vmlinux - Section mismatch: reference to .init.text:sysenter_setup from
.text between 'identify_cpu' (at offset 0xc040a380) and 'detect_ht'
and
WARNING: vmlinux - Section mismatch: reference to .init.data:vsyscall_int80_end
from .text between 'sysenter_setup' (at offset 0xc041a269) and 'enable_sep_cpu'
WARNING: vmlinux - Section mismatch: reference to
.init.data:vsyscall_int80_start from .text between 'sysenter_setup' (at offset
0xc041a26e) and 'enable_sep_cpu'
WARNING: vmlinux - Section mismatch: reference to
.init.data:vsyscall_sysenter_end from .text between 'sysenter_setup' (at offset
0xc041a275) and 'enable_sep_cpu'
WARNING: vmlinux - Section mismatch: reference to
.init.data:vsyscall_sysenter_start from .text between 'sysenter_setup' (at
offset 0xc041a27a) and 'enable_sep_cpu'
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Add __init to probe_bigsmp. All callers are __init and data being examined
is __initdata.
Resolves MODPOST warning similar to:
WARNING: vmlinux - Section mismatch: reference to .init.data: from .text between 'probe_bigsmp' (at offset 0xc0401e56) and 'init_apic_ldr'
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Hello,
This patch against 2.6.20-git14 makes the NMI watchdog use PERFSEL1/PERFCTR1
instead of PERFSEL0/PERFCTR0 on processors supporting Intel architectural
perfmon, such as Intel Core 2. Although all PMU events can work on
both counters, the Precise Event-Based Sampling (PEBS) requires that the
event be in PERFCTR0 to work correctly (see section 18.14.4.1 in the
IA32 SDM Vol 3b). This versions has 3 chunks compared to previous where
we had missed on check.
Changelog:
- make the x86-64 NMI watchdog use PERFSEL1/PERFCTR1 instead of PERFSEL0/PERFCTR0
on processors supporting the Intel architectural perfmon (e.g. Core 2 Duo).
This allows PEBS to work when the NMI watchdog is active.
signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Hello,
This patch against 2.6.20-git14 makes the NMI watchdog use PERFSEL1/PERFCTR1
instead of PERFSEL0/PERFCTR0 on processors supporting Intel architectural
perfmon, such as Intel Core 2. Although all PMU events can work on
both counters, the Precise Event-Based Sampling (PEBS) requires that the
event be in PERFCTR0 to work correctly (see section 18.14.4.1 in the
IA32 SDM Vol 3b).
A similar patch for x86-64 is to follow.
Changelog:
- make the i386 NMI watchdog use PERFSEL1/PERFCTR1 instead of PERFSEL0/PERFCTR0
on processors supporting the Intel architectural perfmon (e.g. Core 2 Duo).
This allows PEBS to work when the NMI watchdog is active.
signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
a userspace fault or a kernelspace fault which will result in the
immediate death of the process. They should not be filled in as a
result of a kernelspace fault which can be fixed up.
Otherwise, if the process is handling SIGSEGV and examining the fault
information, this can result in the kernel space fault trashing the
previously stored fault information if it arrives between the
userspace fault happening and the SIGSEGV being delivered to the process.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
--
arch/i386/kernel/traps.c | 24 ++++++++++++++++++------
arch/x86_64/kernel/traps.c | 30 +++++++++++++++++++++++-------
2 files changed, 41 insertions(+), 13 deletions(-)
Remove the assumption that if the first page of a legacy ROM is mapped,
it'll all be mapped. This'll also stop people reading this code from
wondering if they're looking at a bug...
Signed-off-by: Rene Herman <rene.herman@gmail.com>
Signed-off-by: Martin Murray <murrayma@citi.umich.edu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The VIA C7 is a 686 (with TSC) that supports MMX, SSE and SSE2, it also has
a cache line length of 64 according to
http://www.digit-life.com/articles2/cpu/rmma-via-c7.html. This patch sets
gcc to -march=686 and select s the correct cache shift.
Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Eliminated the arch/i386/kernel/timers in 2.6.18, use clocksoures instead.
pit_latch_buggy was referred in timers/timer_tsc.c, and currently removed.
Therefore nobody refer it.
Until 2.6.17, MediaGX's TSC works correctly. after 2.6.18, warned "TSC
appears to be running slowly. Marking it as unstable". So marked unstable
TSC when CS55x0.
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Whether a region is below 1Mb is determined by its start rather than
its end.
This hunk got erroneously dropped from a previous patch.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
No need to use -traditional for processing asm in i386/kernel/
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Synchronize i386's smp_send_stop() with x86-64's in only try-locking
the call lock to prevent deadlocks when called from panic().
In both version, disable interrupts before clearing the CPU off the
online map to eliminate races with IRQ handlers inspecting this map.
Also in both versions, save/restore interrupts rather than disabling/
enabling them.
On x86-64, eliminate one function used here by folding it into its
single caller, convert to static, and rename for consistency with i386
(lkcd may like this).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Avoid including asm/vsyscall32.h in virtually every source file.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Move inclusion of asm/fixmap.h to where it is really used rather than
where it may have been used long ago (requires a few other adjustments
to includes due to previous implicit dependencies).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
- make the page table contents printing PAE capable
- make sure the address stored in current->thread.cr2 is unmodified
from what was read from CR2
- don't call oops_may_print() multiple times, when one time suffices
- print pte even in highpte case, as long as the pte page isn't in
actually in high memory (which is specifically the case for all page
tables covering kernel space)
(Changes to v3: Use sizeof()*2 rather than the suggested sizeof()*4 for
printing width, use fixed 16-nibble width for PAE, and also apply the
max_low_pfn range check to the middle level lookup on PAE.)
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Remove clustered APIC mode. There's little point in the use of clustered APIC
mode, broadcasting is limited to within the cluster only, and chipsets have
bugs in this area as well. So default to physical APIC mode when the CPU
count is large, and default to logical APIC mode when the CPU count is 8 or
smaller.
(this patch only removes the use of genapic_cluster and cleans up the
resulting genapic.c file - removal of all remaining traces of clustered
mode will be done by another patch.)
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Fix a couple of inconsistencies/problems I found while reviewing the x86_64
genapic code (when I was chasing mysterious eth0 timeouts that would only
trigger if CPU_HOTPLUG is enabled):
- AMD systems defaulted to the slower flat-physical mode instead
of the flat-logical mode. The only restriction on AMD systems
is that they should not use clustered APIC mode.
- removed the CPU hotplug hacks, switching the default for small
systems back from phys-flat to logical-flat. The switching to logical
flat mode on small systems fixed sporadic ethernet driver timeouts i
was getting on a dual-core Athlon64 system:
NETDEV WATCHDOG: eth0: transmit timed out
eth0: Transmit timeout, status 0c 0005 c07f media 80.
eth0: Tx queue start entry 32 dirty entry 28.
eth0: Tx descriptor 0 is 0008a04a. (queue head)
eth0: Tx descriptor 1 is 0008a04a.
eth0: Tx descriptor 2 is 0008a04a.
eth0: Tx descriptor 3 is 0008a04a.
eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1
- The use of '<= 8' was a bug by itself (the valid APIC ids
for logical flat mode go from 0 to 7, not 0 to 8). The new logic
is to use logical flat mode on both AMD and Intel systems, and
to only switch to physical mode when logical mode cannot be used.
If CPU hotplug is racy wrt. APIC shutdown then CPU hotplug needs
fixing, not the whole IRQ system be made inconsistent and slowed
down.
- minor cleanups: simplified some code constructs
build & booted on a couple of AMD and Intel SMP systems.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Li, Shaohua" <shaohua.li@intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Almost all users of pm_ops only support mem sleep, don't check in .valid and
don't reject any others in .prepare so users can be confused if they check
/sys/power/state, especially when new states are added (these would then
result in s-t-r although they're supposed to be something different).
This patch implements a generic pm_valid_only_mem function that is then
exported for users and puts it to use in almost all existing pm_ops.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: linux-pm@lists.linux-foundation.org
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch series cleans up some misconceptions about pm_ops. Some users of
the pm_ops structure attempt to use it to stop the user from entering suspend
to disk, this, however, is not possible since the user can always use
"shutdown" in /sys/power/disk and then the pm_ops are never invoked. Also,
platforms that don't support suspend to disk simply should not allow
configuring SOFTWARE_SUSPEND (read the help text on it, it only selects
suspend to disk and nothing else, all the other stuff depends on PM).
The pm_ops structure is actually intended to provide a way to enter
platform-defined sleep states (currently supported states are "standby" and
"mem" (suspend to ram)) and additionally (if SOFTWARE_SUSPEND is configured)
allows a platform to support a platform specific way to enter low-power mode
once everything has been saved to disk. This is currently only used by ACPI
(S4).
This patch:
The pm_ops.pm_disk_mode is used in totally bogus ways since nobody really
seems to understand what it actually does.
This patch clarifies the pm_disk_mode description.
It also removes all the arm and sh users that think they can veto suspend to
disk via pm_ops; not so since the user can always do echo shutdown >
/sys/power/disk, they need to find a better way involving Kconfig or such.
ACPI is the only user left with a non-zero pm_disk_mode.
The patch also sets the default mode to shutdown again, but when a new pm_ops
is registered its pm_disk_mode is selected as default, that way the default
stays for ACPI where it is apparently required.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
[IPV4] SNMP: Support OutMcastPkts and OutBcastPkts
[IPV4] SNMP: Support InMcastPkts and InBcastPkts
[IPV4] SNMP: Support InTruncatedPkts
[IPV4] SNMP: Support InNoRoutes
[SNMP]: Add definitions for {In,Out}BcastPkts
[TCP] FRTO: RFC4138 allows Nagle override when new data must be sent
[TCP] FRTO: Delay skb available check until it's mandatory
[XFRM]: Restrict upper layer information by bundle.
[TCP]: Catch skb with S+L bugs earlier
[PATCH] INET : IPV4 UDP lookups converted to a 2 pass algo
[L2TP]: Add the ability to autoload a pppox protocol module.
[SKB]: Introduce skb_queue_walk_safe()
[AF_IUCV/IUCV]: smp_call_function deadlock
[IPV6]: Fix slab corruption running ip6sic
[TCP]: Update references in two old comments
[XFRM]: Export SPD info
[IPV6]: Track device renames in snmp6.
[SCTP]: Fix sctp_getsockopt_local_addrs_old() to use local storage.
[NET]: Remove NETIF_F_INTERNAL_STATS, default to internal stats.
[NETPOLL]: Remove CONFIG_NETPOLL_RX
...
Call of_find_node_by_type with NULL instead of np
so the cpu node does not get put twice.
This was causing kref_put warnings.
Signed-off-by: John Rigby <jrigby@freescale.com>
Acked-by: Sylvain Munaut <tnt@246tNt.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Use DEFINE_SPINLOCK instead of initializing spinlocks to
SPIN_LOCK_UNLOCKED, since DEFINE_SPINLOCK is better for lockdep.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For dma_alloc_*()
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
arch/powerpc/sysdev/timer.c:51: error: variable `timer_sysclass' has
initializer but incomplete type
arch/powerpc/sysdev/timer.c:52: error: unknown field `resume' specified in initializer
<etc>
Signed-off-by: Srinivasa Ds <srinivasa@in.ibm.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Just another pass through arch/powerpc for old usages.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Recently, someone fixed a syntax error in the HTDMSOUND driver
introduced 4 years ago.
Unfortunately not by trying to compile this driver for his hardware but
by code inspection - which seems to be a strong indication that there
are no users left for this OSS sound driver.
This patch therefore removes it.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Dan Malek <dan@embeddedalley.com>
Acked-by: Marcelo Tosatti <marcelo@kvack.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
dt_xlate_reg() looks up the 'reg' property in the specified node
to get the address and size to translate. Add dt_xlate_addr()
which is passed in the address and size to translate.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
A usage of CONFIG_DEVICE_TREE got accidentally truncated; this
fix allows out-of-tree dts files to work.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Why create a platform specific board_info structure that is hacked
together, ugly, and dangerous, when we've got a perfectly fine common
board_info structure that is hacked-together, ugly and dangerous.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The arch/ppc/syslib/ppc_sys.c infrastructure does not work well for the
virtex ports. Move the ml300 and ml403 board ports over to use the new
virtex_devices infrastructure.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently virtex support in mainline make use of the infrastructure in
arch/ppc/syslib/ppc_sys.c for registering common devices on virtex ppc405
platforms. The ppc_sys.c code is not well suited to the dynamic nature of
FPGA designs and makes adding new board ports more complex. This patch
adds a new listing of common devices which does not depend on the ppc_sys.c
infrastructure.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The header files for the ml403 and ml300 are virtually identical, merge
them into a single file.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Reverse dependency order for Xilinx Virtex parts. For these parts, It
makes more sense for boards/chips to specify which features they
provide instead of the features listing the parts they are implemented
in. I think it also makes adding new board ports simpler.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Shuffle Kconfig order, making the platform drivers menu depend on the global
option instead of each driver being dependent on it.
Also fix dependency of PPC_PMAC on the G5 one.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This reverts commit 9414715a7b,
at Olaf Hering's request:
> Paul, please discard this patch. The optional graphics card may have
> also device_type 'serial' if it is in VGA mode.
> I will send an updated patch later.
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (107 commits)
smc911x: fix compilation breakage wjen debug is on
[netdrvr] eexpress: minor corrections
add NAPI support to sb1250-mac.c
ixgb: ROUND_UP macro cleanup in drivers/net/ixgb
e1000: ROUND_UP macro cleanup in drivers/net/e1000
Generic HDLC sparse annotations
e100: Optionally use I/O mode only to access register space
e100: allow bad MAC address when running with invalid eeprom csum
ehea: fix for dlpar support
ehea: fix for sysfs entries
3C509: Remove unnecessary include of <linux/pm_legacy.h>
NetXen: Fix for vmalloc issues
NetXen: Fixes for Power PC architecture
NetXen: Port swap feature for multi port cards
NetXen: Removal of redundant macros
NetXen: Multi PCI support for Quad cards
NetXen: Removal of redundant argument passing
NetXen: Use multiple PCI functions
[netdrvr e100] experiment with doing RX in a similar manner to eepro100
[PATCH] ieee80211: add missing global needed by IEEE80211_DEBUG_XXXX
...