1
Commit Graph

3763 Commits

Author SHA1 Message Date
Ingo Molnar
fe402e1f2b x86, apic: clean up / remove TARGET_CPUS
Impact: cleanup

use apic->target_cpus() directly instead of the TARGET_CPUS wrapper.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:17 +01:00
Ingo Molnar
9b5bc8dc12 x86, apic: remove IRQ_DEST_MODE / IRQ_DELIVERY_MODE
Remove the wrapper macros IRQ_DEST_MODE and IRQ_DELIVERY_MODE.

The typical 32-bit and the 64-bit build all dereference via the genapic,
so it's pointless to hide that indirection via these ugly macros.

Furthermore, it also obscures subarchitecture details.

So replace it with apic->irq_dest_mode / etc. accesses.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:13 +01:00
Ingo Molnar
f8987a1093 x86, genapic: rename int_delivery_mode, et. al.
int_delivery_mode is supposed to mean 'interrupt delivery mode', but
it's quite a misnomer as 'int' we usually think of as an integer type ...

The standard naming for such attributes is 'irq' - so rename the following
fields and macros:

 int_delivery_mode => irq_delivery_mode
 INT_DELIVERY_MODE => IRQ_DELIVERY_MODE
 int_dest_mode     => irq_dest_mode
 INT_DEST_MODE     => IRQ_DEST_MODE

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:13 +01:00
Ingo Molnar
7ed248daa5 x86: clean up apic->apic_id_registered() methods
Impact: cleanup

x86 subarchitectures each defined a "apic_id_registered()" method,
which could be an inline function depending on which subarch we build
for, and which was also the name of a genapic field.

Untangle this namespace spaghetti by giving each of the instances
a separate name.

Also remove wrapper macro obfuscation.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:12 +01:00
Ingo Molnar
306db03b0d x86: clean up apic->acpi_madt_oem_check methods
Impact: refactor code

x86 subarchitectures each defined a "acpi_madt_oem_check()" method,
which could be an inline function, or an extern, or a static function,
and which was also the name of a genapic field.

Untangle this namespace spaghetti by setting ->acpi_madt_oem_check()
to NULL on those subarchitectures that have no detection quirks,
and rename the other ones (summit, es7000) that do.

Also change default_acpi_madt_oem_check() to handle NULL entries,
and clean its control flow up as well.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:12 +01:00
Ingo Molnar
504a3c3ad4 x86: clean up apic_x2apic_cluster
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:09 +01:00
Ingo Molnar
05c155c235 x86: clean up apic_x2apic_phys
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:08 +01:00
Ingo Molnar
c796732991 x86: clean up apic_x2apic_uv_x
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:08 +01:00
Ingo Molnar
4c3e51e05a x86: clean up genapic_phys_flat
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:07 +01:00
Ingo Molnar
f2f05ee8b8 x86: clean up genapic_flat
- reorder fields so that they appear in struct genapic field ordering

- add zero-initialized fields too so that it's apparent which functionality
  is default / missing.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:07 +01:00
Ingo Molnar
c8d46cf06d x86: rename 'genapic' to 'apic'
Rename genapic-> to apic-> references because in a future chagne we'll
open-code all the indirect calls (instead of obscuring them via macros),
so we want this reference to be as short as possible.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-28 23:20:06 +01:00
Ingo Molnar
74b6eb6b93 Merge branches 'x86/asm', 'x86/cleanups', 'x86/cpudetect', 'x86/debug', 'x86/doc', 'x86/header-fixes', 'x86/mm', 'x86/paravirt', 'x86/pat', 'x86/setup-v2', 'x86/subarch', 'x86/uaccess' and 'x86/urgent' into x86/core 2009-01-28 23:13:53 +01:00
Peter Zijlstra
8f6d86dc41 x86: cpu_init(): remove ugly #ifdef construct around debug register clear
Impact: Cleanup

While I was looking through the new and improved bootstrap code - great
work that, thanks! I found the below a slight improvement.

Remove unnecessary ugly #ifdef construct around debug register clear.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-27 14:54:44 -08:00
Ingo Molnar
4369f1fb7c Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu
Conflicts:
	arch/x86/kernel/setup_percpu.c

Semantic conflict:

	arch/x86/kernel/cpu/common.c

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-27 12:03:24 +01:00
Ingo Molnar
3ddeb51d9c Merge branch 'linus' into core/percpu
Conflicts:
	arch/x86/kernel/setup_percpu.c
2009-01-27 12:01:51 +01:00
Tejun Heo
cf3997f507 x86: clean up indentation in setup_per_cpu_areas()
Impact: cosmetic cleanup

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 14:25:05 +09:00
James Bottomley
22f25138c3 x86: fix build breakage on voyage
Impact: build fix

x86_cpu_to_apicid and x86_bios_cpu_apicid aren't defined for voyage.
Earlier patch forgot to conditionalize early percpu clearing.  Fix it.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 14:21:37 +09:00
Brian Gerst
2697fbd5fa x86: load new GDT after setting up boot cpu per-cpu area
Impact: sync 32 and 64-bit code

Merge load_gs_base() into switch_to_new_gdt().  Load the GDT and
per-cpu state for the boot cpu when its new area is set up.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
b2d2f4312b x86: initialize per-cpu GDT segment in per-cpu setup
Impact: cleanup

Rename init_gdt() to setup_percpu_segment(), and move it to
setup_percpu.c.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
89c9c4c58e x86: make Voyager use x86 per-cpu setup.
Impact: standardize all x86 platforms on same setup code

With the preceding changes, Voyager can use the same per-cpu setup
code as all the other x86 platforms.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
34019be1cd x86: don't assume boot cpu is #0
Impact: minor cleanup

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
1688401a0f x86: move this_cpu_offset
Impact: Small cleanup

Define BOOT_PERCPU_OFFSET and use it for this_cpu_offset and
__per_cpu_offset initializers.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:48 +09:00
Brian Gerst
996db817e3 x86: only compile setup_percpu.o on SMP
Impact: Minor build optimization

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
ec70de8b04 x86: move apic variables to apic.c
Impact: Code movement

Move the variable definitions to apic.c.  Ifdef the copying of
the two early per-cpu variables, since Voyager doesn't use them.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
74631a248d x86: always page-align per-cpu area start and size
Impact: cleanup

The way the code is written, align is always PAGE_SIZE.  Simplify
the code by removing the align variable.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
2f2f52bad7 x86: move setup_cpu_local_masks()
Impact: Code movement, no functional change.

Move setup_cpu_local_masks() to kernel/cpu/common.c, where the
masks are defined.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
6470aff619 x86: move 64-bit NUMA code
Impact: Code movement, no functional change.

Move the 64-bit NUMA code from setup_percpu.c to numa_64.c

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Brian Gerst
0d77e7f04d x86: merge setup_per_cpu_maps() into setup_per_cpu_areas()
Impact: minor optimization

Eliminates the need for two loops over possible cpus.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-27 12:56:47 +09:00
Linus Torvalds
3386c05bdb Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: add and use INIT_WORK_ON_STACK
  rcu: remove duplicate CONFIG_RCU_CPU_STALL_DETECTOR
  relay: fix lock imbalance in relay_late_setup_files
  oprofile: fix uninitialized use of struct op_entry
  rcu: move Kconfig menu
  softlock: fix false panic which can occur if softlockup_thresh is reduced
  rcu: add __cpuinit to rcu_init_percpu_data()
2009-01-26 09:47:56 -08:00
Linus Torvalds
1e70c7f7a9 Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hrtimers: fix inconsistent lock state on resume in hres_timers_resume
  time-sched.c: tick_nohz_update_jiffies should be static
  locking, hpet: annotate false positive warning
  kernel/fork.c: unused variable 'ret'
  itimers: remove the per-cpu-ish-ness
2009-01-26 09:47:43 -08:00
Linus Torvalds
810ee58de2 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (29 commits)
  xen: unitialised return value in xenbus_write_transaction
  x86: fix section mismatch warning
  x86: unmask CPUID levels on Intel CPUs, fix
  x86: work around PAGE_KERNEL_WC not getting WC in iomap_atomic_prot_pfn.
  x86: use standard PIT frequency
  xen: handle highmem pages correctly when shrinking a domain
  x86, mm: fix pte_free()
  xen: actually release memory when shrinking domain
  x86: unmask CPUID levels on Intel CPUs
  x86: add MSR_IA32_MISC_ENABLE bits to <asm/msr-index.h>
  x86: fix PTE corruption issue while mapping RAM using /dev/mem
  x86: mtrr fix debug boot parameter
  x86: fix page attribute corruption with cpa()
  Revert "x86: signal: change type of paramter for sys_rt_sigreturn()"
  x86: use early clobbers in usercopy*.c
  x86: remove kernel_physical_mapping_init() from init section
  fix: crash: IP: __bitmap_intersects+0x48/0x73
  cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
  work_on_cpu: Use our own workqueue.
  work_on_cpu: don't try to get_online_cpus() in work_on_cpu.
  ...
2009-01-26 09:47:28 -08:00
H. Peter Anvin
30a0fb947a x86: correct the CPUID pattern for MSR_IA32_MISC_ENABLE availability
Impact: re-enable CPUID unmasking on affected processors

As far as I am capable of discerning from the documentation,
MSR_IA32_MISC_ENABLE should be available for all family 0xf CPUs, as
well as family 6 for model >= 0xd (newer Pentium M).

The documentation on this isn't ideal, so we need to be on the lookout
for errors, still.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-26 09:40:58 -08:00
Rakib Mullick
659d2618b3 x86: fix section mismatch warning
Here function vmi_activate calls a init function activate_vmi , which
causes the following section mismatch warnings:

  LD      arch/x86/kernel/built-in.o
WARNING: arch/x86/kernel/built-in.o(.text+0x13ba9): Section mismatch
in reference from the function vmi_activate() to the function
.init.text:vmi_time_init()
The function vmi_activate() references
the function __init vmi_time_init().
This is often because vmi_activate lacks a __init
annotation or the annotation of vmi_time_init is wrong.

WARNING: arch/x86/kernel/built-in.o(.text+0x13bd1): Section mismatch
in reference from the function vmi_activate() to the function
.devinit.text:vmi_time_bsp_init()
The function vmi_activate() references
the function __devinit vmi_time_bsp_init().
This is often because vmi_activate lacks a __devinit
annotation or the annotation of vmi_time_bsp_init is wrong.

WARNING: arch/x86/kernel/built-in.o(.text+0x13bdb): Section mismatch
in reference from the function vmi_activate() to the function
.devinit.text:vmi_time_ap_init()
The function vmi_activate() references
the function __devinit vmi_time_ap_init().
This is often because vmi_activate lacks a __devinit
annotation or the annotation of vmi_time_ap_init is wrong.

Fix it by marking vmi_activate() as __init too.

Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 14:27:18 +01:00
Ingo Molnar
d5e397cb49 x86: improve early fault/irq printout
Impact: add a stack dump to early IRQs/faults

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 14:22:00 +01:00
Ingo Molnar
34707bcd04 x86, debug: remove early_printk() #ifdefs from head_32.S
Impact: cleanup

Remove such constructs:

 #ifdef CONFIG_EARLY_PRINTK
        call early_printk
 #else
        call printk
 #endif

Not only are they ugly, they are also pointless: a call to printk()
maps to early_printk during early bootup anyway, if CONFIG_EARLY_PRINTK
is enabled.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 14:18:43 +01:00
Ingo Molnar
99fb4d349d x86: unmask CPUID levels on Intel CPUs, fix
Impact: fix boot hang on pre-model-15 Intel CPUs

rdmsrl_safe() does not work in very early bootup code yet, because we
dont have the pagefault handler installed yet so exception section
does not get parsed. rdmsr_safe() will just crash and hang the bootup.

So limit the MSR_IA32_MISC_ENABLE MSR read to those CPU types that
support it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-26 12:36:24 +01:00
H. Peter Anvin
b38b066590 x86: filter CPU features dependent on unavailable CPUID levels
Impact: Fixes potential crashes on misconfigured systems.

Some CPU features require specific CPUID levels to be available in
order to function, as they contain information about the operation of
a specific feature.  However, some BIOSes and virtualization software
provide the ability to mask CPUID levels in order to support legacy
operating systems.  We try to enable such CPUID levels when we know
how to do it, but for the remaining cases, filter out such CPU
features when there is no way for us to support them.

Do this in one place, in the CPUID code, with a table-driven approach.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 18:08:05 -08:00
H. Peter Anvin
75a048119e x86: handle PAT more like other CPU features
Impact: Cleanup

When PAT was originally introduced, it was handled specially for a few
reasons:

- PAT bugs are hard to track down, so we wanted to maintain a
  whitelist of CPUs.
- The i386 and x86-64 CPUID code was not yet unified.

Both of these are now obsolete, so handle PAT like any other features,
including ordinary feature blacklisting due to known bugs.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 18:07:45 -08:00
Hiroshi Shimamoto
98e3d45eda x86: signal: use {get|put}_user_try and catch
Impact: use new framework

Use {get|put}_user_try, catch, and _ex in arch/x86/kernel/signal.c.

Note: this patch contains "WARNING: line over 80 characters", because when
introducing new block I insert an indent to avoid mistakes by edit.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-23 17:17:38 -08:00
Brian Gerst
3819cd489e x86: remove include of apic.h from hardirq_64.h
Impact: cleanup

APIC definitions aren't needed here.  Remove the include and fix
up the fallout.

tj: added include to mce_intel_64.c.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-23 11:03:29 +09:00
Brian Gerst
03d2989df9 x86: remove idle_timestamp from 32bit irq_cpustat_t
Impact: bogus irq_cpustat field removed

idle_timestamp is left over from the removed irqbalance code.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-23 11:03:28 +09:00
Jeremy Fitzhardinge
ab897d2013 x86/pvops: remove pte_flags pvop
pte_flags() was introduced as a new pvop in order to extract just the
flags portion of a pte, which is a potentially cheaper operation than
extracting the page number as well.  It turns out this operation is
not needed, because simply using a mask to extract the flags from a
pte is sufficient for all current users.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-22 23:35:20 +01:00
Thomas Gleixner
336f6c322d debugobjects: add and use INIT_WORK_ON_STACK
Impact: Fix debugobjects warning

debugobject enabled kernels spit out a warning in hpet code due to a
workqueue which is initialized on stack.

Add INIT_WORK_ON_STACK() which calls init_timer_on_stack() and use it
in hpet.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-01-22 10:02:07 +01:00
H. Peter Anvin
066941bd4e x86: unmask CPUID levels on Intel CPUs
Impact: Fixes crashes with misconfigured BIOSes on XSAVE hardware

Avuton Olrich reported early boot crashes with v2.6.28 and
bisected it down to dc1e35c6e9
("x86, xsave: enable xsave/xrstor on cpus with xsave support").

If the CPUID limit bit in MSR_IA32_MISC_ENABLE is set, clear it to
make all CPUID information available.  This is required for some
features to work, in particular XSAVE.

Reported-and-bisected-by: Avuton Olrich <avuton@gmail.com>
Tested-by: Avuton Olrich <avuton@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2009-01-22 09:24:02 +01:00
Nick Piggin
03b486322e x86: make UV support configurable
Make X86 SGI Ultraviolet support configurable. Saves about 13K of text size
on my modest config.

   text    data     bss     dec     hex filename
6770537 1158680  694356 8623573  8395d5 vmlinux
6757492 1157664  694228 8609384  835e68 vmlinux.nouv

Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 13:00:42 +01:00
Thomas Renninger
731f1872f4 x86: mtrr fix debug boot parameter
while looking at:

  http://bugzilla.kernel.org/show_bug.cgi?id=11541

I realized that the mtrr.show param cannot work, because
the code is processed much too early.

This patch:
 - Declares mtrr.show as early_param
 - Stays consistent with the previous param (which I doubt
   that it ever worked), so mtrr.show=1 would still work
 - Declares mtrr_show as initdata

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 12:26:42 +01:00
Ingo Molnar
198030782c Merge branch 'x86/mm' into core/percpu
Conflicts:
	arch/x86/mm/fault.c
2009-01-21 10:39:51 +01:00
Ingo Molnar
55f4949f57 x86, mm: move tlb.c to arch/x86/mm/
Impact: cleanup

Now that it's unified, move the (SMP) TLB flushing code from arch/x86/kernel/
to arch/x86/mm/, where it belongs logically.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 10:16:19 +01:00
Ingo Molnar
3eb3963fd1 Merge branch 'cpus4096' into core/percpu
Conflicts:
	arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c
	arch/x86/kernel/tlb_32.c

Merge it here because both the cpumask changes and the ongoing percpu
work is touching the TLB code. The percpu changes take precedence, as
they eliminate tlb_32.c altogether.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 10:14:17 +01:00
Ingo Molnar
552b8aa4d1 Revert "x86: signal: change type of paramter for sys_rt_sigreturn()"
This reverts commit 4217458daf.

Justin Madru bisected this commit, it was causing weird Firefox
crashes.

The reason is that GCC mis-optimizes (re-uses) the on-stack parameters of
the calling frame, which corrupts the syscall return pt_regs state and
thus corrupts user-space register state.

So we go back to the slightly less clean but more optimization-safe
method of getting to pt_regs. Also add a comment to explain this.

Resolves: http://bugzilla.kernel.org/show_bug.cgi?id=12505

Reported-and-bisected-by: Justin Madru <jdm64@gawab.com>
Tested-by: Justin Madru <jdm64@gawab.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-21 09:43:18 +01:00
Tejun Heo
16c2d3f895 x86: rename tlb_64.c to tlb.c
Impact: file rename

tlb_64.c is now the tlb code for both 32 and 64.  Rename it to tlb.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Tejun Heo
02cf94c370 x86: make x86_32 use tlb_64.c
Impact: less contention when issuing invalidate IPI, cleanup

Make x86_32 use the same tlb code as 64bit.  The 64bit code uses
multiple IPI vectors for tlb shootdown to reduce contention.  This
patch makes x86_32 allocate the same 8 IPIs as x86_64 and share the
code paths.

Note that the usage of asmlinkage is inconsistent for x86_32 and 64
and calls for further cleanup.  This has been noted with a FIXME
comment in tlb_64.c.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Tejun Heo
6dd01bedee x86: prepare for tlb merge
Impact: clean up, ipi vector number reordering for x86_32

Make the following changes to prepare for tlb merge.

* reorder x86_32 ip vectors

* adjust tlb_32.c and tlb_64.c such that their logics coincide exactly
	- on spurious invalidate ipi, tlb_32 acks the irq
	- tlb_64 now has proper memory barriers around clearing
          flush_cpumask (no change in generated code)

* unexport flush_tlb_page from tlb_32.c, there's no user

* use unsigned int for cpu id

* drop unnecessary includes from tlb_64.c

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Tejun Heo
bdbcdd4888 x86: uv cleanup
Impact: cleanup

Make the following uv related cleanups.

* collect visible uv related definitions and interfaces into uv/uv.h
  and use it.  this cleans up the messy situation where on 64bit, uv
  is defined properly, on 32bit generic it's dummy and on the rest
  undefined.  after this clean up, uv is defined on 64 and dummy on
  32.

* update uv_flush_tlb_others() such that it takes cpumask of
  to-be-flushed cpus as argument, instead of that minus self, and
  returns yet-to-be-flushed cpumask, instead of modifying the passed
  in parameter.  this interface change will ease dummy implementation
  of uv_flush_tlb_others() and makes uv tlb flush related stuff
  defined in tlb_uv proper.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Brian Gerst
d650a51485 x86: merge irq_regs.h
Impact: cleanup, better irq_regs code generation for x86_64

Make 64-bit use the same optimizations as 32-bit.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:06 +09:00
Brian Gerst
0dd76d736e x86: set %fs to __KERNEL_PERCPU unconditionally for x86_32
Impact: cleanup

%fs is currently set to __KERNEL_DS at boot, and conditionally
switched to __KERNEL_PERCPU for secondary cpus.  Instead, initialize
GDT_ENTRY_PERCPU to the same attributes as GDT_ENTRY_KERNEL_DS and
set %fs to __KERNEL_PERCPU unconditionally.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:05 +09:00
Brian Gerst
06deef892c x86: clean up gdt_page definition
Impact: cleanup && more compact percpu area layout with future changes

Move 64-bit GDT to page-aligned section and clean up comment
formatting.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-21 17:26:05 +09:00
Jiri Kosina
afb33f8c0d x86: remove byte locks
Impact: cleanup

Remove byte locks implementation, which was introduced by Jeremy in
8efcbab6 ("paravirt: introduce a "lock-byte" spinlock implementation"),
but turned out to be dead code that is not used by any in-kernel
virtualization guest (Xen uses its own variant of spinlocks implementation
and KVM is not planning to move to byte locks).

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 17:14:28 +01:00
Ingo Molnar
0ce1c38368 Merge commit 'v2.6.29-rc2' into x86/mm 2009-01-20 09:23:28 +01:00
Ingo Molnar
5766b842b2 x86, cpumask: fix tlb flush race
Impact: fix bootup crash

The cpumask is now passed in as a reference to mm->cpu_vm_mask, not on
the stack - hence it is not constant anymore during the TLB flush.

That way it could race and some static sanity checks would trigger:

[  238.154287] ------------[ cut here ]------------
[  238.156039] kernel BUG at arch/x86/kernel/tlb_32.c:130!
[  238.156039] invalid opcode: 0000 [#1] SMP
[  238.156039] last sysfs file: /sys/class/net/eth2/address
[  238.156039] Modules linked in:
[  238.156039]
[  238.156039] Pid: 6493, comm: ifup-eth Not tainted (2.6.29-rc2-tip #1) P4DC6
[  238.156039] EIP: 0060:[<c0118f87>] EFLAGS: 00010202 CPU: 2
[  238.156039] EIP is at native_flush_tlb_others+0x35/0x158
[  238.156039] EAX: c0ef972c EBX: f6143301 ECX: 00000000 EDX: 00000000
[  238.156039] ESI: f61433a8 EDI: f6143200 EBP: f34f3e00 ESP: f34f3df0
[  238.156039]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  238.156039] Process ifup-eth (pid: 6493, ti=f34f2000 task=f399ab00 task.ti=f34f2000)
[  238.156039] Stack:
[  238.156039]  ffffffff f61433a8 ffffffff f6143200 f34f3e18 c0118e9c 00000000 f6143200
[  238.156039]  f61433a8 f5bec738 f34f3e28 c0119435 c2b5b830 f6143200 f34f3e34 c01c2dc3
[  238.156039]  bffd9000 f34f3e60 c01c3051 00000000 ffffffff f34f3e4c 00000000 00000071
[  238.156039] Call Trace:
[  238.156039]  [<c0118e9c>] ? flush_tlb_others+0x52/0x5b
[  238.156039]  [<c0119435>] ? flush_tlb_mm+0x7f/0x8b
[  238.156039]  [<c01c2dc3>] ? tlb_finish_mmu+0x2d/0x55
[  238.156039]  [<c01c3051>] ? exit_mmap+0x124/0x170
[  238.156039]  [<c013e965>] ? mmput+0x40/0xf5
[  238.156039]  [<c01e4788>] ? flush_old_exec+0x640/0x94b
[  238.156039]  [<c01ddb4e>] ? fsnotify_access+0x37/0x39
[  238.156039]  [<c01e3435>] ? kernel_read+0x39/0x4b
[  238.156039]  [<c021bc8a>] ? load_elf_binary+0x4a1/0x11bb
[  238.156039]  [<c01c0af9>] ? might_fault+0x51/0x9c
[  238.156039]  [<c010a2cc>] ? paravirt_read_tsc+0x20/0x4f
[  238.156039]  [<c010a406>] ? native_sched_clock+0x5d/0x60
[  238.156039]  [<c01e2fda>] ? search_binary_handler+0xab/0x2c4
[  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
[  238.156039]  [<c04ae9a5>] ? _raw_read_unlock+0x21/0x46
[  238.156039]  [<c021b7e9>] ? load_elf_binary+0x0/0x11bb
[  238.156039]  [<c01e2fe1>] ? search_binary_handler+0xb2/0x2c4
[  238.156039]  [<c01e4076>] ? do_execve+0x21c/0x2ee
[  238.156039]  [<c01029b7>] ? sys_execve+0x51/0x8c
[  238.156039]  [<c0103eaf>] ? sysenter_do_call+0x12/0x43

Fix it by not assuming that the cpumask is constant.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 09:13:15 +01:00
Ingo Molnar
8f5d36ed5b Merge branch 'tj-percpu' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into core/percpu 2009-01-20 08:23:45 +01:00
Brian Gerst
0d974d4592 x86: remove pda.h
Impact: cleanup

Signed-off-by: Brian Gerst <brgerst@gmail.com>
2009-01-20 12:29:20 +09:00
Brian Gerst
947e76cdc3 x86: move stack_canary into irq_stack
Impact: x86_64 percpu area layout change, irq_stack now at the beginning

Now that the PDA is empty except for the stack canary, it can be removed.
The irqstack is moved to the start of the per-cpu section.  If the stack
protector is enabled, the canary overlaps the bottom 48 bytes of the irqstack.

tj: * updated subject
    * dropped asm relocation of irq_stack_ptr
    * updated comments a bit
    * rebased on top of stack canary changes

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:20 +09:00
Brian Gerst
8c7e58e690 x86: rework __per_cpu_load adjustments
Impact: cleanup

Use cpu_number to determine if the adjustment is necessary.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:20 +09:00
Brian Gerst
8ce031972b x86: remove pda_init()
Impact: cleanup

Copy the code to cpu_init() to satisfy the requirement that the cpu
be reinitialized.  Remove all other calls, since the segments are
already initialized in head_64.S.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:19 +09:00
Tejun Heo
c6e50f93db x86: cleanup stack protector
Impact: cleanup

Make the following cleanups.

* remove duplicate comment from boot_init_stack_canary() which fits
  better in the other place - cpu_idle().

* move stack_canary offset check from __switch_to() to
  boot_init_stack_canary().

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-20 12:29:19 +09:00
Ingo Molnar
bfa318ad52 fix: crash: IP: __bitmap_intersects+0x48/0x73
-tip testing found this crash:

> [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
> [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
> [   35.267554] PGD 0
> [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
allocation of the variable mask, so we pass in an uninitialized cmd.mask
field to drv_read(), which then passes it to the scheduler which then
crashes ...

Switch it over to the much simpler constant-cpumask-pointers approach.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-20 00:17:01 +01:00
Mike Travis
7285908185 cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new work_on_cpu function to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
that the work_on_cpu() function is more stable.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19 22:36:13 +01:00
Ingo Molnar
5cdc5e9e69 x86: fully honor "nolapic", fix
Impact: build fix

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-19 20:49:37 +01:00
Michael Ellerman
422e79a8b3 x86: Remove never-called arch_setup_msi_irq()
Since commit 75c46fa, "x64, x2apic/intr-remap: MSI and MSI-X
support for interrupt remapping infrastructure", x86 has had an
implementation of arch_setup_msi_irqs().

That implementation does not call arch_setup_msi_irq(), instead it calls
setup_irq(). No other x86 code calls arch_setup_msi_irq().

That leaves only arch_setup_msi_irqs() in drivers/pci/msi.c, but that
routine is overridden by the x86 version of arch_setup_msi_irqs().

So arch_setup_msi_irq() is dead code, remove it.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-01-18 22:16:46 -08:00
Leonardo Potenza
c7f8562a51 x86: fix section mismatch warnings in kernel/setup_percpu.c
The function setup_cpu_local_masks() has been marked __init, in
order to remove the following section mismatch messages:

WARNING: vmlinux.o(.text+0x3c2c7): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

WARNING: vmlinux.o(.text+0x3c2d3): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

WARNING: vmlinux.o(.text+0x3c2df): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

WARNING: vmlinux.o(.text+0x3c2eb): Section mismatch in reference from the function setup_cpu_local_masks() to the function .init.text:alloc_bootmem_cpumask_var()
The function setup_cpu_local_masks() references
the function __init alloc_bootmem_cpumask_var().
This is often because setup_cpu_local_masks lacks a __init
annotation or the annotation of alloc_bootmem_cpumask_var is wrong.

Signed-off-by: Leonardo Potenza <lpotenza@inwind.it>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 23:59:22 +01:00
Mike Travis
b2b815d80a x86: put trigger in to detect mismatched apic versions
Impact: add debug warning

Fire off one message if two apic's discovered with different
apic versions. (this code is only called during CPU init)

The goal of this is to pave the way of the removal of the apic_version[]
array. We dont expect any apic version incompatibilities in the x86
landscape of systems [if so we dont handle them very well and probably
never will handle deep apic version assymetries well], but it's prudent
to have a debug check for one kernel cycle nevertheless.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 21:15:27 +01:00
Ingo Molnar
b2b062b816 Merge branch 'core/percpu' into stackprotector
Conflicts:
	arch/x86/include/asm/pda.h
	arch/x86/include/asm/system.h

Also, moved include/asm-x86/stackprotector.h to arch/x86/include/asm.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-18 18:37:14 +01:00
Brian Gerst
c2558e0eba x86-64: Move isidle from PDA to per-cpu.
tj: s/isidle/is_idle/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:59 +09:00
Brian Gerst
e7a22c1ebc x86-64: Move nodenumber from PDA to per-cpu.
tj: * s/nodenumber/node_number/
    * removed now unused pda variable from pda_init()

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:59 +09:00
Brian Gerst
5689553076 x86-64: Move irqcount from PDA to per-cpu.
tj: s/irqcount/irq_count/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
3d1e42a7cf x86-64: Move oldrsp from PDA to per-cpu.
tj: * in asm-offsets_64.c, pda.h inclusion shouldn't be removed as pda
      is still referenced in the file
    * s/oldrsp/old_rsp/

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
9af45651f1 x86-64: Move kernelstack from PDA to per-cpu.
Also clean up PER_CPU_VAR usage in xen-asm_64.S

tj: * remove now unused stack_thread_info()
    * s/kernelstack/kernel_stack/
    * added FIXME comment in xen-asm_64.S

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
c6f5e0acd5 x86-64: Move current task from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
ea9279066d x86-64: Move cpu number from PDA to per-cpu and consolidate with 32-bit.
tj: moved cpu_number definition out of CONFIG_HAVE_SETUP_PER_CPU_AREA
    for voyager.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
92d65b2371 x86-64: Convert exception stacks to per-cpu
Move the exception stacks to per-cpu, removing specific allocation code.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
26f80bd6a9 x86-64: Convert irqstacks to per-cpu
Move the irqstackptr variable from the PDA to per-cpu.  Make the
stacks themselves per-cpu, removing some specific allocation code.
Add a seperate flag (is_boot_cpu) to simplify the per-cpu boot
adjustments.

tj: * sprinkle some underbars around.

    * irq_stack_ptr is not used till traps_init(), no reason to
      initialize it early.  On SMP, just leaving it NULL till proper
      initialization in setup_per_cpu_areas() works.  Dropped
      is_boot_cpu and early irq_stack_ptr initialization.

    * do DECLARE/DEFINE_PER_CPU(char[IRQ_STACK_SIZE], irq_stack)
      instead of (char, irq_stack[IRQ_STACK_SIZE]).

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:58 +09:00
Brian Gerst
9eb912d1aa x86-64: Move TLB state from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:57 +09:00
Brian Gerst
1b437c8c73 x86-64: Move irq stats from PDA to per-cpu and consolidate with 32-bit.
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-19 00:38:57 +09:00
Mike Travis
cef30b3a84 x86: put trigger in to detect mismatched apic versions.
Fire off one message if two apic's discovered with different
apic versions.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-16 15:58:13 -08:00
Mike Travis
6eb714c63e cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write
Impact: use new work_on_cpu function to reduce stack usage

Replace the saving of current->cpus_allowed and set_cpus_allowed_ptr() with
a work_on_cpu function for drv_read() and drv_write().

Basically converts do_drv_{read,write} into "work_on_cpu" functions that
are now called by drv_read and drv_write.

Note: This patch basically reverts 50c668d6 which reverted 7503bfba, now
that the work_on_cpu() function is more stable.

Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Dieter Ries <clip2@gmx.de>
Tested-by: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: Dave Jones <davej@redhat.com>
Cc: <cpufreq@vger.kernel.org>
2009-01-16 15:31:15 -08:00
Len Brown
88d998c264 Merge branch 'misc' into release 2009-01-16 14:45:34 -05:00
Masami Hiramatsu
5a4ccaf37f kprobes: check CONFIG_FREEZER instead of CONFIG_PM
Check CONFIG_FREEZER instead of CONFIG_PM because kprobe booster
depends on freeze_processes() and thaw_processes() when CONFIG_PREEMPT=y.

This fixes a linkage error which occurs when CONFIG_PREEMPT=y, CONFIG_PM=y
and CONFIG_FREEZER=n.

Reported-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Len Brown <len.brown@intel.com>
2009-01-16 14:32:17 -05:00
Tejun Heo
cd3adf5230 x86_64: initialize this_cpu_off to __per_cpu_load
On x86_64, if get_per_cpu_var() is used before per cpu area is setup
(if lockdep is turned on, it happens), it needs this_cpu_off to point
to __per_cpu_load.  Initialize accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:20:58 +01:00
Tejun Heo
a338af2c64 x86: fix build bug introduced during merge
EXPORT_PER_CPU_SYMBOL() got misplaced during merge leading to build
failure.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:20:43 +01:00
Ingo Molnar
6dbde35308 percpu: add optimized generic percpu accessors
It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:20:31 +01:00
Tejun Heo
004aa322f8 x86: misc clean up after the percpu update
Do the following cleanups:

* kill x86_64_init_pda() which now is equivalent to pda_init()

* use per_cpu_offset() instead of cpu_pda() when initializing
  initial_gs

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:20:26 +01:00
Tejun Heo
49357d19e4 x86: convert pda ops to wrappers around x86 percpu accessors
pda is now a percpu variable and there's no reason it can't use plain
x86 percpu accessors.  Add x86_test_and_clear_bit_percpu() and replace
pda op implementations with wrappers around x86 percpu accessors.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:20:22 +01:00
Tejun Heo
b12d8db8fb x86: make pda a percpu variable
[ Based on original patch from Christoph Lameter and Mike Travis. ]

As pda is now allocated in percpu area, it can easily be made a proper
percpu variable.  Make it so by defining per cpu symbol from linker
script and declaring it in C code for SMP and simply defining it for
UP.  This change cleans up code and brings SMP and UP closer a bit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:20:03 +01:00
Tejun Heo
9939ddaff5 x86: merge 64 and 32 SMP percpu handling
Now that pda is allocated as part of percpu, percpu doesn't need to be
accessed through pda.  Unify x86_64 SMP percpu access with x86_32 SMP
one.  Other than the segment register, operand size and the base of
percpu symbols, they behave identical now.

This patch replaces now unnecessary pda->data_offset with a dummy
field which is necessary to keep stack_canary at its place.  This
patch also moves per_cpu_offset initialization out of init_gdt() into
setup_per_cpu_areas().  Note that this change also necessitates
explicit per_cpu_offset initializations in voyager_smp.c.

With this change, x86_OP_percpu()'s are as efficient on x86_64 as on
x86_32 and also x86_64 can use assembly PER_CPU macros.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:58 +01:00
Tejun Heo
1a51e3a0ae x86: fold pda into percpu area on SMP
[ Based on original patch from Christoph Lameter and Mike Travis. ]

Currently pdas and percpu areas are allocated separately.  %gs points
to local pda and percpu area can be reached using pda->data_offset.
This patch folds pda into percpu area.

Due to strange gcc requirement, pda needs to be at the beginning of
the percpu area so that pda->stack_canary is at %gs:40.  To achieve
this, a new percpu output section macro - PERCPU_VADDR_PREALLOC() - is
added and used to reserve pda sized chunk at the start of the percpu
area.

After this change, for boot cpu, %gs first points to pda in the
data.init area and later during setup_per_cpu_areas() gets updated to
point to the actual pda.  This means that setup_per_cpu_areas() need
to reload %gs for CPU0 while clearing pda area for other cpus as cpu0
already has modified it when control reaches setup_per_cpu_areas().

This patch also removes now unnecessary get_local_pda() and its call
sites.

A lot of this patch is taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:46 +01:00
Tejun Heo
c8f3329a0d x86: use static _cpu_pda array
_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:40 +01:00
Tejun Heo
f32ff5388d x86: load pointer to pda into %gs while brining up a CPU
[ Based on original patch from Christoph Lameter and Mike Travis. ]

CPU startup code in head_64.S loaded address of a zero page into %gs
for temporary use till pda is loaded but address to the actual pda is
available at the point.  Load the real address directly instead.

This will help unifying percpu and pda handling later on.

This patch is mostly taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:19:26 +01:00
Tejun Heo
3e5d8f9784 x86: make percpu symbols zerobased on SMP
[ Based on original patch from Christoph Lameter and Mike Travis. ]

This patch makes percpu symbols zerobased on x86_64 SMP by adding
PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on
the percpu output section and using it in vmlinux_64.lds.S.  A new
PHDR is added as existing ones cannot contain sections near address
zero.  PERCPU_VADDR() also adds a new symbol __per_cpu_load which
always points to the vaddr of the loaded percpu data.init region.

The following adjustments have been made to accomodate the address
change.

* code to locate percpu gdt_page in head_64.S is updated to add the
  load address to the gdt_page offset.

* __per_cpu_load is used in places where access to the init data area
  is necessary.

* pda->data_offset is initialized soon after C code is entered as zero
  value doesn't work anymore.

This patch is mostly taken from Mike Travis' "x86_64: Base percpu
variables at zero" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:14 +01:00
Tejun Heo
a698c823e1 x86: make vmlinux_32.lds.S use PERCPU() macro
Make vmlinux_32.lds.S use the generic PERCPU() macro instead of open
coding it.  This will ease future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:09 +01:00