linux

Author	SHA1	Message	Date
David P. Reed	4637a74cf2	[PATCH] x86-64: Avoid overflows during apic timer calibration - Use 64bit TSC calculations to avoid handling overflow - Use 32bit unsigned arithmetic for the APIC timer. This way overflows are handled correctly. - Fix exit check of loop to account for apic timer counting down Signed-off-by: dpreed@reed.com Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:20 +02:00
Andi Kleen	05cb007dac	[PATCH] x86-64: Use the 32bit wd_ops for 64bit too. This mainly removes a lot of code, replacing it with calls into the new 32bit perfctr-watchdog.c Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:20 +02:00
Suresh Siddha	e3f1caeef9	[PATCH] x86-64: set node_possible_map at runtime - try 2 Set the node_possible_map at runtime on x86_64. On a non NUMA system, num_possible_nodes() will now say '1'. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: David Rientjes <rientjes@google.com> Cc: Christoph Lameter <clameter@engr.sgi.com>	2007-05-02 19:27:20 +02:00
Tim Hockin	8a336b0a4b	[PATCH] x86-64: Dynamically adjust machine check interval Background: We've found that MCEs (specifically DRAM SBEs) tend to come in bunches, especially when we are trying really hard to stress the system out. The current MCE poller uses a static interval which does not care whether it has or has not found MCEs recently. Description: This patch makes the MCE poller adjust the polling interval dynamically. If we find an MCE, poll 2x faster (down to 10 ms). When we stop finding MCEs, poll 2x slower (up to check_interval seconds). The check_interval tunable becomes the max polling interval. The "Machine check events logged" printk() is rate limited to the check_interval, which should be identical behavior to the old functionality. Result: If you start to take a lot of correctable errors (not exceptions), you log them faster and more accurately (less chance of overflowing the MCA registers). If you don't take a lot of errors, you will see no change. Alternatives: I considered simply reducing the polling interval to 10 ms immediately and keeping it there as long as we continue to find errors. This felt a bit heavy handed, but does perform significantly better for the default check_interval of 5 minutes (we're using a few seconds when testing for DRAM errors). I could be convinced to go with this, if anyone felt it was not too aggressive. Testing: I used an error-injecting DIMM to create lots of correctable DRAM errors and verified that the polling interval accelerates. The printk() only happens once per check_interval seconds. Patch: This patch is against 2.6.21-rc7. Signed-Off-By: Tim Hockin <thockin@google.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:19 +02:00
Andrew Morton	425001fea7	[PATCH] x86-64: unexport cpu_llc_id WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.data:cpu_llc_id from __ksymtab between '__ksymtab_cpu_llc_id' (at offset 0x4a0) and '__ksymtab_smp_num_siblings' It is strange to export a __cpuinitdata symbols to modules, and no module appears to use it anyway. Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:19 +02:00
Andi Kleen	57a4f91ae5	[PATCH] x86-64: Auto compute __NR_syscall_max at compile time No need to maintain it anymore Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:18 +02:00
Eric Dumazet	141a892f57	[PATCH] x86-64: move __vgetcpu_mode & __jiffies to the vsyscall_2 zone We apparently hit the 1024 limit of vsyscall_0 zone when some debugging options are set, or if __vsyscall_gtod_data is 64 bytes larger. In order to save 128 bytes from the vsyscall_0 zone, we move __vgetcpu_mode & __jiffies to vsyscall_2 zone where they really belong, since they are used only from vgetcpu() (which is in this vsyscall_2 area). After patch is applied, new layout is : ffffffffff600000 T vgettimeofday ffffffffff60004e t vsysc2 ffffffffff600140 t vread_hpet ffffffffff600150 t vread_tsc ffffffffff600180 D __vsyscall_gtod_data ffffffffff600400 T vtime ffffffffff600413 t vsysc1 ffffffffff600800 T vgetcpu ffffffffff600870 D __vgetcpu_mode ffffffffff600880 D __jiffies ffffffffff600c00 T venosys_1 Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:18 +02:00
Fernando Luis [ ISO-8859-1 charset ] V�zquezCao	9062d888aa	[PATCH] x86-64: __send_IPI_dest_field - x86_64 Implement __send_IPI_dest_field which can be used to send IPIs when the "destination shorthand" field of the ICR is set to 00 (destination field). Use it whenever possible. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:18 +02:00
Fernando Luis VazquezCao	3144c332fa	[PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64 inquire_remote_apic is used for APIC debugging, so use safe_apic_wait_icr_idle instead of apic_wait_icr_idle to avoid possible lockups when APIC delivery fails. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:17 +02:00
Fernando Luis VazquezCao	ea8c733b98	[PATCH] x86-64: use safe_apic_wait_icr_idle in smpboot.c - x86_64 The functionality provided by the new safe_apic_wait_icr_idle is being open-coded all over "kernel/smpboot.c". Use safe_apic_wait_icr_idle instead to consolidate code and ease maintenance. Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:17 +02:00
Fernando Luis VazquezCao	8339e9fba3	[PATCH] x86-64: safe_apic_wait_icr_idle - x86_64 apic_wait_icr_idle looks like this: static __inline__ void apic_wait_icr_idle(void) { while (apic_read(APIC_ICR) & APIC_ICR_BUSY) cpu_relax(); } The busy loop in this function would not be problematic if the corresponding status bit in the ICR were always updated, but that does not seem to be the case under certain crash scenarios. Kdump uses an IPI to stop the other CPUs in the event of a crash, but when any of the other CPUs are locked-up inside the NMI handler the CPU that sends the IPI will end up looping forever in the ICR check, effectively hard-locking the whole system. Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3: "A local APIC unit indicates successful dispatch of an IPI by resetting the Delivery Status bit in the Interrupt Command Register (ICR). The operating system polls the delivery status bit after sending an INIT or STARTUP IPI until the command has been dispatched. A period of 20 microseconds should be sufficient for IPI dispatch to complete under normal operating conditions. If the IPI is not successfully dispatched, the operating system can abort the command. Alternatively, the operating system can retry the IPI by writing the lower 32-bit double word of the ICR. This “time-out” mechanism can be implemented through an external interrupt, if interrupts are enabled on the processor, or through execution of an instruction or time-stamp counter spin loop." Intel's documentation suggests the implementation of a time-out mechanism, which, by the way, is already being open-coded in some parts of the kernel that tinker with ICR. Create a apic_wait_icr_idle replacement that implements the time-out mechanism and that can be used to solve the aforementioned problem. AK: moved both functions out of line AK: Added improved loop from Keith Owens Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:17 +02:00
Bernhard Kaindl	3ebad59056	[PATCH] x86: Save and restore the fixed-range MTRRs of the BSP when suspending Note: This patch didn'nt need an update since it's initial post. Some BIOSes may modify fixed-range MTRRs in SMM, e.g. when they transition the system into ACPI mode, which is entered thru an SMI, triggered by Linux in acpi_enable(). SMIs which cause that Linux is interrupted and BIOS code is executed (which may change e.g. fixed-range MTRRs) in SMM may be raised by an embedded system controller which is often found in notebooks also at other occasions. If we would not update our copy of the fixed-range MTRRs before suspending to RAM or to disk, restore_processor_state() would set the fixed-range MTRRs of the BSP using old backup values which may be outdated and this could cause the system to fail later during resume. This patch ensures that our copy of the fixed-range MTRRs is updated when saving the boot processor state on suspend to disk and suspend to RAM. In combination with other patches this allows to fix s2ram and s2disk on the Acer Ferrari 1000 notebook and at least s2disk on the Acer Ferrari 5000 notebook. Signed-off-by: Bernhard Kaindl <bk@suse.de> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <ak@suse.de> Cc: Dave Jones <davej@codemonkey.org.uk>	2007-05-02 19:27:17 +02:00
Bernhard Kaindl	2b1f6278d7	[PATCH] x86: Save the MTRRs of the BSP before booting an AP Applied fix by Andew Morton: http://lkml.org/lkml/2007/4/8/88 - Fix `make headers_check'. AMD and Intel x86 CPU manuals state that it is the responsibility of system software to initialize and maintain MTRR consistency across all processors in Multi-Processing Environments. Quote from page 188 of the AMD64 System Programming manual (Volume 2): 7.6.5 MTRRs in Multi-Processing Environments "In multi-processing environments, the MTRRs located in all processors must characterize memory in the same way. Generally, this means that identical values are written to the MTRRs used by the processors." (short omission here) "Failure to do so may result in coherency violations or loss of atomicity. Processor implementations do not check the MTRR settings in other processors to ensure consistency. It is the responsibility of system software to initialize and maintain MTRR consistency across all processors." Current Linux MTRR code already implements the above in the case that the BIOS does not properly initialize MTRRs on the secondary processors, but the case where the fixed-range MTRRs of the boot processor are changed after Linux started to boot, before the initialsation of a secondary processor, is not handled yet. In this case, secondary processors are currently initialized by Linux with MTRRs which the boot processor had very early, when mtrr_bp_init() did run, but not with the MTRRs which the boot processor uses at the time when that secondary processors is actually booted, causing differing MTRR contents on the secondary processors. Such situation happens on Acer Ferrari 1000 and 5000 notebooks where the BIOS enables and sets AMD-specific IORR bits in the fixed-range MTRRs of the boot processor when it transitions the system into ACPI mode. The SMI handler of the BIOS does this in SMM, entered while Linux ACPI code runs acpi_enable(). Other occasions where the SMI handler of the BIOS may change bits in the MTRRs could occur as well. To initialize newly booted secodary processors with the fixed-range MTRRs which the boot processor uses at that time, this patch saves the fixed-range MTRRs of the boot processor before new secondary processors are started. When the secondary processors run their Linux initialisation code, their fixed-range MTRRs will be updated with the saved fixed-range MTRRs. If CONFIG_MTRR is not set, we define mtrr_save_state as an empty statement because there is nothing to do. Possible TODOs: ) CPU-hotplugging outside of SMP suspend/resume is not yet tested with this patch. ) If, even in this case, an AP never runs i386/do_boot_cpu or x86_64/cpu_up, then the calls to mtrr_save_state() could be replaced by calls to mtrr_save_fixed_ranges(NULL) and mtrr_save_state() would not be needed. That would need either verification of the CPU-hotplug code or at least a test on a >2 CPU machine. *) The MTRRs of other running processors are not yet checked at this time but it might be interesting to syncronize the MTTRs of all processors before booting. That would be an incremental patch, but of rather low priority since there is no machine known so far which would require this. AK: moved prototypes on x86-64 around to fix warnings Signed-off-by: Bernhard Kaindl <bk@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Dave Jones <davej@codemonkey.org.uk>	2007-05-02 19:27:17 +02:00
Jeremy Fitzhardinge	57decbda6a	[PATCH] x86: update for i386 and x86-64 check_bugs Remove spurious comments, headers and keywords from x86-64 bugs.[ch]. Use identify_boot_cpu() AK: merged with other patch Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:16 +02:00
Jeremy Fitzhardinge	35c7422649	[PATCH] x86: deflate stack usage in lib/inflate.c inflate_fixed and huft_build together use around 2.7k of stack. When using 4k stacks, I saw stack overflows from interrupts arriving while unpacking the root initrd: do_IRQ: stack overflow: 384 [<c0106b64>] show_trace_log_lvl+0x1a/0x30 [<c01075e6>] show_trace+0x12/0x14 [<c010763f>] dump_stack+0x16/0x18 [<c0107ca4>] do_IRQ+0x6d/0xd9 [<c010202b>] xen_evtchn_do_upcall+0x6e/0xa2 [<c0106781>] xen_hypervisor_callback+0x25/0x2c [<c010116c>] xen_restore_fl+0x27/0x29 [<c0330f63>] _spin_unlock_irqrestore+0x4a/0x50 [<c0117aab>] change_page_attr+0x577/0x584 [<c0117b45>] kernel_map_pages+0x8d/0xb4 [<c016a314>] cache_alloc_refill+0x53f/0x632 [<c016a6c2>] __kmalloc+0xc1/0x10d [<c0463d34>] malloc+0x10/0x12 [<c04641c1>] huft_build+0x2a7/0x5fa [<c04645a5>] inflate_fixed+0x91/0x136 [<c04657e2>] unpack_to_rootfs+0x5f2/0x8c1 [<c0465acf>] populate_rootfs+0x1e/0xe4 (This was under Xen, but there's no reason it couldn't happen on bare hardware.) This patch mallocs the local variables, thereby reducing the stack usage to sane levels. Also, up the heap size for the kernel decompressor to deal with the extra allocation. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Tim Yamin <plasmaroo@gentoo.org> Cc: Andi Kleen <ak@suse.de> Cc: Matt Mackall <mpm@selenic.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com>	2007-05-02 19:27:15 +02:00
James Puthukattukaran	82d1bb725e	[PATCH] x86-64: x86-64 system crashes when no memory populating Node 0 I have a 4 socket AMD Operton system. The 2.6.18 kernel I have crashes when there is no memory in node0. AK: changed call to _nopanic Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:13 +02:00
Glauber de Oliveira Costa	1c3d99c11c	[PATCH] x86-64: Fix x86_64 compilation with DEBUG_SIG on Setting the DEBUG_SIG flag breaks compilation due to a wrong struct access. Aditionally, it raises two warnings. This is one patch to fix them all. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:13 +02:00
Jeremy Fitzhardinge	b6e3590f81	[PATCH] x86: Allow percpu variables to be page-aligned Let's allow page-alignment in general for per-cpu data (wanted by Xen, and Ingo suggested KVM as well). Because larger alignments can use more room, we increase the max per-cpu memory to 64k rather than 32k: it's getting a little tight. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:12 +02:00
Andi Kleen	f039b75471	[PATCH] x86: Don't use MWAIT on AMD Family 10 It doesn't put the CPU into deeper sleep states, so it's better to use the standard idle loop to save power. But allow to reenable it anyways for benchmarking. I also removed the obsolete idle=halt on i386 Cc: andreas.herrmann@amd.com Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:12 +02:00
Jeremy Fitzhardinge	c169859d6d	[PATCH] x86-64: Clean up asm-x86_64/bugs.h Most of asm-x86_64/bugs.h is code which should be in a C file, so put it there. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-02 19:27:12 +02:00
Jan Beulich	b92e9fac40	[PATCH] x86: fix amd64-agp aperture validation Under CONFIG_DISCONTIGMEM, assuming that a !pfn_valid() implies all subsequent pfn-s are also invalid is wrong. Thus replace this by explicitly checking against the E820 map. AK: make e820 on x86-64 not initdata Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>	2007-05-02 19:27:11 +02:00
Bernhard Walle	141f9cfe0a	[PATCH] x86-64: Fix "Section mismatch" compile warning Fix "Section mismatch" warnings in arch/x86_64/kernel/time.c Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:11 +02:00
Jan Beulich	dd4ecfc2b1	[PATCH] x86-64: adjust EDID retrieval commit `5e518d7672` did the same change to i386's variant. With this change, i386's and x86-64's versions are identical, raising the question whether the x86-64 one should go (just like there's only one instance of edd.S). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:11 +02:00
Konrad Rzeszutek	ae32b1297a	[PATCH] x86-64: Inhibit machine from asserting an NMI when doing Alt-SysRq-M operation. This patch touches the NMI watchdog every MAX_ORDER_NR_PAGES to inhibit the machine from triggering an NMI while the CPUs are locked. This situation is happening on boxes with more than 64CPUs and 128GB of RAM when Alt-SysRq-m is performed. It has been succesfully tested for regression on uni, 2, 4, 8 32, and 64 CPU boxes with various memory configuration. Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:11 +02:00
Eric Dumazet	c8118c6c07	[PATCH] x86-64: vsyscall_gtod_data diet and vgettimeofday() fix Current vsyscall_gtod_data is large (3 or 4 cache lines dirtied at timer interrupt). We can shrink it to exactly 64 bytes (1 cache line on AMD64) Instead of copying a whole struct clocksource, we copy only needed fields. I deleted an unused field : offset_base This patch fixes one oddity in vgettimeofday(): It can returns a timeval with tv_usec = 1000000. Maybe not a bug, but why not doing the right thing ? Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:11 +02:00
Eric Dumazet	272a3713bb	[PATCH] x86-64: fix vtime() vsyscall There is a tiny probability that the return value from vtime(time_t t) is Signed-off-by: Andi Kleen <ak@suse.de> different than the value stored in t Using a temporary variable solves the problem and gives a faster code. 17: 48 85 ff test %rdi,%rdi 1a: 48 8b 05 00 00 00 00 mov 0(%rip),%rax # __vsyscall_gtod_data.wall_time_tv.tv_sec 21: 74 03 je 26 23: 48 89 07 mov %rax,(%rdi) 26: c9 leaveq 27: c3 retq Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>	2007-05-02 19:27:11 +02:00
Adrian Bunk	bd8559c38e	[PATCH] x86: remove UNEXPECTED_IO_APIC() Many years ago, UNEXPECTED_IO_APIC() contained printk()'s (but nothing more). Now that it's completely empty for years, we can as well remove it. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:11 +02:00
Adrian Bunk	ca906e4231	[PATCH] x86: sys_ioperm() prototype cleanup - there's no reason for duplicating the prototype from include/linux/syscalls.h in include/asm-x86_64/unistd.h - every file should #include the headers containing the prototypes for it's global functions Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:10 +02:00
Christoph Lameter	2bff73830c	[PATCH] x86-64: use lru instead of page->index and page->private for pgd lists management. x86_64 currently simulates a list using the index and private fields of the page struct. Seems that the code was inherited from i386. But x86_64 does not use the slab to allocate pgds and pmds etc. So the lru field is not used by the slab and therefore available. This patch uses standard list operations on page->lru to realize pgd tracking. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:10 +02:00
Andi Kleen	b8716890f3	[PATCH] x86-64: Remove unused stext symbol suggested by Jan Beulich Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:10 +02:00
Jan Beulich	6fb14755a6	[PATCH] x86: tighten kernel image page access rights On x86-64, kernel memory freed after init can be entirely unmapped instead of just getting 'poisoned' by overwriting with a debug pattern. On i386 and x86-64 (under CONFIG_DEBUG_RODATA), kernel text and bug table can also be write-protected. Compared to the first version, this one prevents re-creating deleted mappings in the kernel image range on x86-64, if those got removed previously. This, together with the original changes, prevents temporarily having inconsistent mappings when cacheability attributes are being changed on such pages (e.g. from AGP code). While on i386 such duplicate mappings don't exist, the same change is done there, too, both for consistency and because checking pte_present() before using various other pte_XXX functions is a requirement anyway. At once, i386 code gets adjusted to use pte_huge() instead of open coding this. AK: split out cpa() changes Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:10 +02:00
Jan Beulich	d01ad8dd56	[PATCH] x86: Improve handling of kernel mappings in change_page_attr Fix various broken corner cases in i386 and x86-64 change_page_attr. AK: split off from tighten kernel image access rights Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:10 +02:00
Bernhard Walle	8f9aeca7a0	[PATCH] x86: add command line length to boot protocol Because the command line is increased to 2048 characters after 2.6.21, it's not possible for boot loaders and userspace tools to determine the length of the command line the kernel can understand. The benefit of knowing the length is that users can be warned if the command line size is too long which prevents surprise if things don't work after bootup. This patch updates the boot protocol to contain a field called "cmdline_size" that contain the length of the command line (excluding the terminating zero). The patch also adds missing fields (of protocol version 2.05) to the x86_64 setup code. Signed-off-by: Bernhard Walle <bwalle@suse.de> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Alon Bar-Lev <alon.barlev@gmail.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:10 +02:00
Joerg Roedel	d824395c59	[PATCH] x86: remove constant_tsc reporting from /proc/cpuinfo' power flags remove the reporting of the constant_tsc flag from the "power management" field in /proc/cpuinfo. The NULL value there was replaced by "" because the former would result in a printout of [8] if the flag is set. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:09 +02:00
David Rientjes	382591d500	[PATCH] x86-64: fixed size remaining fake nodes Extends the numa=fake x86_64 command-line option to split the remaining system memory into nodes of fixed size. Any leftover memory is allocated to a final node unless the command-line ends with a comma. For example: numa=fake=2512,128 gives two 512M nodes and the remaining system memory is split into nodes of 128M each. This is beneficial for systems where the exact size of RAM is unknown or not necessarily relevant, but the size of the remaining nodes to be allocated is known based on their capacity for resource management. Cc: Andi Kleen <ak@suse.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:09 +02:00
David Rientjes	14694d736b	[PATCH] x86-64: split remaining fake nodes equally Extends the numa=fake x86_64 command-line option to split the remaining system memory into equal-sized nodes. For example: numa=fake=2512,4 gives two 512M nodes and the remaining system memory is split into four approximately equal chunks. This is beneficial for systems where the exact size of RAM is unknown or not necessarily relevant, but the granularity with which nodes shall be allocated is known. Cc: Andi Kleen <ak@suse.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:09 +02:00
David Rientjes	8b8ca80e19	[PATCH] x86-64: configurable fake numa node sizes Extends the numa=fake x86_64 command-line option to allow for configurable node sizes. These nodes can be used in conjunction with cpusets for coarse memory resource management. The old command-line option is still supported: numa=fake=32 gives 32 fake NUMA nodes, ignoring the NUMA setup of the actual machine. But now you may configure your system for the node sizes of your choice: numa=fake=2512,1024,2256 gives two 512M nodes, one 1024M node, two 256M nodes, and the rest of system memory to a sixth node. The existing hash function is maintained to support the various node sizes that are possible with this implementation. Each node of the same size receives roughly the same amount of available pages, regardless of any reserved memory with its address range. The total available pages on the system is calculated and divided by the number of equal nodes to allocate. These nodes are then dynamically allocated and their borders extended until such time as their number of available pages reaches the required size. Configurable node sizes are recommended when used in conjunction with cpusets for memory control because it eliminates the overhead associated with scanning the zonelists of many smaller full nodes on page_alloc(). Cc: Andi Kleen <ak@suse.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Paul Jackson <pj@sgi.com> Cc: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:09 +02:00
Adrian Bunk	786142fab8	[PATCH] x86-64: make simnow_init() static Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:09 +02:00
Yinghai Lu	f0e13ae76a	[PATCH] x86-64: remove extra smp_processor_id calling Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andi Kleen <ak@muc.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:08 +02:00
Ralf Baechle	9f7290ed23	[PATCH] x86-64: fix ia32_binfmt.c build error Reorder code to avoid multiple inclusion of elf.h. #undef several symbols to avoid build errors over redefinitions. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:08 +02:00
john stultz	5a90cf205c	[PATCH] x86: Log reason why TSC was marked unstable Change mark_tsc_unstable() so it takes a string argument, which holds the reason the TSC was marked unstable. This is then displayed the first time mark_tsc_unstable is called. This should help us better debug why the TSC was marked unstable on certain systems and allow us to make sure we're not being overly paranoid when throwing out this troublesome clocksource. Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:08 +02:00
Andi Kleen	d039c688c6	[PATCH] x86-64: Minor white space cleanup in traps.c Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:08 +02:00
Andi Kleen	fb60b8392c	[PATCH] x86-64: Allow sys_uselib unconditionally Previously it wasn't enabled in the binfmt_aout is a module case. Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:08 +02:00
Andi Kleen	1652fcbf37	[PATCH] x86-64: Don't disable basic block reordering When compiling with -Os (which is default) the compiler defaults to it anyways. And with -O2 it probably generates somewhat better (although also larger) code. Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:08 +02:00
Vivek Goyal	a4831e08b7	[PATCH] x86-64: Move cpu verification code to common file o This patch moves the code to verify long mode and SSE to a common file. This code is now shared by trampoline.S, wakeup.S, boot/setup.S and boot/compressed/head.S o So far we used to do very limited check in trampoline.S, wakeup.S and in 32bit entry point. Now all the entry paths are forced to do the exhaustive check, including SSE because verify_cpu is shared. o I am keeping this patch as last in the x86 relocatable series because previous patches have got quite some amount of testing done and don't want to distrub that. So that if there is problem introduced by this patch, at least it can be easily isolated. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:08 +02:00
Vivek Goyal	8035d3ea78	[PATCH] x86-64: Extend bzImage protocol for relocatable bzImage o Extend the bzImage protocol (same as i386) to allow bzImage loaders to load the protected mode kernel at non-1MB address. Now protected mode component is relocatable and can be loaded at non-1MB addresses. o As of today kdump uses it to run a second kernel from a reserved memory area. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:08 +02:00
Vivek Goyal	6a50a664ca	[PATCH] x86-64: build-time checking o X86_64 kernel should run from 2MB aligned address for two reasons. - Performance. - For relocatable kernels, page tables are updated based on difference between compile time address and load time physical address. This difference should be multiple of 2MB as kernel text and data is mapped using 2MB pages and PMD should be pointing to a 2MB aligned address. Life is simpler if both compile time and load time kernel addresses are 2MB aligned. o Flag the error at compile time if one is trying to build a kernel which does not meet alignment restrictions. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2007-05-02 19:27:08 +02:00
Vivek Goyal	1ab60e0f72	[PATCH] x86-64: Relocatable Kernel Support This patch modifies the x86_64 kernel so that it can be loaded and run at any 2M aligned address, below 512G. The technique used is to compile the decompressor with -fPIC and modify it so the decompressor is fully relocatable. For the main kernel the page tables are modified so the kernel remains at the same virtual address. In addition a variable phys_base is kept that holds the physical address the kernel is loaded at. __pa_symbol is modified to add that when we take the address of a kernel symbol. When loaded with a normal bootloader the decompressor will decompress the kernel to 2M and it will run there. This both ensures the relocation code is always working, and makes it easier to use 2M pages for the kernel and the cpu. AK: changed to not make RELOCATABLE default in Kconfig Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	0dbf7028c0	[PATCH] x86: __pa and __pa_symbol address space separation Currently __pa_symbol is for use with symbols in the kernel address map and __pa is for use with pointers into the physical memory map. But the code is implemented so you can usually interchange the two. __pa which is much more common can be implemented much more cheaply if it is it doesn't have to worry about any other kernel address spaces. This is especially true with a relocatable kernel as __pa_symbol needs to peform an extra variable read to resolve the address. There is a third macro that is added for the vsyscall data __pa_vsymbol for finding the physical addesses of vsyscall pages. Most of this patch is simply sorting through the references to __pa or __pa_symbol and using the proper one. A little of it is continuing to use a physical address when we have it instead of recalculating it several times. swapper_pgd is now NULL. leave_mm now uses init_mm.pgd and init_mm.pgd is initialized at boot (instead of compile time) to the physmem virtual mapping of init_level4_pgd. The physical address changed. Except for the for EMPTY_ZERO page all of the remaining references to __pa_symbol appear to be during kernel initialization. So this should reduce the cost of __pa in the common case, even on a relocated kernel. As this is technically a semantic change we need to be on the lookout for anything I missed. But it works for me (tm). Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	49c3df6aaa	[PATCH] x86: Move swsusp __pa() dependent code to arch portion o __pa() should be used only on kernel linearly mapped virtual addresses and not on kernel text and data addresses. o Hibernation code needs to determine the physical address associated with kernel symbol to mark a section boundary which contains pages which don't have to be saved and restored during hibernate/resume operation. o Move this piece of code in arch dependent section. So that architectures which don't have kernel text/data mapped into kernel linearly mapped region can come up with their own ways of determining physical addresses associated with a kernel text. Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	cfd243d4af	[PATCH] x86-64: Remove the identity mapping as early as possible With the rewrite of the SMP trampoline and the early page allocator there is nothing that needs identity mapped pages, once we start executing C code. So add zap_identity_mappings into head64.c and remove zap_low_mappings() from much later in the code. The functions are subtly different thus the name change. This also kills boot_level4_pgt which was from an earlier attempt to move the identity mappings as early as possible, and is now no longer needed. Essentially I have replaced boot_level4_pgt with trampoline_level4_pgt in trampoline.S Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	bdb96a6614	[PATCH] x86-64: Modify discover_ebda to use virtual addresses Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	d8e1baf10d	[PATCH] x86-64: 64bit ACPI wakeup trampoline o Moved wakeup_level4_pgt into the wakeup routine so we can run the kernel above 4G. o Now we first go to 64bit mode and continue to run from trampoline and then then start accessing kernel symbols and restore processor context. This enables us to resume even in relocatable kernel context when kernel might not be loaded at physical addr it has been compiled for. o Removed the need for modifying any existing kernel page table. o Increased the size of the wakeup routine to 8K. This is required as wake page tables are on trampoline itself and they got to be at 4K boundary, hence one page is not sufficient. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	275f55170e	[PATCH] x86-64: wakeup.S misc cleanups o Various cleanups. One of the main purpose of cleanups is that make wakeup.S as close as possible to trampoline.S. o Following are the changes - Indentations for comments. - Changed the gdt table to compact form and to resemble the one in trampoline.S - Take the jump to 32bit from real mode using ljmpl. Makes code more readable. - After enabling long mode, directly take a long jump for 64bit mode. No need to take an extra jump to "reach_comaptibility_mode" - Stack is not used after real mode. So don't load stack in 32 bit mode. - No need to enable PGE here. - No need to do extra EFER read, anyway we trash the read contents. - No need to enable system call (EFER_SCE). Anyway it will be enabled when original EFER is restored. - No need to set MP, ET, NE, WP, AM bits in cr0. Very soon we will reload the original cr0 while restroing the processor state. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	7db681d7e4	[PATCH] x86-64: wakeup.S rename registers to reflect right names o Use appropriate names for 64bit regsiters. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	7c17e70613	[PATCH] x86-64: Get rid of dead code in suspend resume o Get rid of dead code in wakeup.S o We never restore from saved_gdt, saved_idt, saved_ltd, saved_tss, saved_cr3, saved_cr4, saved_cr0, real_save_gdt, saved_efer, saved_efer2. Get rid of of associated code. o Get rid of bogus_magic, bogus_31_magic and bogus_magic2. No longer being used. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	90b1c2085e	[PATCH] x86-64: 64bit PIC SMP trampoline This modifies the SMP trampoline and all of the associated code so it can jump to a 64bit kernel loaded at an arbitrary address. The dependencies on having an idenetity mapped page in the kernel page tables for SMP bootup have all been removed. In addition the trampoline has been modified to verify that long mode is supported. Asking if long mode is implemented is down right silly but we have traditionally had some of these checks, and they can't hurt anything. So when the totally ludicrous happens we just might handle it correctly. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	3c321bceb4	[PATCH] x86-64: Add EFER to the register set saved by save_processor_state EFER varies like %cr4 depending on the cpu capabilities, and which cpu capabilities we want to make use of. So save/restore it make certain we have the same EFER value when we are done. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	30f4728954	[PATCH] x86-64: cleanup segments Move __KERNEL32_CS up into the unused gdt entry. __KERNEL32_CS is used when entering the kernel so putting it first is useful when trying to keep boot gdt sizes to a minimum. Set the accessed bit on all gdt entries. We don't care so there is no need for the cpu to burn the extra cycles, and it potentially allows the pages to be immutable. Plus it is confusing when debugging and your gdt entries mysteriously change. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	278c0eb7f9	[PATCH] x86-64: modify copy_bootdata to use virtual addresses Use virtual addresses instead of physical addresses in copy bootdata. In addition fix the implementation of the old bootloader convention. Everything is at real_mode_data always. It is just that sometimes real_mode_data was relocated by setup.S to not sit at 0x90000. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:07 +02:00
Vivek Goyal	93fd755e47	[PATCH] x86-64: Fix early printk to use standard ISA mapping Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:06 +02:00
Vivek Goyal	67dcbb6bc6	[PATCH] x86-64: Clean up the early boot page table - Merge physmem_pgt and ident_pgt, removing physmem_pgt. The merge is broken as soon as mm/init.c:init_memory_mapping is run. - As physmem_pgt is gone don't export it in pgtable.h. - Use defines from pgtable.h for page permissions. - Fix the physical memory identity mapping so it is at the correct address. - Remove the physical memory mapping from wakeup_level4_pgt it is at the wrong address so we can't possibly be usinging it. - Simply NEXT_PAGE the work to calculate the phys_ alias of the labels was very cool. Unfortuantely it was a brittle special purpose hack that makes maitenance more difficult. Instead just use label - __START_KERNEL_map like we do everywhere else in assembly. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:06 +02:00
Vivek Goyal	dafe41ee3a	[PATCH] x86-64: Kill temp boot pmds Early in the boot process we need the ability to set up temporary mappings, before our normal mechanisms are initialized. Currently this is used to map pages that are part of the page tables we are building and pages during the dmi scan. The core problem is that we are using the user portion of the page tables to implement this. Which means that while this mechanism is active we cannot catch NULL pointer dereferences and we deviate from the normal ways of handling things. In this patch I modify early_ioremap to map pages into the kernel portion of address space, roughly where we will later put modules, and I make the discovery of which addresses we can use dynamic which removes all kinds of static limits and remove the dependencies on implementation details between different parts of the code. Now alloc_low_page() and unmap_low_page() use early_iomap() and early_iounmap() to allocate/map and unmap a page. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:06 +02:00
Stephen Hemminger	e658450455	[PATCH] x86-64: dma_ops as const The dma_ops structure can be const since it never changes after boot. Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:06 +02:00
Joerg Roedel	6b37f5a20c	[PATCH] x86-64: fix cpu MHz reporting on constant_tsc cpus This patch fixes the reporting of cpu_mhz in /proc/cpuinfo on CPUs with a constant TSC rate and a kernel with disabled cpufreq. Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Andi Kleen <ak@suse.de> arch/x86_64/kernel/apic.c \| 2 - arch/x86_64/kernel/time.c \| 58 +++++++++++++++++++++++++++++++++++++++--- arch/x86_64/kernel/tsc.c \| 12 +++++--- arch/x86_64/kernel/tsc_sync.c \| 2 - include/asm-x86_64/proto.h \| 1 5 files changed, 65 insertions(+), 10 deletions(-)	2007-05-02 19:27:06 +02:00
Andi Kleen	d9c93813ac	[PATCH] x86-64: Correct max number of CPUs in Kconfig Pointed out by Adrian Bunk Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:05 +02:00
Stephane Eranian	405e494d91	[PATCH] x86-64: x86_64 make NMI use PERFCTR1 for architectural perfmon (take 2) Hello, This patch against 2.6.20-git14 makes the NMI watchdog use PERFSEL1/PERFCTR1 instead of PERFSEL0/PERFCTR0 on processors supporting Intel architectural perfmon, such as Intel Core 2. Although all PMU events can work on both counters, the Precise Event-Based Sampling (PEBS) requires that the event be in PERFCTR0 to work correctly (see section 18.14.4.1 in the IA32 SDM Vol 3b). This versions has 3 chunks compared to previous where we had missed on check. Changelog: - make the x86-64 NMI watchdog use PERFSEL1/PERFCTR1 instead of PERFSEL0/PERFCTR0 on processors supporting the Intel architectural perfmon (e.g. Core 2 Duo). This allows PEBS to work when the NMI watchdog is active. signed-off-by: stephane eranian <eranian@hpl.hp.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:05 +02:00
Andi Kleen	803d80f650	[PATCH] x86-64: Some cleanup in time.c Move prototypes into header files Remove unneeded includes. Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:05 +02:00
Andi Kleen	d189518342	[PATCH] x86: Fix i386 and x86_64 fault information pollution a userspace fault or a kernelspace fault which will result in the immediate death of the process. They should not be filled in as a result of a kernelspace fault which can be fixed up. Otherwise, if the process is handling SIGSEGV and examining the fault information, this can result in the kernel space fault trashing the previously stored fault information if it arrives between the userspace fault happening and the SIGSEGV being delivered to the process. Signed-off-by: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: Jan Beulich <jbeulich@novell.com> -- arch/i386/kernel/traps.c \| 24 ++++++++++++++++++------ arch/x86_64/kernel/traps.c \| 30 +++++++++++++++++++++++------- 2 files changed, 41 insertions(+), 13 deletions(-)	2007-05-02 19:27:05 +02:00
Jan Beulich	3755090722	[PATCH] x86-64: a few missing entry.S annotations Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:05 +02:00
Jan Beulich	9964cf7d77	[PATCH] x86: consolidate smp_send_stop() Synchronize i386's smp_send_stop() with x86-64's in only try-locking the call lock to prevent deadlocks when called from panic(). In both version, disable interrupts before clearing the CPU off the online map to eliminate races with IRQ handlers inspecting this map. Also in both versions, save/restore interrupts rather than disabling/ enabling them. On x86-64, eliminate one function used here by folding it into its single caller, convert to static, and rename for consistency with i386 (lkcd may like this). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:05 +02:00
Jan Beulich	b0354795c9	[PATCH] x86-64: adjust inclusion of asm/vsyscall32.h Avoid including asm/vsyscall32.h in virtually every source file. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:04 +02:00
Jan Beulich	00f1ea6967	[PATCH] x86: adjust inclusion of asm/fixmap.h Move inclusion of asm/fixmap.h to where it is really used rather than where it may have been used long ago (requires a few other adjustments to includes due to previous implicit dependencies). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:04 +02:00
Ingo Molnar	3c43f03908	[PATCH] x86: default to physical mode on hotplug CPU kernels Default to physical mode on hotplug CPU kernels. Furher simplify and clean up the APIC initialization code. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: "Li, Shaohua" <shaohua.li@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2007-05-02 19:27:04 +02:00
Ingo Molnar	424df39010	[PATCH] x86-64: remove clustered APIC mode Remove now unused clustered APIC mode code. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: "Li, Shaohua" <shaohua.li@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2007-05-02 19:27:04 +02:00
Ingo Molnar	07c7c47444	[PATCH] x86-64: always use physical delivery mode on > 8 CPUs Remove clustered APIC mode. There's little point in the use of clustered APIC mode, broadcasting is limited to within the cluster only, and chipsets have bugs in this area as well. So default to physical APIC mode when the CPU count is large, and default to logical APIC mode when the CPU count is 8 or smaller. (this patch only removes the use of genapic_cluster and cleans up the resulting genapic.c file - removal of all remaining traces of clustered mode will be done by another patch.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: "Li, Shaohua" <shaohua.li@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2007-05-02 19:27:04 +02:00
Ingo Molnar	f18d397e6a	[PATCH] x86-64: optimize & fix APIC mode setup Fix a couple of inconsistencies/problems I found while reviewing the x86_64 genapic code (when I was chasing mysterious eth0 timeouts that would only trigger if CPU_HOTPLUG is enabled): - AMD systems defaulted to the slower flat-physical mode instead of the flat-logical mode. The only restriction on AMD systems is that they should not use clustered APIC mode. - removed the CPU hotplug hacks, switching the default for small systems back from phys-flat to logical-flat. The switching to logical flat mode on small systems fixed sporadic ethernet driver timeouts i was getting on a dual-core Athlon64 system: NETDEV WATCHDOG: eth0: transmit timed out eth0: Transmit timeout, status 0c 0005 c07f media 80. eth0: Tx queue start entry 32 dirty entry 28. eth0: Tx descriptor 0 is 0008a04a. (queue head) eth0: Tx descriptor 1 is 0008a04a. eth0: Tx descriptor 2 is 0008a04a. eth0: Tx descriptor 3 is 0008a04a. eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1 - The use of '<= 8' was a bug by itself (the valid APIC ids for logical flat mode go from 0 to 7, not 0 to 8). The new logic is to use logical flat mode on both AMD and Intel systems, and to only switch to physical mode when logical mode cannot be used. If CPU hotplug is racy wrt. APIC shutdown then CPU hotplug needs fixing, not the whole IRQ system be made inconsistent and slowed down. - minor cleanups: simplified some code constructs build & booted on a couple of AMD and Intel SMP systems. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: "Li, Shaohua" <shaohua.li@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@osdl.org>	2007-05-02 19:27:04 +02:00
Andrew Morton	a86f34b49f	[PATCH] x86: revert x86_64-mm-fix-the-irqbalance-quirk-for-e7320-e7520-e7525 Obsoleted by Ingo's genapic stuff. Cc: Ingo Molnar <mingo@elte.hu> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: "Li, Shaohua" <shaohua.li@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:04 +02:00
Andrew Morton	3dc68d9b58	[PATCH] x86-64: revert x86_64-mm-add-genapic_force This is obsoleted by new Ingo genapic patches. Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: "Li, Shaohua" <shaohua.li@intel.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:04 +02:00
Andi Kleen	2ca8c1a126	[PATCH] x86-64: Update defconfig Signed-off-by: Andi Kleen <ak@suse.de>	2007-05-02 19:27:04 +02:00
Jeff Garzik	8cdfb29c0c	libata/IDE: remove combined mode quirk Both old-IDE and libata should be able handle all controllers and devices found using normal resource reservation methods. This eliminates the awful, low-performing split-driver configuration where old-IDE drove the PATA portion of a PCI device, in PIO-only mode, and libata drove the SATA portion of the /same/ PCI device, in DMA mode. Typically vendors would ship SATA hard drive / PATA optical configuration, which would lend itself to slow (PIO-only) CD-ROM performance. For Intel users running in combined mode, it is now wholly dependent on your driver choice (potentially link order, if you compile both drivers in) whether old-IDE or libata will drive your hardware. In either case, you will get full performance from both SATA and PATA ports now, without having to pass a kernel command line parameter. Signed-off-by: Jeff Garzik <jeff@garzik.org>	2007-04-28 14:15:59 -04:00
Mike Frysinger	9101be532a	[CPUFREQ] cleanup kconfig options Adds proper lines to help output of kconfig so people can find the module names. Also fixed some broken leading spaces versus tabs. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Dave Jones <davej@redhat.com>	2007-04-26 14:32:03 -04:00
Andi Kleen	90767bd13f	[PATCH] x86-64: Always flush all pages in change_page_attr change_page_attr on x86-64 only flushed the TLB for pages that got reverted. That's not correct: it has to be flushed in all cases. This bug was added in some earlier changes. Just flush all pages for now. This could be done more efficiently, but for this late in the release this seem to be the best fix. Pointed out by Jan Beulich Signed-off-by: Andi Kleen <ak@suse.de>	2007-04-24 13:05:37 +02:00
Joachim Deguara	cf6387daf8	[PATCH] x86-64: make GART PTEs uncacheable This patches fixes the silent data corruption problems being seen using the GART iommu where 4kB of data where incorrect (seen mostly on Nvidia CK804 systems). This fix, to mark the memory regin the GART PTEs reside on as uncacheable, also brings the code in line with the AGP specification. Signed-off-by: Joachim Deguara <joachim.deguara@amd.com> Signed-off-by: Andi Kleen <ak@suse.de>	2007-04-24 13:05:36 +02:00
Linus Torvalds	80d74d5123	Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: [BRIDGE]: Unaligned access when comparing ethernet addresses [SCTP]: Unmap v4mapped addresses during SCTP_BINDX_REM_ADDR operation. [SCTP]: Fix assertion (!atomic_read(&sk->sk_rmem_alloc)) failed message [NET]: Set a separate lockdep class for neighbour table's proxy_queue [NET]: Fix UDP checksum issue in net poll mode. [KEY]: Fix conversion between IPSEC_MODE_xxx and XFRM_MODE_xxx. [NET]: Get rid of alloc_skb_from_cache	2007-04-17 16:51:32 -07:00
Linus Torvalds	71bfa15142	Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6 * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: [PATCH] x86: Fix potential overflow in perfctr reservation [PATCH] x86: Fix gcc 4.2 _proxy_pda workaround	2007-04-17 16:44:05 -07:00
Herbert Xu	b4dfa0b1fb	[NET]: Get rid of alloc_skb_from_cache Since this was added originally for Xen, and Xen has recently (~2.6.18) stopped using this function, we can safely get rid of it. Good timing too since this function has started to bit rot. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2007-04-17 13:13:16 -07:00
Badari Pulavarty	6f29e35e2d	cache_k8_northbridges() overflows beyond allocation cache_k8_northbridges() is storing config values to incorrect locations (in flush_words) and also its overflowing beyond the allocation, causing slab verification failures. Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-16 18:09:18 -07:00
Andi Kleen	1714f9bfc9	[PATCH] x86: Fix potential overflow in perfctr reservation While reviewing this code again I found a potential overflow of the bitmap. The p4 oprofile can theoretically set bits beyond the reservation bitmap for specific configurations. Avoid that by sizing the bitmaps properly. Signed-off-by: Andi Kleen <ak@suse.de>	2007-04-16 10:30:27 +02:00
Andi Kleen	08269c6d38	[PATCH] x86: Fix gcc 4.2 _proxy_pda workaround Due to an over aggressive optimizer gcc 4.2 cannot optimize away _proxy_pda in all cases (counter intuitive, but true). This breaks loading of some modules. The earlier workaround to just export a dummy symbol didn't work unfortunately because the module code ignores exports with 0 value. Make it 1 instead. Signed-off-by: Andi Kleen <ak@suse.de>	2007-04-16 10:30:27 +02:00
Ravikiran G Thirumalai	c9c57929d2	failsafe mechanism to HPET clock calibration Provide a failsafe mechanism to avoid kernel spinning forever at read_hpet_tsc during early kernel bootup. This failsafe mechanism was originally introduced in commit `2f7a2a79c3`, but looks like the hpet split from time.c lost it again. This reintroduces the failsafe mechanism Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org> Signed-off-by: Shai Fultheim <shai@scalex86.org> Cc: Jack Steiner <steiner@sgi.com> Cc: john stultz <johnstul@us.ibm.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-14 21:44:03 -07:00
Andrew Morton	c993c7355d	[PATCH] x86_64 early quirks: fix early_qrk[] section tag WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text:nvidia_bugs from .data between 'early_qrk' (at offset 0x8428) and 'enable_cpu_hotplug' WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text:via_bugs from .data between 'early_qrk' (at offset 0x8438) and 'enable_cpu_hotplug' WARNING: arch/x86_64/kernel/built-in.o - Section mismatch: reference to .init.text:ati_bugs from .data between 'early_qrk' (at offset 0x8448) and 'enable_cpu_hotplug' The compiler is putting it into .data because the __initdata is in the wrong place. Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-04-08 19:47:55 -07:00
Zwane Mwaikambo	a369a7100d	[PATCH] x86: Don't probe for DDC on VBE1.2 VBE1.2 doesn't support function 15h (DDC) resulting in a 'hang' whilst uncompressing kernel with some video cards. Make sure we check VBE version before fiddling around with DDC. http://bugzilla.kernel.org/show_bug.cgi?id=1458 Opened: 2003-10-30 09:12 Last update: 2007-02-13 22:03 Much thanks to Tobias Hain for help in testing and investigating the bug. Tested on; i386, Chips & Technologies 65548 VESA VBE 1.2 CONFIG_VIDEO_SELECT=Y CONFIG_FIRMWARE_EDID=Y Untested on x86_64. Signed-off-by: Zwane Mwaikambo <zwane@infradead.org> Signed-off-by: Andi Kleen <ak@suse.de>	2007-04-02 12:14:12 +02:00
Andi Kleen	0fb2ebfcb5	[PATCH] x86-64: Increase NMI watchdog probing timeout A 4 core Opteron needs longer than 10 ticks for this. Signed-off-by: Andi Kleen <ak@suse.de>	2007-04-02 12:14:12 +02:00
Andi Kleen	89e07569e4	[PATCH] x86-64: Let oprofile reserve MSR on all CPUs The MSR reservation is per CPU and oprofile would only allocate them on the CPU it was initialized on. Change this to handle all CPUs. This also fixes a warning about unprotected use of smp_processor_id() in preemptible kernels. Signed-off-by: Andi Kleen <ak@suse.de>	2007-04-02 12:14:12 +02:00
Yinghai Lu	c97beb4710	[PATCH] x86_64 irq: Fix comments after changing IRQ0_VECTOR from 0x20 to 0x30 Signed-off-by: Yinghai Lu <yinghai.lu@amd.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-29 08:16:23 -07:00
Rafael J. Wysocki	436ce71638	[PATCH] Revert "swsusp: disable nonboot CPUs before entering platform suspend" This reverts commit `94985134b7` and insteads removes the WARN_ON() that caused that commit in the first place. The problem is that we call disable_nonboot_cpus() in swsusp before powering down the system in order to avoid triggering the WARN_ON() in arch/x86_64/kernel/acpi/sleep.c:init_low_mapping() and this doesn't work well on Thomas' system. So instead, remove the WARN_ON() in arch/x86_64/kernel/acpi/sleep.c: init_low_mapping(), which triggers every time during the suspend to disk in the platform mode, as the potential problem it is related to doesn't seem to occur in practice. [ I think we might want to disallow the case of multiple users of that mm, or something. Normally, playing with the current process page tables on the current CPU should be fine as long as we don't have other threads using those tables at the same time.. Anyway, not pretty, but better than the warning or the lockup - Linus ] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-27 09:20:03 -07:00
Thomas Gleixner	f33bc55c47	[PATCH] x86_64: avoid sending LOCAL_TIMER_VECTOR IPI to itself Ray Lee reported, that on an UP kernel with "noapic" command line option set, the box locks hard during boot. Adding some debug printks revealed, that the last action on the box before stalling was "Send IPI" - a debug printk which was put into smp_send_timer_broadcast_ipi(). It seems that send_IPI_mask(mask, LOCAL_TIMER_VECTOR) fails when "noapic" is set on the command line on an UP kernel. Aside of that it does not make much sense to trigger an interrupt instead of calling the function directly on the CPU which gets the PIT/HPET interrupt in case of broadcasting. Reported-by: Ray Lee <ray-lk@madrabbit.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ray Lee <ray-lk@madrabbit.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-23 20:21:45 -07:00
Linus Torvalds	2e7c28382b	x86-64: add "local_apic_timer_c2_ok" here too Needed for any architecture that claims ARCH_APICTIMER_STOPS_ON_C3, not just i386. I'm hoping Thomas will clean this up a bit later.. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-23 11:32:31 -07:00
Mathieu Desnoyers	303cd1535f	[PATCH] Fix atomicity of TIF update in flush_thread() for x86_64 Fix atomicity of TIF update in flush_thread() for x86_64 Race : parent process executing : sys_ptrace() (lock_kernel()) (ptrace_get_task_struct(pid)) arch_ptrace() ptrace_detach() ptrace_disable(child); clear_singlestep(child); clear_tsk_thread_flag(child, TIF_SINGLESTEP); (which clears the TIF_SINGLESTEP flag atomically from a different process) (put_task_struct(child)) (unlock_kernel()) And at the same time, in the child process : sys_execve() do_execve() search_binary_handler() load_elf_binary() flush_old_exec() flush_thread() doing a non-atomic thread flag update Signed-off-by: Rebecca Schultz <rschultz@google.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-03-18 11:35:08 -07:00

1 2 3 4 5 ...

1508 Commits