For NUMA optimization and some other algorithms it is useful to have a fast
to get the current CPU and node numbers in user space.
x86-64 added a fast way to do this in a vsyscall. This adds a generic
syscall for other architectures to make it a generic portable facility.
I expect some of them will also implement it as a faster vsyscall.
The cache is an optimization for the x86-64 vsyscall optimization. Since
what the syscall returns is an approximation anyways and user space
often wants very fast results it can be cached for some time. The norma
methods to get this information in user space are relatively slow
The vsyscall is in a better position to manage the cache because it has direct
access to a fast time stamp (jiffies). For the generic syscall optimization
it doesn't help much, but enforce a valid argument to keep programs
portable
I only added an i386 syscall entry for now. Other architectures can follow
as needed.
AK: Also added some cleanups from Andrew Morton
Signed-off-by: Andi Kleen <ak@suse.de>
This patch adds a vgetcpu vsyscall, which depending on the CPU RDTSCP
capability uses either the RDTSCP or CPUID to obtain a CPU and node
numbers and pass them to the program.
AK: Lots of changes over Vojtech's original code:
Better prototype for vgetcpu()
It's better to pass the cpu / node numbers as separate arguments
to avoid mistakes when going from SMP to NUMA.
Also add a fast time stamp based cache using a user supplied
argument to speed things more up.
Use fast method from Chuck Ebbert to retrieve node/cpu from
GDT limit instead of CPUID
Made sure RDTSCP init is always executed after node is known.
Drop printk
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch adds initalization of the RDTSCP auxilliary values to CPU numbers
to time.c. If RDTSCP is available, the MSRs are written with the respective
values. It can be later used to initalize per-cpu timekeeping variables.
AK: Some cleanups. Move externs into headers and fix CPU hotplug.
Signed-off-by: Vojtech Pavlik <vojtech@suse.cz>
Signed-off-by: Andi Kleen <ak@suse.de>
AK: This redoes the changes I temporarily reverted.
Intel now has support for Architectural Performance Monitoring Counters
( Refer to IA-32 Intel Architecture Software Developer's Manual
http://www.intel.com/design/pentium4/manuals/253669.htm ). This
feature is present starting from Intel Core Duo and Intel Core Solo processors.
What this means is, the performance monitoring counters and some performance
monitoring events are now defined in an architectural way (using cpuid).
And there will be no need to check for family/model etc for these architectural
events.
Below is the patch to use this performance counters in nmi watchdog driver.
Patch handles both i386 and x86-64 kernels.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
I've had good experiences with having this on by default on x86-64.
It turns nasty hangs into easier to debug oopses.
Enable the local APIC wdog by default for systems newer than 2004.
This comes from a strange compromise: according to arjan the reason
it was off by default was some old IBM systems that corrupted
registered when NMI happened in SMI. Can't remember more specific,
but >= 2004 should avoid these. It's probably overly broad
because most older systems should be ok (and the really old systems
won't be supported by the local apic watchdog anyways)
Signed-off-by: Andi Kleen <ak@suse.de>
After a crash we should wait for NMI IPI event and not for external NMI or
NMI watchdog tick.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
After a crash we should wait for NMI IPI event and not for external NMI or
NMI watchdog tick.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
This patch makes the following needlessly global functions static:
- nmi_int.c: profile_exceptions_notify()
- nmi_timer_int.c: profile_timer_exceptions_notify()
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andi Kleen <ak@suse.de>
When a unknown NMI happened the panic would claim a NMI watchdog timeout.
Also it would check the variable set by nmi_watchdog=panic and panic then.
Fix up the panic message to be generic
Unconditionally panic on unknown NMI when panic on unknown nmi is enabled.
Noticed by Jan Beulich
Cc: jbeulich@novell.com
Signed-off-by: Andi Kleen <ak@suse.de>
Making NMI suspend/resume work with SMP. We use CPU hotplug to offline
APs in SMP suspend/resume. Only BSP executes sysdev's .suspend/.resume
method. APs should follow CPU hotplug code path.
And:
+From: Don Zickus <dzickus@redhat.com>
Makes the start/stop paths of nmi watchdog more robust to handle the
suspend/resume cases more gracefully.
AK: I merged the two patches together
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Clean up some of the output messages on the nmi error paths to make more
sense when they are displayed. This is mainly a cosmetic fix and
shouldn't impact any normal code path.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
To quote Alan Cox:
The default Linux behaviour on an NMI of either memory or unknown is to
continue operation. For many environments such as scientific computing
it is preferable that the box is taken out and the error dealt with than
an uncorrected parity/ECC error get propogated.
A small number of systems do generate NMI's for bizarre random reasons
such as power management so the default is unchanged. In other respects
the new proc/sys entry works like the existing panic controls already in
that directory.
This is separate to the edac support - EDAC allows supported chipsets to
handle ECC errors well, this change allows unsupported cases to at least
panic rather than cause problems further down the line.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Adds a new /proc/sys/kernel/nmi_watchdog call that will enable/disable the
nmi watchdog.
By entering a non-zero value here, a user can enable the nmi watchdog to
monitor the online cpus in the system. By entering a zero value here, a
user can disable the nmi watchdog and free up a performance counter which
could then be utilized by the oprofile subsystem, otherwise oprofile may be
short a counter when in use.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Adds a new /proc/sys/kernel/nmi call that will enable/disable the nmi
watchdog.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Removes the un/set_nmi_callback and reserve/release_lapic_nmi functions as
they are no longer needed. The various subsystems are modified to register
with the die_notifier instead.
Also includes compile fixes by Andrew Morton.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
We need TIF_RESTORE_SIGMASK in order to support ppoll() and pselect()
system calls. This patch originally came from Andi, and was based
heavily on David Howells' implementation of same on i386. I fixed a typo
which was causing do_signal() to use the wrong signal mask.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch cleans up the NMI interrupt path. Instead of being gated by if
the 'nmi callback' is set, the interrupt handler now calls everyone who is
registered on the die_chain and additionally checks the nmi watchdog,
reseting it if enabled. This allows more subsystems to hook into the NMI if
they need to (without being block by set_nmi_callback).
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch includes the changes to make the nmi watchdog on i386 SMP aware.
A bunch of code was moved around to make it simpler to read. In addition,
it is now possible to determine if a particular NMI was the result of the
watchdog or not. This feature allows the kernel to filter out unknown NMIs
easier.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
This patch includes the changes to make the nmi watchdog on x86_64 SMP
aware. A bunch of code was moved around to make it simpler to read. In
addition, it is now possible to determine if a particular NMI was the result
of the watchdog or not. This feature allows the kernel to filter out
unknown NMIs easier.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Incorporates the new performance counter reservation system in oprofile.
Also cleans up a lot of the initialization code. The code original zero'd
out every register associated with performance counters regardless if those
registers were used or not. This causes issues with the nmi watchdog.
Now oprofile tries to reserve registers and gives up if it can't get them.
Cc: levon@movementarian.org
Cc: oprofile-list@lists.sf.net
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Adds basic infrastructure to allow subsystems to reserve performance
counters on the x86 chips. Only UP kernels are supported in this patch to
make reviewing easier. The SMP portion makes a lot more changes.
Think of this as a locking mechanism where each bit represents a different
counter. In addition, each subsystem should also reserve an appropriate
event selection register that will correspond to the performance counter it
will be using (this is mainly neccessary for the Pentium 4 chips as they
break the 1:1 relationship to performance counters).
This will help prevent subsystems like oprofile from interfering with the
nmi watchdog.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
There are some machines around (large xSeries or Unisys ES7000) that
need physical IO-APIC destination mode to access all of their IO
devices. This currently doesn't work in UP kernels as used in
distribution installers.
This patch allows to compile even UP kernels as GENERICARCH which
allows to use physical or clustered APIC mode.
Signed-off-by: Andi Kleen <ak@suse.de>
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SOUND] sparc/amd7930: Use __devinit and __devinitdata as needed.
[SUNLANCE]: Mark sparc_lance_probe_one as __devinit.
[SPARC64]: Fix section-mismatch errors in solaris emul module.
If there is only 1 node in the system cpus should think they are apart of
some other node.
If cases where a real numa system boots the Flat numa option make sure the
cpus don't claim to be apart on a non-existent node.
Signed-off-by: Keith Mannthey <kmannth@us.ibm.com>
Cc: Andy Whitcroft <apw@shadowen.org>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Assume that a cpu is *physically* offlined at boot time...
Because smpboot.c::smp_boot_cpu_map() canoot find cpu's sapicid,
numa.c::build_cpu_to_node_map() cannot build cpu<->node map for
offlined cpu.
For such cpus, cpu_to_node map should be fixed at cpu-hot-add.
This mapping should be done before cpu onlining.
This patch also handles cpu hotremove case.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Problem description:
We have additional_cpus= option for allocating possible_cpus. But nid
for possible cpus are not fixed at boot time. cpus which is offlined at
boot or cpus which is not on SRAT is not tied to its node. This will
cause panic at cpu onlining.
Usually, pxm_to_nid() mapping is fixed at boot time by SRAT.
But, unfortunately, some system (my system!) do not include
full SRAT table for possible cpus. (Then, I use
additiona_cpus= option.)
For such possible cpus, pxm<->nid should be fixed at
hot-add. We now have acpi_map_pxm_to_node() which is also
used at boot. It's suitable here.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With CONFIG_PHYSICAL_START set to a non default values the i386
boot_ioremap code calculated its pte index wrong and users of boot_ioremap
have their areas incorrectly mapped (for me SRAT table not mapped during
early boot). This patch removes the addr < BOOT_PTE_PTRS constraint.
[ Keith says this is applicable to 2.6.16 and 2.6.17 as well ]
Signed-off-by: Keith Mannthey<kmannth@us.ibm.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: <stable@kernel.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
init_socksys() was marked __init but invoked from a
non-__init function.
Use the correct module_{init,exit}() faciltiies while we're
here and eliminate some seriously bogus ifdefs.
Signed-off-by: David S. Miller <davem@davemloft.net>
Unfortunately, sparc64 doesn't have an easy way to do a "64 X 64 -->
128" bit multiply like PowerPC and IA64 do. We were doing a
"64 X 64 --> 64" bit multiple which causes overflow very quickly with
a 30-bit quotient shift.
So use a quotientshift count of 10 instead of 30, just like x86 and
ARM do.
This also fixes the wrapping of printk timestamp values every ~17
seconds.
Signed-off-by: David S. Miller <davem@davemloft.net>
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] sw_any_bug_dmi_table can be used on resume, so it isn't initdata
[CPUFREQ] Fix some more CPU hotplug locking.
[CPUFREQ] Workaround for BIOS bug in software coordination of frequency
[CPUFREQ] Longhaul - Add voltage scaling to driver
[CPUFREQ] Fix sparse warning in ondemand
[CPUFREQ] make drivers/cpufreq/cpufreq_ondemand.c:powersave_bias_target() static
[CPUFREQ] Longhaul - Add ignore_latency option
[CPUFREQ] Longhaul - Disable arbiter
[CPUFREQ][2/2] ondemand: updated add powersave_bias tunable
[CPUFREQ][1/2] ondemand: updated tune for hardware coordination
[CPUFREQ] Fix typo.
iommu_init() and iounit_init() are never called for sun4, but that's not
enough - these calls should be ifdefed out since the functions in question
simply do not exist for CONFIG_SUN4 kernel.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Back when pci_dev had base_address[], loop of form
base = &...->base_address[0];
for (.....) {
...
*base++ = addr;
}
was fine, but when that array got spread in ->resource[...].start
replacing the initialization with
base = &...->resource[0].start;
was not a sufficient modification. IOW this code got broken for cases
when there had been more than one resource to fill. All way back in
2.3.41-pre3...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
sw_any_bug_dmi_table can be used on resume, so it isn't initdata.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Dave Jones <davej@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (64 commits)
[BLOCK] dm-crypt: trivial comment improvements
[CRYPTO] api: Deprecate crypto_digest_* and crypto_alg_available
[CRYPTO] padlock: Convert padlock-sha to use crypto_hash
[CRYPTO] users: Use crypto_comp and crypto_has_*
[CRYPTO] api: Add crypto_comp and crypto_has_*
[CRYPTO] users: Use crypto_hash interface instead of crypto_digest
[SCSI] iscsi: Use crypto_hash interface instead of crypto_digest
[CRYPTO] digest: Remove old HMAC implementation
[CRYPTO] doc: Update documentation for hash and me
[SCTP]: Use HMAC template and hash interface
[IPSEC]: Use HMAC template and hash interface
[CRYPTO] tcrypt: Use HMAC template and hash interface
[CRYPTO] hmac: Add crypto template implementation
[CRYPTO] digest: Added user API for new hash type
[CRYPTO] api: Mark parts of cipher interface as deprecated
[PATCH] scatterlist: Add const to sg_set_buf/sg_init_one pointer argument
[CRYPTO] drivers: Remove obsolete block cipher operations
[CRYPTO] users: Use block ciphers where applicable
[SUNRPC] GSS: Use block ciphers where applicable
[IPSEC] ESP: Use block ciphers where applicable
...
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits)
[S390] hypfs crashes with invalid mount option.
[S390] cio: subchannel evaluation function operates without lock
[S390] cio: always query all paths on path verification.
[S390] cio: update path groups on logical CHPID changes.
[S390] cio: subchannels in no-path state.
[S390] Replace nopav-message on VM.
[S390] set modalias for ccw bus uevents.
[S390] Get rid of DBG macro.
[S390] Use alternative user-copy operations for new hardware.
[S390] Make user-copy operations run-time configurable.
[S390] Cleanup in signal handling code.
[S390] Cleanup in page table related code.
[S390] Linux API for writing z/VM APPLDATA Monitor records.
[S390] xpram off by one error.
[S390] Remove kexec experimental flag.
[S390] cleanup appldata.
[S390] fix typo in vmcp.
[S390] Kernel stack overflow handling.
[S390] qdio slsb processing state.
[S390] Missing initialization in common i/o layer.
...
On detection of an EEH error, some Power4 systems seem to occasionally
want to be reset twice before they report themselves as fully recovered.
This patch re-arranges the code to attempt additional resets if the first
one doesn't take.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch causes fsl_soc.h to import the definition of phys_addr_t
itself, rather than relying on its includer to do so.
Signed-off-by: Scott Wood <scott@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Noticed that the U3_*CFA macros have some typos:
#define U3_HT_CFA0(devfn, off) \
((((unsigned long)devfn) << 8) | offset)
(refers to offset rather than off)
#define U3_AGP_CFA0(devfn, off) \
((1 << (unsigned long)PCI_SLOT(dev_fn)) \
| (((unsigned long)PCI_FUNC(dev_fn)) << 8) \
(refers to dev_fn rather than devfn)
Things happen to work, but there doesn't seem to be any reason these
shouldn't be functions. Overall behavior should be unchanged.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
When there is a PCI-X mode 2 capable device behind the HT<->PCI-X
bridge, the pci core decides that the device has the extended 4K
config space, even though the bus is not operating in mode 2. This is
because the u3_ht pci ops silently accept offsets greater than 255 but
use only the 8 least significant bits, which means reading at offset
0x100 gets the data at offset 0x0, and causes confusion for lspci.
Reject accesses to configuration space offsets greater than 255.
Signed-off-by: Nathan Lynch <ntl@pobox.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch fixes the assignment of pending registers to IRQ numbers for
the IPIC; the code previously assigned all IRQs to the high pending word
regardless of which word the interrupt belonged to.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch changes the io operations so that they are out of line if
CONFIG_PPC_ISERIES is set and includes a firmware feature check in
that case.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Update to the PowerPC PCI error recovery code.
Add code to enable MMIO if a device driver reports that it is capable
of recovering on its own. One anticipated use of this having a device
driver enable MMIO so that it can take a register dump, which might
then be followed by the device driver requesting a full reset.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add wrapper around the rtas call to enable MMIO or DMA on a frozen pci
slot.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Clean up subroutine documentation; mostly formatting changes, with
some new content.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This corrects a pci_dev get/put imbalance that can occur only in
highly unlikely situations (kmalloc failures, pci devices with
overlapping resource addresses). No actual failures seen, this was
spotted during code review.
Signed-off-by: Linas Vepstas <linas@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The stack frame address was being printed incorrectly in the backtrace
option of XMON on PPC. This patch fixes it to print the actual stack
address instead of the address of the local variable that contains it.
Signed-off-by: Josh Boyer <jdub@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Jakub noticed the cputable.c entry for Xilinx Virtex-4 FX was missing
a .platform value, so the AT_PLATFORM value wouldn't be set correctly.
This adds it.
Signed-off-by: Peter Bergner <bergner@vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Cleanup for user headers, as noted:
asm-sh/page.h requires asm-generic/memory_model.h, which does not exist in exported headers
asm-sh/ptrace.h requires asm/ubc.h, which does not exist in exported headers
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
This patch removes obsolete block operations of the simple cipher type
from drivers. These were preserved so that existing users can make a
smooth transition. Now that the transition is complete, they are no
longer needed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds block cipher algorithms for S390. Once all users of the
old cipher type have been converted the existing CBC/ECB non-block cipher
operations will be removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Accelerated versions of crypto algorithms must carry a distinct driver name
and priority in order to distinguish themselves from their generic counter-
part.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the tfm is passed directly to setkey instead of the ctx, we no
longer need to pass the &tfm->crt_flags pointer.
This patch also gets rid of a few unnecessary checks on the key length
for ciphers as the cipher layer guarantees that the key length is within
the bounds specified by the algorithm.
Rather than testing dia_setkey every time, this patch does it only once
during crypto_alloc_tfm. The redundant check from crypto_digest_setkey
is also removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When an invalid mount option is specified, no root inode is created
for hypfs, hypfs_fill_super() returns with -EINVAL and then
hypfs_kill_super() is called. hypfs_kill_super() does not check if
the root inode has been initialized. This patch adds this check.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This introduces new user-copy operations which are optimized for
copying more than 256 Bytes on new hardware.
Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduces a struct uaccess_ops which allows setting user-copy
operations at run-time.
Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Changed and simplified some page table related #defines and code.
Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This patch delivers a new Linux API in the form of a misc char
device that is useable from user space and allows write access
to the z/VM APPLDATA Monitor Records collected by the *MONITOR
System Service of z/VM.
Signed-off-by: Melissa Howland <melissah@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce asm header that contains the appldata data structures and
the diag inline assembly.
Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Substract the size of the initial stack frame from the correct
register. Otherwise we will end up in a program check loop.
Fix the offset into the save area as well.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Convert GET_IPL_DEVICE assembler macro to C function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix compile warning with some configurations:
arch/s390/mm/cmm.c:58: warning: 'cmm_strtoul' defined but not used
Originally cmm_strtoul was introduced because simple_strtoul couldn't
handle strings with hexadecimal numbers that contained a capital 'X'.
Since this is no longer true cmm_strtoul can be removed.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
If do_signal() gets called several times before returning to user space
and no signal is pending (e.g. cancelled by a debugger) syscall restart
handling could be done several times. This would change the user space
PSW to an address prior to the syscall instruction.
Fix this by making sure that syscall restart handling is only done once.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It is now possible to specify a ccw/fcp dump device which is used to
automatically create a system dump in case of a kernel panic. The dump
device can be configured under /sys/firmware/dump.
In addition it is now possible to specify a ccw/fcp device which is used
for the next reboot of Linux. The reipl device can be configured under
/sys/firmware/reipl.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Correct some comments in the hypervisor filesystem.
Signed-off-by: Michael Holzheu <holzheu@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Move initrd if the bitmap of the bootmem allocator would overwrite it.
In addition this patch sets the default size and address of the initrd to 0.
Therefore all boot loaders must set the initrd size and address correctly.
This is especially relevant for ftp boot via HMC/SE, where this change
requires a special patch file entry in the .ins file which sets these two
values contained at address 0x10408 and 0x10410.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Michael Grundy <grundym@us.ibm.com>
Signed-off-by: David Wilder <dwilder@us.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This abstracts the operations used in the bootwrapper, and defines
the operations needed for the bootwrapper to run on an OF platform.
The operations have been divided up into platform ops (platform_ops),
firmware ops (fw_ops), device tree ops (dt_ops), and console ops
(console_ops).
The proper operations will be hooked up at runtime to provide the
functionality that you need.
Signed-off-by: Mark A. Greer <mgreer@mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
There are various places where we want to extract an unsigned long
value from a device-tree property that can be 1 or 2 cells in length.
This replaces some open-coded calculations, and one place where we
assumed without checking that properties were the length we wanted,
with a little of_read_ulong() helper.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This produces essentially the same code and will make the iSeries i/o
consolidation easier.
The count parameter is changed to long since that will produce the same
(better) code on 32 and 64 bit builds.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
The io accessors insw_ns, outsw_ns, insl_ns and outsl_ns are unused
(except for one unnecessary use in drivers/net/3c509.c that is addressed
in a previous patch) and are only defined in powerpc/ppc, so remove them.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
This reverts commits 11012d419c and
40dd2d20f2, which allowed us to use the
MMIO accesses for PCI config cycles even without the area being marked
reserved in the e820 memory tables.
Those changes were needed for EFI-environment Intel macs, but broke some
newer Intel 965 boards, so for now it's better to revert to our old
2.6.17 behaviour and at least avoid introducing any new breakage.
Andi Kleen has a set of patches that work with both EFI and the broken
Intel 965 boards, which will be applied once they get wider testing.
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
(And reset it on new thread creation)
It turns out that eflags is important to save and restore not just
because of iopl, but due to the magic bits like the NT bit, which we
don't want leaking between different threads.
Tested-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 3815/1: headers_install support for ARM
[ARM] 3794/1: S3C24XX: do not defined set_irq_wake when no CONFIG_PM
[ARM] 3793/1: S3C2412: fix wrong serial info struct
[ARM] 3780/1: Fix iop321 cpuid
[ARM] 3786/1: pnx4008: update defconfig
[ARM] 3785/1: S3C2412: Fix idle code as default uses wrong clocks
[ARM] 3784/1: S3C2413: fix config for MACH_S3C2413/MACH_SMDK2413
This patch corrects the buffer length checking in the
sys_getdomainname() implementation for sparc/sparc64.
Signed-off-by: Andy Walker <andy@puszczka.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Ben Dooks
Do not define set_irq_wake as a real function if
the CONFIG_PM option is not set.
Fixes bug reported by Thomas Gleixner.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Dan Williams
commit a6a38a6622 changed the iop321 id to a value that does not work with all platforms. Change the mask to permit bit 11. Tested on an iq80321 600Mhz CRB.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Fix the idle code on the s3c2412 as the default
code is using bits in the CLKCON register that are
no-longer there.
Provide an override for the idle code, and ensure
that the power configuration is set to allow idle
instead of stop or sleep.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
These two machines are identical, and supported
by the SMDK2413 configuration. When MACH_SMDK2413
is selected, we must also select MACH_S3C2413
to allow machine_is_smdk2413() or machine_is_s3c2413()
to work.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This fixes a compile error that only surfaces on CONFIG_SMP=n builds;
<asm/hvcall.h> seems to get pulled in through another header file for
SMP builds. This problem was introduced by the hvcall stats patch.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
It seems that the occasional data corruption observed with the tg3
driver wasn't due to missing barriers after all, but rather seems to
be due to the DART (= IOMMU) in the U4 northbridge reading stale
IOMMU table entries from memory due to a race. This fixes it by
making the CPU read the entry back from memory before using it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This changes the writeX family of functions to have a sync instruction
before the MMIO store rather than after, because the generally expected
behaviour is that the device receiving the MMIO store can be guaranteed
to see the effects of any preceding writes to normal memory.
To preserve ordering between writeX and readX, and to preserve ordering
between preceding stores and the readX, the readX family of functions
have had an sync added before the load.
Although writeX followed by spin_unlock is not officially guaranteed
to keep the writeX inside the spin-locked region unless an mmiowb()
is used, there are currently drivers that depend on the previous
behaviour on powerpc, which was that the mmiowb wasn't actually required.
Therefore we have a per-cpu flag that is set by writeX, cleared by
__raw_spin_lock and mmiowb, and tested by __raw_spin_unlock. If it is
set, __raw_spin_unlock does a sync and clears it.
This changes both 32-bit and 64-bit readX/writeX. 32-bit already has a
sync in __raw_spin_unlock (since lwsync doesn't exist on 32-bit), and thus
doesn't need the per-cpu flag.
Tested on G5 (PPC970) and POWER5.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Export copy_4K_page() for use by modules via copy_page() (such as
CacheFiles).
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
xmon does not print a backtrace per default. This is bad on systems with
USB keyboard, the most needed info about the crash is lost.
print a backtrace during the very first xmon entry.
Booting with xmon=nobt disables the autobacktrace functionality.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Fix the following whitespace warnings when compiling with ARCH=ppc
arch/ppc/Kconfig:1207:warning: leading whitespace ignored
arch/ppc/Kconfig:1226:warning: leading whitespace ignored
arch/ppc/Kconfig:1231:warning: leading whitespace ignored
Also fix a typo ("Supprt").
Signed-off-by: Josh Boyer <jdub@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add instrumentation for hypervisor calls on pseries. Call statistics
include number of calls, wall time and cpu cycles (if available) and
are made available via debugfs. Instrumentation code is behind the
HCALL_STATS config option and has no impact if not enabled.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Base patch for PA6T and PA6T-1682M. This introduces the
arch/powerpc/platform/pasemi directory, together with basic
implementations for various setup.
Much of this was based on other platform code, i.e. Maple, etc.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Reduce default cacheline size on 64-bit powerpc from 128 bytes to 64.
This is the architected minimum. In most cases we'll still end up using
cache line information from the device tree, but defaults are used during
early boot and doing a few dcbst/icbi's too many there won't do any harm.
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The following patch allows XMON to run on the 4xx platform. Tested on
Walnut, Ebony, and Nova (440GX based) eval boards. 440EP, 440SP, and
440SPE boards should work as well. Patch is against 2.6.18-rc6.
Signed-off-by: Josh Boyer <jdub@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Allow the pm_power_off function variable in PPC to work as an override.
This makes the function consistent with the other architectures and it
allows generic poweroff operations (like those provided in IPMI
systems) to work properly on PPC.
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Joseph Barnett <jbarnett@motorola.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In an attempt to make it easier for a power5 optimized app to run on a
power4 or a 970 or random earlier machine, this provides emulation of
the popcntb instruction.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
As part of the new irq code pseries_kexec_cpu_down() was split into a
xics and mpic version. The vpa unregister logic is now only done in the
xics routine, and although that's ok in practice (we don't have SPLPAR
machines with mpic), I'd rather have the two concepts stay separate.
So move the vpa unregister into pseries_kexec_cpu_down(), which gets called
by both the xics and mpic routines. This also gives us an obvious place to
put any new kexec-down logic needed in future.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Call chip->eoi(irq) to clear any pending interrupt in case of kdump
shutdown sequence. chip->end(irq) does not serve this purpose.
Signed-off-by: Mohan Kumar M <mohan@in.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Update PReP defconfig, disable some drivers for hardware that is not
used on those systems; enable SL82C105 IDE driver for Powerstack.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'audit.b29' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current:
[PATCH] sparc64 audit syscall classes hookup
[PATCH] syscall class hookup for all normal targets
... that should do it for all targets; the only remaining issues are
mips (currently treated as non-biarch) and handling of other OS
emulations (OSF/SunOS/Solaris/???). The latter would need to be
assigned new AUDIT_ARCH_... ABI numbers anyway...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Take default arch/*/kernel/audit.c to lib/, have those with special
needs (== biarch) define AUDIT_ARCH in their Kconfig.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
sh64 wasn't providing a sensible pm_power_off(), add one,
and just wrap it to machine_power_off, which already does
the right thing.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
While we've been sorting out the toolchain fiasco, some of
the code has suffered a bit of bitrot. Building with GCC4
also brings up some more build warnings. Trivial fixes for
both issues.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The original sh64 toolchains required that we tune the ISA
level accordingly to not have head.S/entry.S blow up. With
current toolchains, this is no longer the case, and the
syntax magically changed as well, causing all current
toolchains to die a horrible death.
Incidentally, code generation in other parts of the kernel
is now significantly complex enough that none of the older
toolchains make it very far these days, so there's not
even any point in preserving legacy compatability via
as-option.
This fixes a long-standing issue, as noted here:
http://lkml.org/lkml/2005/1/5/223
Though at the time the current toolchains were too broken
to make adjusting the tuning worthwhile.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 3778/1: S3C24XX: remove changelogs from include/asm-arm/arch-s3c2410 [simtec]
[ARM] 3783/1: S3C2412: fix IRQ_EINT0 to IRQ_EINT3 handling
[ARM] 3779/1: S3C24XX: remove changelogs from include/asm-arm/arch-s3c2410 [left]
[ARM] 3777/1: S3C24XX: remove changelogs from include/asm-arm/arch-s3c2410 [regs-*.h]
[ARM] 3776/1: S3C24XX: remove changelogs from include/asm-arm/arch-s3c2410
[ARM] 3775/1: S3C24XX: do not add same sysdev_driver to two classes
[ARM] 3774/1: S3C24XX: SMDK2413 has two machine IDs
[ARM] 3773/1: Add the HWCAP_VFP bit for the ARM926 CPUs
[ARM] 3772/1: Fix compilation error in mach-ixp4xx/nslu2*
[ARM] 3767/1: S3C24XX: remove changelog comments from arch/arm/mach-s3c2410
[ARM] 3766/1: Fix typo in ARM _raw_read_trylock
Patch from Ben Dooks
The IRQ_EINT0 through IRQ_EINT3 handling has changed
on the S3C2412 from the previous SoCs in the range,
and thus we need to add code to handle this.
The changes come about due to these IRQs being
displayed in two different registers, and needing to
be acked and masked in both.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
The s3c244x-irq.c code makes the mistake of adding
the same drive to two different sys-classes. This
causes the class lists to become corrupted and the
suspend code to OOPS.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The sn_cpu_init() is required for cpu initialization on SN platforms.
Change __init to __cpuinit so that the function is not freed with init code/data.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The SN PROM uses the register stack in the slave loop. The contents
must be preserved for the OS to return to the slave loop via offlining
a cpu or for kexec. A 'flushrs" is needed to force the stack to be written
to memory prior to changing bspstore.
Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
The syscalls set/get_robust_list must not be wired up until
futex_atomic_cmpxchg_inatomic is implemented. Otherwise the kernel will
hang in handle_futex_death.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Fix a bug in sys_perfmonctl() whereby it was not correctly
decrementing the file descriptor reference count.
Signed-off-by: stephane eranian <eranian@hpl.hp.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
This prevents cross-region mappings on IA64 and SPARC which could lead
to system crash. They were correctly trapped for normal mmap() calls,
but not for the kernel internal calls generated by executable loading.
This code just moves the architecture-specific cross-region checks into
an arch-specific "arch_mmap_check()" macro, and defines that for the
architectures that needed it (ia64, sparc and sparc64).
Architectures that don't have any special requirements can just ignore
the new cross-region check, since the mmap() code will just notice on
its own when the macro isn't defined.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
[ Cleaned up to not affect architectures that don't need it ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Ben Dooks
It turns out we have both SMDK2413 and S3C2413 for
the same board.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Catalin Marinas
The ARM926EJ-S CPU has the VFP coprocessor and therefore it should be shown
in the /proc/cpuinfo if CONFIG_VFP is enabled.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Martin Michlmayr
Include linux/irq.h in the nslu2 code in order to avoid the following
compiler error:
CC arch/arm/mach-ixp4xx/nslu2-power.o
arch/arm/mach-ixp4xx/nslu2-power.c: In function 'nslu2_power_init':
arch/arm/mach-ixp4xx/nslu2-power.c:53: warning: implicit declaration of function 'set_irq_type'
arch/arm/mach-ixp4xx/nslu2-power.c:53: error: 'IRQ_TYPE_LEVEL_LOW' undeclared (first use in this function)
arch/arm/mach-ixp4xx/nslu2-power.c:53: error: (Each undeclared identifier is reported only once
arch/arm/mach-ixp4xx/nslu2-power.c:53: error: for each function it appears in.)
arch/arm/mach-ixp4xx/nslu2-power.c:54: error: 'IRQ_TYPE_LEVEL_HIGH' undeclared (first use in this function)
make[5]: *** [arch/arm/mach-ixp4xx/nslu2-power.o] Error 1
Signed-off-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Ben Dooks
Remove the pointless changelog comments from
arch/arm/mach-s3c2410 files, as all this can
be found from the revision control system.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Use the generic time stuff for FRV.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some buggy BIOSes do a "software any" kind of coordination without telling
about it to OS. So, when OS sets frequency on one CPU on these platforms,
it will also impact all the other logical CPUs that are in the same power
domain. Attached patch is a workaround for those buggy BIOSes.
Patch should be a noop on the normal non-buggy platforms.
Applies over previously sent acpi-cpufreq and software coordination
bug fix patch
Signed-off-by: Denis Sadykov <denis.m.sadykov@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Alexey Starikovskiy <alexey.y.starikovskiy@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Rename option "dont_scale_voltage" to "scale_voltage" because
don't will be default.
Use "pos" for calculating voltage. In this way driver don't need
to know mV value or low level value. Simply min U is one pos and
max U is second pos. All pos between these two are used.
Assume that min U is for min f and max U for max f. For frequency
between min and max calculate pos based on difference between
current frequency and min f.
Values in mobile VRM table changed to values from
C3-M datasheet.
Signed-off-by: Rafa Bilski <rafalbilski@interia.pl>
Signed-off-by: Dave Jones <davej@redhat.com>
New sparse caught that typo which could have caused erratic hardware
behaviour on some machines if the platform functions are used by the
firmware to change bits in some FCR registers.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from George G. Davis
Resolve ARM1136 VIPT non-aliasing cache coherency issues observed when
using ptrace to set breakpoints and cleanup copy_{to,from}_user_page()
while we're here as requested by Russell King because "it's also far
too heavy on non-v6 CPUs".
NOTES:
1. Only access_process_vm() calls copy_{to,from}_user_page().
2. access_process_vm() calls get_user_pages() to pin down the "page".
3. get_user_pages() calls flush_dcache_page(page) which ensures cache
coherency between kernel and userspace mappings of "page". However
flush_dcache_page(page) may not invalidate I-Cache over this range
for all cases, specifically, I-Cache is not invalidated for the VIPT
non-aliasing case. So memory is consistent between kernel and user
space mappings of "page" but I-Cache may still be hot over this
range. IOW, we don't have to worry about flush_cache_page() before
memcpy().
4. Now, for the copy_to_user_page() case, after memcpy(), we must flush
the caches so memory is consistent with kernel cache entries and
invalidate the I-Cache if this mm region is executable. We don't
need to do anything after memcpy() for the copy_from_user_page()
case since kernel cache entries will be invalidated via the same
process above if we access "page" again. The flush_ptrace_access()
function (borrowed from SPARC64 implementation) is added to handle
cache flushing after memcpy() for the copy_to_user_page() case.
Signed-off-by: George G. Davis <gdavis@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The port to genirq & the new powerpc interrupt model in 2.6.18 introduced a
bug in the legacy PowerMac PIC code (used on older machines) because of a
typo potentially causing hangs due to interrupt storms. This fixes it,
along with a performance issue causing us to do spurrious retriggers after
masking an interrupt.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Apparently some systems export valid HPET addresses, but hpet_enable()
fails. Then when the HPET clocksource starts up, it only checks for a
valid HPET address, and the result is a system where time does not advance.
See http://bugme.osdl.org/show_bug.cgi?id=7062 for details.
This patch just makes sure we better check that the HPET is functional
before registering the HPET clocksource.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Ben Dooks
The type naming in the s3c24xx dma code is riddled with
typedefs creating _t types, from the code import from 2.4
which is contrary to the current Kernel coding style.
This patch cleans this up, removing the typedefs and
and fixing up the resultant code changes.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from David Brownell
This adds RTC support to the csb337 default config. Both the AT91
and the ds1307 RTCs are enabled (rtc0 and rtc1 respectively).
The ds1307 is used to initialize the system time, since it's battery-backed.
From then on the AT91 RTC is used, since it's more capable (with both
alarm and update irqs, and system wakeup capability) even though it
needs manual initialization (symlink /dev/rtc to /dev/rtc0 for older
versions of hwclock, then "hwclock --systohc") in an rc script or
from inittab.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix return value from memcpy
[POWERPC] iseries: Define insw et al. so libata/ide will compile
[POWERPC] Fix irq enable/disable in smp_generic_take_timebase
[POWERPC] Fix problem with time not advancing on 32-bit platforms
[POWERPC] Restore copyright notice in arch/powerpc/kernel/fpu.S
[POWERPC] Fix up ibm_architecture_vec definition
[POWERPC] Make OF irq map code detect more error cases
[POWERPC] Support for "weird" MPICs and fixup mpc7448_hpc2
[POWERPC] Fix MPIC sense codes in documentation
[POWERPC] Fix performance regression in IRQ radix tree locking
[POWERPC] Add mpc7448hpc2 device tree source file
[POWERPC] Add MPC8349E MDS device tree source file to arch/powerpc/boot/dts
[POWERPC] modify mpc83xx platforms to use new IRQ layer
[POWERPC] Adapt ipic driver to new host_ops interface, add set_irq_type to set IRQ sense
[POWERPC] back up old school ipic.[hc] to arch/ppc
[POWERPC] Use mpc8641hpcn PIC base address from dev tree.
[POWERPC] Allow MPC8641 HPCN to build with CONFIG_PCI disabled too.
[POWERPC] Fix powerpc 44x_mmu build
[POWERPC] Remove flush_dcache_all export
This fixes a hang on ppc32.
The problem was that I was comparing a 32-bit quantity with a 64-bit
quantity, and consequently time wasn't advancing. This makes us use a
64-bit quantity on all platforms, which ends up simplifying the code
since we can now get rid of the tb_last_stamp variable (which actually
fixes another bug that Ben H and I noticed while going carefully through
the code).
This works fine on my G4 tibook. Let me know how it goes on your
machines.
Acked-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As pointed out by Herbert Xu <herbert@gondor.apana.org.au>, our
memcpy implementation didn't return the destination pointer as its
return value, and there is code in the kernel that expects that.
This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
[IA64] Increase default nodes shift to 10, nr_cpus to 1024
[IA64] remove redundant local_irq_save() calls from sn_sal.h
[IA64] panic if topology_init kzalloc fails
[IA64-SGI] Silent data corruption caused by XPC V2.
It's possible to get an invalid page fault in kernel mode when we try to
write out segments from vsyscall32 when dumping core for a 32bit process if
the vsyscall32 DSO is not mapped in its address space (which can happen if,
for example, ulimit -v 100 is run).
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The values in init_tss.ist[] can change when an IST event occurs. Save
the original IST values for checking stack addresses when debugging or
doing stack traces.
Signed-off-by: Keith Owens <kaos@ocs.com.au>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There was a bogus hunk from the genirq merge that essentially
broke stack switching for hard interrupts. Remove it since it isn't
needed.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
As a replacement for the earlier removal of the e820 MCFG check
we blacklist the Intel SDV with the original BIOS bug that
motivated that check. On those machines don't use MMCONFIG.
This also adds a new pci=mmconf parameter to override the blacklist.
Cc: Greg KH <gregkh@suse.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Noticed by Jan Beulich.
When the kernel was moved from 1MB to 2MB in 2.6.17 the kernel reservation
code wasn't adjusted and it still reserved starting with 1MB. This means 1MB always
were lost.
This patch fixes this by reserving only starting with _text.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The unwinder fallback logic still had potential for falling through to
the legacy stack trace code without printing an indication (at once
serving as a separator) of this.
Further, the stack pointer retrieval for the fallback should be as
restrictive as possible (in order to avoid having the legacy stack
tracer try to access invalid memory). The patch tightens that, but
this could certainly be further improved.
Also making the call_trace command line option now conditional upon
CONFIG_STACK_UNWIND (as it's meaningless otherwise).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One open question: Should this added push perhaps be made conditional
upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: not needed, these are all very slow paths]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
One open question: Should these added pushes perhaps be made
conditional upon CONFIG_STACK_UNWIND or CONFIG_UNWIND_INFO?
[AK: Not needed -- these are all very slow paths]
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The check for the MCFG table being reserved in the e820 map was originally
added to detect a broken BIOS in a preproduction Intel SDV. However it also
breaks the Apple x86 Macs, which can't supply this properly, but need
a working MCFG. With this patch they wouldn't use the MCFG and not work.
After some discussion I think it's best to remove the heuristic again.
It also failed on some other boxes (although it didn't cause much
problems there because old style port access for PCI config space
still works as fallback), but the preproduction SDVs can just use
pci=nommcfg. Supporting production machines properly is more
important.
Edgar Hucek did all the debugging work.
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Edgar Hucek <hostmaster@ed-soft.at>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Daniel Jacobowitz
vfp_put_double didn't work in a CONFIG_AEABI kernel. By swapping
the arguments, we arrange for them to be in the same place regardless
of ABI. I made the same change to vfp_put_float for consistency.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The copy_in_user primitive does not work as advertised. If the source
and target area are available copy_in_user copies one byte too much.
If one of the memory areas is not available it does not copy as much
data as it can, but up to 257 bytes less.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Eran Ben-Avi <eranpublic@yahoo.com> pointed out that the arch/ppc version
of smp_generic_take_timebase disables interrupts on entry but exits without
restoring them. However, both it and the arch/powerpc version have another
problem, which is that they use local_irq_disable/enable rather than
local_irq_save/restore, and they are called with interrupts disabled.
This fixes both problems; it changes a return to a break in the arch/ppc
version, and changes both versions to use local_irq_save/restore.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a problem introduced in 5db9fa9593.
The last_jiffy per-cpu variable is only 32 bits on 32-bit machines, but it
was being compared with a 64-bit quantity (tb_next_jiffy), which resulted in
time not advancing.
This fixes it by changing last_jiffy to be 64 bits on all platforms. With
this, we no longer need tb_last_stamp as a 32-bit version of tb_last_jiffy,
so this gets rid of tb_last_stamp and we just use tb_last_jiffy instead.
This also fixes a bug when the boot cpu is not online, because using
tb_last_stamp could have caused the wrong timebase origin value to be used
when calculating the time of day.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This code got moved from head.S but the copyright notice on head.S didn't
get transferred with it. Noticed by Cort Dougan <cort@fsmlabs.com>.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This problem was noticed by one of the Phyp firmware folks.
Our ibm,client-architecture-support call was failing.
This corrects the vector length parameters being passed in.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Device-tree bugs on js20 with some versions of SLOF were causing the
interrupt for IDE to not be parsed correctly and fail to boot. This
patch adds a bit more sanity checking to the parser to detect some of
those errors and fail instead of returning bogus information. The
powerpc PCI code can then trigger a fallback that works on those
machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds a new hardware information table for mpic. This enables
the mpic code to deal with mpic controllers with different register
layouts and hardware behaviours.
This introduces CONFIG_MPIC_WEIRD. For boards with non standard mpic
controllers, select CONFIG_MPIC_WEIRD and add its hardware information
in the mpic_infos[] array.
TSI108/109 PIC takes the first index of weird hardware information
table. :) The table can be extended. The Tsi108/109 PIC looks like
standard OpenPIC but, in fact, is different in register mapping and
behavior.
The patch does not affect the behavior of standard mpic. If
CONFIG_MPIC_WEIRD is not defined, the code is essentially identical to
the current code.
[benh@kernel.crashing.org:
This patch is a slightly cleaned up version of Zang Roy's support for
the TSI108 MPIC variant. It also fixes up MPC7448_hpc2 to use the new
version of the type macros and changes the way MPIC is selected in
Kconfig to better match what is done for other system devices.
]
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This problem was introduced by changeset
14778d9072
Unlike the hugetlb code paths, the normal fault code is not setup to
propagate PTE changes for large page sizes correctly like the ones we
make for I/O mappings in io_remap_pfn_range().
It is absolutely necessary to update all sub-ptes of a largepage
mapping on a fault. Adding special handling for this would add
considerably complexity to tlb_batch_add(). So let's just side-step
the issue and forcefully dirty any writable PTEs created by
io_remap_pfn_range().
The only other real option would be to disable to large PTE code of
io_remap_pfn_range() and we really don't want to do that.
Much thanks to Mikael Pettersson for tracking down this problem and
testing debug patches.
Signed-off-by: David S. Miller <davem@davemloft.net>
When reworking the powerpc irq code, I figured out that we were using
the radix tree in a racy way. As a temporary fix, I put a spinlock in
there. However, this can have a significant impact on performances. This
patch reworks that to use a smarter technique based on the fact that
what we need is in fact a rwlock with extremely rare writers (thus
optimized for the read path).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch adds the mpc7448hpc2 device tree source file.
Signed-off-by: Roy Zang <tie-fei.zang@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add MPC8349E MDS device tree source file to arch/powerpc/boot/dts
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes MPC834x MDS (formerly SYS) and ITX platform code to get IRQ data (including PCI) from the device tree, and to use the new IPIC code.
renamed defconfig (sys -> mds), left one redundant NULL assignment in mpc83xx_pcibios_fixup to keep the compiler happy.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This converts ipic code to Benh's IRQ mods. For the IPIC, IRQ sense values in the device tree equal those in include/linux/irq.h; that's 8 for low assertion (most internal IRQs on mpc83xx), and 2 for high-to-low change.
spinlocks added to [un]mask, ack operations; default handler and type now set in host_map; and redundant condition check eliminated.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Li Yang <leoli@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Keep from breaking 83xx arch/ppc build. Back up old school arch/powerpc/sysdev/ipic.[hc] to arch/ppc/syslib.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Change both the NODES_SHIFT and the NR_CPUS so that even big machines
can boot all nodes and processors with a generic kernel.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 3761/1: fix armv4t breakage after adding thumb interworking to userspace helpers
[ARM] Add Integrator support for glibc outb() and friends
[ARM] Move prototype for register_isa_ports to asm/io.h
[ARM] Arrange for isa.c to use named initialisers
[ARM] 3741/1: remove sa1111.c build warning on non-sa1100 systems
[ARM] 3760/1: This patch adds timeouts while working with SSP registers. Such timeouts were en
[ARM] 3758/1: Preserve signalling NaNs in conversion
[ARM] 3749/3: Correct VFP single/double conversion emulation
[ARM] 3748/3: Correct error check in vfp_raise_exceptions
Patch from Lennert Buytenhek
On armv4t systems, we have always compiled the kernel with -march=armv4
instead of -march=armv4t, which means that any use of bx will bomb out.
Commit ba9b5d7637 introduced the use of
bx in the kernel, which means we need to compile with -march=armv4t on
armv4t systems now.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Add the necessary call to register_isa_ports() so that glibc knows
where these are found on Integrator platforms.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When we select NUMA with i386, the system is only X86_NUMAQ or using ACPI.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ignore the return value of early_init_acpi(), as it can give false error
messages. If there is something really wrong, then register_driver will
fail cleanly with EINVAL later.
[ background: modprobe acpi-cpufreq on systems not capable of speed-scaling
started failing with 'invalid argument', where previously it would only
ever -ENODEV
I'm not 100% happy with the solution. It'd be better to handle
failure properly, but this is a low-impact change for 2.6.18
We can always revisit doing this better in .19 --davej.]
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from David Brownell
Remove buld warning when building sa1111 on non-sa1100 platforms (e.g. PXA).
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Paul Sokolovsky
This patch adds timeouts while working with SSP registers. Such
timeouts were envisioned by docstrings in ssp.c, but were not
implemented. There were actual lockups while accessing
touchscreen for iPaqs h1910, h4000 due to lack of the timeouts.
This is updated version of previously submitted patch: 3738/1.
Signed-off-by: Paul Sokolovsky <pmiscml@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
The fcvtds and fcvtsd instructions were generating a qnan bit pattern
for both quiet and signalling NaNs.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
The fcvtsd/fcvtds emulation was left behind when the numbering of double
precision registers was changed from 0-30 to 0-15. Both conversion
instructions were writing their results to the wrong register. Also,
the conversion instructions should stop after the first element even
if a vector length is specified.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Patch from Daniel Jacobowitz
The recent fix to hide VFP_NAN_FLAG broke the check in vfp_raise_exceptions;
it would attempt to deliver an exception mask of 0xfffffeff instead of reporting
a serious error condition using printk. Define a safe constant to use for
an invalid exception maskm, and use it at both ends.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
It moves the smp_procesors_ready variable to sun4d_smp.c only.
Signed-off-by: Krzysztof Helt (krzysztof.h1@wp.pl)
Signed-off-by: David S. Miller <davem@davemloft.net>