In 2.6.17, there was a problem with cpu_notifiers and XFS. I provided a
band-aid solution to solve that problem. In the process, i undid all the
changes you both were making to ensure that these notifiers were available
only at init time (unless CONFIG_HOTPLUG_CPU is defined).
We deferred the real fix to 2.6.18. Here is a set of patches that fixes the
XFS problem cleanly and makes the cpu notifiers available only at init time
(unless CONFIG_HOTPLUG_CPU is defined).
If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
time.
This patch reverts the notifier_call changes made in 2.6.17
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With Goto-san's patch, we can add new pgdat/node at runtime. I'm now
considering node-hot-add with cpu + memory on ACPI.
I found acpi container, which describes node, could evaluate cpu before
memory. This means cpu-hot-add occurs before memory hot add.
In most part, cpu-hot-add doesn't depend on node hot add. But register_cpu(),
which creates symbolic link from node to cpu, requires that node should be
onlined before register_cpu(). When a node is onlined, its pgdat should be
there.
This patch-set holds off creating symbolic link from node to cpu
until node is onlined.
This removes node arguments from register_cpu().
Now, register_cpu() requires 'struct node' as its argument. But the array of
struct node is now unified in driver/base/node.c now (By Goto's node hotplug
patch). We can get struct node in generic way. So, this argument is not
necessary now.
This patch also guarantees add cpu under node only when node is onlined. It
is necessary for node-hot-add vs. cpu-hot-add patch following this.
Moreover, register_cpu calculates cpu->node_id by cpu_to_node() without regard
to its 'struct node *root' argument. This patch removes it.
Also modify callers of register_cpu()/unregister_cpu, whose args are changed
by register-cpu-remove-node-struct patch.
[Brice.Goglin@ens-lyon.org: fix it]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When new node becomes enable by hot-add, new sysfs file must be created for
new node. So, if new node is enabled by add_memory(), register_one_node() is
called to create it. In addition, I386's arch_register_node() and a part of
register_nodes() of powerpc are consolidated to register_one_node() as a
generic_code().
This is tested by Tiger4(IPF) with node hot-plug emulation.
Signed-off-by: Keiichiro Tokunaga <tokuanga.keiich@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
typo fixes
Clean up 'inline is not at beginning' warnings for usb storage
Storage class should be first
i386: Trivial typo fixes
ixj: make ixj_set_tone_off() static
spelling fixes
fix paniced->panicked typos
Spelling fixes for Documentation/atomic_ops.txt
move acknowledgment for Mark Adler to CREDITS
remove the bouncing email address of David Campbell
This fixes the clock source updates in update_wall_time() to correctly
track the time coming in via current_tick_length(). Optimize the fast
paths to be as short as possible to keep the overhead low.
Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Change the current_tick_length() function so it takes an argument which
specifies how much precision to return in shifted nanoseconds. This provides
a simple way to convert between NTPs internal nanoseconds shifted by
(SHIFT_SCALE - 10) to other shifted nanosecond units that are used by the
clocksource abstraction.
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In a testament to the utter simplicity and logic of the English
language ;-), I found a single correct use - in kernel/panic.c - and
10-15 incorrect ones.
Signed-Off-By: Lee Revell <rlrevell@joe-job.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
This patch contains a total rewrite of the backlight infrastructure for
portable Apple computers. Backward compatibility is retained. A sysfs
interface allows userland to control the brightness with more steps than
before. Userland is allowed to upload a brightness curve for different
monitors, similar to Mac OS X.
[akpm@osdl.org: add needed exports]
Signed-off-by: Michael Hanselmann <linux-kernel@hansmi.ch>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: "Antonino A. Daplas" <adaplas@pol.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove VM_LOCKED before remap_pfn range from device drivers and get rid of
VM_SHM.
remap_pfn_range() already sets VM_IO. There is no need to set VM_SHM since
it does nothing. VM_LOCKED is of no use since the remap_pfn_range does not
place pages on the LRU. The pages are therefore never subject to swap
anyways. Remove all the vm_flags settings before calling remap_pfn_range.
After removing all the vm_flag settings no use of VM_SHM is left. Drop it.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The hardirq_ctx and softirq_ctx variables are written to on init only,
Signed-off-by: Andreas Mohr <andi@lisas.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Support the ibm,extended-*-frequency properties found in recent POWER5
firmware:
cpus/PowerPC,POWER5@0/clock-frequency
59aa5880 (1504336000)
cpus/PowerPC,POWER5@0/ibm,extended-clock-frequency
00000000 59aa5880
cpus/PowerPC,POWER5@0/timebase-frequency
0b354b10 (188042000)
cpus/PowerPC,POWER5@0/ibm,extended-timebase-frequency
00000000 0b354b10
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Avoid duplication of the syscall table for the cell platform. Based on an
idea from David Woodhouse.
Signed-off-by: Andreas Schwab <schwab@suse.de>
Acked-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This is a first version of support for the Cell BE "Reliability,
Availability and Serviceability" features.
It doesn't yet handle some of the RAS interrupts (the ones described in
iic_is/iic_irr), I'm still working on a proper way to expose these. They
are essentially a cascaded controller by themselves (sic !) though I may
just handle them locally to the iic driver. I need also to sync with
David Erb on the way he hooked in the performance monitor interrupt.
So that's all for 2.6.17 and I'll do more work on that with my rework of
the powerpc interrupt layer that I'm hacking on at the moment.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Clear the high BATS during load_up_mmu if FTR_HAS_HIGH_BATS.
Allow just a bit more time for secondary CPUs to phone home.
Signed-off-by: Wei Zhang <Wei.Zhang@freescale.com>
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com>
Signed-off-by: Jon Loeliger <jdl@freescale.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Export both news RTAS delay functions, and change the scanlog module to
use the new delay functions.
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Currently the kernel blindly halts all the processors and calls the
ibm,suspend-me rtas call. If the firmware is not in the correct
state, we then re-start all the processors and return. It is much
smarter to first check the firmware state, and only if it is waiting,
call the ibm,suspend-me call.
Signed-off-by: Paul Mackerras <paulus@samba.org>
It seems that prom_init's early_cmdline_parse is broken on at least
Apple 970 xserves and IBM JS20 blades with SLOF. The firmware of these
machines returns -1 and 1 respectively when getprop is called for the
bootargs property of /chosen, causing Linux to ignore its builtin
command line in favor of a null string. This patch makes Linux use its
builtin command line if getprop returns an error or a null string.
Signed-off-by: Amos Waterland <apw@us.ibm.com>
Acked-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In the syscall path we currently have:
crclr so
mfcr r9
If we shift the crclr up we can avoid a stall on some CPUs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For pseries IOMMU bypass I want to be able to fall back to the regular
IOMMU ops. Do this by creating a dma_mapping_ops struct, and convert
the others while at it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Allocate IOMMU tables local to the relevant node.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Mackerras <paulus@samba.org>
of_node_to_nid returns -1 if the associativity cannot be found. This
means pcibus_to_cpumask has to be careful not to pass a negative index into
node_to_cpumask.
Since pcibus_to_node could be used a lot, and of_node_to_nid is slow (it
walks a list doing strcmps), lets also cache the node in the
pci_controller struct.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Remove some stale POWER3/POWER4/970 on 32bit kernel support.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Forthcoming machines will extend the FPSCR to 64 bits. We already
had a 64-bit save area for the FPSCR, but we need to use a new form
of the mtfsf instruction. Fortunately this new form is decoded as
an ordinary mtfsf by existing 64-bit processors.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Instead of trying to make PPC64 MSI fit in a Intel-centric MSI layer, a
simple short-term solution is to hook the pci_{en/dis}able_msi() calls
and make a machdep call.
The rest of the MSI functions are superfluous for what is needed at this
time. Many of which can have machdep calls added as needed.
Ben and Michael Ellerman are looking into rewrite the MSI layer to be
more generic. However, in the meantime this works as a interim
solution.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds support to recognize the PCIe device_type "pciex" and made
the portdrv buildable.
Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The push_end macro in arch/powerpc/kernel/pci_32.c uses integer
division and multiplication to achieve the effect of rounding a
resource end address up and then advancing it to the end of a
power-of-2 sized region. This changes it to an equivalent computation
that only needs an integer add and OR. This is partly based on an
earlier patch by Mel Gorman.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Some POWER5+ machines can do 64k hardware pages for normal memory but
not for cache-inhibited pages. This patch lets us use 64k hardware
pages for most user processes on such machines (assuming the kernel
has been configured with CONFIG_PPC_64K_PAGES=y). User processes
start out using 64k pages and get switched to 4k pages if they use any
non-cacheable mappings.
With this, we use 64k pages for the vmalloc region and 4k pages for
the imalloc region. If anything creates a non-cacheable mapping in
the vmalloc region, the vmalloc region will get switched to 4k pages.
I don't know of any driver other than the DRM that would do this,
though, and these machines don't have AGP.
When a region gets switched from 64k pages to 4k pages, we do not have
to clear out all the 64k HPTEs from the hash table immediately. We
use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page
was hashed in as a 64k page or a set of 4k pages. If hash_page is
trying to insert a 4k page for a Linux PTE and it sees that it has
already been inserted as a 64k page, it first invalidates the 64k HPTE
before inserting the 4k HPTE. The hash invalidation routines also use
the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a
set of 4k HPTEs to remove. With those two changes, we can tolerate a
mix of 4k and 64k HPTEs in the hash table, and they will all get
removed when the address space is torn down.
Signed-off-by: Paul Mackerras <paulus@samba.org>
The pgdir field in the paca was a leftover from the dynamic VSIDs
patch, and is not used in the current kernel code. This removes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
In commit 8eb6c6e3b9, Christoph Hellwig
made iommu_alloc_coherent able to do node-local allocations, but
unfortunately got the order of the arguments to alloc_pages_node
wrong. This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This gives the ability to control whether alignment exceptions get
fixed up or reported to the process as a SIGBUS, using the existing
PR_SET_UNALIGN and PR_GET_UNALIGN prctls. We do not implement the
option of logging a message on alignment exceptions.
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the PowerPC part of the code to allow processes to change
their endian mode via prctl.
This also extends the alignment exception handler to be able to fix up
alignment exceptions that occur in little-endian mode, both for
"PowerPC" little-endian and true little-endian.
We always enter signal handlers in big-endian mode -- the support for
little-endian mode does not amount to the creation of a little-endian
user/kernel ABI. If the signal handler returns, the endian mode is
restored to what it was when the signal was delivered.
We have two new kernel CPU feature bits, one for PPC little-endian and
one for true little-endian. Most of the classic 32-bit processors
support PPC little-endian, and this is reflected in the CPU feature
table. There are two corresponding feature bits reported to userland
in the AT_HWCAP aux vector entry.
This is based on an earlier patch by Anton Blanchard.
Signed-off-by: Paul Mackerras <paulus@samba.org>
When debugging early kernel crashes that happen after console_init() and
before a proper console driver takes over, we often have to go hack into
udbg.c to prevent it from unregistering so we can "see" what is
happening. This patch adds a kernel command line option "udbg-immortal"
instead to avoid having to modify the kernel.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
POWER6 moves some of the MMCRA bits and also requires some bits to be
cleared each PMU interrupt.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Make sure dma_alloc_coherent allocates memory from the local node. This
is important on Cell where we avoid going through the slow cpu
interconnect.
Note: I could only test this patch on Cell, it should be verified on
some pseries machine by those that have the hardware.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch attempts to handle RTAS "busy" return codes in a more simple
and consistent manner. Typical callers of RTAS shouldn't have to
manage wait times and delay calls.
This patch also changes the kernel to use msleep() rather than udelay()
when a runtime delay is necessary. This will avoid CPU soft lockups
for extended delay conditions.
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The 970MP cputable entry needs a num_pmcs entry for oprofile to work.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
My js20 appears to lack the ibm,#dma- properties, and boot fails with a
"Kernel panic - not syncing: iommu_init_table: Can't allocate 0 bytes"
message.
This adds a fallback to the "#address-cells" property in case the
"#ibm,dma-address-cells" property is missing. Tested on js20 and
power5 lpar.
Unless there is a more elegant solution... :-)
Signed-off-by: Will Schmidt <willschm@us.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch cleans up some locking & error handling in the ppc vdso and
moves the vdso base pointer from the thread struct to the mm context
where it more logically belongs. It brings the powerpc implementation
closer to Ingo's new x86 one and also adds an arch_vma_name() function
allowing to print [vsdo] in /proc/<pid>/maps if Ingo's x86 vdso patch is
also applied.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
I have tested PPC_PTRACE_GETREGS and PPC_PTRACE_SETREGS on umview.
I do not understand why historically these tags has been defined as
PPC_PTRACE_GETREGS and PPC_PTRACE_SETREGS instead of simply
PTRACE_[GS]ETREGS. The other "originality" is that the address must be
put into the "addr" field instead of the "data" field as stated in the
manual.
Signed-off-by: renzo davoli <renzo@cs.unibo.it>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The IBM Cell blade firmware might confuse the kernel to think it's a
pSeries machine. This fixes it for now. With a bit of luck, the firmware
will be updated to avoid that in the future but currently that patch is
needed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The code in prom_init.c calling the firmware
ibm,client-architecture-support method on pSeries has a bug where it
fails to properly pass the instance handle of the firmware object when
trying to call a method. Result ranges from the call doing nothing to
the firmware crashing. (Found by Segher, thanks !)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This fixes a bug found by Dave Jones that means that it is possible
for userspace to provoke a machine check on 32-bit kernels. This
also fixes a couple of other places where I found similar problems
by inspection.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Due to a firmware device tree bug, RTC and NVRAM accesses (including
halt/reboot) on Maple have been broken since January, when an untested
build fix went in. This code patches the device tree in Linux.
Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
First we capture all the strings from dt.c statically by noting that gcc
puts them in a special section of their own. Idea from Michael Ellerman.
Then we move the flattened device tree to klimit.
Still to come, making the values blob grow as needed.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>