The LL/SC and IRQ versions were using generic stubs while the GRB version
was just reimplementing what it already had for the standard cmpxchg()
code. As we have optimized cmpxchg() implementations that are decoupled
from the atomic code, simply falling back on the generic wrapper does the
right thing. With this in place the GRB case is unaffected while the
LL/SC case gets to use its optimized cmpxchg().
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We previously had 2 quicklists, one for the PGD case and one for PTEs.
Now that the PGD/PMD cases are handled through slab caches due to the
multi-level configurability, only the PTE quicklist remains. As such,
reduce NR_QUICK to its appropriate size and bump down the PTE quicklist
index.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
We also switched away from quicklists and instead moved to slab
caches. After benchmarking both implementations the difference is
negligible. The slab caches suit us better though because the size of a
pgd table is just 4 entries when we're using a 3-level page table layout
and quicklists always deal with pages.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
The previous expressions were wrong which made free_pmd_range() explode
when using anything other than 4KB pages (which is why 8KB and 64KB
pages were disabled with the 3-level page table layout).
The problem was that pmd_offset() was returning an index of non-zero
when it should have been returning 0. This non-zero offset was used to
calculate the address of the pmd table to free in free_pmd_range(),
which ended up trying to free an object that was not aligned on a page
boundary.
Now 3-level page tables should work with 4KB, 8KB and 64KB pages.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
As CPUs are migrated over to more fully-featured clock frameworks of
their own and off of the legacy CPG code, they no longer have any real
need for defining the PCLK value. The PCLK define in itself is already
fairly misleading, as many boards get their input clocks from different
sources, making this value fairly arbitrary anyways.
Outside of the legacy CPG clock framework, the only place where this
value is used is for deriving CLOCK_TICK_RATE, which we set back to the
legacy PIT value that it was before the PCLK definitions were added in
the first place.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits)
net: fix for utsrelease.h moving to generated
gen_init_cpio: fixed fwrite warning
kbuild: fix make clean after mismerge
kbuild: generate modules.builtin
genksyms: properly consider EXPORT_UNUSED_SYMBOL{,_GPL}()
score: add asm/asm-offsets.h wrapper
unifdef: update to upstream revision 1.190
kbuild: specify absolute paths for cscope
kbuild: create include/generated in silentoldconfig
scripts/package: deb-pkg: use fakeroot if available
scripts/package: add KBUILD_PKG_ROOTCMD variable
scripts/package: tar-pkg: use tar --owner=root
Kbuild: clean up marker
net: add net_tstamp.h to headers_install
kbuild: move utsrelease.h to include/generated
kbuild: move autoconf.h to include/generated
drop explicit include of autoconf.h
kbuild: move compile.h to include/generated
kbuild: drop include/asm
kbuild: do not check for include/asm-$ARCH
...
Fixed non-conflicting clean merge of modpost.c as per comments from
Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h
that needed to be changed to generated/autoconf.h)
If using 64-bit PTEs and 4K pages then each page table has 512 entries
(as opposed to 1024 entries with 32-bit PTEs). Unlike MIPS, SH follows
the convention that all structures in the page table (pgd_t, pmd_t,
pgprot_t, etc) must be the same size. Therefore, 64-bit PTEs require
64-bit PGD entries, etc. Using 2-levels of page tables and 64-bit PTEs
it is only possible to map 1GB of virtual address space.
In order to map all 4GB of virtual address space we need to adopt a
3-level page table layout. This actually works out better for
CONFIG_SUPERH32 because we only waste 2 PGD entries on the P1 and P2
areas (which are untranslated) instead of 256.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Keep the dimensions of the page tables in a separate header file in
preparation for allowing a three level page table structure.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
MAX_DMA_CHANNELS is tested for the total number of channels in order to
populate an IRQ map. Stub this out completely when no DMA support is
enabled -- as used to be the default behaviour before this was
generalized for use by the dmaengine code.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (33 commits)
sh: Fix test of unsigned in se7722_irq_demux()
sh: mach-ecovec24: Add FSI sound support
sh: mach-ecovec24: Add mt9t112 camera support
sh: mach-ecovec24: Add tw9910 support
sh: MSIOF/mmc_spi platform data for the Ecovec24 board
sh: ms7724se: Add ak4642 support
sh: Fix up FPU build for SH5
sh: Remove old early serial console code V2
sh: sh5 scif pdata (sh5-101/sh5-103)
sh: sh4a scif pdata (sh7757/sh7763/sh7770/sh7780/sh7785/sh7786/x3)
sh: sh4a scif pdata (sh7343/sh7366/sh7722/sh7723/sh7724)
sh: sh4 scif pdata (sh7750/sh7760/sh4-202)
sh: sh3 scif pdata (sh7705/sh770x/sh7710/sh7720)
sh: sh2a scif pdata (sh7201/sh7203/sh7206/mxg)
sh: sh2 scif pdata (sh7616)
sh-sci: Extend sh-sci driver with early console V2
sh: Stub in P3 ioremap support for nommu parts.
sh: wire up vmallocinfo support in ioremap() implementations.
sh: Make the unaligned trap handler always obey notification levels.
sh: Couple kernel and user write page perm bits for CONFIG_X2TLB
...
Currently all architectures but microblaze unconditionally define
USE_ELF_CORE_DUMP. The microblaze omission seems like an error to me, so
let's kill this ifdef and make sure we are the same everywhere.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: <linux-arch@vger.kernel.org>
Cc: Michal Simek <michal.simek@petalogix.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Name space cleanup for rwlock functions. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
Not strictly necessary for -rt as -rt does not have non sleeping
rwlocks, but it's odd to not have a consistent naming convention.
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
Name space cleanup. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
Further name space cleanup. No functional change
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
The raw_spin* namespace was taken by lockdep for the architecture
specific implementations. raw_spin_* would be the ideal name space for
the spinlocks which are not converted to sleeping locks in preempt-rt.
Linus suggested to convert the raw_ to arch_ locks and cleanup the
name space instead of using an artifical name like core_spin,
atomic_spin or whatever
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: linux-arch@vger.kernel.org
p3_ioremap() references __ioremap() which is presently undefined on
nommu. This provides a trivial stub to fix the build up.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This wires up the caller information for the ioremap VMA, which allows
for more helpful caller tracking via /proc/vmallocinfo. Follows the x86
and powerpc changes of the same nature.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
pte_write() should check whether the permissions include either the user
or kernel write permission bits. Likewise, pte_wrprotect() needs to
remove both the kernel and user write bits.
Without this patch handle_tlbmiss() doesn't handle faulting in pages
from the P3 area (our vmalloc space) because of a write. Mappings of the
P3 space have the _PAGE_EXT_KERN_WRITE bit but not _PAGE_EXT_USER_WRITE.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
The simplest method was to add an extra asm-offsets.h
file in arch/$ARCH/include/asm that references the generated file.
We can now migrate the architectures one-by-one to reference
the generated file direct - and when done we can delete the
temporary arch/$ARCH/include/asm/asm-offsets.h file.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
The stub already existed in the _64 syscall table, but was lacking a
__NR_recvmmsg definition, while it was absent entirely for _32 variants.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (137 commits)
sh: include empty zero page in romImage
sh: Make associative cache writes fatal on all SH-4A parts.
sh: Drop associative writes for SH-4 cache flushes.
sh: Partial revert of copy/clear_user_highpage() optimizations.
sh: Add default uImage rule for se7724, ap325rxa, and migor.
sh: allow runtime pm without suspend/resume callbacks
sh: mach-ecovec24: Remove un-defined settings for VPU
sh: mach-ecovec24: LCDC drive ability become high
sh: fix sh7724 VEU3F resource size
serial: sh-sci: Fix too early port disabling.
sh: pfc: pr_info() -> pr_debug() cleanups.
sh: pfc: Convert from ctrl_xxx() to __raw_xxx() I/O routines.
sh: Improve kfr2r09 serial port setup code
sh: Break out SuperH PFC code
sh: Move KEYSC header file
sh: convert /proc/cpu/aligmnent, /proc/cpu/kernel_alignment to seq_file
sh: Add CPG save/restore code for sh7724 R-standby
sh: Add SDHI power control support to Ecovec
mfd: Add power control platform data to SDHI driver
sh: mach-ecovec24: modify address map
...
This patch adds a ->start_transfer() callback to the
KFR2R09 lcd handling code. The callback is used to
notify the lcd controller that a new frame of data
is about to be transferred. The callback is only used
in combination with deferred io, but the code has
been tested both with and without deferred io enabled.
Without this patch the display data on the KFR2R09
lcd panel becomes corrupted over time.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'for-2.6.33' of git://git.kernel.dk/linux-2.6-block: (113 commits)
cfq-iosched: Do not access cfqq after freeing it
block: include linux/err.h to use ERR_PTR
cfq-iosched: use call_rcu() instead of doing grace period stall on queue exit
blkio: Allow CFQ group IO scheduling even when CFQ is a module
blkio: Implement dynamic io controlling policy registration
blkio: Export some symbols from blkio as its user CFQ can be a module
block: Fix io_context leak after failure of clone with CLONE_IO
block: Fix io_context leak after clone with CLONE_IO
cfq-iosched: make nonrot check logic consistent
io controller: quick fix for blk-cgroup and modular CFQ
cfq-iosched: move IO controller declerations to a header file
cfq-iosched: fix compile problem with !CONFIG_CGROUP
blkio: Documentation
blkio: Wait on sync-noidle queue even if rq_noidle = 1
blkio: Implement group_isolation tunable
blkio: Determine async workload length based on total number of queues
blkio: Wait for cfq queue to get backlogged if group is empty
blkio: Propagate cgroup weight updation to cfq groups
blkio: Drop the reference to queue once the task changes cgroup
blkio: Provide some isolation between groups
...
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1815 commits)
mac80211: fix reorder buffer release
iwmc3200wifi: Enable wimax core through module parameter
iwmc3200wifi: Add wifi-wimax coexistence mode as a module parameter
iwmc3200wifi: Coex table command does not expect a response
iwmc3200wifi: Update wiwi priority table
iwlwifi: driver version track kernel version
iwlwifi: indicate uCode type when fail dump error/event log
iwl3945: remove duplicated event logging code
b43: fix two warnings
ipw2100: fix rebooting hang with driver loaded
cfg80211: indent regulatory messages with spaces
iwmc3200wifi: fix NULL pointer dereference in pmkid update
mac80211: Fix TX status reporting for injected data frames
ath9k: enable 2GHz band only if the device supports it
airo: Fix integer overflow warning
rt2x00: Fix padding bug on L2PAD devices.
WE: Fix set events not propagated
b43legacy: avoid PPC fault during resume
b43: avoid PPC fault during resume
tcp: fix a timewait refcnt race
...
Fix up conflicts due to sysctl cleanups (dead sysctl_check code and
CTL_UNNUMBERED removed) in
kernel/sysctl_check.c
net/ipv4/sysctl_net_ipv4.c
net/ipv6/addrconf.c
net/sctp/sysctl.c
"Definition" is misspelled "defintion" in several comments; this
patch fixes them. No code changes.
Signed-off-by: Adam Buchbinder <adam.buchbinder@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The setting of VPU need not be changed from default.
And current setting value is not defined on SH7724
Reported-by: Goda Yusuke <goda.yusuke@renesas.com>
Signed-off-by: Kuninori Morimoto <morimoto.kuninori@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This file breaks out the SuperH PFC code from
arch/sh/kernel/gpio.c + arch/sh/include/asm/gpio.h
to drivers/sh/pfc.c + include/linux/sh_pfc.h.
Similar to the INTC stuff. The non-SuperH specific
file location makes it possible to share the code
between multiple architectures.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
This patch moves the KEYSC header file from the
SuperH specific asm directory to a place where
it can be shared by multiple architectures.
Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Mtdblock driver doesn't call flush_dcache_page for pages in request. So,
this causes problems on architectures where the icache doesn't fill from
the dcache or with dcache aliases. The patch fixes this.
The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid
pointless empty cache-thrashing loops on architectures for which
flush_dcache_page() is a no-op. Every architecture was provided with this
flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is
equal 1 or do nothing otherwise.
See "fix mtd_blkdevs problem with caches on some architectures" discussion
on LKML for more information.
Signed-off-by: Ilya Loginov <isloginov@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Peter Horton <phorton@bitbox.co.uk>
Cc: "Ed L. Cashin" <ecashin@coraid.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
A number of small optimisations to FPU handling, in particular:
- move the task USEDFPU flag from the thread_info flags field (which
is accessed asynchronously to the thread) to a new status field,
which is only accessed by the thread itself. This allows locking to
be removed in most cases, or can be reduced to a preempt_lock().
This mimics the i386 behaviour.
- move the modification of regs->sr and thread_info->status flags out
of save_fpu() to __unlazy_fpu(). This gives the compiler a better
chance to optimise things, as well as making save_fpu() symmetrical
with restore_fpu() and init_fpu().
- implement prepare_to_copy(), so that when creating a thread, we can
unlazy the FPU prior to copying the thread data structures.
Also make sure that the FPU is disabled while in the kernel, in
particular while booting, and for newly created kernel threads,
In a very artificial benchmark, the execution time for 2500000
context switches was reduced from 50 to 45 seconds.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
The previous implementation of clear_user_highpage and copy_user_highpage
checked to see if there was a D-cache aliasing issue between the user
and kernel mappings of a page, but if there was they always did a
flush with writeback on the dirtied kernel alias.
However as we now have the ability to map a page into kernel space
with the same cache colour as the user mapping, there is no need to
write back this data.
Currently we also invalidate the kernel alias as a precaution, however
I'm not sure if this is actually required.
Also correct the definition of FIX_CMAP_END so that the mappings created
by kmap_coherent() are actually at the correct colour.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
sh port of the sLeAZY-fpu feature currently implemented for some architectures
such us i386.
Right now the SH kernel has a 100% lazy fpu behaviour.
This is of course great for applications that have very sporadic or no FPU use.
However for very frequent FPU users... you take an extra trap every context
switch.
The patch below adds a simple heuristic to this code: after 5 consecutive
context switches of FPU use, the lazy behavior is disabled and the context
gets restored every context switch.
After 256 switches, this is reset and the 100% lazy behavior is returned.
Tests with LMbench showed no regression.
I saw a little improvement due to the prefetching (~2%).
The tests below also show that, with this sLeazy patch, indeed,
the number of FPU exceptions is reduced.
To test this. I hacked the lat_ctx LMBench to use the FPU a little more.
sLeasy implementation
===========================================
switch_to calls | 79326
sleasy calls | 42577
do_fpu_state_restore calls| 59232
restore_fpu calls | 59032
Exceptions: 0x800 (FPU disabled ): 16604
100% Leazy (default implementation)
===========================================
switch_to calls | 79690
do_fpu_state_restore calls | 53299
restore_fpu calls | 53101
Exceptions: 0x800 (FPU disabled ): 53273
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Stuart Menefy <stuart.menefy@st.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
sh64 doesn't use GENERIC_BUG, which presently causes the handle_BUG()
code to blow up. Fix up the dependencies and get it all building again.
Signed-off-by: Paul Mundt <lethal@linux-sh.org>