linux

Author	SHA1	Message	Date
Martin Schwidefsky	2f569afd9c	CONFIG_HIGHPTE vs. sub-page page tables. Background: I've implemented 1K/2K page tables for s390. These sub-page page tables are required to properly support the s390 virtualization instruction with KVM. The SIE instruction requires that the page tables have 256 page table entries (pte) followed by 256 page status table entries (pgste). The pgstes are only required if the process is using the SIE instruction. The pgstes are updated by the hardware and by the hypervisor for a number of reasons, one of them is dirty and reference bit tracking. To avoid wasting memory the standard pte table allocation should return 1K/2K (31/64 bit) and 2K/4K if the process is using SIE. Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means the s390 version for pte_alloc_one cannot return a pointer to a struct page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one cannot return a pointer to a pte either, since that would require more than 32 bit for the return value of pte_alloc_one (and the pte * would not be accessible since its not kmapped). Solution: The only solution I found to this dilemma is a new typedef: a pgtable_t. For s390 pgtable_t will be a (pte ) - to be introduced with a later patch. For everybody else it will be a (struct page ). The additional problem with the initialization of the ptl lock and the NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and a destructor pgtable_page_dtor. The page table allocation and free functions need to call these two whenever a page table page is allocated or freed. pmd_populate will get a pgtable_t instead of a struct page pointer. To get the pgtable_t back from a pmd entry that has been installed with pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page call in free_pte_range and apply_to_pte_range. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:42 -08:00
David Howells	922a70d327	aout: move STACK_TOP[_MAX] to asm/processor.h Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT format is available. Signed-off-by: David Howells <dhowells@redhat.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:29 -08:00
Heiko Carstens	596f56018d	tty: s390 support for termios2. Backend for s390. Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:24 -08:00
Eric W. Biederman	df5f8314ca	proc: seqfile convert proc_pid_status to properly handle pid namespaces Currently we possibly lookup the pid in the wrong pid namespace. So seq_file convert proc_pid_status which ensures the proper pid namespaces is passed in. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: another build fix] [akpm@linux-foundation.org: s390 build fix] [akpm@linux-foundation.org: fix task_name() output] [akpm@linux-foundation.org: fix nommu build] Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Cc: Andrew Morgan <morgan@kernel.org> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Menage <menage@google.com> Cc: Paul Jackson <pj@sgi.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-08 09:22:24 -08:00
Mathieu Desnoyers	fe4130131e	Add cmpxchg_local to s390 Use the standard __cmpxchg for every type that can be updated atomically. Use the new generic cmpxchg_local (disables interrupt) for other types. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:32 -08:00
H. Peter Anvin	6e16d89bcd	Sanitize the type of struct user.u_ar0 struct user.u_ar0 is defined to contain a pointer offset on all architectures in which it is defined (all architectures which define an a.out format except SPARC.) However, it has a pointer type in the headers, which is pointless -- <asm/user.h> is not exported to userspace, and it just makes the code messy. Redefine the field as "unsigned long" (which is the same size as a pointer on all Linux architectures) and change the setting code to user offsetof() instead of hand-coded arithmetic. Cc: Linux Arch Mailing List <linux-arch@vger.kernel.org> Cc: Bryan Wu <bryan.wu@analog.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Lennert Buytenhek <kernel@wantstofly.org> Cc: Håvard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:30 -08:00
Kirill A. Shutemov	ed7b1889da	Unexport asm/page.h Do not export asm/page.h during make headers_install. This removes PAGE_SIZE from userspace headers. Signed-off-by: Kirill A. Shutemov <k.shutemov@gmail.com> Reviewed-by: David Woodhouse <dwmw2@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:30 -08:00
Kirill A. Shutemov	516c25a86f	Cleanup asm/{elf,page,user}.h: #ifdef __KERNEL__ is no longer needed asm/elf.h, asm/page.h and asm/user.h don't export to userspace now, so we can drop #ifdef __KERNEL__ for them. [k.shutemov@gmail.com: remove #ifdef __KERNEL_] Signed-off-by: Kirill A. Shutemov <k.shutemov@gmail.com> Reviewed-by: David Woodhouse <dwmw2@infradead.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kirill A. Shutemov <k.shutemov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 08:42:30 -08:00
Michael Neuling	06b8e878a9	taskstats scaled time cleanup This moves the ability to scale cputime into generic code. This allows us to fix the issue in kernel/timer.c (noticed by Balbir) where we could only add an unscaled value to the scaled utime/stime. This adds a cputime_to_scaled function. As before, the POWERPC version does the scaling based on the last SPURR/PURR ratio calculated. The generic and s390 (only other arch to implement asm/cputime.h) versions are both NOPs. Also moves the SPURR and PURR snapshots closer. Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-06 10:41:00 -08:00
Linus Torvalds	39ce941ec1	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] dcss: Initialize workqueue before using it. [S390] Remove BUILD_BUG_ON() in vmem code. [S390] sclp_tty/sclp_vt220: Fix scheduling while atomic [S390] dasd: fix panic caused by alias device offline [S390] dasd: add ifcc handling [S390] latencytop s390 support. [S390] Implement ext2_find_next_bit. [S390] Cleanup & optimize bitops. [S390] Define GENERIC_LOCKBREAK. [S390] console: allow vt220 console to be the only console [S390] Fix couple of section mismatches. [S390] Fix smp_call_function_mask semantics. [S390] Fix linker script. [S390] DEBUG_PAGEALLOC support for s390. [S390] cio: Add shutdown callback for ccwgroup. [S390] cio: Update documentation. [S390] cio: Clean up chsc response code handling. [S390] cio: make sense id procedure work with partial hardware response	2008-02-05 10:11:02 -08:00
Benjamin Herrenschmidt	5e5419734c	add mm argument to pte/pmd/pud/pgd_free (with Martin Schwidefsky <schwidefsky@de.ibm.com>) The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as first argument. The free functions do not get the mm_struct argument. This is 1) asymmetrical and 2) to do mm related page table allocations the mm argument is needed on the free function as well. [kamalesh@linux.vnet.ibm.com: i386 fix] [akpm@linux-foundation.org: coding-syle fixes] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 09:44:18 -08:00
Dave Hansen	8245525741	maps4: rework TASK_SIZE macros The following replaces the earlier patches sent. It should address David Rientjes's comments, and has been compile tested on all the architectures that it touches, save for parisc. For the /proc/<pid>/pagemap code[1], we need to able to query how much virtual address space a particular task has. The trick is that we do it through /proc and can't use TASK_SIZE since it references "current" on some arches. The process opening the /proc file might be a 32-bit process opening a 64-bit process's pagemap file. x86_64 already has a TASK_SIZE_OF() macro: #define TASK_SIZE_OF(child) ((test_tsk_thread_flag(child, TIF_IA32)) ? IA32_PAGE_OFFSET : TASK_SIZE64) I'd like to have that for other architectures. So, add it for all the architectures that actually use "current" in their TASK_SIZE. For the others, just add a quick #define in sched.h to use plain old TASK_SIZE. 1. http://www.linuxworld.com/news/2007/042407-kernel.html - MIPS portion from Ralf Baechle <ralf@linux-mips.org> [akpm@linux-foundation.org: fix mips build] Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Matt Mackall <mpm@selenic.com> Acked-by: David Rientjes <rientjes@google.com> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-05 09:44:16 -08:00
Heiko Carstens	0189103c69	[S390] Remove BUILD_BUG_ON() in vmem code. Remove BUILD_BUG_ON() in vmem code since it causes build failures if the size of struct page increases. Instead calculate at compile time the address of the highest physical address that can be added to the 1:1 mapping. This supposed to fix a build failure with the page owner tracking leak detector patches as reported by akpm. page-owner-tracking-leak-detector-broken-on-s390.patch can be removed from -mm again when this is merged. Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:51:01 +01:00
Heiko Carstens	67fe9251bb	[S390] Implement ext2_find_next_bit. Fixes this compile error: fs/ext4/mballoc.c: In function 'ext4_mb_generate_buddy': fs/ext4/mballoc.c:954: error: implicit declaration of function 'generic_find_next_le_bit' Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:58 +01:00
Martin Schwidefsky	0abbf05cdd	[S390] Cleanup & optimize bitops. The bitops header is now a bit shorter and easier to understand since it uses less inline assembly. It requires some tricks to persuade the compiler to generate decent code. The ffz/ffs functions now use the _zb_findmap/_sb_findmap table as well. With this cleanup the new bitops for ext4 can be implemented with a few lines, instead of another large inline assembly. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:58 +01:00
Heiko Carstens	2485579bf5	[S390] DEBUG_PAGEALLOC support for s390. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:54 +01:00
Cornelia Huck	01bc8ad165	[S390] cio: Add shutdown callback for ccwgroup. This intendeds to make proper shutdown of qeth devices easier. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-05 16:50:54 +01:00
Marcin Ślusarz	8b3de0df4e	asm-*/compat.h: fix typo in comment Signed-off-by: Marcin Ślusarz <marcin.slusarz@gmail.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>	2008-02-03 16:32:51 +02:00
Laszlo Attila Toth	4a19ec5800	[NET]: Introducing socket mark socket option. A userspace program may wish to set the mark for each packets its send without using the netfilter MARK target. Changing the mark can be used for mark based routing without netfilter or for packet filtering. It requires CAP_NET_ADMIN capability. Signed-off-by: Laszlo Attila Toth <panther@balabit.hu> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2008-01-31 19:27:19 -08:00
travis@sgi.com	f034347470	s390: use generic percpu linux-2.6.git Change s390 percpu.h to use asm-generic/percpu.h Signed-off-by: Mike Travis <travis@sgi.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 23:27:58 +01:00
travis@sgi.com	6ddfee0e79	modules: fold percpu_modcopy into module.c percpu_modcopy() is defined multiple times in arch files. However, the only user is module.c. Put a static definition into module.c and remove the definitions from the arch files. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 23:27:58 +01:00
travis@sgi.com	5280e004fc	percpu: move arch XX_PER_CPU_XX definitions into linux/percpu.h - Special consideration for IA64: Add the ability to specify arch specific per cpu flags - remove .data.percpu attribute from DEFINE_PER_CPU for non-smp case. The arch definitions are all the same. So move them into linux/percpu.h. We cannot move DECLARE_PER_CPU since some include files just include asm/percpu.h to avoid include recursion problems. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:32:52 +01:00
travis@sgi.com	b32ef636a5	percpu: use a kconfig variable to signal arch specific percpu setup The use of the __GENERIC_PERCPU is a bit problematic since arches may want to run their own percpu setup while using the generic percpu definitions. Replace it through a kconfig variable. Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Mike Travis <travis@sgi.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-01-30 13:32:51 +01:00
Aneesh Kumar K.V	aa02ad67d9	ext4: Add ext4_find_next_bit() This function is used by the ext4 multi block allocator patches. Also add generic_find_next_le_bit Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2008-01-28 23:58:27 -05:00
Hisashi Hifumi	894cdde26b	[S390] do local_irq_restore while spinning in spin_lock_irqsave. In s390's spin_lock_irqsave, interrupts remain disabled while spinning. In other architectures like x86 and powerpc, interrupts are re-enabled while spinning if IRQ is not masked before spin_lock_irqsave is called. The following patch re-enables interrupts through local_irq_restore while spinning for a lock acquisition. This can improve system response. [heiko.carstens@de.ibm.com: removed saving of pc] Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:31 +01:00
Carsten Otte	dab5209cd8	[S390] add smp_call_function_mask This patch adds the s390 variant for smp_call_function_mask(). The implementation is pretty straight forward using the wrapper __smp_call_function_map() which already takes a cpumask_t argument. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:31 +01:00
Martin Schwidefsky	028fed8233	[S390] Unused field / extern declaration in processor.h Remove extern declaration of non-existent last_task_used_math and remove unused field error_code from the thread_struct. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:29 +01:00
Roland McGrath	0ac30be461	[S390] single-step cleanup Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:27 +01:00
Joe Perches	5800266a78	[S390] include/asm-s390/: Spelling fixes Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:25 +01:00
Michael Holzheu	48657d223d	[S390] Use diag308 subcodes 3 and 6 for reboot and dump when possible. This patch fixes a problem with the following scenario: 1. Linux booted from DASD "A" 2. Reboot from DASD "B" using "/sys/firmware/reipl/ccw/device" 3. Reboot DASD "B" Without this patch in step 3 on newer s390 systems under LPAR instead of DASD "B", DASD "A" will be booted. The reason is that in step 2 we use CCW reipl and in step 3 we use DIAG308 (subcode 3) reipl. DIAG308 does not notice the CCW reipl and still thinks that it has to reboot DASD "A". Before applying this fix, ensure to have MCF RJ9967101E or z9 GA3 base driver installed. Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:24 +01:00
Michael Holzheu	99ca4e582d	[S390] kernel: Shutdown Actions Interface In case of a kernel panic it is currently possible to specify that a dump should be created, the system should be rebooted or stopped. Virtual sysfs files under the directory /sys/firmware/ are used for that configuration. In addition to that, there are kernel parameters 'vmhalt', 'vmpoff' and 'vmpanic', which can be used to specify z/VM commands, which are automatically executed in case of halt, power off or a kernel panic. This patch combines both functionalities and allows to specify the z/VM CP commands also via sysfs attributes. In addition to that, it enhances the existing handling of shutdown triggers (e.g. halt or panic) and associated shutdown actions (e.g. dump or reipl) and makes it more flexible. Signed-off-by: Michael Holzheu <holzheu@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:19 +01:00
Heiko Carstens	48483b3290	[S390] Get rid of additional_cpus kernel parameter. It caused only a lot of confusion. From now on cpu hotplug of up to NR_CPUS will work by default. If somebody wants to limit that then the possible_cpus parameter can be used. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:15 +01:00
Heiko Carstens	519580fc17	[S390] Use new style spinlock initializer in __RWSEM_INITIALIZER. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:15 +01:00
Heiko Carstens	3b4beb3175	[S390] Remove owner_pc member from raw_spinlock_t. Used to contain the address of the holder of the lock. But since the spinlock code is not inlined anymore all locks contain the same address anyway. And since in addtition nobody complained about that for ages its obviously unused. So remove it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:14 +01:00
Christian Borntraeger	5fd9c6e214	[S390] Change vmalloc defintions Currently the vmalloc area starts at a dynamic address depending on the memory size. There was also an 8MB security hole after the physical memory to catch out-of-bounds accesses. We can simplify the code by putting the vmalloc area explicitely at the top of the kernel mapping and setting the vmalloc size to a fixed value of 128MB/128GB for 31bit/64bit systems. Part of the vmalloc area will be used for the vmem_map. This leaves an area of 96MB/1GB for normal vmalloc allocations. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:12 +01:00
Martin Schwidefsky	8ffd74a092	[S390] Avoid warnings in tlblush.h Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:11 +01:00
Martin Schwidefsky	6f457e1a14	[S390] Fix tlb flushing with idte. The clear-by-asce operation of the idte instruction gets an asce (address-space-control-element) as argument to specify which TLBs need to get flushed. The current code passes a plain pointer to the start of the pgd without the additional bits which would make the pointer an asce. The current machines don't mind the difference but a future model might want to use the designation type control bits in the asce as a filter for the TLBs to flush. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:10 +01:00
Heiko Carstens	08d0796827	[S390] Standby cpu activation/deactivation. Add a new interface so that cpus can be put into standby state and configured state. Only offline cpus can be put into standby state or configured state. For that the new percpu sysfs attribute "configure" must be used. To put a cpu in standby state a "0" must be written to the attribute. In order to switch it into configured state a "1" must be written to the attribute. Only cpus in configured state can be brought online. In addition this patch introduces a static mapping of physical to logical cpus. As a result only the sysfs directories of present cpus will be created. To scan for new cpus the new sysfs attribute "rescan" must be used. Writing to /sys/devices/system/cpu/rescan will trigger a rescan of cpus and will create directories for new cpus. On IPL only configured cpus will be used. And on reboot/shutdown all cpus will remain in their current state (configured/standby). Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:09 +01:00
Peter Oberparleiter	4e8e56c671	[S390] cio: Extend adapter interrupt interface. From: Cornelia Huck <cornelia.huck@de.ibm.com> Change the adapter interrupt interface in order to allow multiple adapter interrupt handlers to be registered. Indicators are now allocated by cio instead of the device driver. The qdio parts have been Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-01-26 14:11:00 +01:00
Martin Schwidefsky	0d01792300	[S390] pud_present/pmd_present bug. Git commit `3610cce87a` (yeah my own :-/) introduced a bug in regard to pud/pmd table entries. If the address of the page table refered to by a pud/pmd value happens to have zeroes in the lower 32 bits, pud_present and pmd_present return false. The obvious effect is that this triggers the BUG_ON in exit_mmap because some ptes will not get released on process end. Worse is that the next fault for memory covered by that pud/pmd will allocate another pmd/pte table and populate the pud/pmd entry. The old page table entries hanging below this entry are lost! The fix is simple, properly check against 0. The check is added for pud_none/pmd_none as well even if these two functions work because the invalid bit is in the lower 32 bits. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-12-17 16:25:56 +01:00
Heiko Carstens	411788ea7f	[S390] Fix irq tracing and lockdep_sys_exit calls. Current support for TRACE_IRQFLAGS and lockdep_sys_exit is broken. IRQ flag tracing is broken for program checks. Even worse is that the newly introduced calls to lockdep_sys_exit are in the critical section code which is not supposed to call any C functions. In addition the checks if locks are still held are also done when returning to kernel code which is broken as well. Fix all this by disabling interrupts and machine checks at the exit paths and then do the appropriate checks and calls. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-11-20 11:13:45 +01:00
Linus Torvalds	56d61a0e26	Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6 * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: [S390] 4level-fixup cleanup [S390] Cleanup page table definitions. [S390] Introduce follow_table in uaccess_pt.c [S390] Remove unused user_seg from thread structure. [S390] tlb flush fix. [S390] kernel: Fix dump on panic for DASDs under LPAR. [S390] struct class_device -> struct device conversion. [S390] cio: Fix incomplete commit for uevent suppression. [S390] cio: Use to_channelpath() for device to channel path conversion. [S390] Add per-cpu idle time / idle count sysfs attributes. [S390] Update default configuration.	2007-10-22 19:23:34 -07:00
Jens Axboe	d6ec084200	Add CONFIG_DEBUG_SG sg validation Add a Kconfig entry which will toggle some sanity checks on the sg entry and tables. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:20:03 +02:00
Jens Axboe	18dabf473e	Change table chaining layout Change the page member of the scatterlist structure to be an unsigned long, and encode more stuff in the lower bits: - Bits 0 and 1 zero: this is a normal sg entry. Next sg entry is located at sg + 1. - Bit 0 set: this is a chain entry, the next real entry is at ->page_link with the two low bits masked off. - Bit 1 set: this is the final entry in the sg entry. sg_next() will return NULL when passed such an entry. It's thus important that sg table users use the proper accessors to get and set the page member. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-22 21:20:01 +02:00
Martin Schwidefsky	190a1d722a	[S390] 4level-fixup cleanup Get independent from asm-generic/4level-fixup.h Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:49 +02:00
Martin Schwidefsky	3610cce87a	[S390] Cleanup page table definitions. - De-confuse the defines for the address-space-control-elements and the segment/region table entries. - Create out of line functions for page table allocation / freeing. - Simplify get_shadow_xxx functions. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:49 +02:00
Martin Schwidefsky	6f3fa3f0eb	[S390] Remove unused user_seg from thread structure. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:48 +02:00
Martin Schwidefsky	ba8a9229ab	[S390] tlb flush fix. The current tlb flushing code for page table entries violates the s390 architecture in a small detail. The relevant section from the principles of operation (SA22-7832-02 page 3-47): "A valid table entry must not be changed while it is attached to any CPU and may be used for translation by that CPU except to (1) invalidate the entry by using INVALIDATE PAGE TABLE ENTRY or INVALIDATE DAT TABLE ENTRY, (2) alter bits 56-63 of a page-table entry, or (3) make a change by means of a COMPARE AND SWAP AND PURGE instruction that purges the TLB." That means if one thread of a multithreaded applciation uses a vma while another thread does an unmap on it, the page table entries of that vma needs to get removed with IPTE, IDTE or CSP. In some strange and rare situations a cpu could check-stop (die) because a entry has been pushed out of the TLB that is still needed to complete a (milli-coded) instruction. I've never seen it happen with the current code on any of the supported machines, so right now this is a theoretical problem. But I want to fix it nevertheless, to avoid headaches in the futures. To get this implemented correctly without changing common code the primitives ptep_get_and_clear, ptep_get_and_clear_full and ptep_set_wrprotect need to use the IPTE instruction to invalidate the pte before the new pte value gets stored. If IPTE is always used for the three primitives three important operations will have a performace hit: fork, mprotect and exit_mmap. Time for some workarounds: * 1: ptep_get_and_clear_full is used in unmap_vmas to remove page tables entries in a batched tlb gather operation. If the mmu_gather context passed to unmap_vmas has been started with full_mm_flush==1 or if only one cpu is online or if the only user of a mm_struct is the current process then the fullmm indication in the mmu_gather context is set to one. All TLBs for mm_struct are flushed by the tlb_gather_mmu call. No new TLBs can be created while the unmap is in progress. In this case ptep_get_and_clear_full clears the ptes with a simple store. * 2: ptep_get_and_clear is used in change_protection to clear the ptes from the page tables before they are reentered with the new access flags. At the end of the update flush_tlb_range clears the remaining TLBs. In general the ptep_get_and_clear has to issue IPTE for each pte and flush_tlb_range is a nop. But if there is only one user of the mm_struct then ptep_get_and_clear uses simple stores to do the update and flush_tlb_range will flush the TLBs. * 3: Similar to 2, ptep_set_wrprotect is used in copy_page_range for a fork to make all ptes of a cow mapping read-only. At the end of of copy_page_range dup_mmap will flush the TLBs with a call to flush_tlb_mm. Check for mm->mm_users and if there is only one user avoid using IPTE in ptep_set_wrprotect and let flush_tlb_mm clear the TLBs. Overall for single threaded programs the tlb flush code now performs better, for multi threaded programs it is slightly worse. In particular exit_mmap() now does a single IDTE for the mm and then just frees every page cache reference and every page table page directly without a delay over the mmu_gather structure. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:48 +02:00
Heiko Carstens	fae8b22d3e	[S390] Add per-cpu idle time / idle count sysfs attributes. Add two new sysfs entries per cpu: idle_count and idle_time. idle_count contains the number of times a cpu went into idle state. idle_time contains the time a cpu spent in idle state in microseconds. This can be used e.g. by powertop to tell how often idle state is entered and left. # cat /sys/devices/system/cpu/cpu0/idle_count 504 # cat /sys/devices/system/cpu/cpu0/idle_time 469734037 us Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-22 12:52:47 +02:00
Jiri Slaby	0624517d80	forbid asm/bitops.h direct inclusion forbid asm/bitops.h direct inclusion Because of compile errors that may occur after bit changes if asm/bitops.h is included directly without e.g. linux/kernel.h which includes linux/bitops.h, forbid direct inclusion of asm/bitops.h. Thanks to Adrian Bunk. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Adrian Bunk <bunk@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:41 -07:00

1 2 3 4 5 ...

366 Commits