linux

Author	SHA1	Message	Date
Heiko Carstens	048cd4e51d	compat: fix compile breakage on s390 The new is_compat_task() define for the !COMPAT case in include/linux/compat.h conflicts with a similar define in arch/s390/include/asm/compat.h. This is the minimal patch which fixes the build issues. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-02-27 07:54:27 -08:00
Martin Schwidefsky	aa33c8cbba	[S390] cleanup trap handling Move the program interruption code and the translation exception identifier to the pt_regs structure as 'int_code' and 'int_parm_long' and make the first level interrupt handler in entry[64].S store the two values. That makes it possible to drop 'prot_addr' and 'trap_no' from the thread_struct and to reduce the number of arguments to a lot of functions. Finally un-inline do_trap. Overall this saves 5812 bytes in the .text section of the 64 bit kernel. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-12-27 11:27:12 +01:00
Carsten Otte	f32269a0d0	[S390] disable MACHINE_IS_VM check for pfault This patch disables the check for MACHINE_IS_VM when initializing the pfault infrastructure. The code checks for successful completion of diag 258 anyway, thus it's safe to try initialization on LPAR anyway. This is needed to use pfault on kvm Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-12-27 11:27:10 +01:00
Heiko Carstens	fa2fb2f4a5	[S390] pfault: ignore leftover completion interrupts Ignore completion interrupts if the initial interrupt hasn't been received and the addressed task is not running. This case can only happen if leftover (pending) completion interrupt gets delivered which wasn't removed with the PFAULT CANCEL operation during cpu hotplug. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-11-14 11:19:08 +01:00
Carsten Otte	499069e1a4	[S390] take mmap_sem when walking guest page table gmap_fault needs to walk the guest page table. However, parts of that may change if some other thread does munmap. In that case gmap_unmap_notifier will also unmap the corresponding parts from the guest page table. We need to take mmap_sem in order to serialize these operations. do_exception now calls __gmap_fault with mmap_sem held which does not get exported to modules. The exported function, which is called from KVM, now takes mmap_sem. Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-10-30 15:16:45 +01:00
Martin Schwidefsky	b50511e41a	[S390] cleanup psw related bits and pieces Split out addressing mode bits from PSW_BASE_BITS, rename PSW_BASE_BITS to PSW_MASK_BASE, get rid of psw_user32_bits, remove unused function enabled_wait(), introduce PSW_MASK_USER, and drop PSW_MASK_MERGE macros. Change psw_kernel_bits / psw_user_bits to contain only the bits that are always set in the respective mode. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-10-30 15:16:43 +01:00
Martin Schwidefsky	ccf45cafb0	[S390] addressing mode limits and psw address wrapping An instruction with an address right below the adress limit for the current addressing mode will wrap. The instruction restart logic in the protection fault handler and the signal code need to follow the wrapping rules to find the correct instruction address. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-10-30 15:16:43 +01:00
Martin Schwidefsky	e5992f2e6c	[S390] kvm guest address space mapping Add code that allows KVM to control the virtual memory layout that is seen by a guest. The guest address space uses a second page table that shares the last level pte-tables with the process page table. If a page is unmapped from the process page table it is automatically unmapped from the guest page table as well. The guest address space mapping starts out empty, KVM can map any individual 1MB segments from the process virtual memory to any 1MB aligned location in the guest virtual memory. If a target segment in the process virtual memory does not exist or is unmapped while a guest mapping exists the desired target address is stored as an invalid segment table entry in the guest page table. The population of the guest page table is fault driven. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-07-24 10:48:21 +02:00
Peter Zijlstra	a8b0ca17b8	perf: Remove the nmi parameter from the swevent and overflow interface The nmi parameter indicated if we could do wakeups from the current context, if not, we would set some state and self-IPI and let the resulting interrupt do the wakeup. For the various event classes: - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from the PMI-tail (ARM etc.) - tracepoint: nmi=0; since tracepoint could be from NMI context. - software: nmi=[0,1]; some, like the schedule thing cannot perform wakeups, and hence need 0. As one can see, there is very little nmi=1 usage, and the down-side of not using it is that on some platforms some software events can have a jiffy delay in wakeup (when arch_irq_work_raise isn't implemented). The up-side however is that we can remove the nmi parameter and save a bunch of conditionals in fast paths. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Will Deacon <will.deacon@arm.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: Anton Blanchard <anton@samba.org> Cc: Eric B Munson <emunson@mgebm.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jason Wessel <jason.wessel@windriver.com> Cc: Don Zickus <dzickus@redhat.com> Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-07-01 11:06:35 +02:00
Heiko Carstens	33ce614029	[S390] mm: add page fault retry handling s390 arch backend for `d065bd81` "mm: retry page fault when blocking on disk transfer". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2011-05-26 09:48:25 +02:00
Heiko Carstens	99583181cb	[S390] mm: handle kernel caused page fault oom situations If e.g. copy_from_user() generates a page fault and the kernel runs into an OOM situation the system might lock up. If the OOM killer sends a SIG_KILL to the current process it can't handle it since it is stuck in a copy_from_user() - page fault loop. Fix this by adding the same fix as other architectures have. E.g. the x86 variant f86268 "x86/mm: Handle mm_fault_error() in kernel space" Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2011-05-26 09:48:25 +02:00
Heiko Carstens	d7b250e2a2	[S390] irq: merge irq.c and s390_ext.c Merge irq.c and s390_ext.c into irq.c. That way all external interrupt related functions are together. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-26 09:48:24 +02:00
Heiko Carstens	df7997ab1c	[S390] irq: fix service signal external interrupt handling Interrupt sources like pfault, sclp, dasd_diag and virtio all use the service signal external interrupt subclass mask in control register 0 to enable and disable the corresponding interrupt. Because no reference counting is implemented each subsystem thinks it is the only user of subclass and sets and clears the bit like it wants. This leads to case that unloading the dasd diag module under z/VM causes both sclp and pfault interrupts to be masked. The result will be locked up system sooner or later. Fix this by introducing a new way to set (register) and clear (unregister) the service signal subclass mask bit in cr0. Also convert all drivers. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-26 09:48:24 +02:00
Heiko Carstens	902050bcde	[S390] pfault: always enable service signal interrupt Always enable the service signal subclass mask bit in cr0, if pfault is available. That way we use the normal cpu hotplug way to propagate the subclass mask bit in cr0 instead of open coding it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-26 09:48:24 +02:00
Heiko Carstens	7dd8fe1f91	[S390] pfault: cleanup code Small code cleanup. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:30 +02:00
Heiko Carstens	f2db2e6cb3	[S390] pfault: cpu hotplug vs missing completion interrupts On cpu hot remove a PFAULT CANCEL command is sent to the hypervisor which in turn will cancel all outstanding pfault requests that have been issued on that cpu (the same happens with a SIGP cpu reset). The result is that we end up with uninterruptible processes where the interrupt that would wake up these processes never arrives. In order to solve this all processes which wait for a pfault completion interrupt get woken up after a cpu hot remove. The worst case that could happen is that they fault again and in turn need to wait again. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:29 +02:00
Martin Schwidefsky	043d07084b	[S390] Remove data execution protection The noexec support on s390 does not rely on a bit in the page table entry but utilizes the secondary space mode to distinguish between memory accesses for instructions vs. data. The noexec code relies on the assumption that the cpu will always use the secondary space page table for data accesses while it is running in the secondary space mode. Up to the z9-109 class machines this has been the case. Unfortunately this is not true anymore with z10 and later machines. The load-relative-long instructions lrl, lgrl and lgfrl access the memory operand using the same addressing-space mode that has been used to fetch the instruction. This breaks the noexec mode for all user space binaries compiled with march=z10 or later. The only option is to remove the current noexec support. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-05-23 10:24:28 +02:00
Heiko Carstens	a985183285	[S390] irqstats: fix counting of pfault, dasd diag and virtio irqs pfault, dasd diag and virtio all use the same external interrupt number. The respective interrupt handlers decide by the subcode if they are meant to handle the interrupt. Counting is currently done before looking at the subcode which means each handler counts an interrupt even if it is not handling it. Fix this by moving the kstat code after the code which looks at the subcode. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-04-29 10:42:25 +02:00
Heiko Carstens	e35c76cd47	[S390] pfault: fix token handling `f6649a7e` "[S390] cleanup lowcore access from external interrupts" changed handling of external interrupts. Instead of letting the external interrupt handlers accessing the per cpu lowcore the entry code of the kernel reads already all fields that are necessary and passes them to the handlers. The pfault interrupt handler was incorrectly converted. It tries to dereference a value which used to be a pointer to a lowcore field. After the conversion however it is not anymore the pointer to the field but its content. So instead of a dereference only a cast is needed to get the task pointer that caused the pfault. Fixes a NULL pointer dereference and a subsequent kernel crash: Unable to handle kernel pointer dereference at virtual kernel address (null) Oops: 0004 [#1] SMP Modules linked in: nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc loop qeth_l3 qeth vmur ccwgroup ext3 jbd mbcache dm_mod dasd_eckd_mod dasd_diag_mod dasd_mod CPU: 0 Not tainted 2.6.38-2-s390x #1 Process cron (pid: 1106, task: 000000001f962f78, ksp: 000000001fa0f9d0) Krnl PSW : 0404200180000000 000000000002c03e (pfault_interrupt+0xa2/0x138) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:0 CC:2 PM:0 EA:3 Krnl GPRS: 0000000000000000 0000000000000001 0000000000000000 0000000000000001 000000001f962f78 0000000000518968 0000000090000002 000000001ff03280 0000000000000000 000000000064f000 000000001f962f78 0000000000002603 0000000006002603 0000000000000000 000000001ff7fe68 000000001ff7fe48 Krnl Code: 000000000002c036: 5820d010 l %r2,16(%r13) 000000000002c03a: 1832 lr %r3,%r2 000000000002c03c: 1a31 ar %r3,%r1 >000000000002c03e: ba23d010 cs %r2,%r3,16(%r13) 000000000002c042: a744fffc brc 4,2c03a 000000000002c046: a7290002 lghi %r2,2 000000000002c04a: e320d0000024 stg %r2,0(%r13) 000000000002c050: 07f0 bcr 15,%r0 Call Trace: ([<000000001f962f78>] 0x1f962f78) [<000000000001acda>] do_extint+0xf6/0x138 [<000000000039b6ca>] ext_no_vtime+0x30/0x34 [<000000007d706e04>] 0x7d706e04 Last Breaking-Event-Address: [<0000000000000000>] 0x0 For stable maintainers: the first kernel which contains this bug is 2.6.37. Reported-by: Stephen Powell <zlinuxman@wowway.com> Cc: Jonathan Nieder <jrnieder@gmail.com> Cc: stable@kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-04-20 10:15:44 +02:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Martin Schwidefsky	5e9a26928f	[S390] ptrace cleanup Overhaul program event recording and the code dealing with the ptrace user space interface. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-05 12:47:31 +01:00
Heiko Carstens	fb0a9d7e86	[S390] pfault: delay register of pfault interrupt Use an early init call to initialize pfault. That way it is possible to use the register_external_interrupt() instead of the early variant. No need to enable pfault any earlier since it has only effect if user space processes are running. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-05 12:47:26 +01:00
Heiko Carstens	052ff461c8	[S390] irq: have detailed statistics for interrupt types Up to now /proc/interrupts only has statistics for external and i/o interrupts but doesn't split up them any further. This patch adds a line for every single interrupt source so that it is possible to easier tell what the machine is/was doing. Part of the output now looks like this; CPU0 CPU2 CPU4 EXT: 3898 4232 2305 I/O: 782 315 245 CLK: 1029 1964 727 [EXT] Clock Comparator IPI: 2868 2267 1577 [EXT] Signal Processor TMR: 0 0 0 [EXT] CPU Timer TAL: 0 0 0 [EXT] Timing Alert PFL: 0 0 0 [EXT] Pseudo Page Fault [...] NMI: 0 1 1 [NMI] Machine Checks Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2011-01-05 12:47:25 +01:00
Martin Schwidefsky	14375bc4eb	[S390] cleanup facility list handling Store the facility list once at system startup with stfl/stfle and reuse the result for all facility tests. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:21 +02:00
Martin Schwidefsky	f6649a7e5a	[S390] cleanup lowcore access from external interrupts Read external interrupts parameters from the lowcore in the first level interrupt handler in entry[64].S. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:19 +02:00
Martin Schwidefsky	1e54622e04	[S390] cleanup lowcore access from program checks Read all required fields for program checks from the lowcore in the first level interrupt handler in entry[64].S. If the context that caused the fault was enabled for interrupts we can now re-enable the irqs in entry[64].S. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:19 +02:00
Martin Schwidefsky	36bf96801e	[S390] fix SIGBUS handling Raise SIGBUS with a siginfo structure. Deliver BUS_ADRERR as si_code and the address of the fault in the si_addr field. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:19 +02:00
Martin Schwidefsky	92f842eac7	[S390] store indication fault optimization Use the store indication bit in the translation exception code on page faults to avoid the protection faults that immediatly follow the page fault if the access has been a write. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-10-25 16:10:15 +02:00
Heiko Carstens	ab3c68ee5f	[S390] debug: enable exception-trace debug facility The exception-trace facility on x86 and other architectures prints traces to dmesg whenever a user space application crashes. s390 has such a feature since ages however it is called userprocess_debug and is enabled differently. This patch makes sure that whenever one of the two procfs files /proc/sys/kernel/userprocess_debug /proc/sys/debug/exception-trace is modified the contents of the second one changes as well. That way we keep backwards compatibilty but also support the same interface like other architectures do. Besides that the output of the traces is improved since it will now also contain the corresponding filename of the vma (when available) where the process caused a fault or trap. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-05-17 10:00:17 +02:00
Heiko Carstens	22e0a04672	[S390] use kprobes_built_in() in mm/fault code Use kprobes_built_in() to avoid ifdefs like most other architectures do. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-02-26 22:37:32 +01:00
Heiko Carstens	cbb870c822	[S390] Cleanup struct _lowcore usage and defines. Use asm offsets to make sure the offset defines to struct _lowcore and its layout don't get out of sync. Also add a BUILD_BUG_ON() which checks that the size of the structure is sane. And while being at it change those sites which use odd casts to access the current lowcore. These should use S390_lowcore instead. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2010-02-26 22:37:31 +01:00
Gerald Schaefer	6c1e3e7943	[S390] Use do_exception() in pagetable walk usercopy functions. The pagetable walk usercopy functions have used a modified copy of the do_exception() function for fault handling. This lead to inconsistencies with recent changes to do_exception(), e.g. performance counters. This patch changes the pagetable walk usercopy code to call do_exception() directly, eliminating the redundancy. A new parameter is added to do_exception() to specify the fault address. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:34 +01:00
Martin Schwidefsky	1ab947de29	[S390] fault handler access flags check. Simplify the check of the vma->flags in do_exception for the different fault types. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:34 +01:00
Martin Schwidefsky	50d7280d43	[S390] fault handler performance optimization. Slim down the do_exception function to handle only the fast path of a fault and move the exceptional cases into a new function. That slightly increases the performance of the fault handling. Build fix for !CONFIG_COMPAT by Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Martin Schwidefsky	7ecb344ae8	[S390] Improve notify_page_fault implementation. notify_page_fault does a preempt_disable/preempt_enable for each fault generated by a kernel access to user space. If kprobes is not active that is unnecessary since the interrupts are not reenabled yet. To play safe repeat the kprobe_running check after preempt_disable(). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Martin Schwidefsky	b11b533427	[S390] Improve address space mode selection. Introduce user_mode to replace the two variables switch_amode and s390_noexec. There are three valid combinations of the old values: 1) switch_amode == 0 && s390_noexec == 0 2) switch_amode == 1 && s390_noexec == 0 3) switch_amode == 1 && s390_noexec == 1 They get replaced by 1) user_mode == HOME_SPACE_MODE 2) user_mode == PRIMARY_SPACE_MODE 3) user_mode == SECONDARY_SPACE_MODE The new kernel parameter user_mode=[primary,secondary,home] lets you choose the address space mode the user space processes should use. In addition the CONFIG_S390_SWITCH_AMODE config option is removed. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Martin Schwidefsky	61365e132e	[S390] Improve address space check. A data access in access-register mode always is a user mode access, the code to inspect the access-registers can be removed. The second change is to use a different test to check for no-execute fault. The third change is to pass the translation exception identification as parameter, in theory the trans_exc_code in the lowcore could have been overwritten by the time the call to check_space from do_no_context is done. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-12-07 12:51:33 +01:00
Ingo Molnar	cdd6c482c9	perf: Do the big rename: Performance Counters -> Performance Events Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f \| grep -vE 'oprofile\|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N \| sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <eranian@google.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <linux-arch@vger.kernel.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-21 14:28:04 +02:00
Heiko Carstens	bde69af2ab	[S390] Wire up page fault events for software perf counters. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-09-11 10:29:56 +02:00
Alexey Dobriyan	405f55712d	headers: smp_lock.h redux * Remove smp_lock.h from files which don't need it (including some headers!) * Add smp_lock.h to files which do need it * Make smp_lock.h include conditional in hardirq.h It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT This will make hardirq.h inclusion cheaper for every PREEMPT=n config (which includes allmodconfig/allyesconfig, BTW) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-07-12 12:22:34 -07:00
Linus Torvalds	d06063cc22	Move FAULT_FLAG_xyz into handle_mm_fault() callers This allows the callers to now pass down the full set of FAULT_FLAG_xyz flags to handle_mm_fault(). All callers have been (mechanically) converted to the new calling convention, there's almost certainly room for architectures to clean up their code and then add FAULT_FLAG_RETRY when that support is added. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-06-21 13:08:22 -07:00
Heiko Carstens	7757591ab4	[S390] implement is_compat_task Implement is_compat_task and use it all over the place. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-06-12 10:27:30 +02:00
Heiko Carstens	59fa4392dd	[S390] page fault: invoke oom-killer s390 arch backend for `1c0fe6e3bd` "mm: invoke oom-killer from page fault". Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2009-03-26 15:24:03 +01:00
Gerald Schaefer	53492b1de4	[S390] System z large page support. This adds hugetlbfs support on System z, using both hardware large page support if available and software large page emulation on older hardware. Shared (large) page tables are implemented in software emulation mode, by using page->index of the first tail page from a compound large page to store page table information. Signed-off-by: Gerald Schaefer <geraldsc@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-04-30 13:38:47 +02:00
Heiko Carstens	a806170e29	[S390] Fix a lot of sparse warnings. Most noteable part of this commit is the new local header file entry.h which contains all the function declarations of functions that get only called from asm code or are arch internal. That way we can avoid extern declarations in C files. This is more or less the same that was done for sparc64. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:47:06 +02:00
Heiko Carstens	c2e8b8531b	[S390] exec_protect: Fix incorrect extern declarations. sys_sigreturn and sys_rt_sigreturn don't take any arguments. So luckily this resulted only in unneeded instead of incorrect code. But still this clearly shows why one should not put extern declarations in C files (will be fixed with a larger sparse patch). Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>	2008-04-17 07:46:59 +02:00
Martin Schwidefsky	6252d702c5	[S390] dynamic page tables. Add support for different number of page table levels dependent on the highest address used for a process. This will cause a 31 bit process to use a two level page table instead of the four level page table that is the default after the pud has been introduced. Likewise a normal 64 bit process will use three levels instead of four. Only if a process runs out of the 4 tera bytes which can be addressed with a three level page table the fourth level is dynamically added. Then the process can use up to 8 peta byte. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2008-02-09 18:24:41 +01:00
Serge E. Hallyn	b460cbc581	pid namespaces: define is_global_init() and is_container_init() is_init() is an ambiguous name for the pid==1 check. Split it into is_global_init() and is_container_init(). A cgroup init has it's tsk->pid == 1. A global init also has it's tsk->pid == 1 and it's active pid namespace is the init_pid_ns. But rather than check the active pid namespace, compare the task structure with 'init_pid_ns.child_reaper', which is initialized during boot to the /sbin/init process and never changes. Changelog: 2.6.22-rc4-mm2-pidns1: - Use 'init_pid_ns.child_reaper' to determine if a given task is the global init (/sbin/init) process. This would improve performance and remove dependence on the task_pid(). 2.6.21-mm2-pidns2: - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc, ppc,avr32}/traps.c for the _exception() call to is_global_init(). This way, we kill only the cgroup if the cgroup's init has a bug rather than force a kernel panic. [akpm@linux-foundation.org: fix comment] [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c] [bunk@stusta.de: kernel/pid.c: remove unused exports] [sukadev@us.ibm.com: Fix capability.c to work with threaded init] Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Acked-by: Pavel Emelianov <xemul@openvz.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cedric Le Goater <clg@fr.ibm.com> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Herbert Poetzel <herbert@13thfloor.at> Cc: Kirill Korotaev <dev@sw.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-19 11:53:37 -07:00
Will Schmidt	dcca2bde4f	During VM oom condition, kill all threads in process group We have had complaints where a threaded application is left in a bad state after one of it's threads is killed when we hit a VM: out_of_memory condition. Killing just one of the process threads can leave the application in a bad state, whereas killing the entire process group would allow for the application to restart, or be otherwise handled, and makes it very obvious that something has gone wrong. This change allows the entire process group to be taken down, rather than just the one thread. Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Andi Kleen <ak@suse.de> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Matthew Wilcox <willy@debian.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-16 09:42:52 -07:00
Heiko Carstens	c41fbc6965	[S390] pfault: Fix alignment of parameter list. Make sure parameter list of the pfault token function is eight byte aligned. Otherwise we can get specification exceptions. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2007-10-12 16:13:11 +02:00

1 2

74 Commits