1
linux/arch/x86/kernel/signal.c

440 lines
12 KiB
C
Raw Normal View History

License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 07:07:57 -07:00
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (C) 1991, 1992 Linus Torvalds
* Copyright (C) 2000, 2001, 2002 Andi Kleen SuSE Labs
*
* 1997-11-28 Modified for POSIX.1b signals by Richard Henderson
* 2000-06-20 Pentium III FXSR, SSE support by Gareth Hughes
* 2000-2002 x86-64 support by Andi Kleen
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/sched.h>
#include <linux/sched/task_stack.h>
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/kernel.h>
#include <linux/kstrtox.h>
#include <linux/errno.h>
#include <linux/wait.h>
#include <linux/unistd.h>
#include <linux/stddef.h>
#include <linux/personality.h>
#include <linux/uaccess.h>
#include <linux/user-return-notifier.h>
uprobes/core: Handle breakpoint and singlestep exceptions Uprobes uses exception notifiers to get to know if a thread hit a breakpoint or a singlestep exception. When a thread hits a uprobe or is singlestepping post a uprobe hit, the uprobe exception notifier sets its TIF_UPROBE bit, which will then be checked on its return to userspace path (do_notify_resume() ->uprobe_notify_resume()), where the consumers handlers are run (in task context) based on the defined filters. Uprobe hits are thread specific and hence we need to maintain information about if a task hit a uprobe, what uprobe was hit, the slot where the original instruction was copied for xol so that it can be singlestepped with appropriate fixups. In some cases, special care is needed for instructions that are executed out of line (xol). These are architecture specific artefacts, such as handling RIP relative instructions on x86_64. Since the instruction at which the uprobe was inserted is executed out of line, architecture specific fixups are added so that the thread continues normal execution in the presence of a uprobe. Postpone the signals until we execute the probed insn. post_xol() path does a recalc_sigpending() before return to user-mode, this ensures the signal can't be lost. Uprobes relies on DIE_DEBUG notification to notify if a singlestep is complete. Adds x86 specific uprobe exception notifiers and appropriate hooks needed to determine a uprobe hit and subsequent post processing. Add requisite x86 fixups for xol for uprobes. Specific cases needing fixups include relative jumps (x86_64), calls, etc. Where possible, we check and skip singlestepping the breakpointed instructions. For now we skip single byte as well as few multibyte nop instructions. However this can be extended to other instructions too. Credits to Oleg Nesterov for suggestions/patches related to signal, breakpoint, singlestep handling code. Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com> Cc: Linux-mm <linux-mm@kvack.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20120313180011.29771.89027.sendpatchset@srdronam.in.ibm.com [ Performed various cleanliness edits ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-13 11:00:11 -07:00
#include <linux/uprobes.h>
#include <linux/context_tracking.h>
#include <linux/entry-common.h>
#include <linux/syscalls.h>
#include <linux/rseq.h>
#include <asm/processor.h>
#include <asm/ucontext.h>
#include <asm/fpu/signal.h>
#include <asm/fpu/xstate.h>
#include <asm/vdso.h>
x86, mce: use 64bit machine check code on 32bit The 64bit machine check code is in many ways much better than the 32bit machine check code: it is more specification compliant, is cleaner, only has a single code base versus one per CPU, has better infrastructure for recovery, has a cleaner way to communicate with user space etc. etc. Use the 64bit code for 32bit too. This is the second attempt to do this. There was one a couple of years ago to unify this code for 32bit and 64bit. Back then this ran into some trouble with K7s and was reverted. I believe this time the K7 problems (and some others) are addressed. I went over the old handlers and was very careful to retain all quirks. But of course this needs a lot of testing on old systems. On newer 64bit capable systems I don't expect much problems because they have been already tested with the 64bit kernel. I made this a CONFIG for now that still allows to select the old machine check code. This is mostly to make testing easier, if someone runs into a problem we can ask them to try with the CONFIG switched. The new code is default y for more coverage. Once there is confidence the 64bit code works well on older hardware too the CONFIG_X86_OLD_MCE and the associated code can be easily removed. This causes a behaviour change for 32bit installations. They now have to install the mcelog package to be able to log corrected machine checks. The 64bit machine check code only handles CPUs which support the standard Intel machine check architecture described in the IA32 SDM. The 32bit code has special support for some older CPUs which have non standard machine check architectures, in particular WinChip C3 and Intel P5. I made those a separate CONFIG option and kept them for now. The WinChip variant could be probably removed without too much pain, it doesn't really do anything interesting. P5 is also disabled by default (like it was before) because many motherboards have it miswired, but according to Alan Cox a few embedded setups use that one. Forward ported/heavily changed version of old patch, original patch included review/fixes from Thomas Gleixner, Bert Wesarg. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-04-28 10:07:31 -07:00
#include <asm/mce.h>
#include <asm/sighandling.h>
#include <asm/vm86.h>
#include <asm/syscall.h>
#include <asm/sigframe.h>
#include <asm/signal.h>
x86/shstk: Handle signals for shadow stack When a signal is handled, the context is pushed to the stack before handling it. For shadow stacks, since the shadow stack only tracks return addresses, there isn't any state that needs to be pushed. However, there are still a few things that need to be done. These things are visible to userspace and which will be kernel ABI for shadow stacks. One is to make sure the restorer address is written to shadow stack, since the signal handler (if not changing ucontext) returns to the restorer, and the restorer calls sigreturn. So add the restorer on the shadow stack before handling the signal, so there is not a conflict when the signal handler returns to the restorer. The other thing to do is to place some type of checkable token on the thread's shadow stack before handling the signal and check it during sigreturn. This is an extra layer of protection to hamper attackers calling sigreturn manually as in SROP-like attacks. For this token the shadow stack data format defined earlier can be used. Have the data pushed be the previous SSP. In the future the sigreturn might want to return back to a different stack. Storing the SSP (instead of a restore offset or something) allows for future functionality that may want to restore to a different stack. So, when handling a signal push - the SSP pointing in the shadow stack data format - the restorer address below the restore token. In sigreturn, verify SSP is stored in the data format and pop the shadow stack. Co-developed-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: John Allen <john.allen@amd.com> Tested-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/20230613001108.3040476-32-rick.p.edgecombe%40intel.com
2023-06-12 17:10:57 -07:00
#include <asm/shstk.h>
static inline int is_ia32_compat_frame(struct ksignal *ksig)
{
return IS_ENABLED(CONFIG_IA32_EMULATION) &&
ksig->ka.sa.sa_flags & SA_IA32_ABI;
}
static inline int is_ia32_frame(struct ksignal *ksig)
{
return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame(ksig);
}
static inline int is_x32_frame(struct ksignal *ksig)
{
return IS_ENABLED(CONFIG_X86_X32_ABI) &&
ksig->ka.sa.sa_flags & SA_X32_ABI;
}
/*
* Enable all pkeys temporarily, so as to ensure that both the current
* execution stack as well as the alternate signal stack are writeable.
* The application can use any of the available pkeys to protect the
* alternate signal stack, and we don't know which one it is, so enable
* all. The PKRU register will be reset to init_pkru later in the flow,
* in fpu__clear_user_states(), and it is the application's responsibility
* to enable the appropriate pkey as the first step in the signal handler
* so that the handler does not segfault.
*/
static inline u32 sig_prepare_pkru(void)
{
u32 orig_pkru = read_pkru();
write_pkru(0);
return orig_pkru;
}
/*
* Set up a signal frame.
*/
/* x86 ABI requires 16-byte alignment */
#define FRAME_ALIGNMENT 16UL
#define MAX_FRAME_PADDING (FRAME_ALIGNMENT - 1)
/*
* Determine which stack to use..
*/
void __user *
get_sigframe(struct ksignal *ksig, struct pt_regs *regs, size_t frame_size,
void __user **fpstate)
{
struct k_sigaction *ka = &ksig->ka;
int ia32_frame = is_ia32_frame(ksig);
/* Default to using normal stack */
bool nested_altstack = on_sig_stack(regs->sp);
bool entering_altstack = false;
x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-07-24 16:05:29 -07:00
unsigned long math_size = 0;
unsigned long sp = regs->sp;
x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-07-24 16:05:29 -07:00
unsigned long buf_fx = 0;
u32 pkru;
/* redzone */
if (!ia32_frame)
sp -= 128;
2016-04-14 13:20:02 -07:00
/* This is the X/Open sanctioned signal stack switching. */
if (ka->sa.sa_flags & SA_ONSTACK) {
/*
* This checks nested_altstack via sas_ss_flags(). Sensible
* programs use SS_AUTODISARM, which disables that check, and
* programs that don't use SS_AUTODISARM get compatible.
*/
if (sas_ss_flags(sp) == 0) {
2016-04-14 13:20:02 -07:00
sp = current->sas_ss_sp + current->sas_ss_size;
entering_altstack = true;
}
} else if (ia32_frame &&
!nested_altstack &&
regs->ss != __USER_DS &&
2016-04-14 13:20:02 -07:00
!(ka->sa.sa_flags & SA_RESTORER) &&
ka->sa.sa_restorer) {
/* This is the legacy signal stack switching. */
sp = (unsigned long) ka->sa.sa_restorer;
entering_altstack = true;
}
sp = fpu__alloc_mathframe(sp, ia32_frame, &buf_fx, &math_size);
x86/fpu: Remove fpu->initialized The struct fpu.initialized member is always set to one for user tasks and zero for kernel tasks. This avoids saving/restoring the FPU registers for kernel threads. The ->initialized = 0 case for user tasks has been removed in previous changes, for instance, by doing an explicit unconditional init at fork() time for FPU-less systems which was otherwise delayed until the emulated opcode. The context switch code (switch_fpu_prepare() + switch_fpu_finish()) can't unconditionally save/restore registers for kernel threads. Not only would it slow down the switch but also load a zeroed xcomp_bv for XSAVES. For kernel_fpu_begin() (+end) the situation is similar: EFI with runtime services uses this before alternatives_patched is true. Which means that this function is used too early and it wasn't the case before. For those two cases, use current->mm to distinguish between user and kernel thread. For kernel_fpu_begin() skip save/restore of the FPU registers. During the context switch into a kernel thread don't do anything. There is no reason to save the FPU state of a kernel thread. The reordering in __switch_to() is important because the current() pointer needs to be valid before switch_fpu_finish() is invoked so ->mm is seen of the new task instead the old one. N.B.: fpu__save() doesn't need to check ->mm because it is called by user tasks only. [ bp: Massage. ] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Aubrey Li <aubrey.li@intel.com> Cc: Babu Moger <Babu.Moger@amd.com> Cc: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: Dmitry Safonov <dima@arista.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: kvm ML <kvm@vger.kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nicolai Stange <nstange@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Will Deacon <will.deacon@arm.com> Cc: x86-ml <x86@kernel.org> Link: https://lkml.kernel.org/r/20190403164156.19645-8-bigeasy@linutronix.de
2019-04-03 09:41:36 -07:00
*fpstate = (void __user *)sp;
sp -= frame_size;
if (ia32_frame)
/*
* Align the stack pointer according to the i386 ABI,
* i.e. so that on function entry ((sp + 4) & 15) == 0.
*/
sp = ((sp + 4) & -FRAME_ALIGNMENT) - 4;
else
sp = round_down(sp, FRAME_ALIGNMENT) - 8;
/*
* If we are on the alternate signal stack and would overflow it, don't.
* Return an always-bogus address instead so we will die with SIGSEGV.
*/
if (unlikely((nested_altstack || entering_altstack) &&
!__on_sig_stack(sp))) {
if (show_unhandled_signals && printk_ratelimit())
pr_info("%s[%d] overflowed sigaltstack\n",
current->comm, task_pid_nr(current));
return (void __user *)-1L;
}
/* Update PKRU to enable access to the alternate signal stack. */
pkru = sig_prepare_pkru();
x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-07-24 16:05:29 -07:00
/* save i387 and extended state */
if (!copy_fpstate_to_sigframe(*fpstate, (void __user *)buf_fx, math_size, pkru)) {
/*
* Restore PKRU to the original, user-defined value; disable
* extra pkeys enabled for the alternate signal stack, if any.
*/
write_pkru(pkru);
return (void __user *)-1L;
}
return (void __user *)sp;
}
/*
* There are four different struct types for signal frame: sigframe_ia32,
* rt_sigframe_ia32, rt_sigframe_x32, and rt_sigframe. Use the worst case
* -- the largest size. It means the size for 64-bit apps is a bit more
* than needed, but this keeps the code simple.
*/
#if defined(CONFIG_X86_32) || defined(CONFIG_IA32_EMULATION)
# define MAX_FRAME_SIGINFO_UCTXT_SIZE sizeof(struct sigframe_ia32)
#else
# define MAX_FRAME_SIGINFO_UCTXT_SIZE sizeof(struct rt_sigframe)
#endif
/*
* The FP state frame contains an XSAVE buffer which must be 64-byte aligned.
* If a signal frame starts at an unaligned address, extra space is required.
* This is the max alignment padding, conservatively.
*/
#define MAX_XSAVE_PADDING 63UL
/*
* The frame data is composed of the following areas and laid out as:
*
* -------------------------
* | alignment padding |
* -------------------------
* | (f)xsave frame |
* -------------------------
* | fsave header |
* -------------------------
* | alignment padding |
* -------------------------
* | siginfo + ucontext |
* -------------------------
*/
/* max_frame_size tells userspace the worst case signal stack size. */
static unsigned long __ro_after_init max_frame_size;
static unsigned int __ro_after_init fpu_default_state_size;
static int __init init_sigframe_size(void)
{
fpu_default_state_size = fpu__get_fpstate_size();
max_frame_size = MAX_FRAME_SIGINFO_UCTXT_SIZE + MAX_FRAME_PADDING;
max_frame_size += fpu_default_state_size + MAX_XSAVE_PADDING;
/* Userspace expects an aligned size. */
max_frame_size = round_up(max_frame_size, FRAME_ALIGNMENT);
pr_info("max sigframe size: %lu\n", max_frame_size);
return 0;
}
early_initcall(init_sigframe_size);
unsigned long get_sigframe_size(void)
{
return max_frame_size;
}
static int
setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
{
/* Perform fixup for the pre-signal frame. */
rseq_signal_deliver(ksig, regs);
x86: Add support for restartable sequences Call the rseq_handle_notify_resume() function on return to userspace if TIF_NOTIFY_RESUME thread flag is set. Perform fixup on the pre-signal frame when a signal is delivered on top of a restartable sequence critical section. Check that system calls are not invoked from within rseq critical sections by invoking rseq_signal() from syscall_return_slowpath(). With CONFIG_DEBUG_RSEQ, such behavior results in termination of the process with SIGSEGV. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joel Fernandes <joelaf@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Dave Watson <davejwatson@fb.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Chris Lameter <cl@linux.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Andrew Hunter <ahh@google.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com> Cc: Paul Turner <pjt@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Maurer <bmaurer@fb.com> Cc: linux-api@vger.kernel.org Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20180602124408.8430-7-mathieu.desnoyers@efficios.com
2018-06-02 05:43:58 -07:00
/* Set up the stack frame */
if (is_ia32_frame(ksig)) {
if (ksig->ka.sa.sa_flags & SA_SIGINFO)
return ia32_setup_rt_frame(ksig, regs);
else
return ia32_setup_frame(ksig, regs);
} else if (is_x32_frame(ksig)) {
return x32_setup_rt_frame(ksig, regs);
} else {
return x64_setup_rt_frame(ksig, regs);
}
}
static void
handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-16 00:40:25 -07:00
bool stepping, failed;
struct fpu *fpu = &current->thread.fpu;
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-16 00:40:25 -07:00
if (v8086_mode(regs))
save_v86_state((struct kernel_vm86_regs *) regs, VM86_SIGNAL);
/* Are we from a system call? */
if (syscall_get_nr(current, regs) != -1) {
/* If so, check system call restarting.. */
switch (syscall_get_error(current, regs)) {
case -ERESTART_RESTARTBLOCK:
case -ERESTARTNOHAND:
regs->ax = -EINTR;
break;
case -ERESTARTSYS:
if (!(ksig->ka.sa.sa_flags & SA_RESTART)) {
regs->ax = -EINTR;
break;
}
fallthrough;
case -ERESTARTNOINTR:
regs->ax = regs->orig_ax;
regs->ip -= 2;
break;
}
}
/*
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-16 00:40:25 -07:00
* If TF is set due to a debugger (TIF_FORCED_TF), clear TF now
* so that register information in the sigcontext is correct and
* then notify the tracer before entering the signal handler.
*/
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-16 00:40:25 -07:00
stepping = test_thread_flag(TIF_SINGLESTEP);
if (stepping)
user_disable_single_step(current);
failed = (setup_rt_frame(ksig, regs) < 0);
if (!failed) {
/*
* Clear the direction flag as per the ABI for function entry.
*
* Clear RF when entering the signal handler, because
* it might disable possible debug exception from the
* signal handler.
*
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-16 00:40:25 -07:00
* Clear TF for the case when it wasn't set by debugger to
* avoid the recursive send_sigtrap() in SIGTRAP handler.
*/
regs->flags &= ~(X86_EFLAGS_DF|X86_EFLAGS_RF|X86_EFLAGS_TF);
x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal() save_xstate_sig()->drop_init_fpu() doesn't look right. setup_rt_frame() can fail after that, in this case the next setup_rt_frame() triggered by SIGSEGV won't save fpu simply because the old state was lost. This obviously mean that fpu won't be restored after sys_rt_sigreturn() from SIGSEGV handler. Shift drop_init_fpu() into !failed branch in handle_signal(). Test-case (needs -O2): #include <stdio.h> #include <signal.h> #include <unistd.h> #include <sys/syscall.h> #include <sys/mman.h> #include <pthread.h> #include <assert.h> volatile double D; void test(double d) { int pid = getpid(); for (D = d; D == d; ) { /* sys_tkill(pid, SIGHUP); asm to avoid save/reload * fp regs around "C" call */ asm ("" : : "a"(200), "D"(pid), "S"(1)); asm ("syscall" : : : "ax"); } printf("ERR!!\n"); } void sigh(int sig) { } char altstack[4096 * 10] __attribute__((aligned(4096))); void *tfunc(void *arg) { for (;;) { mprotect(altstack, sizeof(altstack), PROT_READ); mprotect(altstack, sizeof(altstack), PROT_READ|PROT_WRITE); } } int main(void) { stack_t st = { .ss_sp = altstack, .ss_size = sizeof(altstack), .ss_flags = SS_ONSTACK, }; struct sigaction sa = { .sa_handler = sigh, }; pthread_t pt; sigaction(SIGSEGV, &sa, NULL); sigaltstack(&st, NULL); sa.sa_flags = SA_ONSTACK; sigaction(SIGHUP, &sa, NULL); pthread_create(&pt, NULL, tfunc, NULL); test(123.456); return 0; } Reported-by: Bean Anderson <bean@azulsystems.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: http://lkml.kernel.org/r/20140902175713.GA21646@redhat.com Cc: <stable@kernel.org> # v3.7+ Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-09-02 10:57:13 -07:00
/*
* Ensure the signal handler starts with the new fpu state.
*/
fpu__clear_user_states(fpu);
}
x86/ptrace: Fix the TIF_FORCED_TF logic in handle_signal() When the TIF_SINGLESTEP tracee dequeues a signal, handle_signal() clears TIF_FORCED_TF and X86_EFLAGS_TF but leaves TIF_SINGLESTEP set. If the tracer does PTRACE_SINGLESTEP again, enable_single_step() sets X86_EFLAGS_TF but not TIF_FORCED_TF. This means that the subsequent PTRACE_CONT doesn't not clear X86_EFLAGS_TF, and the tracee gets the wrong SIGTRAP. Test-case (needs -O2 to avoid prologue insns in signal handler): #include <unistd.h> #include <stdio.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <sys/user.h> #include <assert.h> #include <stddef.h> void handler(int n) { asm("nop"); } int child(void) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); signal(SIGALRM, handler); kill(getpid(), SIGALRM); return 0x23; } void *getip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } int main(void) { int pid, status; pid = fork(); if (!pid) return child(); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGALRM); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 0); assert(ptrace(PTRACE_SINGLESTEP, pid, 0, SIGALRM) == 0); assert(wait(&status) == pid); assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP); assert((getip(pid) - (void*)handler) == 1); assert(ptrace(PTRACE_CONT, pid, 0,0) == 0); assert(wait(&status) == pid); assert(WIFEXITED(status) && WEXITSTATUS(status) == 0x23); return 0; } The last assert() fails because PTRACE_CONT wrongly triggers another single-step and X86_EFLAGS_TF can't be cleared by debugger until the tracee does sys_rt_sigreturn(). Change handle_signal() to do user_disable_single_step() if stepping, we do not need to preserve TIF_SINGLESTEP because we are going to do ptrace_notify(), and it is simply wrong to leak this bit. While at it, change the comment to explain why we also need to clear TF unconditionally after setup_rt_frame(). Note: in the longer term we should probably change setup_sigcontext() to use get_flags() and then just remove this user_disable_single_step(). And, the state of TIF_FORCED_TF can be wrong after restore_sigcontext() which can set/clear TF, this needs another fix. This fix fixes the 'single_step_syscall_32' testcase in the x86 testsuite: Before: ~/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [WARN] Hit 10000 SIGTRAPs with si_addr 0xf7789cc0, ip 0xf7789cc0 Trace/breakpoint trap (core dumped) After: ~/linux/linux/tools/testing/selftests/x86> ./single_step_syscall_32 [RUN] Set TF and check nop [OK] Survived with TF set and 9 traps [RUN] Set TF and check int80 [OK] Survived with TF set and 9 traps [RUN] Set TF and check a fast syscall [OK] Survived with TF set and 39 traps [RUN] Fast syscall with TF cleared [OK] Nothing unexpected happened Reported-by: Evan Teran <eteran@alum.rit.edu> Reported-by: Pedro Alves <palves@redhat.com> Tested-by: Andres Freund <andres@anarazel.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ Added x86 self-test info. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-16 00:40:25 -07:00
signal_setup_done(failed, ksig, stepping);
}
static inline unsigned long get_nr_restart_syscall(const struct pt_regs *regs)
{
#ifdef CONFIG_IA32_EMULATION
if (current->restart_block.arch_data & TS_COMPAT)
return __NR_ia32_restart_syscall;
#endif
#ifdef CONFIG_X86_X32_ABI
return __NR_restart_syscall | (regs->orig_ax & __X32_SYSCALL_BIT);
#else
return __NR_restart_syscall;
#endif
}
/*
* Note that 'init' is a special process: it doesn't get signals it doesn't
* want to handle. Thus you cannot kill init even with a SIGKILL even by
* mistake.
*/
void arch_do_signal_or_restart(struct pt_regs *regs)
{
struct ksignal ksig;
if (get_signal(&ksig)) {
/* Whee! Actually deliver the signal. */
handle_signal(&ksig, regs);
[PATCH] Handle TIF_RESTORE_SIGMASK for i386 Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled: [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc [PATCH] 3/3 Generic sys_rt_sigsuspend It does the following: (1) Declares TIF_RESTORE_SIGMASK for i386. (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set. (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved in current->saved_sigmask. (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead. (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK rather than attempting to fudge the return registers. (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping intrinsically. (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or -EFAULT rather than true/false to be consistent with the rest of the kernel. Due to the fact do_signal() is then only called from one place: (8) Makes do_signal() no longer have a return value is it was just being ignored; force_sig() takes care of this. (9) Discards the old sigmask argument to do_signal() as it's no longer necessary. (10) Makes do_signal() static. (11) Marks the second argument to do_notify_resume() as unused. The unused argument should remain in the middle as the arguments are passed in as registers, and the ordering is specific in entry.S Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(), they no longer need access to the exception frame, and so can just take arguments normally. This patch depends on sys_rt_sigsuspend patch. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 18:44:00 -07:00
return;
}
/* Did we come from a system call? */
if (syscall_get_nr(current, regs) != -1) {
/* Restart the system call - no handlers present */
switch (syscall_get_error(current, regs)) {
[PATCH] Handle TIF_RESTORE_SIGMASK for i386 Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled: [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc [PATCH] 3/3 Generic sys_rt_sigsuspend It does the following: (1) Declares TIF_RESTORE_SIGMASK for i386. (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set. (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved in current->saved_sigmask. (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead. (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK rather than attempting to fudge the return registers. (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping intrinsically. (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or -EFAULT rather than true/false to be consistent with the rest of the kernel. Due to the fact do_signal() is then only called from one place: (8) Makes do_signal() no longer have a return value is it was just being ignored; force_sig() takes care of this. (9) Discards the old sigmask argument to do_signal() as it's no longer necessary. (10) Makes do_signal() static. (11) Marks the second argument to do_notify_resume() as unused. The unused argument should remain in the middle as the arguments are passed in as registers, and the ordering is specific in entry.S Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(), they no longer need access to the exception frame, and so can just take arguments normally. This patch depends on sys_rt_sigsuspend patch. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 18:44:00 -07:00
case -ERESTARTNOHAND:
case -ERESTARTSYS:
case -ERESTARTNOINTR:
regs->ax = regs->orig_ax;
regs->ip -= 2;
[PATCH] Handle TIF_RESTORE_SIGMASK for i386 Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled: [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc [PATCH] 3/3 Generic sys_rt_sigsuspend It does the following: (1) Declares TIF_RESTORE_SIGMASK for i386. (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set. (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved in current->saved_sigmask. (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead. (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK rather than attempting to fudge the return registers. (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping intrinsically. (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or -EFAULT rather than true/false to be consistent with the rest of the kernel. Due to the fact do_signal() is then only called from one place: (8) Makes do_signal() no longer have a return value is it was just being ignored; force_sig() takes care of this. (9) Discards the old sigmask argument to do_signal() as it's no longer necessary. (10) Makes do_signal() static. (11) Marks the second argument to do_notify_resume() as unused. The unused argument should remain in the middle as the arguments are passed in as registers, and the ordering is specific in entry.S Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(), they no longer need access to the exception frame, and so can just take arguments normally. This patch depends on sys_rt_sigsuspend patch. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 18:44:00 -07:00
break;
case -ERESTART_RESTARTBLOCK:
regs->ax = get_nr_restart_syscall(regs);
regs->ip -= 2;
[PATCH] Handle TIF_RESTORE_SIGMASK for i386 Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled: [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc [PATCH] 3/3 Generic sys_rt_sigsuspend It does the following: (1) Declares TIF_RESTORE_SIGMASK for i386. (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set. (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved in current->saved_sigmask. (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead. (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK rather than attempting to fudge the return registers. (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping intrinsically. (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or -EFAULT rather than true/false to be consistent with the rest of the kernel. Due to the fact do_signal() is then only called from one place: (8) Makes do_signal() no longer have a return value is it was just being ignored; force_sig() takes care of this. (9) Discards the old sigmask argument to do_signal() as it's no longer necessary. (10) Makes do_signal() static. (11) Marks the second argument to do_notify_resume() as unused. The unused argument should remain in the middle as the arguments are passed in as registers, and the ordering is specific in entry.S Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(), they no longer need access to the exception frame, and so can just take arguments normally. This patch depends on sys_rt_sigsuspend patch. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 18:44:00 -07:00
break;
}
}
[PATCH] Handle TIF_RESTORE_SIGMASK for i386 Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled: [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc [PATCH] 3/3 Generic sys_rt_sigsuspend It does the following: (1) Declares TIF_RESTORE_SIGMASK for i386. (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set. (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved in current->saved_sigmask. (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead. (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK rather than attempting to fudge the return registers. (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping intrinsically. (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or -EFAULT rather than true/false to be consistent with the rest of the kernel. Due to the fact do_signal() is then only called from one place: (8) Makes do_signal() no longer have a return value is it was just being ignored; force_sig() takes care of this. (9) Discards the old sigmask argument to do_signal() as it's no longer necessary. (10) Makes do_signal() static. (11) Marks the second argument to do_notify_resume() as unused. The unused argument should remain in the middle as the arguments are passed in as registers, and the ordering is specific in entry.S Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(), they no longer need access to the exception frame, and so can just take arguments normally. This patch depends on sys_rt_sigsuspend patch. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 18:44:00 -07:00
/*
* If there's no signal to deliver, we just put the saved sigmask
* back.
*/
restore_saved_sigmask();
}
void signal_fault(struct pt_regs *regs, void __user *frame, char *where)
{
struct task_struct *me = current;
if (show_unhandled_signals && printk_ratelimit()) {
printk("%s"
"%s[%d] bad frame in %s frame:%p ip:%lx sp:%lx orax:%lx",
task_pid_nr(current) > 1 ? KERN_INFO : KERN_EMERG,
me->comm, me->pid, where, frame,
regs->ip, regs->sp, regs->orig_ax);
print_vma_addr(KERN_CONT " in ", regs->ip);
pr_cont("\n");
}
force_sig(SIGSEGV);
}
#ifdef CONFIG_DYNAMIC_SIGFRAME
#ifdef CONFIG_STRICT_SIGALTSTACK_SIZE
static bool strict_sigaltstack_size __ro_after_init = true;
#else
static bool strict_sigaltstack_size __ro_after_init = false;
#endif
static int __init strict_sas_size(char *arg)
{
return kstrtobool(arg, &strict_sigaltstack_size) == 0;
}
__setup("strict_sas_size", strict_sas_size);
/*
* MINSIGSTKSZ is 2048 and can't be changed despite the fact that AVX512
* exceeds that size already. As such programs might never use the
* sigaltstack they just continued to work. While always checking against
* the real size would be correct, this might be considered a regression.
*
* Therefore avoid the sanity check, unless enforced by kernel
* configuration or command line option.
*
* When dynamic FPU features are supported, the check is also enforced when
* the task has permissions to use dynamic features. Tasks which have no
* permission are checked against the size of the non-dynamic feature set
* if strict checking is enabled. This avoids forcing all tasks on the
* system to allocate large sigaltstacks even if they are never going
* to use a dynamic feature. As this is serialized via sighand::siglock
* any permission request for a dynamic feature either happened already
* or will see the newly install sigaltstack size in the permission checks.
*/
bool sigaltstack_size_valid(size_t ss_size)
{
unsigned long fsize = max_frame_size - fpu_default_state_size;
u64 mask;
lockdep_assert_held(&current->sighand->siglock);
if (!fpu_state_size_dynamic() && !strict_sigaltstack_size)
return true;
fsize += current->group_leader->thread.fpu.perm.__user_state_size;
if (likely(ss_size > fsize))
return true;
if (strict_sigaltstack_size)
return ss_size > fsize;
mask = current->group_leader->thread.fpu.perm.__state_perm;
if (mask & XFEATURE_MASK_USER_DYNAMIC)
return ss_size > fsize;
return true;
}
#endif /* CONFIG_DYNAMIC_SIGFRAME */