1

x86/stackprotector: Work around strict Clang TLS symbol requirements

GCC and Clang both implement stack protector support based on Thread Local
Storage (TLS) variables, and this is used in the kernel to implement per-task
stack cookies, by copying a task's stack cookie into a per-CPU variable every
time it is scheduled in.

Both now also implement -mstack-protector-guard-symbol=, which permits the TLS
variable to be specified directly. This is useful because it will allow to
move away from using a fixed offset of 40 bytes into the per-CPU area on
x86_64, which requires a lot of special handling in the per-CPU code and the
runtime relocation code.

However, while GCC is rather lax in its implementation of this command line
option, Clang actually requires that the provided symbol name refers to a TLS
variable (i.e., one declared with __thread), although it also permits the
variable to be undeclared entirely, in which case it will use an implicit
declaration of the right type.

The upshot of this is that Clang will emit the correct references to the stack
cookie variable in most cases, e.g.,

  10d:       64 a1 00 00 00 00       mov    %fs:0x0,%eax
                     10f: R_386_32   __stack_chk_guard

However, if a non-TLS definition of the symbol in question is visible in the
same compilation unit (which amounts to the whole of vmlinux if LTO is
enabled), it will drop the per-CPU prefix and emit a load from a bogus
address.

Work around this by using a symbol name that never occurs in C code, and emit
it as an alias in the linker script.

Fixes: 3fb0fdb3bb ("x86/stackprotector/32: Make the canary into a regular percpu variable")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: stable@vger.kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1854
Link: https://lore.kernel.org/r/20241105155801.1779119-2-brgerst@gmail.com
This commit is contained in:
Ard Biesheuvel 2024-11-05 10:57:46 -05:00 committed by Borislav Petkov (AMD)
parent a5ca1dc46a
commit 577c134d31
5 changed files with 27 additions and 2 deletions

View File

@ -142,7 +142,8 @@ ifeq ($(CONFIG_X86_32),y)
ifeq ($(CONFIG_STACKPROTECTOR),y)
ifeq ($(CONFIG_SMP),y)
KBUILD_CFLAGS += -mstack-protector-guard-reg=fs -mstack-protector-guard-symbol=__stack_chk_guard
KBUILD_CFLAGS += -mstack-protector-guard-reg=fs \
-mstack-protector-guard-symbol=__ref_stack_chk_guard
else
KBUILD_CFLAGS += -mstack-protector-guard=global
endif

View File

@ -51,3 +51,19 @@ EXPORT_SYMBOL_GPL(mds_verw_sel);
.popsection
THUNK warn_thunk_thunk, __warn_thunk
#ifndef CONFIG_X86_64
/*
* Clang's implementation of TLS stack cookies requires the variable in
* question to be a TLS variable. If the variable happens to be defined as an
* ordinary variable with external linkage in the same compilation unit (which
* amounts to the whole of vmlinux with LTO enabled), Clang will drop the
* segment register prefix from the references, resulting in broken code. Work
* around this by avoiding the symbol used in -mstack-protector-guard-symbol=
* entirely in the C code, and use an alias emitted by the linker script
* instead.
*/
#ifdef CONFIG_STACKPROTECTOR
EXPORT_SYMBOL(__ref_stack_chk_guard);
#endif
#endif

View File

@ -20,3 +20,6 @@
extern void cmpxchg8b_emu(void);
#endif
#if defined(__GENKSYMS__) && defined(CONFIG_STACKPROTECTOR)
extern unsigned long __ref_stack_chk_guard;
#endif

View File

@ -2089,8 +2089,10 @@ void syscall_init(void)
#ifdef CONFIG_STACKPROTECTOR
DEFINE_PER_CPU(unsigned long, __stack_chk_guard);
#ifndef CONFIG_SMP
EXPORT_PER_CPU_SYMBOL(__stack_chk_guard);
#endif
#endif
#endif /* CONFIG_X86_64 */

View File

@ -491,6 +491,9 @@ SECTIONS
. = ASSERT((_end - LOAD_OFFSET <= KERNEL_IMAGE_SIZE),
"kernel image bigger than KERNEL_IMAGE_SIZE");
/* needed for Clang - see arch/x86/entry/entry.S */
PROVIDE(__ref_stack_chk_guard = __stack_chk_guard);
#ifdef CONFIG_X86_64
/*
* Per-cpu symbols which need to be offset from __per_cpu_load