License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 07:07:57 -07:00
|
|
|
# SPDX-License-Identifier: GPL-2.0
|
2008-01-30 05:32:31 -07:00
|
|
|
#
|
|
|
|
# Makefile for x86 specific library files.
|
|
|
|
#
|
|
|
|
|
kernel: add kcov code coverage
kcov provides code coverage collection for coverage-guided fuzzing
(randomized testing). Coverage-guided fuzzing is a testing technique
that uses coverage feedback to determine new interesting inputs to a
system. A notable user-space example is AFL
(http://lcamtuf.coredump.cx/afl/). However, this technique is not
widely used for kernel testing due to missing compiler and kernel
support.
kcov does not aim to collect as much coverage as possible. It aims to
collect more or less stable coverage that is function of syscall inputs.
To achieve this goal it does not collect coverage in soft/hard
interrupts and instrumentation of some inherently non-deterministic or
non-interesting parts of kernel is disbled (e.g. scheduler, locking).
Currently there is a single coverage collection mode (tracing), but the
API anticipates additional collection modes. Initially I also
implemented a second mode which exposes coverage in a fixed-size hash
table of counters (what Quentin used in his original patch). I've
dropped the second mode for simplicity.
This patch adds the necessary support on kernel side. The complimentary
compiler support was added in gcc revision 231296.
We've used this support to build syzkaller system call fuzzer, which has
found 90 kernel bugs in just 2 months:
https://github.com/google/syzkaller/wiki/Found-Bugs
We've also found 30+ bugs in our internal systems with syzkaller.
Another (yet unexplored) direction where kcov coverage would greatly
help is more traditional "blob mutation". For example, mounting a
random blob as a filesystem, or receiving a random blob over wire.
Why not gcov. Typical fuzzing loop looks as follows: (1) reset
coverage, (2) execute a bit of code, (3) collect coverage, repeat. A
typical coverage can be just a dozen of basic blocks (e.g. an invalid
input). In such context gcov becomes prohibitively expensive as
reset/collect coverage steps depend on total number of basic
blocks/edges in program (in case of kernel it is about 2M). Cost of
kcov depends only on number of executed basic blocks/edges. On top of
that, kernel requires per-thread coverage because there are always
background threads and unrelated processes that also produce coverage.
With inlined gcov instrumentation per-thread coverage is not possible.
kcov exposes kernel PCs and control flow to user-space which is
insecure. But debugfs should not be mapped as user accessible.
Based on a patch by Quentin Casasnovas.
[akpm@linux-foundation.org: make task_struct.kcov_mode have type `enum kcov_mode']
[akpm@linux-foundation.org: unbreak allmodconfig]
[akpm@linux-foundation.org: follow x86 Makefile layout standards]
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tavis Ormandy <taviso@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@google.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: David Drysdale <drysdale@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-22 14:27:30 -07:00
|
|
|
# Produces uninteresting flaky coverage.
|
|
|
|
KCOV_INSTRUMENT_delay.o := n
|
|
|
|
|
2019-11-14 11:03:03 -07:00
|
|
|
# KCSAN uses udelay for introducing watchpoint delay; avoid recursion.
|
|
|
|
KCSAN_SANITIZE_delay.o := n
|
2020-02-14 14:10:35 -07:00
|
|
|
ifdef CONFIG_KCSAN
|
|
|
|
# In case KCSAN+lockdep+ftrace are enabled, disable ftrace for delay.o to avoid
|
|
|
|
# lockdep -> [other libs] -> KCSAN -> udelay -> ftrace -> lockdep recursion.
|
|
|
|
CFLAGS_REMOVE_delay.o = $(CC_FLAGS_FTRACE)
|
|
|
|
endif
|
2019-11-14 11:03:03 -07:00
|
|
|
|
x86: Instruction decoder API
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding
instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
Cc: Roland McGrath <roland@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-13 13:34:13 -07:00
|
|
|
inat_tables_script = $(srctree)/arch/x86/tools/gen-insn-attr-x86.awk
|
|
|
|
inat_tables_maps = $(srctree)/arch/x86/lib/x86-opcode-map.txt
|
|
|
|
quiet_cmd_inat_tables = GEN $@
|
2018-12-31 01:24:08 -07:00
|
|
|
cmd_inat_tables = $(AWK) -f $(inat_tables_script) $(inat_tables_maps) > $@
|
x86: Instruction decoder API
Add x86 instruction decoder to arch-specific libraries. This decoder
can decode x86 instructions used in kernel into prefix, opcode, modrm,
sib, displacement and immediates. This can also show the length of
instructions.
This version introduces instruction attributes for decoding
instructions.
The instruction attribute tables are generated from the opcode map file
(x86-opcode-map.txt) by the generator script(gen-insn-attr-x86.awk).
Currently, the opcode maps are based on opcode maps in Intel(R) 64 and
IA-32 Architectures Software Developers Manual Vol.2: Appendix.A,
and consist of below two types of opcode tables.
1-byte/2-bytes/3-bytes opcodes, which has 256 elements, are
written as below;
Table: table-name
Referrer: escaped-name
opcode: mnemonic|GrpXXX [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
(or)
opcode: escape # escaped-name
EndTable
Group opcodes, which has 8 elements, are written as below;
GrpTable: GrpXXX
reg: mnemonic [operand1[,operand2...]] [(extra1)[,(extra2)...] [| 2nd-mnemonic ...]
EndTable
These opcode maps include a few SSE and FP opcodes (for setup), because
those opcodes are used in the kernel.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Jim Keniston <jkenisto@us.ibm.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Baron <jbaron@redhat.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Przemysław Pawełczyk <przemyslaw@pawelczyk.it>
Cc: Roland McGrath <roland@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
LKML-Reference: <20090813203413.31965.49709.stgit@localhost.localdomain>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-08-13 13:34:13 -07:00
|
|
|
|
|
|
|
$(obj)/inat-tables.c: $(inat_tables_script) $(inat_tables_maps)
|
|
|
|
$(call cmd,inat_tables)
|
|
|
|
|
|
|
|
$(obj)/inat.o: $(obj)/inat-tables.c
|
|
|
|
|
|
|
|
clean-files := inat-tables.c
|
|
|
|
|
2010-01-22 08:01:03 -07:00
|
|
|
obj-$(CONFIG_SMP) += msr-smp.o cache-smp.o
|
2008-01-30 05:32:31 -07:00
|
|
|
|
2015-11-23 03:12:21 -07:00
|
|
|
lib-y := delay.o misc.o cmdline.o cpu.o
|
2011-06-07 02:49:55 -07:00
|
|
|
lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
|
2008-01-30 05:32:31 -07:00
|
|
|
lib-y += memcpy_$(BITS).o
|
x86: Add support for 0x22/0x23 port I/O configuration space
Define macros and accessors for the configuration space addressed
indirectly with an index register and a data register at the port I/O
locations of 0x22 and 0x23 respectively.
This space is defined by the Intel MultiProcessor Specification for the
IMCR register used to switch between the PIC and the APIC mode[1], by
Cyrix processors for their configuration[2][3], and also some chipsets.
Given the lack of atomicity with the indirect addressing a spinlock is
required to protect accesses, although for Cyrix processors it is enough
if accesses are executed with interrupts locally disabled, because the
registers are local to the accessing CPU, and IMCR is only ever poked at
by the BSP and early enough for interrupts not to have been configured
yet. Therefore existing code does not have to change or use the new
spinlock and neither it does.
Put the spinlock in a library file then, so that it does not get pulled
unnecessarily for configurations that do not refer it.
Convert Cyrix accessors to wrappers so as to retain the brevity and
clarity of the `getCx86' and `setCx86' calls.
References:
[1] "MultiProcessor Specification", Version 1.4, Intel Corporation,
Order Number: 242016-006, May 1997, Section 3.6.2.1 "PIC Mode", pp.
3-7, 3-8
[2] "5x86 Microprocessor", Cyrix Corporation, Order Number: 94192-00,
July 1995, Section 2.3.2.4 "Configuration Registers", p. 2-23
[3] "6x86 Processor", Cyrix Corporation, Order Number: 94175-01, March
1996, Section 2.4.4 "6x86 Configuration Registers", p. 2-23
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2107182353140.9461@angie.orcam.me.uk
2021-07-19 20:27:49 -07:00
|
|
|
lib-y += pc-conf-reg.o
|
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
In reaction to a proposal to introduce a memcpy_mcsafe_fast()
implementation Linus points out that memcpy_mcsafe() is poorly named
relative to communicating the scope of the interface. Specifically what
addresses are valid to pass as source, destination, and what faults /
exceptions are handled.
Of particular concern is that even though x86 might be able to handle
the semantics of copy_mc_to_user() with its common copy_user_generic()
implementation other archs likely need / want an explicit path for this
case:
On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote:
> >
> > However now I see that copy_user_generic() works for the wrong reason.
> > It works because the exception on the source address due to poison
> > looks no different than a write fault on the user address to the
> > caller, it's still just a short copy. So it makes copy_to_user() work
> > for the wrong reason relative to the name.
>
> Right.
>
> And it won't work that way on other architectures. On x86, we have a
> generic function that can take faults on either side, and we use it
> for both cases (and for the "in_user" case too), but that's an
> artifact of the architecture oddity.
>
> In fact, it's probably wrong even on x86 - because it can hide bugs -
> but writing those things is painful enough that everybody prefers
> having just one function.
Replace a single top-level memcpy_mcsafe() with either
copy_mc_to_user(), or copy_mc_to_kernel().
Introduce an x86 copy_mc_fragile() name as the rename for the
low-level x86 implementation formerly named memcpy_mcsafe(). It is used
as the slow / careful backend that is supplanted by a fast
copy_mc_generic() in a follow-on patch.
One side-effect of this reorganization is that separating copy_mc_64.S
to its own file means that perf no longer needs to track dependencies
for its memcpy_64.S benchmarks.
[ bp: Massage a bit. ]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: <stable@vger.kernel.org>
Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com
Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com
2020-10-05 20:40:16 -07:00
|
|
|
lib-$(CONFIG_ARCH_HAS_COPY_MC) += copy_mc.o copy_mc_64.o
|
2017-10-27 13:25:36 -07:00
|
|
|
lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
|
2016-06-21 17:46:58 -07:00
|
|
|
lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
|
2018-01-12 10:55:03 -07:00
|
|
|
lib-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o
|
2023-11-21 09:07:32 -07:00
|
|
|
lib-$(CONFIG_MITIGATION_RETPOLINE) += retpoline.o
|
2008-01-30 05:32:31 -07:00
|
|
|
|
2016-05-30 03:56:27 -07:00
|
|
|
obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o
|
x86: re-introduce non-generic memcpy_{to,from}io
This has been broken forever, and nobody ever really noticed because
it's purely a performance issue.
Long long ago, in commit 6175ddf06b61 ("x86: Clean up mem*io functions")
Brian Gerst simplified the memory copies to and from iomem, since on
x86, the instructions to access iomem are exactly the same as the
regular instructions.
That is technically true, and things worked, and nobody said anything.
Besides, back then the regular memcpy was pretty simple and worked fine.
Nobody noticed except for David Laight, that is. David has a testing a
TLP monitor he was writing for an FPGA, and has been occasionally
complaining about how memcpy_toio() writes things one byte at a time.
Which is completely unacceptable from a performance standpoint, even if
it happens to technically work.
The reason it's writing one byte at a time is because while it's
technically true that accesses to iomem are the same as accesses to
regular memory on x86, the _granularity_ (and ordering) of accesses
matter to iomem in ways that they don't matter to regular cached memory.
In particular, when ERMS is set, we default to using "rep movsb" for
larger memory copies. That is indeed perfectly fine for real memory,
since the whole point is that the CPU is going to do cacheline
optimizations and executes the memory copy efficiently for cached
memory.
With iomem? Not so much. With iomem, "rep movsb" will indeed work, but
it will copy things one byte at a time. Slowly and ponderously.
Now, originally, back in 2010 when commit 6175ddf06b61 was done, we
didn't use ERMS, and this was much less noticeable.
Our normal memcpy() was simpler in other ways too.
Because in fact, it's not just about using the string instructions. Our
memcpy() these days does things like "read and write overlapping values"
to handle the last bytes of the copy. Again, for normal memory,
overlapping accesses isn't an issue. For iomem? It can be.
So this re-introduces the specialized memcpy_toio(), memcpy_fromio() and
memset_io() functions. It doesn't particularly optimize them, but it
tries to at least not be horrid, or do overlapping accesses. In fact,
this uses the existing __inline_memcpy() function that we still had
lying around that uses our very traditional "rep movsl" loop followed by
movsw/movsb for the final bytes.
Somebody may decide to try to improve on it, but if we've gone almost a
decade with only one person really ever noticing and complaining, maybe
it's not worth worrying about further, once it's not _completely_ broken?
Reported-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04 18:52:49 -07:00
|
|
|
obj-y += iomem.o
|
2008-01-30 05:32:31 -07:00
|
|
|
|
2007-10-11 02:13:35 -07:00
|
|
|
ifeq ($(CONFIG_X86_32),y)
|
2009-07-03 08:28:57 -07:00
|
|
|
obj-y += atomic64_32.o
|
2010-02-24 02:54:25 -07:00
|
|
|
lib-y += atomic64_cx8_32.o
|
2008-01-30 05:32:31 -07:00
|
|
|
lib-y += checksum_32.o
|
|
|
|
lib-y += strstr_32.o
|
2011-07-19 04:59:51 -07:00
|
|
|
lib-y += string_32.o
|
x86/mem: Move memmove to out of line assembler
When building ARCH=i386 with CONFIG_LTO_CLANG_FULL=y, it's possible
(depending on additional configs which I have not been able to isolate)
to observe a failure during register allocation:
error: inline assembly requires more registers than available
when memmove is inlined into tcp_v4_fill_cb() or tcp_v6_fill_cb().
memmove is quite large and probably shouldn't be inlined due to size
alone. A noinline function attribute would be the simplest fix, but
there's a few things that stand out with the current definition:
In addition to having complex constraints that can't always be resolved,
the clobber list seems to be missing %bx. By using numbered operands
rather than symbolic operands, the constraints are quite obnoxious to
refactor.
Having a large function be 99% inline asm is a code smell that this
function should simply be written in stand-alone out-of-line assembler.
Moving this to out of line assembler guarantees that the
compiler cannot inline calls to memmove.
This has been done previously for 64b:
commit 9599ec0471de ("x86-64, mem: Convert memmove() to assembly file
and fix return value bug")
That gives the opportunity for other cleanups like fixing the
inconsistent use of tabs vs spaces and instruction suffixes, and the
label 3 appearing twice. Symbolic operands, local labels, and
additional comments would provide this code with a fresh coat of paint.
Finally, add a test that tickles the `rep movsl` implementation to test
it for correctness, since it has implicit operands.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Suggested-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/all/20221018172155.287409-1-ndesaulniers%40google.com
2022-10-18 10:21:55 -07:00
|
|
|
lib-y += memmove_32.o
|
2023-05-31 06:08:39 -07:00
|
|
|
lib-y += cmpxchg8b_emu.o
|
2009-09-30 22:30:38 -07:00
|
|
|
ifneq ($(CONFIG_X86_CMPXCHG64),y)
|
2023-05-31 06:08:39 -07:00
|
|
|
lib-y += atomic64_386_32.o
|
2009-09-30 22:30:38 -07:00
|
|
|
endif
|
2007-10-11 02:13:35 -07:00
|
|
|
else
|
2022-09-15 08:04:11 -07:00
|
|
|
ifneq ($(CONFIG_GENERIC_CSUM),y)
|
2008-01-30 05:32:31 -07:00
|
|
|
lib-y += csum-partial_64.o csum-copy_64.o csum-wrappers_64.o
|
2022-09-15 08:04:11 -07:00
|
|
|
endif
|
2014-09-21 11:42:32 -07:00
|
|
|
lib-y += clear_page_64.o copy_page_64.o
|
2008-01-30 05:32:31 -07:00
|
|
|
lib-y += memmove_64.o memset_64.o
|
x86: rewrite '__copy_user_nocache' function
I didn't really want to do this, but as part of all the other changes to
the user copy loops, I've been looking at this horror.
I tried to clean it up multiple times, but every time I just found more
problems, and the way it's written, it's just too hard to fix them.
For example, the code is written to do quad-word alignment, and will use
regular byte accesses to get to that point. That's fairly simple, but
it means that any initial 8-byte alignment will be done with cached
copies.
However, the code then is very careful to do any 4-byte _tail_ accesses
using an uncached 4-byte write, and that was claimed to be relevant in
commit a82eee742452 ("x86/uaccess/64: Handle the caching of 4-byte
nocache copies properly in __copy_user_nocache()").
So if you do a 4-byte copy using that function, it carefully uses a
4-byte 'movnti' for the destination. But if you were to do a 12-byte
copy that is 4-byte aligned, it would _not_ do a 4-byte 'movnti'
followed by a 8-byte 'movnti' to keep it all uncached.
Instead, it would align the destination to 8 bytes using a
byte-at-a-time loop, and then do a 8-byte 'movnti' for the final 8
bytes.
The main caller that cares is __copy_user_flushcache(), which knows
about this insanity, and has odd cases for it all. But I just can't
deal with looking at this kind of "it does one case right, and another
related case entirely wrong".
And the code really wasn't fixable without hard drugs, which I try to
avoid.
So instead, rewrite it in a form that hopefully not only gets this
right, but is a bit more maintainable. Knock wood.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-04-20 15:13:50 -07:00
|
|
|
lib-y += copy_user_64.o copy_user_uncached_64.o
|
2011-02-28 03:02:24 -07:00
|
|
|
lib-y += cmpxchg16b_emu.o
|
2007-10-11 02:13:35 -07:00
|
|
|
endif
|