1

28 hotfixes. 13 are cc:stable. 23 are MM.

It is the usual shower of unrelated singletons - please see the individual
 changelogs for details.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZxGY5wAKCRDdBJ7gKXxA
 js6RAQC16zQ7WRV091i79cEi1C5648NbZjMCU626hZjuyfbzKgEA2v8PYtjj9w2e
 UGLxMY+PYZki2XNEh75Sikdkiyl9Vgg=
 =xcWT
 -----END PGP SIGNATURE-----

Merge tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "28 hotfixes. 13 are cc:stable. 23 are MM.

  It is the usual shower of unrelated singletons - please see the
  individual changelogs for details"

* tag 'mm-hotfixes-stable-2024-10-17-16-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (28 commits)
  maple_tree: add regression test for spanning store bug
  maple_tree: correct tree corruption on spanning store
  mm/mglru: only clear kswapd_failures if reclaimable
  mm/swapfile: skip HugeTLB pages for unuse_vma
  selftests: mm: fix the incorrect usage() info of khugepaged
  MAINTAINERS: add Jann as memory mapping/VMA reviewer
  mm: swap: prevent possible data-race in __try_to_reclaim_swap
  mm: khugepaged: fix the incorrect statistics when collapsing large file folios
  MAINTAINERS: kasan, kcov: add bugzilla links
  mm: don't install PMD mappings when THPs are disabled by the hw/process/vma
  mm: huge_memory: add vma_thp_disabled() and thp_disabled_by_hw()
  Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs
  Docs/damon/maintainer-profile: add missing '_' suffixes for external web links
  maple_tree: check for MA_STATE_BULK on setting wr_rebalance
  mm: khugepaged: fix the arguments order in khugepaged_collapse_file trace point
  mm/damon/tests/sysfs-kunit.h: fix memory leak in damon_sysfs_test_add_targets()
  mm: remove unused stub for can_swapin_thp()
  mailmap: add an entry for Andy Chiu
  MAINTAINERS: add memory mapping/VMA co-maintainers
  fs/proc: fix build with GCC 15 due to -Werror=unterminated-string-initialization
  ...
This commit is contained in:
Linus Torvalds 2024-10-17 16:33:06 -07:00
commit 4d939780b7
27 changed files with 309 additions and 133 deletions

View File

@ -73,6 +73,8 @@ Andrey Ryabinin <ryabinin.a.a@gmail.com> <aryabinin@virtuozzo.com>
Andrzej Hajda <andrzej.hajda@intel.com> <a.hajda@samsung.com>
André Almeida <andrealmeid@igalia.com> <andrealmeid@collabora.com>
Andy Adamson <andros@citi.umich.edu>
Andy Chiu <andybnac@gmail.com> <andy.chiu@sifive.com>
Andy Chiu <andybnac@gmail.com> <taochiu@synology.com>
Andy Shevchenko <andy@kernel.org> <andy@smile.org.ua>
Andy Shevchenko <andy@kernel.org> <ext-andriy.shevchenko@nokia.com>
Anilkumar Kolli <quic_akolli@quicinc.com> <akolli@codeaurora.org>

View File

@ -7,26 +7,26 @@ The DAMON subsystem covers the files that are listed in 'DATA ACCESS MONITOR'
section of 'MAINTAINERS' file.
The mailing lists for the subsystem are damon@lists.linux.dev and
linux-mm@kvack.org. Patches should be made against the mm-unstable `tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>` whenever possible and posted to
the mailing lists.
linux-mm@kvack.org. Patches should be made against the `mm-unstable tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ whenever possible and posted
to the mailing lists.
SCM Trees
---------
There are multiple Linux trees for DAMON development. Patches under
development or testing are queued in `damon/next
<https://git.kernel.org/sj/h/damon/next>` by the DAMON maintainer.
<https://git.kernel.org/sj/h/damon/next>`_ by the DAMON maintainer.
Sufficiently reviewed patches will be queued in `mm-unstable
<https://git.kernel.org/akpm/mm/h/mm-unstable>` by the memory management
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ by the memory management
subsystem maintainer. After more sufficient tests, the patches will be queued
in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>` , and finally
in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_, and finally
pull-requested to the mainline by the memory management subsystem maintainer.
Note again the patches for mm-unstable `tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>` are queued by the memory
Note again the patches for `mm-unstable tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ are queued by the memory
management subsystem maintainer. If the patches requires some patches in
damon/next `tree <https://git.kernel.org/sj/h/damon/next>` which not yet merged
`damon/next tree <https://git.kernel.org/sj/h/damon/next>`_ which not yet merged
in mm-unstable, please make sure the requirement is clearly specified.
Submit checklist addendum
@ -37,25 +37,25 @@ When making DAMON changes, you should do below.
- Build changes related outputs including kernel and documents.
- Ensure the builds introduce no new errors or warnings.
- Run and ensure no new failures for DAMON `selftests
<https://github.com/awslabs/damon-tests/blob/master/corr/run.sh#L49>` and
<https://github.com/damonitor/damon-tests/blob/master/corr/run.sh#L49>`_ and
`kunittests
<https://github.com/awslabs/damon-tests/blob/master/corr/tests/kunit.sh>`.
<https://github.com/damonitor/damon-tests/blob/master/corr/tests/kunit.sh>`_.
Further doing below and putting the results will be helpful.
- Run `damon-tests/corr
<https://github.com/awslabs/damon-tests/tree/master/corr>` for normal
<https://github.com/damonitor/damon-tests/tree/master/corr>`_ for normal
changes.
- Run `damon-tests/perf
<https://github.com/awslabs/damon-tests/tree/master/perf>` for performance
<https://github.com/damonitor/damon-tests/tree/master/perf>`_ for performance
changes.
Key cycle dates
---------------
Patches can be sent anytime. Key cycle dates of the `mm-unstable
<https://git.kernel.org/akpm/mm/h/mm-unstable>` and `mm-stable
<https://git.kernel.org/akpm/mm/h/mm-stable>` trees depend on the memory
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and `mm-stable
<https://git.kernel.org/akpm/mm/h/mm-stable>`_ trees depend on the memory
management subsystem maintainer.
Review cadence
@ -72,13 +72,13 @@ Mailing tool
Like many other Linux kernel subsystems, DAMON uses the mailing lists
(damon@lists.linux.dev and linux-mm@kvack.org) as the major communication
channel. There is a simple tool called `HacKerMaiL
<https://github.com/damonitor/hackermail>` (``hkml``), which is for people who
<https://github.com/damonitor/hackermail>`_ (``hkml``), which is for people who
are not very familiar with the mailing lists based communication. The tool
could be particularly helpful for DAMON community members since it is developed
and maintained by DAMON maintainer. The tool is also officially announced to
support DAMON and general Linux kernel development workflow.
In other words, `hkml <https://github.com/damonitor/hackermail>` is a mailing
In other words, `hkml <https://github.com/damonitor/hackermail>`_ is a mailing
tool for DAMON community, which DAMON maintainer is committed to support.
Please feel free to try and report issues or feature requests for the tool to
the maintainer.
@ -98,8 +98,8 @@ slots, and attendees should reserve one of those at least 24 hours before the
time slot, by reaching out to the maintainer.
Schedules and available reservation time slots are available at the Google `doc
<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`.
<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`_.
There is also a public Google `calendar
<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`
<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_
that has the events. Anyone can subscribe it. DAMON maintainer will also
provide periodic reminder to the mailing list (damon@lists.linux.dev).

View File

@ -12242,6 +12242,7 @@ R: Dmitry Vyukov <dvyukov@google.com>
R: Vincenzo Frascino <vincenzo.frascino@arm.com>
L: kasan-dev@googlegroups.com
S: Maintained
B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
F: Documentation/dev-tools/kasan.rst
F: arch/*/include/asm/*kasan.h
F: arch/*/mm/kasan_init*
@ -12265,6 +12266,7 @@ R: Dmitry Vyukov <dvyukov@google.com>
R: Andrey Konovalov <andreyknvl@gmail.com>
L: kasan-dev@googlegroups.com
S: Maintained
B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
F: Documentation/dev-tools/kcov.rst
F: include/linux/kcov.h
F: include/uapi/linux/kcov.h
@ -14907,9 +14909,10 @@ N: include/linux/page[-_]*
MEMORY MAPPING
M: Andrew Morton <akpm@linux-foundation.org>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Vlastimil Babka <vbabka@suse.cz>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Jann Horn <jannh@google.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
@ -24738,9 +24741,10 @@ F: tools/testing/vsock/
VMA
M: Andrew Morton <akpm@linux-foundation.org>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Vlastimil Babka <vbabka@suse.cz>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Jann Horn <jannh@google.com>
L: linux-mm@kvack.org
S: Maintained
W: https://www.linux-mm.org

View File

@ -1037,7 +1037,7 @@ error_inode:
if (corrupt < 0) {
fat_fs_error(new_dir->i_sb,
"%s: Filesystem corrupted (i_pos %lld)",
__func__, sinfo.i_pos);
__func__, new_i_pos);
}
goto out;
}

View File

@ -289,7 +289,7 @@ static int nilfs_readdir(struct file *file, struct dir_context *ctx)
* The folio is mapped and unlocked. When the caller is finished with
* the entry, it should call folio_release_kmap().
*
* On failure, returns NULL and the caller should ignore foliop.
* On failure, returns an error pointer and the caller should ignore foliop.
*/
struct nilfs_dir_entry *nilfs_find_entry(struct inode *dir,
const struct qstr *qstr, struct folio **foliop)
@ -312,22 +312,24 @@ struct nilfs_dir_entry *nilfs_find_entry(struct inode *dir,
do {
char *kaddr = nilfs_get_folio(dir, n, foliop);
if (!IS_ERR(kaddr)) {
de = (struct nilfs_dir_entry *)kaddr;
kaddr += nilfs_last_byte(dir, n) - reclen;
while ((char *) de <= kaddr) {
if (de->rec_len == 0) {
nilfs_error(dir->i_sb,
"zero-length directory entry");
folio_release_kmap(*foliop, kaddr);
goto out;
}
if (nilfs_match(namelen, name, de))
goto found;
de = nilfs_next_entry(de);
if (IS_ERR(kaddr))
return ERR_CAST(kaddr);
de = (struct nilfs_dir_entry *)kaddr;
kaddr += nilfs_last_byte(dir, n) - reclen;
while ((char *)de <= kaddr) {
if (de->rec_len == 0) {
nilfs_error(dir->i_sb,
"zero-length directory entry");
folio_release_kmap(*foliop, kaddr);
goto out;
}
folio_release_kmap(*foliop, kaddr);
if (nilfs_match(namelen, name, de))
goto found;
de = nilfs_next_entry(de);
}
folio_release_kmap(*foliop, kaddr);
if (++n >= npages)
n = 0;
/* next folio is past the blocks we've got */
@ -340,7 +342,7 @@ struct nilfs_dir_entry *nilfs_find_entry(struct inode *dir,
}
} while (n != start);
out:
return NULL;
return ERR_PTR(-ENOENT);
found:
ei->i_dir_start_lookup = n;
@ -384,18 +386,18 @@ fail:
return NULL;
}
ino_t nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr)
int nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr, ino_t *ino)
{
ino_t res = 0;
struct nilfs_dir_entry *de;
struct folio *folio;
de = nilfs_find_entry(dir, qstr, &folio);
if (de) {
res = le64_to_cpu(de->inode);
folio_release_kmap(folio, de);
}
return res;
if (IS_ERR(de))
return PTR_ERR(de);
*ino = le64_to_cpu(de->inode);
folio_release_kmap(folio, de);
return 0;
}
void nilfs_set_link(struct inode *dir, struct nilfs_dir_entry *de,

View File

@ -55,12 +55,20 @@ nilfs_lookup(struct inode *dir, struct dentry *dentry, unsigned int flags)
{
struct inode *inode;
ino_t ino;
int res;
if (dentry->d_name.len > NILFS_NAME_LEN)
return ERR_PTR(-ENAMETOOLONG);
ino = nilfs_inode_by_name(dir, &dentry->d_name);
inode = ino ? nilfs_iget(dir->i_sb, NILFS_I(dir)->i_root, ino) : NULL;
res = nilfs_inode_by_name(dir, &dentry->d_name, &ino);
if (res) {
if (res != -ENOENT)
return ERR_PTR(res);
inode = NULL;
} else {
inode = nilfs_iget(dir->i_sb, NILFS_I(dir)->i_root, ino);
}
return d_splice_alias(inode, dentry);
}
@ -263,10 +271,11 @@ static int nilfs_do_unlink(struct inode *dir, struct dentry *dentry)
struct folio *folio;
int err;
err = -ENOENT;
de = nilfs_find_entry(dir, &dentry->d_name, &folio);
if (!de)
if (IS_ERR(de)) {
err = PTR_ERR(de);
goto out;
}
inode = d_inode(dentry);
err = -EIO;
@ -362,10 +371,11 @@ static int nilfs_rename(struct mnt_idmap *idmap,
if (unlikely(err))
return err;
err = -ENOENT;
old_de = nilfs_find_entry(old_dir, &old_dentry->d_name, &old_folio);
if (!old_de)
if (IS_ERR(old_de)) {
err = PTR_ERR(old_de);
goto out;
}
if (S_ISDIR(old_inode->i_mode)) {
err = -EIO;
@ -382,10 +392,12 @@ static int nilfs_rename(struct mnt_idmap *idmap,
if (dir_de && !nilfs_empty_dir(new_inode))
goto out_dir;
err = -ENOENT;
new_de = nilfs_find_entry(new_dir, &new_dentry->d_name, &new_folio);
if (!new_de)
new_de = nilfs_find_entry(new_dir, &new_dentry->d_name,
&new_folio);
if (IS_ERR(new_de)) {
err = PTR_ERR(new_de);
goto out_dir;
}
nilfs_set_link(new_dir, new_de, new_folio, old_inode);
folio_release_kmap(new_folio, new_de);
nilfs_mark_inode_dirty(new_dir);
@ -440,12 +452,13 @@ out:
*/
static struct dentry *nilfs_get_parent(struct dentry *child)
{
unsigned long ino;
ino_t ino;
int res;
struct nilfs_root *root;
ino = nilfs_inode_by_name(d_inode(child), &dotdot_name);
if (!ino)
return ERR_PTR(-ENOENT);
res = nilfs_inode_by_name(d_inode(child), &dotdot_name, &ino);
if (res)
return ERR_PTR(res);
root = NILFS_I(d_inode(child))->i_root;

View File

@ -254,7 +254,7 @@ static inline __u32 nilfs_mask_flags(umode_t mode, __u32 flags)
/* dir.c */
int nilfs_add_link(struct dentry *, struct inode *);
ino_t nilfs_inode_by_name(struct inode *, const struct qstr *);
int nilfs_inode_by_name(struct inode *dir, const struct qstr *qstr, ino_t *ino);
int nilfs_make_empty(struct inode *, struct inode *);
struct nilfs_dir_entry *nilfs_find_entry(struct inode *, const struct qstr *,
struct folio **);

View File

@ -909,8 +909,15 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
{
/*
* Don't forget to update Documentation/ on changes.
*
* The length of the second argument of mnemonics[]
* needs to be 3 instead of previously set 2
* (i.e. from [BITS_PER_LONG][2] to [BITS_PER_LONG][3])
* to avoid spurious
* -Werror=unterminated-string-initialization warning
* with GCC 15
*/
static const char mnemonics[BITS_PER_LONG][2] = {
static const char mnemonics[BITS_PER_LONG][3] = {
/*
* In case if we meet a flag we don't know about.
*/
@ -987,11 +994,8 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
for (i = 0; i < BITS_PER_LONG; i++) {
if (!mnemonics[i][0])
continue;
if (vma->vm_flags & (1UL << i)) {
seq_putc(m, mnemonics[i][0]);
seq_putc(m, mnemonics[i][1]);
seq_putc(m, ' ');
}
if (vma->vm_flags & (1UL << i))
seq_printf(m, "%s ", mnemonics[i]);
}
seq_putc(m, '\n');
}

View File

@ -322,6 +322,24 @@ struct thpsize {
(transparent_hugepage_flags & \
(1<<TRANSPARENT_HUGEPAGE_USE_ZERO_PAGE_FLAG))
static inline bool vma_thp_disabled(struct vm_area_struct *vma,
unsigned long vm_flags)
{
/*
* Explicitly disabled through madvise or prctl, or some
* architectures may disable THP for some mappings, for
* example, s390 kvm.
*/
return (vm_flags & VM_NOHUGEPAGE) ||
test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags);
}
static inline bool thp_disabled_by_hw(void)
{
/* If the hardware/firmware marked hugepage support disabled. */
return transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED);
}
unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags);
unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr,

View File

@ -41,7 +41,11 @@
PCPU_MIN_ALLOC_SHIFT)
#ifdef CONFIG_RANDOM_KMALLOC_CACHES
#define PERCPU_DYNAMIC_SIZE_SHIFT 12
# if defined(CONFIG_LOCKDEP) && !defined(CONFIG_PAGE_SIZE_4KB)
# define PERCPU_DYNAMIC_SIZE_SHIFT 13
# else
# define PERCPU_DYNAMIC_SIZE_SHIFT 12
#endif /* LOCKDEP and PAGE_SIZE > 4KiB */
#else
#define PERCPU_DYNAMIC_SIZE_SHIFT 10
#endif

View File

@ -208,7 +208,7 @@ TRACE_EVENT(mm_khugepaged_scan_file,
TRACE_EVENT(mm_khugepaged_collapse_file,
TP_PROTO(struct mm_struct *mm, struct folio *new_folio, pgoff_t index,
bool is_shmem, unsigned long addr, struct file *file,
unsigned long addr, bool is_shmem, struct file *file,
int nr, int result),
TP_ARGS(mm, new_folio, index, addr, is_shmem, file, nr, result),
TP_STRUCT__entry(
@ -233,7 +233,7 @@ TRACE_EVENT(mm_khugepaged_collapse_file,
__entry->result = result;
),
TP_printk("mm=%p, hpage_pfn=0x%lx, index=%ld, addr=%ld, is_shmem=%d, filename=%s, nr=%d, result=%s",
TP_printk("mm=%p, hpage_pfn=0x%lx, index=%ld, addr=%lx, is_shmem=%d, filename=%s, nr=%d, result=%s",
__entry->mm,
__entry->hpfn,
__entry->index,

View File

@ -228,6 +228,9 @@ bool codetag_unload_module(struct module *mod)
if (!mod)
return true;
/* await any module's kfree_rcu() operations to complete */
kvfree_rcu_barrier();
mutex_lock(&codetag_lock);
list_for_each_entry(cttype, &codetag_types, link) {
struct codetag_module *found = NULL;

View File

@ -2196,6 +2196,8 @@ static inline void mas_node_or_none(struct ma_state *mas,
/*
* mas_wr_node_walk() - Find the correct offset for the index in the @mas.
* If @mas->index cannot be found within the containing
* node, we traverse to the last entry in the node.
* @wr_mas: The maple write state
*
* Uses mas_slot_locked() and does not need to worry about dead nodes.
@ -3532,7 +3534,7 @@ static bool mas_wr_walk(struct ma_wr_state *wr_mas)
return true;
}
static bool mas_wr_walk_index(struct ma_wr_state *wr_mas)
static void mas_wr_walk_index(struct ma_wr_state *wr_mas)
{
struct ma_state *mas = wr_mas->mas;
@ -3541,11 +3543,9 @@ static bool mas_wr_walk_index(struct ma_wr_state *wr_mas)
wr_mas->content = mas_slot_locked(mas, wr_mas->slots,
mas->offset);
if (ma_is_leaf(wr_mas->type))
return true;
return;
mas_wr_walk_traverse(wr_mas);
}
return true;
}
/*
* mas_extend_spanning_null() - Extend a store of a %NULL to include surrounding %NULLs.
@ -3765,8 +3765,8 @@ static noinline void mas_wr_spanning_store(struct ma_wr_state *wr_mas)
memset(&b_node, 0, sizeof(struct maple_big_node));
/* Copy l_mas and store the value in b_node. */
mas_store_b_node(&l_wr_mas, &b_node, l_mas.end);
/* Copy r_mas into b_node. */
if (r_mas.offset <= r_mas.end)
/* Copy r_mas into b_node if there is anything to copy. */
if (r_mas.max > r_mas.last)
mas_mab_cp(&r_mas, r_mas.offset, r_mas.end,
&b_node, b_node.b_end + 1);
else
@ -4218,7 +4218,7 @@ static inline void mas_wr_store_type(struct ma_wr_state *wr_mas)
/* Potential spanning rebalance collapsing a node */
if (new_end < mt_min_slots[wr_mas->type]) {
if (!mte_is_root(mas->node)) {
if (!mte_is_root(mas->node) && !(mas->mas_flags & MA_STATE_BULK)) {
mas->store_type = wr_rebalance;
return;
}

View File

@ -67,6 +67,7 @@ static void damon_sysfs_test_add_targets(struct kunit *test)
damon_destroy_ctx(ctx);
kfree(sysfs_targets->targets_arr);
kfree(sysfs_targets);
kfree(sysfs_target->regions);
kfree(sysfs_target);
}

View File

@ -109,18 +109,7 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma,
if (!vma->vm_mm) /* vdso */
return 0;
/*
* Explicitly disabled through madvise or prctl, or some
* architectures may disable THP for some mappings, for
* example, s390 kvm.
* */
if ((vm_flags & VM_NOHUGEPAGE) ||
test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags))
return 0;
/*
* If the hardware/firmware marked hugepage support disabled.
*/
if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
if (thp_disabled_by_hw() || vma_thp_disabled(vma, vm_flags))
return 0;
/* khugepaged doesn't collapse DAX vma, but page fault is fine. */

View File

@ -2227,7 +2227,7 @@ rollback:
folio_put(new_folio);
out:
VM_BUG_ON(!list_empty(&pagelist));
trace_mm_khugepaged_collapse_file(mm, new_folio, index, is_shmem, addr, file, HPAGE_PMD_NR, result);
trace_mm_khugepaged_collapse_file(mm, new_folio, index, addr, is_shmem, file, HPAGE_PMD_NR, result);
return result;
}
@ -2252,7 +2252,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
continue;
if (xa_is_value(folio)) {
++swap;
swap += 1 << xas_get_order(&xas);
if (cc->is_khugepaged &&
swap > khugepaged_max_ptes_swap) {
result = SCAN_EXCEED_SWAP_PTE;
@ -2299,7 +2299,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
* is just too costly...
*/
present++;
present += folio_nr_pages(folio);
if (need_resched()) {
xas_pause(&xas);

View File

@ -4181,11 +4181,6 @@ fallback:
return __alloc_swap_folio(vmf);
}
#else /* !CONFIG_TRANSPARENT_HUGEPAGE */
static inline bool can_swapin_thp(struct vm_fault *vmf, pte_t *ptep, int nr_pages)
{
return false;
}
static struct folio *alloc_swap_folio(struct vm_fault *vmf)
{
return __alloc_swap_folio(vmf);
@ -4925,6 +4920,15 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
pmd_t entry;
vm_fault_t ret = VM_FAULT_FALLBACK;
/*
* It is too late to allocate a small folio, we already have a large
* folio in the pagecache: especially s390 KVM cannot tolerate any
* PMD mappings, but PTE-mapped THP are fine. So let's simply refuse any
* PMD mappings if THPs are disabled.
*/
if (thp_disabled_by_hw() || vma_thp_disabled(vma, vma->vm_flags))
return ret;
if (!thp_vma_suitable_order(vma, haddr, PMD_ORDER))
return ret;

View File

@ -1371,7 +1371,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
struct maple_tree mt_detach;
unsigned long end = addr + len;
bool writable_file_mapping = false;
int error = -ENOMEM;
int error;
VMA_ITERATOR(vmi, mm, addr);
VMG_STATE(vmg, mm, &vmi, addr, end, vm_flags, pgoff);
@ -1396,8 +1396,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
}
/* Check against address space limit. */
if (!may_expand_vm(mm, vm_flags, pglen - vms.nr_pages))
if (!may_expand_vm(mm, vm_flags, pglen - vms.nr_pages)) {
error = -ENOMEM;
goto abort_munmap;
}
/*
* Private writable mapping: check memory availability
@ -1405,8 +1407,11 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
if (accountable_mapping(file, vm_flags)) {
charged = pglen;
charged -= vms.nr_accounted;
if (charged && security_vm_enough_memory_mm(mm, charged))
goto abort_munmap;
if (charged) {
error = security_vm_enough_memory_mm(mm, charged);
if (error)
goto abort_munmap;
}
vms.nr_accounted = 0;
vm_flags |= VM_ACCOUNT;
@ -1422,8 +1427,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
* not unmapped, but the maps are removed from the list.
*/
vma = vm_area_alloc(mm);
if (!vma)
if (!vma) {
error = -ENOMEM;
goto unacct_error;
}
vma_iter_config(&vmi, addr, end);
vma_set_range(vma, addr, end, pgoff);
@ -1453,9 +1460,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
* Expansion is handled above, merging is handled below.
* Drivers should not alter the address of the VMA.
*/
error = -EINVAL;
if (WARN_ON((addr != vma->vm_start)))
if (WARN_ON((addr != vma->vm_start))) {
error = -EINVAL;
goto close_and_free_vma;
}
vma_iter_config(&vmi, addr, end);
/*
@ -1500,13 +1508,15 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
}
/* Allow architectures to sanity-check the vm_flags */
error = -EINVAL;
if (!arch_validate_flags(vma->vm_flags))
if (!arch_validate_flags(vma->vm_flags)) {
error = -EINVAL;
goto close_and_free_vma;
}
error = -ENOMEM;
if (vma_iter_prealloc(&vmi, vma))
if (vma_iter_prealloc(&vmi, vma)) {
error = -ENOMEM;
goto close_and_free_vma;
}
/* Lock the VMA since it is modified after insertion into VMA tree */
vma_start_write(vma);

View File

@ -238,6 +238,7 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
{
spinlock_t *old_ptl, *new_ptl;
struct mm_struct *mm = vma->vm_mm;
bool res = false;
pmd_t pmd;
if (!arch_supports_page_table_move())
@ -277,19 +278,25 @@ static bool move_normal_pmd(struct vm_area_struct *vma, unsigned long old_addr,
if (new_ptl != old_ptl)
spin_lock_nested(new_ptl, SINGLE_DEPTH_NESTING);
/* Clear the pmd */
pmd = *old_pmd;
/* Racing with collapse? */
if (unlikely(!pmd_present(pmd) || pmd_leaf(pmd)))
goto out_unlock;
/* Clear the pmd */
pmd_clear(old_pmd);
res = true;
VM_BUG_ON(!pmd_none(*new_pmd));
pmd_populate(mm, new_pmd, pmd_pgtable(pmd));
flush_tlb_range(vma, old_addr, old_addr + PMD_SIZE);
out_unlock:
if (new_ptl != old_ptl)
spin_unlock(new_ptl);
spin_unlock(old_ptl);
return true;
return res;
}
#else
static inline bool move_normal_pmd(struct vm_area_struct *vma,

View File

@ -1664,12 +1664,7 @@ unsigned long shmem_allowable_huge_orders(struct inode *inode,
loff_t i_size;
int order;
if (vma && ((vm_flags & VM_NOHUGEPAGE) ||
test_bit(MMF_DISABLE_THP, &vma->vm_mm->flags)))
return 0;
/* If the hardware/firmware marked hugepage support disabled. */
if (transparent_hugepage_flags & (1 << TRANSPARENT_HUGEPAGE_UNSUPPORTED))
if (thp_disabled_by_hw() || (vma && vma_thp_disabled(vma, vm_flags)))
return 0;
global_huge = shmem_huge_global_enabled(inode, index, write_end,

View File

@ -194,9 +194,6 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si,
if (IS_ERR(folio))
return 0;
/* offset could point to the middle of a large folio */
entry = folio->swap;
offset = swp_offset(entry);
nr_pages = folio_nr_pages(folio);
ret = -nr_pages;
@ -210,6 +207,10 @@ static int __try_to_reclaim_swap(struct swap_info_struct *si,
if (!folio_trylock(folio))
goto out;
/* offset could point to the middle of a large folio */
entry = folio->swap;
offset = swp_offset(entry);
need_reclaim = ((flags & TTRS_ANYWAY) ||
((flags & TTRS_UNMAPPED) && !folio_mapped(folio)) ||
((flags & TTRS_FULL) && mem_cgroup_swap_full(folio)));
@ -2312,7 +2313,7 @@ static int unuse_mm(struct mm_struct *mm, unsigned int type)
mmap_read_lock(mm);
for_each_vma(vmi, vma) {
if (vma->anon_vma) {
if (vma->anon_vma && !is_vm_hugetlb_page(vma)) {
ret = unuse_vma(vma, type);
if (ret)
break;

View File

@ -4963,8 +4963,8 @@ static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control *
blk_finish_plug(&plug);
done:
/* kswapd should never fail */
pgdat->kswapd_failures = 0;
if (sc->nr_reclaimed > reclaimed)
pgdat->kswapd_failures = 0;
}
/******************************************************************************

View File

@ -36317,6 +36317,28 @@ static inline int check_vma_modification(struct maple_tree *mt)
return 0;
}
/*
* test to check that bulk stores do not use wr_rebalance as the store
* type.
*/
static inline void check_bulk_rebalance(struct maple_tree *mt)
{
MA_STATE(mas, mt, ULONG_MAX, ULONG_MAX);
int max = 10;
build_full_tree(mt, 0, 2);
/* erase every entry in the tree */
do {
/* set up bulk store mode */
mas_expected_entries(&mas, max);
mas_erase(&mas);
MT_BUG_ON(mt, mas.store_type == wr_rebalance);
} while (mas_prev(&mas, 0) != NULL);
mas_destroy(&mas);
}
void farmer_tests(void)
{
struct maple_node *node;
@ -36328,6 +36350,10 @@ void farmer_tests(void)
check_vma_modification(&tree);
mtree_destroy(&tree);
mt_init(&tree);
check_bulk_rebalance(&tree);
mtree_destroy(&tree);
tree.ma_root = xa_mk_value(0);
mt_dump(&tree, mt_dump_dec);
@ -36406,9 +36432,93 @@ void farmer_tests(void)
check_nomem(&tree);
}
static unsigned long get_last_index(struct ma_state *mas)
{
struct maple_node *node = mas_mn(mas);
enum maple_type mt = mte_node_type(mas->node);
unsigned long *pivots = ma_pivots(node, mt);
unsigned long last_index = mas_data_end(mas);
BUG_ON(last_index == 0);
return pivots[last_index - 1] + 1;
}
/*
* Assert that we handle spanning stores that consume the entirety of the right
* leaf node correctly.
*/
static void test_spanning_store_regression(void)
{
unsigned long from = 0, to = 0;
DEFINE_MTREE(tree);
MA_STATE(mas, &tree, 0, 0);
/*
* Build a 3-level tree. We require a parent node below the root node
* and 2 leaf nodes under it, so we can span the entirety of the right
* hand node.
*/
build_full_tree(&tree, 0, 3);
/* Descend into position at depth 2. */
mas_reset(&mas);
mas_start(&mas);
mas_descend(&mas);
mas_descend(&mas);
/*
* We need to establish a tree like the below.
*
* Then we can try a store in [from, to] which results in a spanned
* store across nodes B and C, with the maple state at the time of the
* write being such that only the subtree at A and below is considered.
*
* Height
* 0 Root Node
* / \
* pivot = to / \ pivot = ULONG_MAX
* / \
* 1 A [-----] ...
* / \
* pivot = from / \ pivot = to
* / \
* 2 (LEAVES) B [-----] [-----] C
* ^--- Last pivot to.
*/
while (true) {
unsigned long tmp = get_last_index(&mas);
if (mas_next_sibling(&mas)) {
from = tmp;
to = mas.max;
} else {
break;
}
}
BUG_ON(from == 0 && to == 0);
/* Perform the store. */
mas_set_range(&mas, from, to);
mas_store_gfp(&mas, xa_mk_value(0xdead), GFP_KERNEL);
/* If the regression occurs, the validation will fail. */
mt_validate(&tree);
/* Cleanup. */
__mt_destroy(&tree);
}
static void regression_tests(void)
{
test_spanning_store_regression();
}
void maple_tree_tests(void)
{
#if !defined(BENCH)
regression_tests();
farmer_tests();
#endif
maple_tree_seed();

View File

@ -1091,7 +1091,7 @@ static void usage(void)
fprintf(stderr, "\n\t\"file,all\" mem_type requires kernel built with\n");
fprintf(stderr, "\tCONFIG_READ_ONLY_THP_FOR_FS=y\n");
fprintf(stderr, "\n\tif [dir] is a (sub)directory of a tmpfs mount, tmpfs must be\n");
fprintf(stderr, "\tmounted with huge=madvise option for khugepaged tests to work\n");
fprintf(stderr, "\tmounted with huge=advise option for khugepaged tests to work\n");
fprintf(stderr, "\n\tSupported Options:\n");
fprintf(stderr, "\t\t-h: This help message.\n");
fprintf(stderr, "\t\t-s: mTHP size, expressed as page order.\n");

View File

@ -18,7 +18,7 @@ bool test_uffdio_wp = true;
unsigned long long *count_verify;
uffd_test_ops_t *uffd_test_ops;
uffd_test_case_ops_t *uffd_test_case_ops;
atomic_bool ready_for_fork;
pthread_barrier_t ready_for_fork;
static int uffd_mem_fd_create(off_t mem_size, bool hugetlb)
{
@ -519,7 +519,8 @@ void *uffd_poll_thread(void *arg)
pollfd[1].fd = pipefd[cpu*2];
pollfd[1].events = POLLIN;
ready_for_fork = true;
/* Ready for parent thread to fork */
pthread_barrier_wait(&ready_for_fork);
for (;;) {
ret = poll(pollfd, 2, -1);

View File

@ -33,7 +33,6 @@
#include <inttypes.h>
#include <stdint.h>
#include <sys/random.h>
#include <stdatomic.h>
#include "../kselftest.h"
#include "vm_util.h"
@ -105,7 +104,7 @@ extern bool map_shared;
extern bool test_uffdio_wp;
extern unsigned long long *count_verify;
extern volatile bool test_uffdio_copy_eexist;
extern atomic_bool ready_for_fork;
extern pthread_barrier_t ready_for_fork;
extern uffd_test_ops_t anon_uffd_test_ops;
extern uffd_test_ops_t shmem_uffd_test_ops;

View File

@ -241,6 +241,9 @@ static void *fork_event_consumer(void *data)
fork_event_args *args = data;
struct uffd_msg msg = { 0 };
/* Ready for parent thread to fork */
pthread_barrier_wait(&ready_for_fork);
/* Read until a full msg received */
while (uffd_read_msg(args->parent_uffd, &msg));
@ -308,8 +311,12 @@ static int pagemap_test_fork(int uffd, bool with_event, bool test_pin)
/* Prepare a thread to resolve EVENT_FORK */
if (with_event) {
pthread_barrier_init(&ready_for_fork, NULL, 2);
if (pthread_create(&thread, NULL, fork_event_consumer, &args))
err("pthread_create()");
/* Wait for child thread to start before forking */
pthread_barrier_wait(&ready_for_fork);
pthread_barrier_destroy(&ready_for_fork);
}
child = fork();
@ -774,7 +781,7 @@ static void uffd_sigbus_test_common(bool wp)
char c;
struct uffd_args args = { 0 };
ready_for_fork = false;
pthread_barrier_init(&ready_for_fork, NULL, 2);
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
@ -791,8 +798,9 @@ static void uffd_sigbus_test_common(bool wp)
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
while (!ready_for_fork)
; /* Wait for the poll_thread to start executing before forking */
/* Wait for child thread to start before forking */
pthread_barrier_wait(&ready_for_fork);
pthread_barrier_destroy(&ready_for_fork);
pid = fork();
if (pid < 0)
@ -833,7 +841,7 @@ static void uffd_events_test_common(bool wp)
char c;
struct uffd_args args = { 0 };
ready_for_fork = false;
pthread_barrier_init(&ready_for_fork, NULL, 2);
fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK);
if (uffd_register(uffd, area_dst, nr_pages * page_size,
@ -844,8 +852,9 @@ static void uffd_events_test_common(bool wp)
if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args))
err("uffd_poll_thread create");
while (!ready_for_fork)
; /* Wait for the poll_thread to start executing before forking */
/* Wait for child thread to start before forking */
pthread_barrier_wait(&ready_for_fork);
pthread_barrier_destroy(&ready_for_fork);
pid = fork();
if (pid < 0)