1
Commit Graph

1310905 Commits

Author SHA1 Message Date
Stefan Wahren
1f26339b2e net: vertexcom: mse102x: Fix possible double free of TX skb
The scope of the TX skb is wider than just mse102x_tx_frame_spi(),
so in case the TX skb room needs to be expanded, we should free the
the temporary skb instead of the original skb. Otherwise the original
TX skb pointer would be freed again in mse102x_tx_work(), which leads
to crashes:

  Internal error: Oops: 0000000096000004 [#2] PREEMPT SMP
  CPU: 0 PID: 712 Comm: kworker/0:1 Tainted: G      D            6.6.23
  Hardware name: chargebyte Charge SOM DC-ONE (DT)
  Workqueue: events mse102x_tx_work [mse102x]
  pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : skb_release_data+0xb8/0x1d8
  lr : skb_release_data+0x1ac/0x1d8
  sp : ffff8000819a3cc0
  x29: ffff8000819a3cc0 x28: ffff0000046daa60 x27: ffff0000057f2dc0
  x26: ffff000005386c00 x25: 0000000000000002 x24: 00000000ffffffff
  x23: 0000000000000000 x22: 0000000000000001 x21: ffff0000057f2e50
  x20: 0000000000000006 x19: 0000000000000000 x18: ffff00003fdacfcc
  x17: e69ad452d0c49def x16: 84a005feff870102 x15: 0000000000000000
  x14: 000000000000024a x13: 0000000000000002 x12: 0000000000000000
  x11: 0000000000000400 x10: 0000000000000930 x9 : ffff00003fd913e8
  x8 : fffffc00001bc008
  x7 : 0000000000000000 x6 : 0000000000000008
  x5 : ffff00003fd91340 x4 : 0000000000000000 x3 : 0000000000000009
  x2 : 00000000fffffffe x1 : 0000000000000000 x0 : 0000000000000000
  Call trace:
   skb_release_data+0xb8/0x1d8
   kfree_skb_reason+0x48/0xb0
   mse102x_tx_work+0x164/0x35c [mse102x]
   process_one_work+0x138/0x260
   worker_thread+0x32c/0x438
   kthread+0x118/0x11c
   ret_from_fork+0x10/0x20
  Code: aa1303e0 97fffab6 72001c1f 54000141 (f9400660)

Cc: stable@vger.kernel.org
Fixes: 2f207cbf0d ("net: vertexcom: Add MSE102x SPI support")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20241105163101.33216-1-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-06 17:56:50 -08:00
Linus Torvalds
ff7afaeca1 More NFS Client Bugfixes for Linux 6.12-rc
Stable Fixes:
 * Fix KMSAN warning in decode_getfattr_attrs()
 
 Other Bugfixes:
 * Handle -ENOTCONN in xs_tcp_setup_socked()
 * NFSv3: only use NFS timeout for MOUNT when protocols are compatible
 * Fix attribute delegation behavior on exclusive create and a/mtime changes
 * Fix localio to cope with racing nfs_local_probe()
 * Avoid i_lock contention in fs_clear_invalid_mapping()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmcr1HUACgkQ18tUv7Cl
 QOtdYBAA0YohWDHflcHPbltJu0UyCyDDtowvpVacSDJwZwVEXnLQRTTqrdUnWVxx
 Bc2Ae8tGsfcwo10yZ6LUIPjcyEqLQeYvKoKv2Awf0j7eubjRYZrQVypIKtmy8aC2
 H5ETCyrbIubE06jX8EPO8LFxQ+T6nGD7kC8qJZL8z/aNVXGA2nRRCi7AzdE4o6Ht
 0t6fC+W5vxJ4hQHYKb59nGvREMwpKSLg2U4wo1lyFvkDxEJ06DobGOKEtD333cI8
 Mou/1UlSZ6RzgfwJNIPMMpCepIp2spaDeet0XVN+zqzxg55Jmk7LqpxP5pswTjLb
 WsxErV9ZRXtwutCCf+IDoMCv/YS4g4ZG7CLKXQ4felKJVYIuiS4z0n659xRqLyyi
 nW71vrRUdOBE3rCXUW6crZYwX/fHDvl6bsq9/h7cy2ZPnbGkVvXx+LIm0dJRenfb
 MaxVM3CyrMnzL3UUk/caK/rVCOHrDD5q/dAtSNfizMWnqoX+gXby3ho6Zwn0Wj89
 NiUZJIRI/s4V1WzMw4g+Daz7LUUwGblODTtphH2nnKRDfTiYXeT/r/waU6zUOVcS
 7Jd285DF/tkQp2SJ3nvsM/ni7TD2UuG2BsKA3Urlht9i32lwyENeS3nNcx6aHo3i
 blNpD+9mp3vZfWWZNVvLM/JldcIqEvd30+P6GWwS/Td8Zz4PYIM=
 =9mwu
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.12-3' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
 "These are mostly fixes that came up during the nfs bakeathon the other
  week.

  Stable Fixes:
   - Fix KMSAN warning in decode_getfattr_attrs()

  Other Bugfixes:
   - Handle -ENOTCONN in xs_tcp_setup_socked()
   - NFSv3: only use NFS timeout for MOUNT when protocols are compatible
   - Fix attribute delegation behavior on exclusive create and a/mtime
     changes
   - Fix localio to cope with racing nfs_local_probe()
   - Avoid i_lock contention in fs_clear_invalid_mapping()"

* tag 'nfs-for-6.12-3' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  nfs: avoid i_lock contention in nfs_clear_invalid_mapping
  nfs_common: fix localio to cope with racing nfs_local_probe()
  NFS: Further fixes to attribute delegation a/mtime changes
  NFS: Fix attribute delegation behaviour on exclusive create
  nfs: Fix KMSAN warning in decode_getfattr_attrs()
  NFSv3: only use NFS timeout for MOUNT when protocols are compatible
  sunrpc: handle -ENOTCONN in xs_tcp_setup_socket()
2024-11-06 13:09:22 -10:00
Linus Torvalds
f43b156921 Hi,
Couple of fixes for keys and trusted keys. For me it id not make
 (common) sense to separate them into separate them into separate keys
 and trusted keys PR's.
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZyu2GQAKCRAaerohdGur
 0nIUAP4xhqSTeJb1oYX5wvHtDhXGH4EEI/PQtq/zS5yEoAYmqQD/TuW6UU9I4vmd
 U+v6gEL0kLojM43PumHBuzQdF8RHkgA=
 =4C6c
 -----END PGP SIGNATURE-----

Merge tag 'keys-next-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull keys fixes from Jarkko Sakkinen:
 "A couple of fixes for keys and trusted keys"

* tag 'keys-next-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  KEYS: trusted: dcp: fix NULL dereference in AEAD crypto operation
  security/keys: fix slab-out-of-bounds in key_task_permission
2024-11-06 09:29:15 -10:00
Linus Torvalds
7758b20611 Fix tracefs mount options:
The commit 78ff640819 ("vfs: Convert tracefs to use the new mount API")
 broke the gid setting when set by fstab or other mount utility.
 It is ignored when it is set. Fix the code so that it recognises the
 option again and will honor the settings on mount at boot up.
 
 Update the internal documentation and create a selftest to make sure
 it doesn't break again in the future.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZyuidRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qsgQAQDuV0x4RLpCrrowDS/ITQw/eb/WjhR7
 lhkXVROLN6RK6wD+JWmbaCP82q2S4A2Vx0Rjc72gUMmTzDb1HQflhQiLhwU=
 =0dZF
 -----END PGP SIGNATURE-----

Merge tag 'tracefs-v6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracefs fixes from Steven Rostedt:
 "Fix tracefs mount options.

  Commit 78ff640819 ("vfs: Convert tracefs to use the new mount API")
  broke the gid setting when set by fstab or other mount utility. It is
  ignored when it is set. Fix the code so that it recognises the option
  again and will honor the settings on mount at boot up.

  Update the internal documentation and create a selftest to make sure
  it doesn't break again in the future"

* tag 'tracefs-v6.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/selftests: Add tracefs mount options test
  tracing: Document tracefs gid mount option
  tracing: Fix tracefs mount options
2024-11-06 08:08:39 -10:00
Linus Torvalds
b226d01983 platform-drivers-x86 for v6.12-4
Highlights:
  - AMD PMF: Add new hardware id
  - AMD PMC: Fix crash when loaded with enable_stb=1 on devices without STB
  - Dell: Add Alienware hwid for Alienware systems with Dell WMI interface
  - thinkpad_acpi: Quirk to fix wrong fan speed readings on L480
  - New hotkey mappings for Dell and Lenovo laptops
 
 The following is an automated git shortlog grouped by driver:
 
 dell-smbios-base:
  -  Extends support to Alienware products
 
 dell-wmi-base:
  -  Handle META key Lock/Unlock events
 
 ideapad-laptop:
  -  add missing Ideapad Pro 5 fn keys
 
 platform/x86/amd/pmc:
  -  Detect when STB is not available
 
 platform/x86/amd/pmf:
  -  Add SMU metrics table support for 1Ah family 60h model
 
 thinkpad_acpi:
  -  Fix for ThinkPad's with ECFW showing incorrect fan speed
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmcrWl8UHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yMAAf/S1jCHG5TzIUN4ChncXGEC6vkVJ9t
 C9eqErGqx0Dm3VUa5e6geZzpDoD37AaLP7K/C6lAxxYSpYd/VYYIcqMihtIcYsTp
 SRuWWT5qDq3xhGXd1bbe5PU92BQ7vbkoOxvWbRCpHNjaI5xKaRUnxycCAqJaqo5E
 835tx3urhiMRucT0YUtmJoiclfGkIvBfjHFemC7dCAOmpqWAJxe4MFb0HXTyw6ja
 GIcTmzxZpLNGmiiB4DWRs0EyjCfcFE/xJeny14/j36XfMu82vv8dAHIwA0FiaA3e
 Jq0eoWqRfZfaTmkcYY64C2QX1WYEc+ANHmUhOGNyY+Zyt1EuoaYWYsLgXg==
 =1cZL
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:

 - AMD PMF: Add new hardware id

 - AMD PMC: Fix crash when loaded with enable_stb=1 on devices without STB

 - Dell: Add Alienware hwid for Alienware systems with Dell WMI interface

 - thinkpad_acpi: Quirk to fix wrong fan speed readings on L480

 - New hotkey mappings for Dell and Lenovo laptops

* tag 'platform-drivers-x86-v6.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: thinkpad_acpi: Fix for ThinkPad's with ECFW showing incorrect fan speed
  platform/x86: ideapad-laptop: add missing Ideapad Pro 5 fn keys
  platform/x86: dell-wmi-base: Handle META key Lock/Unlock events
  platform/x86: dell-smbios-base: Extends support to Alienware products
  platform/x86/amd/pmc: Detect when STB is not available
  platform/x86/amd/pmf: Add SMU metrics table support for 1Ah family 60h model
2024-11-06 08:03:19 -10:00
Linus Torvalds
9e23acf024 - fix memory safety bugs in dm-cache
- fix restart/panic logic in dm-verity
 
 - fix 32-bit unsigned integer overflow in dm-unstriped
 
 - fix a device mapper crash if blk_alloc_disk fails
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCZyj5MhQcbXBhdG9ja2FA
 cmVkaGF0LmNvbQAKCRATAyx9YGnhbSpeAQCcyhjrFxvFQuTJm/nv65Txwqw3+nvu
 i45pJ1DbK1awEQD/W0xUhhWrHfXwnb2dHV/mowLSnlou7uUh/JUx9q24OQg=
 =5qjV
 -----END PGP SIGNATURE-----

Merge tag 'for-6.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mikulas Patocka:

 - fix memory safety bugs in dm-cache

 - fix restart/panic logic in dm-verity

 - fix 32-bit unsigned integer overflow in dm-unstriped

 - fix a device mapper crash if blk_alloc_disk fails

* tag 'for-6.12/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache: fix potential out-of-bounds access on the first resume
  dm cache: optimize dirty bit checking with find_next_bit when resizing
  dm cache: fix out-of-bounds access to the dirty bitset when resizing
  dm cache: fix flushing uninitialized delayed_work on cache_ctr error
  dm cache: correct the number of origin blocks to match the target length
  dm-verity: don't crash if panic_on_corruption is not selected
  dm-unstriped: cast an operand to sector_t to prevent potential uint32_t overflow
  dm: fix a crash if blk_alloc_disk fails
2024-11-06 07:56:47 -10:00
Linus Torvalds
0951fede4e hid-for-linus-20241105
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIUAwUAZyoK7KZi849r7WBJAQLkfQ/4j0xsITMeQyb7/rLVuYXcYwY6AANuxF26
 fn8At/VQoBWv4SQamyo1iWvc79mhG0I8hk4CojegpodsxurFYpjgSsDQbZGJk/ug
 bbBkIOchI1j2R564QFunDY4EtxlqJI4FaTKvIEmDKGlQ09saGieKskrPh1Gl/VXF
 M0t3EHgzV8snO5i7rD5/QlYp8fDpeqej5JQcaG68uP6L2PYxIw5mNqADQmXWT1v8
 D3D65rbABBxkSe0Z+nqgzle9tQgiK3HuAWJmGcXPOk5q+KlvZK4wSws7XgJiIJEB
 E6O7LVBRvd6izZdq0aNBmde4ggqN78ORNzfTrzVCVbDmEZvLnpBi1It3RStkZjKq
 59rVbo5WUoGUzQc/hr9coYkW0eBPVLGHHWzx6DhlK3stxzPi+YRoJdZpzqIpQMXN
 4wuykQcPRYDE94uTFhVL3w7WGzG6IGLdufZJW49C/PknXaxue1fakd0mc1VrHfhE
 g6Oy4u4CuACCowUGruRPCUV88czsIS+apRwNYDvjxZBUmVvbXUjuDhlnzIkvO1wb
 ZCk7ZpeedIu6rTa38OI7wKHyhunJtVjd520bW07bUchGqWp+MGcCexM0+oQA5owF
 vTCOx9vOnRPnyng9+9LNPvKdNzfrO4q/5sDxKaJkrg0mpQlFOssq7AUJ2BwmRHBF
 /zC+TjdCnw==
 =q7n/
 -----END PGP SIGNATURE-----

Merge tag 'hid-for-linus-20241105' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid

Pull HID fix from Jiri Kosina:

 - report buffer sanitization fix for HID core (Jiri Kosina)

* tag 'hid-for-linus-20241105' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: core: zero-initialize the report buffer
2024-11-06 07:49:54 -10:00
Vishnu Sankar
1be765b292 platform/x86: thinkpad_acpi: Fix for ThinkPad's with ECFW showing incorrect fan speed
Fix for Thinkpad's with ECFW showing incorrect fan speed. Some models use
decimal instead of hexadecimal for the speed stored in the EC registers.
For example the rpm register will have 0x4200 instead of 0x1068, here
the actual RPM is "4200" in decimal.

Add a quirk to handle this.

Signed-off-by: Vishnu Sankar <vishnuocv@gmail.com>
Suggested-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20241105235505.8493-1-vishnuocv@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-11-06 12:48:42 +01:00
Jakub Kicinski
26a2bebd2c Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-11-04 (ice, idpf, i40e, e1000e)

For ice:

Marcin adjusts ordering of calls in ice_eswitch_detach() to resolve a
use after free issue.

Mateusz corrects variable type for Flow Director queue to fix issues
related to drop actions.

For idpf:

Pavan resolves issues related to reset on idpf; avoiding use of freed
vport and correctly unrolling the mailbox task.

For i40e:

Aleksandr fixes a race condition involving addition and deletion of VF
MAC filters.

For e1000e:

Vitaly reverts workaround for Meteor Lake causing regressions in power
management flows.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  e1000e: Remove Meteor Lake SMBUS workarounds
  i40e: fix race condition by adding filter's intermediate sync state
  idpf: fix idpf_vc_core_init error path
  idpf: avoid vport access in idpf_get_link_ksettings
  ice: change q_index variable type to s16 to store -1 value
  ice: Fix use after free during unload with ports in bridge
====================

Link: https://patch.msgid.link/20241104223639.2801097-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 18:05:51 -08:00
Jakub Kicinski
3f2f406a35 Merge branch 'mptcp-pm-fix-wrong-perm-and-sock-kfree'
Matthieu Baerts says:

====================
mptcp: pm: fix wrong perm and sock kfree

Two small fixes related to the MPTCP path-manager:

- Patch 1: remove an accidental restriction to admin users to list MPTCP
  endpoints. A regression from v6.7.

- Patch 2: correctly use sock_kfree_s() instead of kfree() in the
  userspace PM. A fix for another fix introduced in v6.4 and
  backportable up to v5.19.
====================

Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-0-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:51:11 -08:00
Geliang Tang
99635c91fb mptcp: use sock_kfree_s instead of kfree
The local address entries on userspace_pm_local_addr_list are allocated
by sock_kmalloc().

It's then required to use sock_kfree_s() instead of kfree() to free
these entries in order to adjust the allocated size on the sk side.

Fixes: 24430f8bf5 ("mptcp: add address into userspace pm list")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-2-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:51:09 -08:00
Matthieu Baerts (NGI0)
cfbbd48598 mptcp: no admin perm to list endpoints
During the switch to YNL, the command to list all endpoints has been
accidentally restricted to users with admin permissions.

It looks like there are no reasons to have this restriction which makes
it harder for a user to quickly check if the endpoint list has been
correctly populated by an automated tool. Best to go back to the
previous behaviour then.

mptcp_pm_gen.c has been modified using ynl-gen-c.py:

   $ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
     --spec Documentation/netlink/specs/mptcp_pm.yaml --source \
     -o net/mptcp/mptcp_pm_gen.c

The header file doesn't need to be regenerated.

Fixes: 1d0507f468 ("net: mptcp: convert netlink from small_ops to ops")
Cc: stable@vger.kernel.org
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-1-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:51:08 -08:00
Diogo Silva
256748d548 net: phy: ti: add PHY_RST_AFTER_CLK_EN flag
DP83848	datasheet (section 4.7.2) indicates that the reset pin should be
toggled after the clocks are running. Add the PHY_RST_AFTER_CLK_EN to
make sure that this indication is respected.

In my experience not having this flag enabled would lead to, on some
boots, the wrong MII mode being selected if the PHY was initialized on
the bootloader and was receiving data during Linux boot.

Signed-off-by: Diogo Silva <diogompaissilva@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 34e45ad937 ("net: phy: dp83848: Add TI DP83848 Ethernet PHY")
Link: https://patch.msgid.link/20241102151504.811306-1-paissilva@ld-100007.ds1.internal
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:46:03 -08:00
Paolo Abeni
9eaff63bfb Merge branch 'net-ethernet-ti-am65-cpsw-fixes-to-multi-queue-rx-feature'
Roger Quadros says:

====================
net: ethernet: ti: am65-cpsw: Fixes to multi queue RX feature

On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows. Patch 1 fixes this.

The second patch fixes a warning if there was any error in
am65_cpsw_nuss_init_rx_chns() and am65_cpsw_nuss_cleanup_rx_chns()
was called after that.

Signed-off-by: Roger Quadros <rogerq@kernel.org>
====================

Link: https://patch.msgid.link/20241101-am65-cpsw-multi-rx-j7-fix-v3-0-338fdd6a55da@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 15:56:48 +01:00
Roger Quadros
ba3b7ac4f7 net: ethernet: ti: am65-cpsw: fix warning in am65_cpsw_nuss_remove_rx_chns()
flow->irq is initialized to 0 which is a valid IRQ. Set it to -EINVAL
in error path of am65_cpsw_nuss_init_rx_chns() so we do not try
to free an unallocated IRQ in am65_cpsw_nuss_remove_rx_chns().

If user tried to change number of RX queues and am65_cpsw_nuss_init_rx_chns()
failed due to any reason, the warning will happen if user tries to change
the number of RX queues after the error condition.

root@am62xx-evm:~# ethtool -L eth0 rx 3
[   40.385293] am65-cpsw-nuss 8000000.ethernet: set new flow-id-base 19
[   40.393211] am65-cpsw-nuss 8000000.ethernet: Failed to init rx flow2
netlink error: Invalid argument
root@am62xx-evm:~# ethtool -L eth0 rx 2
[   82.306427] ------------[ cut here ]------------
[   82.311075] WARNING: CPU: 0 PID: 378 at kernel/irq/devres.c:144 devm_free_irq+0x84/0x90
[   82.469770] Call trace:
[   82.472208]  devm_free_irq+0x84/0x90
[   82.475777]  am65_cpsw_nuss_remove_rx_chns+0x6c/0xac [ti_am65_cpsw_nuss]
[   82.482487]  am65_cpsw_nuss_update_tx_rx_chns+0x2c/0x9c [ti_am65_cpsw_nuss]
[   82.489442]  am65_cpsw_set_channels+0x30/0x4c [ti_am65_cpsw_nuss]
[   82.495531]  ethnl_set_channels+0x224/0x2dc
[   82.499713]  ethnl_default_set_doit+0xb8/0x1b8
[   82.504149]  genl_family_rcv_msg_doit+0xc0/0x124
[   82.508757]  genl_rcv_msg+0x1f0/0x284
[   82.512409]  netlink_rcv_skb+0x58/0x130
[   82.516239]  genl_rcv+0x38/0x50
[   82.519374]  netlink_unicast+0x1d0/0x2b0
[   82.523289]  netlink_sendmsg+0x180/0x3c4
[   82.527205]  __sys_sendto+0xe4/0x158
[   82.530779]  __arm64_sys_sendto+0x28/0x38
[   82.534782]  invoke_syscall+0x44/0x100
[   82.538526]  el0_svc_common.constprop.0+0xc0/0xe0
[   82.543221]  do_el0_svc+0x1c/0x28
[   82.546528]  el0_svc+0x28/0x98
[   82.549578]  el0t_64_sync_handler+0xc0/0xc4
[   82.553752]  el0t_64_sync+0x190/0x194
[   82.557407] ---[ end trace 0000000000000000 ]---

Fixes: da70d184a8 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 15:56:48 +01:00
Roger Quadros
de794169cf net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7
On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows.

This issue is not present on AM62 as separate pair of
rings are used for free and completion rings for each flow.

Fix this by allocating enough elements for RX free descriptor
ring 0.

However, we can no longer rely on desc_idx (descriptor based
offsets) to identify the pages in the respective flows as
free descriptor ring includes elements for all flows.
To solve this, introduce a new swdata data structure to store
flow_id and page. This can be used to identify which flow (page_pool)
and page the descriptor belonged to when popped out of the
RX rings.

Fixes: da70d184a8 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 15:56:48 +01:00
Peiyang Wang
df3dff8ab6 net: hns3: fix kernel crash when uninstalling driver
When the driver is uninstalled and the VF is disabled concurrently, a
kernel crash occurs. The reason is that the two actions call function
pci_disable_sriov(). The num_VFs is checked to determine whether to
release the corresponding resources. During the second calling, num_VFs
is not 0 and the resource release function is called. However, the
corresponding resource has been released during the first invoking.
Therefore, the problem occurs:

[15277.839633][T50670] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
[15278.131557][T50670] Call trace:
[15278.134686][T50670]  klist_put+0x28/0x12c
[15278.138682][T50670]  klist_del+0x14/0x20
[15278.142592][T50670]  device_del+0xbc/0x3c0
[15278.146676][T50670]  pci_remove_bus_device+0x84/0x120
[15278.151714][T50670]  pci_stop_and_remove_bus_device+0x6c/0x80
[15278.157447][T50670]  pci_iov_remove_virtfn+0xb4/0x12c
[15278.162485][T50670]  sriov_disable+0x50/0x11c
[15278.166829][T50670]  pci_disable_sriov+0x24/0x30
[15278.171433][T50670]  hnae3_unregister_ae_algo_prepare+0x60/0x90 [hnae3]
[15278.178039][T50670]  hclge_exit+0x28/0xd0 [hclge]
[15278.182730][T50670]  __se_sys_delete_module.isra.0+0x164/0x230
[15278.188550][T50670]  __arm64_sys_delete_module+0x1c/0x30
[15278.193848][T50670]  invoke_syscall+0x50/0x11c
[15278.198278][T50670]  el0_svc_common.constprop.0+0x158/0x164
[15278.203837][T50670]  do_el0_svc+0x34/0xcc
[15278.207834][T50670]  el0_svc+0x20/0x30

For details, see the following figure.

     rmmod hclge              disable VFs
----------------------------------------------------
hclge_exit()            sriov_numvfs_store()
  ...                     device_lock()
  pci_disable_sriov()     hns3_pci_sriov_configure()
                            pci_disable_sriov()
                              sriov_disable()
    sriov_disable()             if !num_VFs :
      if !num_VFs :               return;
        return;                 sriov_del_vfs()
      sriov_del_vfs()             ...
        ...                       klist_put()
        klist_put()               ...
        ...                     num_VFs = 0;
      num_VFs = 0;        device_unlock();

In this patch, when driver is removing, we get the device_lock()
to protect num_VFs, just like sriov_numvfs_store().

Fixes: 0dd8a25f35 ("net: hns3: disable sriov before unload hclge layer")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241101091507.3644584-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:45:39 +01:00
Jakub Kicinski
249cfa318f Revert "Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'"
This reverts commit d80a309130, reversing
changes made to 637f414763:

2cf2461435 ("net: hns3: fix kernel crash when 1588 is sent on HIP08 devices")
3e22b7de34 ("net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issue")
d1c2e2961a ("net: hns3: initialize reset_timer before hclgevf_misc_irq_init()")
5f62009ff1 ("net: hns3: don't auto enable misc vector")
2758f18a83 ("net: hns3: Resolved the issue that the debugfs query result is inconsistent.")
662ecfc466 ("net: hns3: fix missing features due to dev->features configuration too early")
3e0f7cc887 ("net: hns3: fixed reset failure issues caused by the incorrect reset type")
f2c14899ca ("net: hns3: add sync command to sync io-pgtable")
e6ab19443b ("net: hns3: default enable tx bounce buffer when smmu enabled")

The series is making the driver poke into IOMMU internals instead of
implementing appropriate IOMMU workarounds.

Link: https://lore.kernel.org/069c9838-b781-4012-934a-d2626fa78212@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:07:54 -08:00
Jakub Kicinski
08d05cea02 linux-can-fixes-for-6.12-20241104
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEUEC6huC2BN0pvD5fKDiiPnotvG8FAmcpJbATHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAoOKI+ei28b7NTCACdt8guPlyrootWDkKzkzIVGNgphauG
 NcL5Ok45hCwz7NfDc5rRN12HsR0qveAabRHIReCNhDcdSU8H4MK+wUa/FONFh548
 mEcH9iDJs6ALbPC4uWIwTcehznFl/tGgD4oLlP34U8aCfLaihMaFsbywUQnI0LCb
 LejkKD0Q+8uac3RPj63DO5FmWz300fRqo8az+bXDHG2Oa270Dtzz0gYKV9UyG/xs
 4rLWVM5iAPETq9wkWetIdR2RXQzkJwI6OuuxLHV0YQlNQITLDSMBPfomrUjeee5b
 NPYtLy8mO6rddCqDFAMTP+CES7Dh66OGssD6NEGJQTh+OIlPAxKVWFcm
 =hX2j
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-6.12-20241104' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2024-11-04

Alexander Hölzl contributes a patch to fix an error in the CAN j1939
documentation.

Thomas Mühlbacher's patch allows building of the {cc770,sja1000}_isa
drivers on x86_64 again.

A patch by me targets the m_can driver and limits the call to
free_irq() to devices with IRQs.

Dario Binacchi's patch fixes the RX and TX error counters in the c_can
driver.

The next 2 patches target the rockchip_canfd driver. Geert
Uytterhoeven's patch lets the driver depend on ARCH_ROCKCHIP. Jean
Delvare's patch drops the obsolete dependency on COMPILE_TEST.

The last 2 patches are by me and fix 2 regressions in the mcp251xfd
driver: fix broken coalescing configuration when switching CAN modes
and fix the length calculation of the Transmit Event FIFO (TEF) on
full TEF.

* tag 'linux-can-fixes-for-6.12-20241104' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: mcp251xfd: mcp251xfd_get_tef_len(): fix length calculation
  can: mcp251xfd: mcp251xfd_ring_alloc(): fix coalescing configuration when switching CAN modes
  can: rockchip_canfd: Drop obsolete dependency on COMPILE_TEST
  can: rockchip_canfd: CAN_ROCKCHIP_CANFD should depend on ARCH_ROCKCHIP
  can: c_can: fix {rx,tx}_errors statistics
  can: m_can: m_can_close(): don't call free_irq() for IRQ-less devices
  can: {cc770,sja1000}_isa: allow building on x86_64
  can: j1939: fix error in J1939 documentation.
====================

Link: https://patch.msgid.link/20241104200120.393312-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 17:48:52 -08:00
Linus Torvalds
2e1b3cc9d7 soc: fixes for 6.12, part 2
Where the last set of fixes was mostly drivers, this time the devicetree
 changes all come at once, targeting mostly the Rockchips, Qualcomm and
 NXP platforms.
 
 The Qualcomm bugfixes target the Snapdragon X Elite laptops, specifically
 problems with PCIe and NVMe support to improve reliability, and a boot
 regresion on msm8939. Also for Snapdragon platforms, there are a number
 of correctness changes in the several platform specific device drivers,
 but none of these are as impactful.
 
 On the NXP i.MX platform, the fixes are all for 64-bit i.MX8 variants,
 correcting individual entries in the devicetree that were incorrect and
 causing the media, video, mmc and spi drivers to misbehave in minor
 ways.
 
 The Arm SCMI firmware driver gets fixes for a use-after-free bug and
 for correctly parsing firmware information.
 
 On the RISC-V side, there are three minor devicetree fixes for starfive
 and sophgo, again addressing only minor mistakes. One device driver
 patch fixes a problem with spurious interrupt handling.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmcpR7oACgkQYKtH/8kJ
 Uic9dA/+NzeNald+ap++pYYmIx9FcV3q3aieN4j5NRgAwUS84bNXpARkDQAK8lx8
 GFzH7VUe0YwKlh+70P73eaLqsRyYE37eUyvabwnxgO0QnI798MZfM5diDvAFm0LK
 kQKnNLx0qojt6HQTN3PVfOn+rcp2J5dgUV+ZAG9TEM3YKA0MY/bABRqaZtIZzSxg
 WwM+jqFh5+EugiioSlL6HCdAVk12wwEJkAInEEY1y/KaeQS9K8vzEOwk4FDeIf2a
 3V+G7IVIBI4rbvrAs2ukW3OrRiGT+svINS5k7qkap13YZUUjENnJYGYNm6woTTyb
 INpBowLKr1gJ1Aluk8AELvpa+aX1BY9rkxP+vTG0HrZVdrPcPToFUBQA4qZgpu/i
 k0uWb20uN1b2Uor2XP2hcTxvzfKPlvnYHxLTMG89PxoVrMX+oEQoCrAP24YcSkPd
 srV0aU54MUHANXEnNco0txdElK+8L7ZCyB19KpVnqqshohDpQ+zkX15Hw9uQLwEc
 RbF99SrR9KFg3sY62Q7LgJOhQ3x6LjGhqPCjejIZ9DvEzndArKYnkkiJYdxxhqkB
 WN1LB/PmnKPAAh1lJdej6SPhiwCCpEFG3IyS/H1/lv7CR94LVode1J4dYTJmcZaq
 aownmb4W1zECgMGQS6YjhuAG1h7D0QtSmWBAuiC+H+T2QTRPtKI=
 =9YYG
 -----END PGP SIGNATURE-----

Merge tag 'arm-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "Where the last set of fixes was mostly drivers, this time the
  devicetree changes all come at once, targeting mostly the Rockchips,
  Qualcomm and NXP platforms.

  The Qualcomm bugfixes target the Snapdragon X Elite laptops,
  specifically problems with PCIe and NVMe support to improve
  reliability, and a boot regresion on msm8939.

  Also for Snapdragon platforms, there are a number of correctness
  changes in the several platform specific device drivers, but none of
  these are as impactful.

  On the NXP i.MX platform, the fixes are all for 64-bit i.MX8 variants,
  correcting individual entries in the devicetree that were incorrect
  and causing the media, video, mmc and spi drivers to misbehave in
  minor ways.

  The Arm SCMI firmware driver gets fixes for a use-after-free bug and
  for correctly parsing firmware information.

  On the RISC-V side, there are three minor devicetree fixes for
  starfive and sophgo, again addressing only minor mistakes. One device
  driver patch fixes a problem with spurious interrupt handling"

* tag 'arm-fixes-6.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (63 commits)
  firmware: arm_scmi: Use vendor string in max-rx-timeout-ms
  dt-bindings: firmware: arm,scmi: Add missing vendor string
  riscv: dts: Replace deprecated snps,nr-gpios property for snps,dw-apb-gpio-port devices
  arm64: dts: rockchip: Correct GPIO polarity on brcm BT nodes
  arm64: dts: rockchip: Drop invalid clock-names from es8388 codec nodes
  ARM: dts: rockchip: Fix the realtek audio codec on rk3036-kylin
  ARM: dts: rockchip: Fix the spi controller on rk3036
  ARM: dts: rockchip: drop grf reference from rk3036 hdmi
  ARM: dts: rockchip: fix rk3036 acodec node
  arm64: dts: rockchip: remove orphaned pinctrl-names from pinephone pro
  soc: qcom: pmic_glink: Handle GLINK intent allocation rejections
  rpmsg: glink: Handle rejected intent request better
  arm64: dts: qcom: x1e80100: fix PCIe5 interconnect
  arm64: dts: qcom: x1e80100: fix PCIe4 interconnect
  arm64: dts: qcom: x1e80100: Fix up BAR spaces
  MAINTAINERS: invert Misc RISC-V SoC Support's pattern
  soc: qcom: socinfo: fix revision check in qcom_socinfo_probe()
  arm64: dts: qcom: x1e80100-qcp: fix nvme regulator boot glitch
  arm64: dts: qcom: x1e80100-microsoft-romulus: fix nvme regulator boot glitch
  arm64: dts: qcom: x1e80100-yoga-slim7x: fix nvme regulator boot glitch
  ...
2024-11-04 15:23:26 -10:00
Vitaly Lifshits
b847372327 e1000e: Remove Meteor Lake SMBUS workarounds
This is a partial revert to commit 76a0a3f9cc ("e1000e: fix force smbus
during suspend flow"). That commit fixed a sporadic PHY access issue but
introduced a regression in runtime suspend flows.
The original issue on Meteor Lake systems was rare in terms of the
reproduction rate and the number of the systems affected.

After the integration of commit 0a6ad4d9e1 ("e1000e: avoid failing the
system during pm_suspend"), PHY access loss can no longer cause a
system-level suspend failure. As it only occurs when the LAN cable is
disconnected, and is recovered during system resume flow. Therefore, its
functional impact is low, and the priority is given to stabilizing
runtime suspend.

Fixes: 76a0a3f9cc ("e1000e: fix force smbus during suspend flow")
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Tested-by: Avigail Dahan <avigailx.dahan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04 13:09:34 -08:00
Aleksandr Loktionov
f30490e969 i40e: fix race condition by adding filter's intermediate sync state
Fix a race condition in the i40e driver that leads to MAC/VLAN filters
becoming corrupted and leaking. Address the issue that occurs under
heavy load when multiple threads are concurrently modifying MAC/VLAN
filters by setting mac and port VLAN.

1. Thread T0 allocates a filter in i40e_add_filter() within
        i40e_ndo_set_vf_port_vlan().
2. Thread T1 concurrently frees the filter in __i40e_del_filter() within
        i40e_ndo_set_vf_mac().
3. Subsequently, i40e_service_task() calls i40e_sync_vsi_filters(), which
        refers to the already freed filter memory, causing corruption.

Reproduction steps:
1. Spawn multiple VFs.
2. Apply a concurrent heavy load by running parallel operations to change
        MAC addresses on the VFs and change port VLANs on the host.
3. Observe errors in dmesg:
"Error I40E_AQ_RC_ENOSPC adding RX filters on VF XX,
	please set promiscuous on manually for VF XX".

Exact code for stable reproduction Intel can't open-source now.

The fix involves implementing a new intermediate filter state,
I40E_FILTER_NEW_SYNC, for the time when a filter is on a tmp_add_list.
These filters cannot be deleted from the hash list directly but
must be removed using the full process.

Fixes: 278e7d0b9d ("i40e: store MAC/VLAN filters in a hash with the MAC Address as key")
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Tested-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04 13:09:34 -08:00
Pavan Kumar Linga
9b58031ff9 idpf: fix idpf_vc_core_init error path
In an event where the platform running the device control plane
is rebooted, reset is detected on the driver. It releases
all the resources and waits for the reset to complete. Once the
reset is done, it tries to build the resources back. At this
time if the device control plane is not yet started, then
the driver timeouts on the virtchnl message and retries to
establish the mailbox again.

In the retry flow, mailbox is deinitialized but the mailbox
workqueue is still alive and polling for the mailbox message.
This results in accessing the released control queue leading to
null-ptr-deref. Fix it by unrolling the work queue cancellation
and mailbox deinitialization in the reverse order which they got
initialized.

Fixes: 4930fbf419 ("idpf: add core init and interrupt request")
Fixes: 34c21fa894 ("idpf: implement virtchnl transaction manager")
Cc: stable@vger.kernel.org # 6.9+
Reviewed-by: Tarun K Singh <tarun.k.singh@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04 13:09:34 -08:00
Pavan Kumar Linga
81d2fb4c7c idpf: avoid vport access in idpf_get_link_ksettings
When the device control plane is removed or the platform
running device control plane is rebooted, a reset is detected
on the driver. On driver reset, it releases the resources and
waits for the reset to complete. If the reset fails, it takes
the error path and releases the vport lock. At this time if the
monitoring tools tries to access link settings, it call traces
for accessing released vport pointer.

To avoid it, move link_speed_mbps to netdev_priv structure
which removes the dependency on vport pointer and the vport lock
in idpf_get_link_ksettings. Also use netif_carrier_ok()
to check the link status and adjust the offsetof to use link_up
instead of link_speed_mbps.

Fixes: 02cbfba1ad ("idpf: add ethtool callbacks")
Cc: stable@vger.kernel.org # 6.7+
Reviewed-by: Tarun K Singh <tarun.k.singh@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh <krishneil.k.singh@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04 13:09:33 -08:00
Mateusz Polchlopek
64502dac97 ice: change q_index variable type to s16 to store -1 value
Fix Flow Director not allowing to re-map traffic to 0th queue when action
is configured to drop (and vice versa).

The current implementation of ethtool callback in the ice driver forbids
change Flow Director action from 0 to -1 and from -1 to 0 with an error,
e.g:

 # ethtool -U eth2 flow-type tcp4 src-ip 1.1.1.1 loc 1 action 0
 # ethtool -U eth2 flow-type tcp4 src-ip 1.1.1.1 loc 1 action -1
 rmgr: Cannot insert RX class rule: Invalid argument

We set the value of `u16 q_index = 0` at the beginning of the function
ice_set_fdir_input_set(). In case of "drop traffic" action (which is
equal to -1 in ethtool) we store the 0 value. Later, when want to change
traffic rule to redirect to queue with index 0 it returns an error
caused by duplicate found.

Fix this behaviour by change of the type of field `q_index` from u16 to s16
in `struct ice_fdir_fltr`. This allows to store -1 in the field in case
of "drop traffic" action. What is more, change the variable type in the
function ice_set_fdir_input_set() and assign at the beginning the new
`#define ICE_FDIR_NO_QUEUE_IDX` which is -1. Later, if the action is set
to another value (point specific queue index) the variable value is
overwritten in the function.

Fixes: cac2a27cd9 ("ice: Support IPv4 Flow Director filters")
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04 13:09:33 -08:00
Marcin Szycik
e9942bfe49 ice: Fix use after free during unload with ports in bridge
Unloading the ice driver while switchdev port representors are added to
a bridge can lead to kernel panic. Reproducer:

  modprobe ice

  devlink dev eswitch set $PF1_PCI mode switchdev

  ip link add $BR type bridge
  ip link set $BR up

  echo 2 > /sys/class/net/$PF1/device/sriov_numvfs
  sleep 2

  ip link set $PF1 master $BR
  ip link set $VF1_PR master $BR
  ip link set $VF2_PR master $BR
  ip link set $PF1 up
  ip link set $VF1_PR up
  ip link set $VF2_PR up
  ip link set $VF1 up

  rmmod irdma ice

When unloading the driver, ice_eswitch_detach() is eventually called as
part of VF freeing. First, it removes a port representor from xarray,
then unregister_netdev() is called (via repr->ops.rem()), finally
representor is deallocated. The problem comes from the bridge doing its
own deinit at the same time. unregister_netdev() triggers a notifier
chain, resulting in ice_eswitch_br_port_deinit() being called. It should
set repr->br_port = NULL, but this does not happen since repr has
already been removed from xarray and is not found. Regardless, it
finishes up deallocating br_port. At this point, repr is still not freed
and an fdb event can happen, in which ice_eswitch_br_fdb_event_work()
takes repr->br_port and tries to use it, which causes a panic (use after
free).

Note that this only happens with 2 or more port representors added to
the bridge, since with only one representor port, the bridge deinit is
slightly different (ice_eswitch_br_port_deinit() is called via
ice_eswitch_br_ports_flush(), not ice_eswitch_br_port_unlink()).

Trace:
  Oops: general protection fault, probably for non-canonical address 0xf129010fd1a93284: 0000 [#1] PREEMPT SMP KASAN NOPTI
  KASAN: maybe wild-memory-access in range [0x8948287e8d499420-0x8948287e8d499427]
  (...)
  Workqueue: ice_bridge_wq ice_eswitch_br_fdb_event_work [ice]
  RIP: 0010:__rht_bucket_nested+0xb4/0x180
  (...)
  Call Trace:
   (...)
   ice_eswitch_br_fdb_find+0x3fa/0x550 [ice]
   ? __pfx_ice_eswitch_br_fdb_find+0x10/0x10 [ice]
   ice_eswitch_br_fdb_event_work+0x2de/0x1e60 [ice]
   ? __schedule+0xf60/0x5210
   ? mutex_lock+0x91/0xe0
   ? __pfx_ice_eswitch_br_fdb_event_work+0x10/0x10 [ice]
   ? ice_eswitch_br_update_work+0x1f4/0x310 [ice]
   (...)

A workaround is available: brctl setageing $BR 0, which stops the bridge
from adding fdb entries altogether.

Change the order of operations in ice_eswitch_detach(): move the call to
unregister_netdev() before removing repr from xarray. This way
repr->br_port will be correctly set to NULL in
ice_eswitch_br_port_deinit(), preventing a panic.

Fixes: fff292b47a ("ice: add VF representors one by one")
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-11-04 13:09:33 -08:00
David Gstir
04de7589e0 KEYS: trusted: dcp: fix NULL dereference in AEAD crypto operation
When sealing or unsealing a key blob we currently do not wait for
the AEAD cipher operation to finish and simply return after submitting
the request. If there is some load on the system we can exit before
the cipher operation is done and the buffer we read from/write to
is already removed from the stack. This will e.g. result in NULL
pointer dereference errors in the DCP driver during blob creation.

Fix this by waiting for the AEAD cipher operation to finish before
resuming the seal and unseal calls.

Cc: stable@vger.kernel.org # v6.10+
Fixes: 0e28bf61a5 ("KEYS: trusted: dcp: fix leak of blob encryption key")
Reported-by: Parthiban N <parthiban@linumiz.com>
Closes: https://lore.kernel.org/keyrings/254d3bb1-6dbc-48b4-9c08-77df04baee2f@linumiz.com/
Signed-off-by: David Gstir <david@sigma-star.at>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-11-04 21:24:24 +02:00
Chen Ridong
4a74da044e security/keys: fix slab-out-of-bounds in key_task_permission
KASAN reports an out of bounds read:
BUG: KASAN: slab-out-of-bounds in __kuid_val include/linux/uidgid.h:36
BUG: KASAN: slab-out-of-bounds in uid_eq include/linux/uidgid.h:63 [inline]
BUG: KASAN: slab-out-of-bounds in key_task_permission+0x394/0x410
security/keys/permission.c:54
Read of size 4 at addr ffff88813c3ab618 by task stress-ng/4362

CPU: 2 PID: 4362 Comm: stress-ng Not tainted 5.10.0-14930-gafbffd6c3ede #15
Call Trace:
 __dump_stack lib/dump_stack.c:82 [inline]
 dump_stack+0x107/0x167 lib/dump_stack.c:123
 print_address_description.constprop.0+0x19/0x170 mm/kasan/report.c:400
 __kasan_report.cold+0x6c/0x84 mm/kasan/report.c:560
 kasan_report+0x3a/0x50 mm/kasan/report.c:585
 __kuid_val include/linux/uidgid.h:36 [inline]
 uid_eq include/linux/uidgid.h:63 [inline]
 key_task_permission+0x394/0x410 security/keys/permission.c:54
 search_nested_keyrings+0x90e/0xe90 security/keys/keyring.c:793

This issue was also reported by syzbot.

It can be reproduced by following these steps(more details [1]):
1. Obtain more than 32 inputs that have similar hashes, which ends with the
   pattern '0xxxxxxxe6'.
2. Reboot and add the keys obtained in step 1.

The reproducer demonstrates how this issue happened:
1. In the search_nested_keyrings function, when it iterates through the
   slots in a node(below tag ascend_to_node), if the slot pointer is meta
   and node->back_pointer != NULL(it means a root), it will proceed to
   descend_to_node. However, there is an exception. If node is the root,
   and one of the slots points to a shortcut, it will be treated as a
   keyring.
2. Whether the ptr is keyring decided by keyring_ptr_is_keyring function.
   However, KEYRING_PTR_SUBTYPE is 0x2UL, the same as
   ASSOC_ARRAY_PTR_SUBTYPE_MASK.
3. When 32 keys with the similar hashes are added to the tree, the ROOT
   has keys with hashes that are not similar (e.g. slot 0) and it splits
   NODE A without using a shortcut. When NODE A is filled with keys that
   all hashes are xxe6, the keys are similar, NODE A will split with a
   shortcut. Finally, it forms the tree as shown below, where slot 6 points
   to a shortcut.

                      NODE A
              +------>+---+
      ROOT    |       | 0 | xxe6
      +---+   |       +---+
 xxxx | 0 | shortcut  :   : xxe6
      +---+   |       +---+
 xxe6 :   :   |       |   | xxe6
      +---+   |       +---+
      | 6 |---+       :   : xxe6
      +---+           +---+
 xxe6 :   :           | f | xxe6
      +---+           +---+
 xxe6 | f |
      +---+

4. As mentioned above, If a slot(slot 6) of the root points to a shortcut,
   it may be mistakenly transferred to a key*, leading to a read
   out-of-bounds read.

To fix this issue, one should jump to descend_to_node if the ptr is a
shortcut, regardless of whether the node is root or not.

[1] https://lore.kernel.org/linux-kernel/1cfa878e-8c7b-4570-8606-21daf5e13ce7@huaweicloud.com/

[jarkko: tweaked the commit message a bit to have an appropriate closes
 tag.]
Fixes: b2a4df200d ("KEYS: Expand the capacity of a keyring")
Reported-by: syzbot+5b415c07907a2990d1a3@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/000000000000cbb7860611f61147@google.com/T/
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2024-11-04 21:24:24 +02:00
Linus Torvalds
557329bcec MMC host:
- sdhci-pci-gli: A couple of fixes for low power mode on GL9767
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmcorkYXHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCmqfBAAtjSYh3rErLLkdJjyYj0jQZWD
 RtiMFO98Yq7tcw9YmcUxVonpHa9W9sLH5XBeeWMyIwLhEohulWoUuZtxds1cRUYY
 Y62Hv9VbH4Y8LvnzEWDEWrb32jiI9I4H+ksphU/wUTX3dunsAOVuR9VgFFYWcJND
 qXOISQ8Ft/6Qsv+6F7oEQ/t5l7l6g370z09P1TNUGmmQJPfLSkYW74KBHoqy/2K7
 OFdfKEJaNWkl05xegUFY/srTfgbxSfmG/dGdj/6Khazp4777GbtlU/ZncRErkOMA
 nsmxNKVvcmxdni3L99wEJCoTwX9smoPpqPQJeZesH57eVCwh9Xb5bRMezgmDyb88
 +2Ldhc6PAgvPlymhxVagm8Hw8Z+YmZt0U3/GZmjejdfWGImc6LfL+96xObHm2iT/
 GsVx4K7/bD8Vndzf+PlQJJhOkXAXbR5AxWP2TheXPkmfTrefl3i6mkH1wMcehWCI
 cEhZFP7d6mUEbhsGXf+fMK1wOtnMUxbtJ+jp9S1Vuxifr3i3wlKrcZg+KTC8C3IZ
 r+h86l/vWDnahbSDAKIBSl2VlqUUX84qAvMF7oFSXwxanFjS/Fp12Imc+TTWbbT8
 e7YqrSc/1GD1Dl/SerkBIDovmLaX6V8rMWZeBaNlvyScZhjLVPaQAj3N39pxU+FZ
 fPr8CzMMj4qlO6JnUGw=
 =zGYd
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull mmc fixes from Ulf Hansson:

 - sdhci-pci-gli: A couple of fixes for low power mode on GL9767

* tag 'mmc-v6.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-pci-gli: GL9767: Fix low power mode in the SD Express process
  mmc: sdhci-pci-gli: GL9767: Fix low power mode on the set clock function
2024-11-04 08:07:22 -10:00
Linus Torvalds
a0339404fd Hi,
This pull request fixes a race condition between tpm_pm_suspend() and
 tpm_hwrng_read() (I think for good now):
 
 https://bugzilla.kernel.org/show_bug.cgi?id=219383
 
 BR, Jarkko
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRE6pSOnaBC00OEHEIaerohdGur0gUCZygPzAAKCRAaerohdGur
 0pVIAQCiPAk03Cz7cHmeDfWr19VJfazDdqYSfm2muGkrK1HamQEAhbruhFlAWChZ
 G6PGBppQkGmGcwLk4LneGtr4FX7yPwY=
 =6DaV
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fix from Jarkko Sakkinen:
 "Fix a race condition between tpm_pm_suspend() and tpm_hwrng_read() (I
  think for good now)"

* tag 'tpmdd-next-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: Lock TPM chip in tpm_pm_suspend() first
2024-11-04 08:00:14 -10:00
Marc Kleine-Budde
3c1c18551e can: mcp251xfd: mcp251xfd_get_tef_len(): fix length calculation
Commit b8e0ddd36c ("can: mcp251xfd: tef: prepare to workaround
broken TEF FIFO tail index erratum") introduced
mcp251xfd_get_tef_len() to get the number of unhandled transmit events
from the Transmit Event FIFO (TEF).

As the TEF has no head pointer, the driver uses the TX FIFO's tail
pointer instead, assuming that send frames are completed. However the
check for the TEF being full was not correct. This leads to the driver
stop working if the TEF is full.

Fix the TEF full check by assuming that if, from the driver's point of
view, there are no free TX buffers in the chip and the TX FIFO is
empty, all messages must have been sent and the TEF must therefore be
full.

Reported-by: Sven Schuchmann <schuchmann@schleissheimer.de>
Closes: https://patch.msgid.link/FR3P281MB155216711EFF900AD9791B7ED9692@FR3P281MB1552.DEUP281.PROD.OUTLOOK.COM
Fixes: b8e0ddd36c ("can: mcp251xfd: tef: prepare to workaround broken TEF FIFO tail index erratum")
Tested-by: Sven Schuchmann <schuchmann@schleissheimer.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20241104-mcp251xfd-fix-length-calculation-v3-1-608b6e7e2197@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 18:01:07 +01:00
Marc Kleine-Budde
eb9a839b3d can: mcp251xfd: mcp251xfd_ring_alloc(): fix coalescing configuration when switching CAN modes
Since commit 50ea5449c5 ("can: mcp251xfd: fix ring configuration
when switching from CAN-CC to CAN-FD mode"), the current ring and
coalescing configuration is passed to can_ram_get_layout(). That fixed
the issue when switching between CAN-CC and CAN-FD mode with
configured ring (rx, tx) and/or coalescing parameters (rx-frames-irq,
tx-frames-irq).

However 50ea5449c5 ("can: mcp251xfd: fix ring configuration when
switching from CAN-CC to CAN-FD mode"), introduced a regression when
switching CAN modes with disabled coalescing configuration: Even if
the previous CAN mode has no coalescing configured, the new mode is
configured with active coalescing. This leads to delayed receiving of
CAN-FD frames.

This comes from the fact, that ethtool uses usecs = 0 and max_frames =
1 to disable coalescing, however the driver uses internally
priv->{rx,tx}_obj_num_coalesce_irq = 0 to indicate disabled
coalescing.

Fix the regression by assigning struct ethtool_coalesce
ec->{rx,tx}_max_coalesced_frames_irq = 1 if coalescing is disabled in
the driver as can_ram_get_layout() expects this.

Reported-by: https://github.com/vdh-robothania
Closes: https://github.com/raspberrypi/linux/issues/6407
Fixes: 50ea5449c5 ("can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode")
Cc: stable@vger.kernel.org
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241025-mcp251xfd-fix-coalesing-v1-1-9d11416de1df@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 18:01:07 +01:00
Jean Delvare
51e102ec23 can: rockchip_canfd: Drop obsolete dependency on COMPILE_TEST
Since commit 0166dc11be ("of: make CONFIG_OF user selectable"), OF
can be enabled on all architectures. Therefore depending on
COMPILE_TEST as an alternative is no longer needed.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20241022130439.70d016e9@endymion.delvare
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 18:01:06 +01:00
Geert Uytterhoeven
4384b8b6ec can: rockchip_canfd: CAN_ROCKCHIP_CANFD should depend on ARCH_ROCKCHIP
The Rockchip CAN-FD controller is only present on Rockchip SoCs. Hence
add a dependency on ARCH_ROCKCHIP, to prevent asking the user about
this driver when configuring a kernel without Rockchip platform
support.

Fixes: ff60bfbaf6 ("can: rockchip_canfd: add driver for Rockchip CAN-FD controller")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Heiko Stuebner <heiko@sntech.de>
Link: https://patch.msgid.link/a4b3c8c1cca9515e67adac83af5ba1b1fab2fcbc.1727169288.git.geert+renesas@glider.be
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 18:01:06 +01:00
Dario Binacchi
4d6d265379 can: c_can: fix {rx,tx}_errors statistics
The c_can_handle_bus_err() function was incorrectly incrementing only the
receive error counter, even in cases of bit or acknowledgment errors that
occur during transmission. The patch fixes the issue by incrementing the
appropriate counter based on the type of error.

Fixes: 881ff67ad4 ("can: c_can: Added support for Bosch C_CAN controller")
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Link: https://patch.msgid.link/20241014135319.2009782-1-dario.binacchi@amarulasolutions.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 18:01:06 +01:00
Marc Kleine-Budde
e4de81f9e1 can: m_can: m_can_close(): don't call free_irq() for IRQ-less devices
In commit b382380c0d ("can: m_can: Add hrtimer to generate software
interrupt") support for IRQ-less devices was added. Instead of an
interrupt, the interrupt routine is called by a hrtimer-based polling
loop.

That patch forgot to change free_irq() to be only called for devices
with IRQs. Fix this, by calling free_irq() conditionally only if an
IRQ is available for the device (and thus has been requested
previously).

Fixes: b382380c0d ("can: m_can: Add hrtimer to generate software interrupt")
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://patch.msgid.link/20240930-m_can-cleanups-v1-1-001c579cdee4@pengutronix.de
Cc: <stable@vger.kernel.org> # v6.6+
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 18:01:06 +01:00
Thomas Mühlbacher
7b22846f8a can: {cc770,sja1000}_isa: allow building on x86_64
The ISA variable is only defined if X86_32 is also defined. However,
these drivers are still useful and in use on at least some modern 64-bit
x86 industrial systems as well. With the correct module parameters, they
work as long as IO port communication is possible, despite their name
having ISA in them.

Fixes: a29689e60e ("net: handle HAS_IOPORT dependencies")
Signed-off-by: Thomas Mühlbacher <tmuehlbacher@posteo.net>
Link: https://patch.msgid.link/20240919174151.15473-2-tmuehlbacher@posteo.net
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 17:46:06 +01:00
Alexander Hölzl
b6ec62e01a can: j1939: fix error in J1939 documentation.
The description of PDU1 format usage mistakenly referred to PDU2 format.

Signed-off-by: Alexander Hölzl <alexander.hoelzl@gmx.net>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20241023145257.82709-1-alexander.hoelzl@gmx.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 17:45:55 +01:00
Ming-Hung Tsai
c0ade5d989 dm cache: fix potential out-of-bounds access on the first resume
Out-of-bounds access occurs if the fast device is expanded unexpectedly
before the first-time resume of the cache table. This happens because
expanding the fast device requires reloading the cache table for
cache_create to allocate new in-core data structures that fit the new
size, and the check in cache_preresume is not performed during the
first resume, leading to the issue.

Reproduce steps:

1. prepare component devices:

dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup create corig --table "0 524288 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct

2. load a cache table of 512 cache blocks, and deliberately expand the
   fast device before resuming the cache, making the in-core data
   structures inadequate.

dmsetup create cache --notable
dmsetup reload cache --table "0 524288 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
dmsetup reload cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup resume cdata
dmsetup resume cache

3. suspend the cache to write out the in-core dirty bitset and hint
   array, leading to out-of-bounds access to the dirty bitset at offset
   0x40:

dmsetup suspend cache

KASAN reports:

  BUG: KASAN: vmalloc-out-of-bounds in is_dirty_callback+0x2b/0x80
  Read of size 8 at addr ffffc90000085040 by task dmsetup/90

  (...snip...)
  The buggy address belongs to the virtual mapping at
   [ffffc90000085000, ffffc90000087000) created by:
   cache_ctr+0x176a/0x35f0

  (...snip...)
  Memory state around the buggy address:
   ffffc90000084f00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
   ffffc90000084f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
  >ffffc90000085000: 00 00 00 00 00 00 00 00 f8 f8 f8 f8 f8 f8 f8 f8
                                             ^
   ffffc90000085080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
   ffffc90000085100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8

Fix by checking the size change on the first resume.

Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: f494a9c6b1 ("dm cache: cache shrinking support")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04 17:39:31 +01:00
Ming-Hung Tsai
f484697e61 dm cache: optimize dirty bit checking with find_next_bit when resizing
When shrinking the fast device, dm-cache iteratively searches for a
dirty bit among the cache blocks to be dropped, which is less efficient.
Use find_next_bit instead, as it is twice as fast as the iterative
approach with test_bit.

Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: f494a9c6b1 ("dm cache: cache shrinking support")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04 17:39:31 +01:00
Ming-Hung Tsai
7922277197 dm cache: fix out-of-bounds access to the dirty bitset when resizing
dm-cache checks the dirty bits of the cache blocks to be dropped when
shrinking the fast device, but an index bug in bitset iteration causes
out-of-bounds access.

Reproduce steps:

1. create a cache device of 1024 cache blocks (128 bytes dirty bitset)

dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 131072 linear /dev/sdc 8192"
dmsetup create corig --table "0 524288 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"

2. shrink the fast device to 512 cache blocks, triggering out-of-bounds
   access to the dirty bitset (offset 0x80)

dmsetup suspend cache
dmsetup reload cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup resume cdata
dmsetup resume cache

KASAN reports:

  BUG: KASAN: vmalloc-out-of-bounds in cache_preresume+0x269/0x7b0
  Read of size 8 at addr ffffc900000f3080 by task dmsetup/131

  (...snip...)
  The buggy address belongs to the virtual mapping at
   [ffffc900000f3000, ffffc900000f5000) created by:
   cache_ctr+0x176a/0x35f0

  (...snip...)
  Memory state around the buggy address:
   ffffc900000f2f80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
   ffffc900000f3000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  >ffffc900000f3080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
                     ^
   ffffc900000f3100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
   ffffc900000f3180: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8

Fix by making the index post-incremented.

Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: f494a9c6b1 ("dm cache: cache shrinking support")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04 17:39:31 +01:00
Ming-Hung Tsai
135496c208 dm cache: fix flushing uninitialized delayed_work on cache_ctr error
An unexpected WARN_ON from flush_work() may occur when cache creation
fails, caused by destroying the uninitialized delayed_work waker in the
error path of cache_create(). For example, the warning appears on the
superblock checksum error.

Reproduce steps:

dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup create corig --table "0 524288 linear /dev/sdc 262144"
dd if=/dev/urandom of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"

Kernel logs:

(snip)
WARNING: CPU: 0 PID: 84 at kernel/workqueue.c:4178 __flush_work+0x5d4/0x890

Fix by pulling out the cancel_delayed_work_sync() from the constructor's
error path. This patch doesn't affect the use-after-free fix for
concurrent dm_resume and dm_destroy (commit 6a459d8edb ("dm cache: Fix
UAF in destroy()")) as cache_dtr is not changed.

Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: 6a459d8edb ("dm cache: Fix UAF in destroy()")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04 17:39:31 +01:00
Ming-Hung Tsai
235d2e739f dm cache: correct the number of origin blocks to match the target length
When creating a cache device, the actual size of the cache origin might
be greater than the specified cache target length. In such case, the
number of origin blocks should match the cache target length, not the
full size of the origin device, since access beyond the cache target is
not possible. This issue occurs when reducing the origin device size
using lvm, as lvreduce preloads the new cache table before resuming the
cache origin, which can result in incorrect sizes for the discard bitset
and smq hotspot blocks.

Reproduce steps:

1. create a cache device consists of 4096 origin blocks

dmsetup create cmeta --table "0 8192 linear /dev/sdc 0"
dmsetup create cdata --table "0 65536 linear /dev/sdc 8192"
dmsetup create corig --table "0 524288 linear /dev/sdc 262144"
dd if=/dev/zero of=/dev/mapper/cmeta bs=4k count=1 oflag=direct
dmsetup create cache --table "0 524288 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"

2. reduce the cache origin to 2048 oblocks, in lvreduce's approach

dmsetup reload corig --table "0 262144 linear /dev/sdc 262144"
dmsetup reload cache --table "0 262144 cache /dev/mapper/cmeta \
/dev/mapper/cdata /dev/mapper/corig 128 2 metadata2 writethrough smq 0"
dmsetup suspend cache
dmsetup suspend corig
dmsetup suspend cdata
dmsetup suspend cmeta
dmsetup resume corig
dmsetup resume cdata
dmsetup resume cmeta
dmsetup resume cache

3. shutdown the cache, and check the number of discard blocks in
   superblock. The value is expected to be 2048, but actually is 4096.

dmsetup remove cache corig cdata cmeta
dd if=/dev/sdc bs=1c count=8 skip=224 2>/dev/null | hexdump -e '1/8 "%u\n"'

Fix by correcting the origin_blocks initialization in cache_create and
removing the unused origin_sectors from struct cache_args accordingly.

Signed-off-by: Ming-Hung Tsai <mtsai@redhat.com>
Fixes: c6b4fcbad0 ("dm: add cache target")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
2024-11-04 17:39:31 +01:00
Mikulas Patocka
a674d0cd56 dm-verity: don't crash if panic_on_corruption is not selected
If the user sets panic_on_error and doesn't set panic_on_corruption,
dm-verity should not panic on data mismatch. But, currently it panics,
because it treats data mismatch as I/O error.

This commit fixes the logic so that if there is data mismatch and
panic_on_corruption or restart_on_corruption is not selected, the system
won't restart or panic.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Fixes: f811b83879 ("dm-verity: introduce the options restart_on_error and panic_on_error")
2024-11-04 17:39:23 +01:00
Zichen Xie
5a4510c762 dm-unstriped: cast an operand to sector_t to prevent potential uint32_t overflow
This was found by a static analyzer.
There may be a potential integer overflow issue in
unstripe_ctr(). uc->unstripe_offset and uc->unstripe_width are
defined as "sector_t"(uint64_t), while uc->unstripe,
uc->chunk_size and uc->stripes are all defined as "uint32_t".
The result of the calculation will be limited to "uint32_t"
without correct casting.
So, we recommend adding an extra cast to prevent potential
integer overflow.

Fixes: 18a5bf2705 ("dm: add unstriped target")
Signed-off-by: Zichen Xie <zichenxie0106@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
2024-11-04 17:34:56 +01:00
Mike Snitzer
867da60d46 nfs: avoid i_lock contention in nfs_clear_invalid_mapping
Multi-threaded buffered reads to the same file exposed significant
inode spinlock contention in nfs_clear_invalid_mapping().

Eliminate this spinlock contention by checking flags without locking,
instead using smp_rmb and smp_load_acquire accordingly, but then take
spinlock and double-check these inode flags.

Also refactor nfs_set_cache_invalid() slightly to use
smp_store_release() to pair with nfs_clear_invalid_mapping()'s
smp_load_acquire().

While this fix is beneficial for all multi-threaded buffered reads
issued by an NFS client, this issue was identified in the context of
surprisingly low LOCALIO performance with 4K multi-threaded buffered
read IO.  This fix dramatically speeds up LOCALIO performance:

before: read: IOPS=1583k, BW=6182MiB/s (6482MB/s)(121GiB/20002msec)
after:  read: IOPS=3046k, BW=11.6GiB/s (12.5GB/s)(232GiB/20001msec)

Fixes: 17dfeb9113 ("NFS: Fix races in nfs_revalidate_mapping")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-11-04 10:24:19 -05:00
Mike Snitzer
bc29408695 nfs_common: fix localio to cope with racing nfs_local_probe()
Fix the possibility of racing nfs_local_probe() resulting in:
  list_add double add: new=ffff8b99707f9f58, prev=ffff8b99707f9f58, next=ffffffffc0f30000.
  ------------[ cut here ]------------
  kernel BUG at lib/list_debug.c:35!

Add nfs_uuid_init() to properly initialize all nfs_uuid_t members
(particularly its list_head).

Switch to returning bool from nfs_uuid_begin(), returns false if
nfs_uuid_t is already in-use (its list_head is on a list). Update
nfs_local_probe() to return early if the nfs_client's cl_uuid
(nfs_uuid_t) is in-use.

Also, switch nfs_uuid_begin() from using list_add_tail_rcu() to
list_add_tail() -- rculist was used in an earlier version of the
localio code that had a lockless nfs_uuid_lookup interface.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-11-04 10:24:19 -05:00
Trond Myklebust
40f45ab381 NFS: Further fixes to attribute delegation a/mtime changes
When asked to set both an atime and an mtime to the current system time,
ensure that the setting is atomic by calling inode_update_timestamps()
only once with the appropriate flags.

Fixes: e12912d941 ("NFSv4: Add support for delegated atime and mtime attributes")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-11-04 10:24:19 -05:00
Trond Myklebust
d054c5eb28 NFS: Fix attribute delegation behaviour on exclusive create
When the client does an exclusive create and the server decides to store
the verifier in the timestamps, a SETATTR is subsequently sent to fix up
those timestamps. When that is the case, suppress the exceptions for
attribute delegations in nfs4_bitmap_copy_adjust().

Fixes: 32215c1f89 ("NFSv4: Don't request atime/mtime/size if they are delegated to us")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-11-04 10:24:19 -05:00
Roberto Sassu
dc270d7159 nfs: Fix KMSAN warning in decode_getfattr_attrs()
Fix the following KMSAN warning:

CPU: 1 UID: 0 PID: 7651 Comm: cp Tainted: G    B
Tainted: [B]=BAD_PAGE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)
=====================================================
=====================================================
BUG: KMSAN: uninit-value in decode_getfattr_attrs+0x2d6d/0x2f90
 decode_getfattr_attrs+0x2d6d/0x2f90
 decode_getfattr_generic+0x806/0xb00
 nfs4_xdr_dec_getattr+0x1de/0x240
 rpcauth_unwrap_resp_decode+0xab/0x100
 rpcauth_unwrap_resp+0x95/0xc0
 call_decode+0x4ff/0xb50
 __rpc_execute+0x57b/0x19d0
 rpc_execute+0x368/0x5e0
 rpc_run_task+0xcfe/0xee0
 nfs4_proc_getattr+0x5b5/0x990
 __nfs_revalidate_inode+0x477/0xd00
 nfs_access_get_cached+0x1021/0x1cc0
 nfs_do_access+0x9f/0xae0
 nfs_permission+0x1e4/0x8c0
 inode_permission+0x356/0x6c0
 link_path_walk+0x958/0x1330
 path_lookupat+0xce/0x6b0
 filename_lookup+0x23e/0x770
 vfs_statx+0xe7/0x970
 vfs_fstatat+0x1f2/0x2c0
 __se_sys_newfstatat+0x67/0x880
 __x64_sys_newfstatat+0xbd/0x120
 x64_sys_call+0x1826/0x3cf0
 do_syscall_64+0xd0/0x1b0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The KMSAN warning is triggered in decode_getfattr_attrs(), when calling
decode_attr_mdsthreshold(). It appears that fattr->mdsthreshold is not
initialized.

Fix the issue by initializing fattr->mdsthreshold to NULL in
nfs_fattr_init().

Cc: stable@vger.kernel.org # v3.5.x
Fixes: 88034c3d88 ("NFSv4.1 mdsthreshold attribute xdr")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-11-04 10:24:18 -05:00