linux

History

Xiao Wang a43fe27d65 riscv: Optimize crc32 with Zbc extension As suggested by the B-ext spec, the Zbc (carry-less multiplication) instructions can be used to accelerate CRC calculations. Currently, the crc32 is the most widely used crc function inside kernel, so this patch focuses on the optimization of just the crc32 APIs. Compared with the current table-lookup based optimization, Zbc based optimization can also achieve large stride during CRC calculation loop, meantime, it avoids the memory access latency of the table-lookup based implementation and it reduces memory footprint. If Zbc feature is not supported in a runtime environment, then the table-lookup based implementation would serve as fallback via alternative mechanism. By inspecting the vmlinux built by gcc v12.2.0 with default optimization level (-O2), we can see below instruction count change for each 8-byte stride in the CRC32 loop: rv64: crc32_be (54->31), crc32_le (54->13), __crc32c_le (54->13) rv32: crc32_be (50->32), crc32_le (50->16), __crc32c_le (50->16) The compile target CPU is little endian, extra effort is needed for byte swapping for the crc32_be API, thus, the instruction count change is not as significant as that in the *_le cases. This patch is tested on QEMU VM with the kernel CRC32 selftest for both rv64 and rv32. Running the CRC32 selftest on a real hardware (SpacemiT K1) with Zbc extension shows 65% and 125% performance improvement respectively on crc32_test() and crc32c_test(). Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240621054707.1847548-1-xiao.w.wang@intel.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>		2024-07-10 13:19:50 -07:00
..
acpi	The usual shower of singleton fixes and minor series all over MM,	2024-05-19 09:21:03 -07:00
asm-generic	asm-generic cleanups for 6.10	2024-05-20 15:18:34 -07:00
clocksource
crypto	This push fixes a bug in the new ecc P521 code as well as a buggy	2024-05-20 08:47:54 -07:00
drm	drm fixes for 6.10-rc1	2024-05-24 17:28:02 -07:00
dt-bindings	- Core Frameworks	2024-05-22 10:49:54 -07:00
keys
kunit
kvm
linux	riscv: Optimize crc32 with Zbc extension	2024-07-10 13:19:50 -07:00
math-emu
media
memory
misc
net	more s390 updates for 6.10 merge window	2024-05-21 12:09:36 -07:00
pcmcia
ras	tracing/treewide: Remove second parameter of __assign_str()	2024-05-22 20:14:47 -04:00
rdma	The usual shower of singleton fixes and minor series all over MM,	2024-05-19 09:21:03 -07:00
rv
scsi
soc	I'm actually surprised this time. There aren't any new Qualcomm SoC clk	2024-05-18 12:48:37 -07:00
sound	ASoC: Fixes for v6.10	2024-05-23 13:29:27 +02:00
target
trace	block-6.10-20240523	2024-05-23 13:44:47 -07:00
uapi	drm fixes for 6.10-rc1	2024-05-24 17:28:02 -07:00
ufs
vdso
video
xen